Try Stellar A/B Testing for Free!

No credit card required. Start testing in minutes with our easy-to-use platform.

← Back to Blog

Simpson's Paradox in A/B Testing: Don't Let Aggregate Data Fool You

Have you ever looked at A/B test results where the overall winner lost in every single segment? Or vice-versa? If so, you might have encountered Simpson's Paradox – a statistical illusion that can lead to costly optimization mistakes.

What is Simpson's Paradox?

Simpson's Paradox occurs when a trend appears in different groups of data but disappears or reverses when these groups are combined. In A/B testing, this means an overall result (like total conversion rate) might contradict the results seen within specific segments (like mobile vs. desktop users).

A Simple A/B Test Example

Imagine testing a new landing page (Variant B) against the original (Variant A). Overall, Variant B wins:

  • Overall:
    • Variant A: 10% Conversion Rate
    • Variant B: 12% Conversion Rate (Winner! ...or is it?)

But when you segment by device type:

  • Desktop Users:
    • Variant A: 15% Conversion Rate (Winner)
    • Variant B: 14% Conversion Rate
  • Mobile Users:
    • Variant A: 6% Conversion Rate (Winner)
    • Variant B: 5% Conversion Rate

Suddenly, Variant A is the winner on both desktop and mobile, directly contradicting the overall result. That's Simpson's Paradox!

Why Does This Happen?

The paradox arises due to unequal group sizes and a confounding variable. In our example, 'device type' is the confounding variable. The overall result is a weighted average.

If Variant B received significantly more traffic from the higher-converting segment (Desktop users in this case), that segment's performance would disproportionately influence the overall average, masking the fact that Variant A performed better within each segment.

Why It Matters for A/B Testing & SEO

Relying solely on aggregate A/B test results can lead you to:

  • Implement a losing variation, hurting your conversions and KPIs.
  • Miss crucial insights about how different user segments interact with your site.
  • Make poor strategic decisions based on misleading data.

How to Avoid the Simpson's Paradox Trap

The solution is straightforward:

  1. Always Segment Your Data: Don't stop at the overall results. Analyze performance across key segments relevant to your test (e.g., device type, traffic source, new vs. returning visitors, browser).
  2. Understand Segment Distribution: Check if traffic or user distribution across segments is heavily skewed between variations. Uneven distribution is a red flag for potential paradoxes.
  3. Look for Consistency: Check if the trend observed in the overall result holds true across important segments.

Conclusion: Look Beyond the Surface

Simpson's Paradox is a powerful reminder that aggregate data can hide critical nuances. For accurate A/B test analysis and informed decision-making, segmentation isn't optional – it's essential. Always dig deeper than the overall numbers to understand the true impact of your changes.

Published: 4/22/2025