What does "Survivorship Bias" mean?
Definition of Survivorship Bias in the context of A/B testing (online controlled experiments).
What is Survivorship Bias?
Survivorship bias is a systematic under or overestimation of an effect on a parameter of interest due to the difference in the population that is remaining at the end of an experiment versus the one which entered it. In extreme cases a result can be biased to the point that the whole observed effect size is due to it and there is no genuine effect at all.
Survivorship bias can affect an A/B test in which the primary KPI is a per user or per order metric and since most key performance indicators are (conversion rate, average revenue per user, average order value) and in which the final decision is based only on a portion of the users and user interactions. One reason to do that is to avoid learning effect. For example, in an online controlled experiment that lasted for 8 weeks, only the data from the last 4 weeks might be analyzed, allowing for 4 weeks for any learning effects to wear off or subside significantly. The result would be that users who did not like the treatment and thus are much less likely to participate in the last 4 weeks are filtered out of the treatment group which will lead to an increase in per user metrics, making it look more successful than it is.
A classic example of survivorship bias can be found in World War II where engineers were considering which parts of bombers to add additional armor to. Planes that returned from raids over Germany were examine and meticulous recording were made of each hit a plane took. Areas with the highest proportion of hits were identified and the military wanted to armor-plate those parts. The statistician Abraham Wald, however, pointed out that these would be the worst places to add armor to since the mere fact that there were so much damage on these parts and the planes made it back to Britain suggests that these are not crucial parts for the survival of the planes. In fact, armor should be added to parts that have no damage since apparently if a plane was hit there it did not made it back.
Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.
Statistical Methods in Online A/B Testing
Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.
Learn moreGlossary index by letter
Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.