What does "Statistical Significance" mean?
Definition of Statistical Significance in the context of A/B testing (online controlled experiments).
What is Statistical Significance?
Aliases: statistically significant, significant
For a result of an A/B test to be statistically significant it has to have crossed the predefined significance threshold set when designing the test. The threshold is usually expressed in the terms of a p-value. Observing a p-value lower than which will result in the rejection of the relevant null hypothesis. For example, with a threshold of 0.05, a p-value of 0.02 is statistically significant and thus the null hypothesis can be rejected at that significance level (0.05).
Furthermore, the null could be rejected at any threshold higher than the observed significance level.
If defined by its complementary, the confidence level, as is often the case for historical reasons in the Conversion Rate Optimization industry, a test is statistically significant if it achieves a higher confidence level than the required threshold, e.g. with a threshold of 90% a test with an observed significance level of 0.02 corresponds to a confidence interval at the 98% level and since 98% is larger than 90% the result is statistically significant.
Observing a significant outcome can logically lead to one of three conclusions: (1) a rare outcome was observed, with how rare being equal to the observed p-value; (2) the null hypothesis can be rejected; (3) the statistical model is inadequate (does not reflect reality, its assumptions do not hold).
Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.
Articles on Statistical Significance
Statistical Significance in A/B Testing – a Complete Guide
blog.analytics-toolkit.com
P-value Definition and Interpretation
www.gigacalculator.com
P-values and Confidence Intervals Explained
blog.analytics-toolkit.com
Concise guide to A/B testing statistics
blog.analytics-toolkit.com
Related A/B Testing terms
Significance Thresholdalphap-valueOne-Tailed TestTwo-Tailed TestConfidence LevelSee this in action
About the author
Statistical Methods in Online A/B Testing
Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.
Learn moreGlossary index by letter
Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.
