What does "p-value" mean?

Definition of p-value in the context of A/B testing (online controlled experiments).

What is p-value?

The p-value, denoted by the small letter "p", is the probability of observing a test statistic as extreme or more extreme than the observed under the assumption that the null hypothesis is true. It is a post-hoc statistic meaning that it can only be computed after a test is completed (or at intervals with appropriate p-value adjustments). In proper notation it is p = P(d(X) ≥ d(x0); H0) where P stands for probability, d(X) is a test statistic (distance function) x0 is a typical realization of X and H0 is the selected null hypothesis. The distance function often comes in the form of a t Score or a z Score.

One can think of the p-value as a summary statistic that encompasses information about the relation between the size of the observed difference between two or more test groups, the sample size, and the characteristics of the frequency distribution and thus the variance of the parameter of interest.

The p-value is usually viewed as a measure of how surprising a result is under the assumption that the null hypothesis is true. When we define a significance threshold past which we consider a result so unexpected that we are willing to reject the null hypothesis we can compare the observed significance level (p-value) with the threshold and if the latter is lower we can reject the null.

The interpretation of the p-value uses a probabilistic variant of the modus tollens logic: H->e, not-e ∴ not-H. Another way to interpret it is as a strong argument from coincidence: there was a low probability that something would have happened assuming the null was true, it did happen so it has to be an unusual (to the extend that the p-value is low) coincidence that it happened, warranting the conclusion to reject the null hypothesis. In an A/B testing context observing a p-value below the significance threshold means that we would implement a variant in the place of the current state of affairs (we have a "winner").

In terms of the predesignated type I error rate alpha observing a given p-value means that we would have rejected the null for any level α which is greater than the observed p-value.

In terms of confidence intervals, observing a given p-value means that a confidence interval with a confidence level greater than (1 - p) would not not cover the null hypothesis, for example if the p-value is 0.01 a confidence interval at a level less than 99% level would not include values under the null.

Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.

Articles on p-value

Statistical Significance in A/B Testing – a Complete Guide
blog.analytics-toolkit.com

P-values and Confidence Intervals Explained
blog.analytics-toolkit.com

A p-value is meaningless without a specified null hypothesis
www.onesided.org

P-value Definition and Interpretation
www.gigacalculator.com

A/B testing statistics explained
blog.analytics-toolkit.com

Related A/B Testing terms

Statistical SignificanceSignificance ThresholdType I ErroralphaOne-Tailed TestTwo-Tailed TestSignificance Test

See this in action

A/B Testing CalculatorA/B Testing Calculator Statistical Significance CalculatorStatistical Significance Calculator

About the author

Georgi Z. Georgiev

Georgi has over twenty years of experience in online marketing, web analytics, statistics, and design of business experiments.

Author of the book "Statistical Methods in Online A/B Testing", white papers on statistical analysis of A/B tests, and a speaker, he has been distinguished as a winner in the Data & Analytics category of the 2024 Experimentation Thought Leadership Awards.

Purchase Statistical Methods in Online A/B Testing

Statistical Methods in Online A/B Testing

Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.

Learn more

Glossary index by letter

Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.