What does "Multiple Comparisons" mean?

Definition of Multiple Comparisons in the context of A/B testing (online controlled experiments).

What is Multiple Comparisons?

There is no strict definition of multiple comparisons in the statistics literature: it sometimes refers to comparing multiple groups between each other or versus a shared control group while in other cases it refers to comparing only two groups but based on multiple characteristics of theirs. In A/B testing multiple comparisons most often refers to the former case and especially to the situation of significance testing multiple test groups versus a common control group, known as multivariate testing (MVT).

Regardless of the precise definition multiple statistical comparisons only one of which is enough to lead to the rejection of the null hypothesis lead to the need to control the Family-Wise Error Rate (FWER) where "family" refers to a set of logically connected significance tests and "error rate" refers to the type I error rate.

When one performs multiple comparisons in the above sense the Dunnett’s test is the most powerful (as in statistical power) procedure which retains error guarantees. It is essentially a p-value adjustment and so is often referred to as Dunnett’s Correction. Using a FWER-correcting procedure also has consequences during the planning stage, when the statistical design is decided on (with or without a risk-reward analysis).

Tests with multiple comparisons necessitate a larger sample size in order to maintain the same level of statistical power. Sample size and power calculations need to take into account the multiple comparisons correction which will be applied after the data is gathered, otherwise one is likely to end up with an underpowered test.

For example, a test with just one variant (A/B) may take 300,000 users in total or 150,000 per variant. Adding two more variants to it (A/B/C/D) would increase the required sample size to about 473,324 users in total (57% larger than the A/B), or about 118,000 users per variant. The above assumes the Dunnett's correction is used. Other, less powerful methods, will result in even larger sample size requirements.

Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.

Articles on Multiple Comparisons

Multivariate Testing – Best Practices & Tools for MVT (A/B/n) Tests
blog.analytics-toolkit.com

Related A/B Testing terms

Dunnett’s CorrectionBonferroni CorrectionMultiple TestingFamily-Wise Error Rate

See this in action

A/B Testing CalculatorA/B Testing Calculator Statistical Significance CalculatorStatistical Significance Calculator Multiple Comparisons CalculatorMultiple Comparisons Calculator

About the author

Georgi Z. Georgiev

Georgi has over twenty years of experience in online marketing, web analytics, statistics, and design of business experiments.

Author of the book "Statistical Methods in Online A/B Testing", white papers on statistical analysis of A/B tests, and a speaker, he has been distinguished as a winner in the Data & Analytics category of the 2024 Experimentation Thought Leadership Awards.

Purchase Statistical Methods in Online A/B Testing

Statistical Methods in Online A/B Testing

Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.

Learn more

Glossary index by letter

Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.