What does "Sequential Testing" mean?

Definition of Sequential Testing in the context of A/B testing (online controlled experiments).

What is Sequential Testing?

Aliases: sequential monitoring, group-sequential design, GSD, GST

Sequential testing is the practice of making decision during an A/B test by sequentially monitoring the data as it accrues. Sequential testing employs optional stopping rules (error-spending functions) that guarantee the overall type I error rate of the procedure. This should not be mistaken with unaccounted peeking at the data with intent to stop.

Sequential testing is usually done by using a so-called group-sequential design (GSD) and sometimes such tests are called group-sequential trials (GST) or group-sequential tests. They can also be performed by using an adaptive sequential design when necessary, although it offers no efficiency improvements and are much more complex.

The benefits of a sequential testing approach is the improved efficiency of the test. For example, one can cut down test duration / sample size by 20-80% (see article references) while maintaining error probability. The added flexibility in the form of the ability to analyze the data as it gathers is also highly desirable as a form of reducing business risk and of opportunity costs. Implementing a winning variant as quickly as possible is desirable and so is stopping a test which has little chance of demonstrating an effect or is in fact actively harming the users exposed to the treatment.

A drawback is the increased computational complexity since the stopping time itself is now a random variable and needs to be accounted for in an adequate statistical model in order to draw valid conclusions. This also introduces bias and requires the use of bias-reducing / bias-correcting techniques as the sample mean is no longer the maximum likelihood estimate.

The control of type I errors is achieved by way of an alpha-spending function while control of the type II error rate is handled by a beta-spending function. The two functions produce two decision boundaries, an efficacy boundary limiting the test statistic (z score) from above and a futility boundary limiting it from below. The boundaries can be maintained even when one deviates from the original design in terms of number and timings of interim analyses. Crossing one of the boundaries results in stopping the trial with a decision to reject or to accept the null hypothesis. The bias-reduction methods are closely linked to the type of spending functions employed. For most cases there exist near-unbiased estimators with good properties.

Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.

Articles on Sequential Testing

Improving ROI in A/B Testing: the AGILE AB Testing Approach
blog.analytics-toolkit.com

Efficient AB Testing with the AGILE Statistical Method
blog.analytics-toolkit.com

20-80% Faster A/B Tests? Is it real?
blog.analytics-toolkit.com

Efficient A/B Testing in Conversion Rate Optimization: The AGILE Statistical Method
www.analytics-toolkit.com

Sequential Testing is About Improving Business Returns
blog.analytics-toolkit.com

Comparison of the statistical power of sequential tests: SPRT, AGILE, and Always Valid Inference
blog.analytics-toolkit.com

Fully Sequential vs Group Sequential Tests
blog.analytics-toolkit.com

Related A/B Testing terms

AGILE A/B TestError-Spending FunctionOptional StoppingAlpha-SpendingBeta-SpendingAverage Sample SizeAdaptive Sequential Design

About the author

Georgi Z. Georgiev

Georgi has over twenty years of experience in online marketing, web analytics, statistics, and design of business experiments.

Author of the book "Statistical Methods in Online A/B Testing", white papers on statistical analysis of A/B tests, and a speaker, he has been distinguished as a winner in the Data & Analytics category of the 2024 Experimentation Thought Leadership Awards.

Purchase Statistical Methods in Online A/B Testing

Statistical Methods in Online A/B Testing

Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.

Learn more

Glossary index by letter

Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.