What does "Severity" mean?

Definition of Severity in the context of A/B testing (online controlled experiments).

What is Severity?

Severity is a principle for assessing the error probability of tests with respect to certain claims about a parameter of interest. It is also the name of the measure of error-detection capability of a test to which a given (statistical) hypothesis was subjected. The strong severity principle states: "We have evidence for a claim C just to the extent it survives a stringent scrutiny. If C passes a test that was highly capable of ﬁnding ﬂaws or discrepancies from C, and yet none or few are found, then the passing result, x, is evidence for C." ^[1].

Mathematically severity has a lot in common with p-values and confidence intervals. A formal expression is SEV(T, x₀,H) which translates to "The severity with which claim H passes test T with outcome x₀" and from this follows SEV(μ > μ1) = P(d(X) ≤ d(x₀); μ=μ1).

The main benefit of using severity logic and presentation is in offering a coherent measure of the evidential support for a specified statistical hypothesis. For example, observing SEV(δ > 0.02)=0.99 means that our testing procedure would have only produced such an extreme result, or a more extreme one, with probability 1-0.99 = 0.01 (1%) if in fact δ ≤ 0.02. Severity can be assessed for different claims about a parameter of interest: severity curves are especially helpful if one wants to assess the test's capacities at a glance.

Severity is useful in combating fallacies of rejection (misguided interpretations of a rejection of the null hypothesis) as well as fallacies of acceptance (misguided interpretations of the failure to reject the null hypothesis) when communicating the outcomes of an A/B test to stakeholders.

References:
[1] Mayo D. (2018) "Statistical Inference as Severe Testing"

Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.

Related A/B Testing terms

Argument from Coincidence Hypothesis Testing Null Hypothesis Statistical Test

About the author

Georgi Z. Georgiev

Georgi has over twenty years of experience in online marketing, web analytics, statistics, and design of business experiments.

Author of the book "Statistical Methods in Online A/B Testing", white papers on statistical analysis of A/B tests, and a speaker, he has been distinguished as a winner in the Data & Analytics category of the 2024 Experimentation Thought Leadership Awards.

Statistical Methods in Online A/B Testing

Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.

Learn more

Glossary index by letter

A B C D E F G H I K L M N O P R S T U V Z

Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.