What does "Confidence Threshold" mean?

Definition of Confidence Threshold in the context of A/B testing (online controlled experiments).

What is a Confidence Threshold?

The confidence threshold is the inverse of the significance threshold and it is also usually expressed as a percentage. For example, a significance threshold of 0.05 is equal to a 95% confidence threshold. Due to the duality between confidence intervals and [t]p-value[/]s the two notions are always exchangeable, assuming the interval can be computed analytically or estimated through simulation.

Just like the significance threshold, the confidence threshold is chosen during the planning of an A/B test. The threshold communicates that we require that a confidence interval (CI) at that confidence level to exclude all values under the null hypothesis. It corresponds to a probability of committing a type I error (registering a false positive) of (1 - the confidence threshold). The chosen probability should be deemed acceptable under the specific circumstances of the test in question. The threshold is used to compute the sample size needed for a uniformly most powerful test at that threshold and specified minimum effect of interest and statistical power against a composite hypothesis with a lower bound at the MEI.

After the test is completed, the observed confidence interval at the level specified by the confidence threshold is examined to see if it covers values under the null hypothesis. If it does not, then the null hypothesis is rejected.

The confidence threshold is often set to 95% but when choosing the threshold for a particular test one should ideally consider the particular risks and rewards associated with the test at hand. A test for a major decisions which has wide-ranging consequences and is hard to reverse might require a very high confidence threshold, say 99.9%. On the other hand, a different test in which the decision has limited scope and is easy to reverse if necessary can be planned with a much higher threshold (lower evidential input) of 90%. Sample size and test duration considerations also enter into account.

Articles on Confidence Threshold

Glossary Index by Letter


Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.