What does "A/B Testing" mean?

Comprehensive definition of A/B Testing. Meaning and key characteristics.

What is A/B Testing?

Alias: split testing

A/B testing is the process of conducting a business experiment, typically an online controlled experiment. It exists to enable growth and innovation with controlled levels of risk. In an A/B test one or more proposed changes are compared against "business as usual". It is done in a scientific manner with actual users of a service or product as opposed to testing with focus groups.

A/B testing allows a causal link to be established as well as to estimate the size and direction of the effect caused by the tested changes despite the variability inherent to observations in all kinds of business data. The split testing process is widely used by most of the top tech companies. Google, Microsoft, Amazon, Booking, and others reportedly conduct thousands and even millions of tests per year.

How does A/B testing work?

In an A/B test, actual users are randomized into several groups. Some are randomly assigned to a control group, meaning they have an unchanged user experience. The rest are randomly assigned to one or more test groups and they experience one or more proposed changes. The randomization process by itself produces a number of useful properties which observational data does not possess.

The test is then monitored at one or several points in time. Its performance is evaluated based on the chosen performance indicator (metric). There are strategies for planning and evaluating tests with varying levels of complexity and robustness.

An important feature of controlled experiments is the ability to extrapolate from a limited set of data to future outcomes. In order to ensure the validity for these predictions, the users in a test have to be a good representation of the expected future users. There are different strategies for assuring the generalizability of A/B test results.

What is the purpose of A/B testing?

While the ultimate purpose of the split testing approach is to allow for innovation with measured risk, the immediate goal of A/B testing is to:

  • infer the causal effect of a given user experience on a performance measure
  • obtain a reliable estimate of the size and direction of the effect
  • enable a prediction of future business outcomes based on a limited sample

In this sense split testing is both a device for managing business risk and a tool for obtaining reliable business information. Test outcomes can be useful in making the decision to implement a given change or to refrain from doing so. As such, A/B testing is often employed as a tool in the process of conversion rate optimization (CRO), landing page optimization, marketing optimization, etc.

The risk management aspect of split testing is especially important. By testing most changes of high potential impact before committing to them, a business ensures its ability to innovate while minimizing the risk of taking steps which would be detrimental to key business metrics.

In this sense there is no such thing as a "successful" or "unsuccessful" experiment. An A/B test, if executed correctly, has fulfilled its purpose regardless of outcome. It has either enabled a useful innovation to be implemented with controlled risk, or has blocked a harmful change from being released.

What can be split tested?

Most changes one can make to a website, app, customer service process, business process, etc. can be tested. The main categories of what can be tested include:

  • user interface & user experience (copy, design, call-to-action elements, personalization, images, landing pages, purchase or lead acquisition, process, etc.)
  • backend systems (algorithms for search results, sorting, recommendations, email interactions, databases, CMSs, html & css...)
  • marketing & advertising (ads, target audiences, bidding approaches, messaging...)
  • support interactions (email templates, CRMs, support / ticketing systems...)

Be mindful that some kinds of tests may be unethical or outright illegal in certain jurisdictions. One example of a sensitive test direction is price sensitivity testing.

What are typical metrics in online A/B testing?

While conversion rates are often used to judge the outcome of an online experiment, a test may use any kind of metric relevant to the question being asked of it. Tests based on average revenue per user (ARPU) are often more informative than those based on a conversion rate as the former also incorporates information about the variability in the value of individual conversions.

For example, a business where each lead has a roughly equal value may use a conversion rate to measure success. The same goes if acquiring a new customer is valuable in itself and the individual value of an acquisition matters little. On the other hand, an online store selling hundreds or thousands of products would likely prefer to measure revenue since the difference between one purchase and another may be multiple fold. A revenue-based metric will give stakeholders more certainty regarding the expected business effect of a change.

Like this glossary entry? For an in-depth and comprehensive reading on A/B testing stats, check out the book "Statistical Methods in Online A/B Testing" by the author of this glossary, Georgi Georgiev.

Articles on A/B Testing

A/B Testing Statistics - A Concise Guide

Affordable A/B Tests: Google Optimize & AGILE A/B Testing

Risk vs. Reward in A/B Tests: A/B testing as Risk Management

A/B Testing with a Small Sample Size

The Cost of Not A/B Testing – a Case Study

Representative samples and generalizability of A/B testing results

Purchase Statistical Methods in Online A/B Testing

Statistical Methods in Online A/B Testing

Take your A/B testing program to the next level with the most comprehensive book on user testing statistics in e-commerce.

Learn more

Glossary index by letter

Select a letter to see all A/B testing terms starting with that letter or visit the Glossary homepage to see all.