The Complete Guide to A/B Testing
All the resources you need to get started with A/B testing
What is A/B Testing?
A/B testing involves comparing two versions of a web page. These variations, known as A and B, are presented randomly to users. A statistical analysis of the results then determines which version performed better, according to certain KPIs such as conversion rate.
Conversion optimization and A/B testing are two ways for companies to increase profits. Their promise is a simple one: generate more revenues with the same amount of traffic. In light of high acquisition costs and complex traffic sources, why not start by getting the most out of your current traffic?
Amazon is very familiar with A/B testing - they're constantly testing to improve UX and conversion rates
Surprisingly, average conversion rates for e-commerce sites continue to hover between 1% and 3%. Why? Because conversion is a complex mechanism that depends on a number of factors, including the quality of traffic generated, user experience, offer quality, the website’s reputation, as well as what the competition is doing.
E-commerce professionals will naturally aim to minimize any negative impact the interplay of the above elements might have on consumers along the buyer journey. A variety of tools exist to help them achieve this, including A/B testing, a discipline that uses data to help you make the best decisions.
What is A/B testing?
A/B testing involves comparing two versions of a web page or application to see which performs better. These variations, known as A and B, are presented randomly to users. A portion of them will be directed to the first version, and the rest to the second. A statistical analysis of the results then determines which version, A or B, performed better, according to certain predefined indicators such as conversion rate.
A/B testing involves comparing the performance of two or more versions of a web page.
In other words, you can verify which version gets the most clicks, subscriptions, purchases, and so on. These results can then help you determine an optimal marketing strategy.
More articles on A/B testing:
The origins of A/B testing
A/B testing was originally a way of comparing two versions of something in order to determine which was the most efficient. This method was therefore around long before the advent of the internet.
Ronald Fisher, a British biologist and statistician, was the first to present this idea using mathematics in the 1920s. The process made it possible to scientifically analyze two different experiences. Fisher’s work was a major advancement in the world of science. Several years later, the principle of A/B testing was introduced for clinical trials in medicine.
It was only in the 1960s that the concept was taken up in the field of marketing, and A/B testing as we know it today has only existed since the 1990s. It quickly became an accepted means for direct marketing specialists to test various messages (each differing in one way) on a sample of consumers in order to see which was the most effective.
Developments in digital technology progressively allowed for richer perspectives, in that they increased the number and kind of testing possibilities, as well as the way performance indicators were measured. Applied to a website, A/B testing allows one to test an almost unlimited number of page versions and precisely measures each version’s performance, using indicators like user actions or purchase behavior. Improvements to technology have also enabled the development of dedicated A/B testing solutions that facilitate the implementation of such tests and the analysis of their results.
What type of websites are relevant for A/B testing ?
Any website can benefit from A/B testing, since they all have a ‘reason for being’ - and this reason is quantifiable. Whether you’re an online store, a news site or a lead generation site, you are aiming to improve your conversion rate, whatever kind of conversion that may be.
The term “lead” is used to mean a sales leads, or a prospective client. Especially relevant here are e-mails sent in order to boost sales. In this case, A/B testing makes use of information about the nature of the people contacted, like their sex or age range.
In a media context, it’s more relevant to talk about ‘editorial A/B testing’. In industries that work closely with the press, the idea behind A/B testing is to test the success of a given content category - for example, to see if it’s a perfect fit with the target audience. Here, as opposed to the above example, A/B testing has an editorial function, not a sales one.
Unsurprisingly, the aim of using A/B testing in an e-commerce context is to measure how well a website or online commercial app is selling its merchandise. A/B testing uses the number of completed sales to determine which version performs best. It’s particularly important to look at the home page and the configuration of the product page, but it’s also a good idea to consider all the visual elements involved in completing a purchase (buttons, calls-to-action).
What types of A/B tests should you use?
There are several types of A/B tests. You should choose the one that best fits your particular situation.
- Classic A/B test. The classic A/B test presents users with two variations of your pages at the same URL. That way, you can compare two or several variations of the same element.
- Split tests or redirect tests. The split test redirects your traffic towards one or several distinct URLs. If you are hosting new pages on your server, this could be an effective approach.
- Multivariate or MVT test. Lastly, multivariate testing measures the impact of multiple changes on the same web page. For example, you can modify your banner, the color of your text, your presentation, and more.
In terms of technology, you can:
- Use A/B testing on websites. A/B testing on the web makes it possible to compare a version A and B of a page. After this, the results are analyzed according to predefined objectives—clicks, purchases, subscriptions, and so on.
- Use A/B testing for native mobile iPhone or Android applications. A/B testing is more complex with applications. This is because it is not possible to present two different versions once the application has been downloaded and deployed on a smartphone. Numerous tools are therefore available so that you can instantly update your application. You can easily modify your design and directly analyze the impact of this change.
- Use server-side A/B testing via APIs. An API is a programming interface that enables connection with an application for data exchange. APIs let you automatically create campaigns or variations from saved data.
It is possible to test on multiple devices with solutions like AB Tasty.
A/B testing and conversion optimization
A/B testing is one useful tool that helps establish a broader conversion optimization strategy, but it is by no means sufficient all on its own. An A/B testing solution lets you statistically validate certain hypotheses, but alone, it cannot give you a sophisticated understanding of user behavior. However, understanding user behavior is certainly key to understanding problems with conversion.
A/B testing confirms hypotheses, but alone, it cannot explain user behaviorClick To Tweet
It’s therefore essential to enrich A/B testing with information provided by other means. This will allow you to gain a fuller understanding of your users, and crucially, help you come up with hypotheses to test.
There are many sources of information you can use to gain this fuller picture:
- Web analytics data. Although this data does not explain user behavior, it may bring conversion problems to the fore (e.g. identifying shopping cart abandonment). It can also help you decide which pages to test first.
- Ergonomics evaluation. These analyses make it possible to inexpensively understand how a user experiences your website.
- User test. Though limited by sample size constraints, this qualitative data can provide a myriad of information not otherwise available using quantitative methods.
- Heatmap and session recording. These methods offer visibility on the way that users interact with elements on a page or between pages.
- Client feedback. Companies collect large amounts of feedback from their clients (e.g. opinions listed on the site, questions for customer service). Their analysis can be completed by tools such as customer satisfaction surveys or live chats.
Recommended tools for conversion optimization:
How to find A/B test ideas
Your A/B tests must be complemented by additional information in order to identify conversion problems and offer an understanding of user behavior. This analysis phase is critical, and must help you to create “strong” hypotheses. The tools mentioned above and their associated disciplines will help. A correctly formulated hypothesis is the first step towards a successful A/B testing program and must respect the following rules.
- be linked to a clearly discerned problem that has identifiable causes
- mention a possible solution to the problem
- indicate the expected result, which is directly linked to the KPI to be measured
For example, if the identified problem is a high abandon rate for a registration form that seems like it could be too long, a hypothesis might be: “Shortening the form by deleting optional fields will increase the number of contacts collected.”
Use the entire array of tools available to understand your users’ obstacles
More articles about formulating test hypotheses:
What elements should be tested on a website?
What should you test on your site? This question comes up again and again because companies often don’t know how to explain their conversion rates, whether good or bad. If a company could be sure that their users were having trouble understanding their product, they wouldn’t bother testing the location or color of an add-to-cart button - this would be off topic. Instead, they would test various wordings of their customer benefits. Every situation is different. Rather than providing an exhaustive list of elements to test, we preferred to give you a framework to identify these elements. Below are some good places to start:
Titles and headers
You can start by changing the title or content of your articles so that they draw people in. Regarding form, a change of color or font can also make a difference.
Call to action
The call to action is a very important button. Color, copy, the position and type of words used ('buy', 'add to cart', 'order', etc.) can have a decisive impact on your conversion rate.
Other buttons can also play a crucial role. Be sure to modify their size (small, large, etc.), their shape (square, round, etc.) and their color (black, white, colored) to attract as many visitors as possible.
Images are just as important as text. It is recommended to try different images. For example, if you are a prêt-à-porter e-commerce, verify that photos of clothing articles on models are more popular than photos of the articles alone. Also play on the size and aesthetic of your photos (hue, saturation, brightness, etc.) as well as location (right, left, up, down).
The structure of your pages, whether home page or category pages, should be particularly well crafted. You can add a carousel, choose fixed images, change your banner, present a few flagship products on the home page...
Use different algorithms to transform your visitors into customers or increase their cart: similar articles, most-searched products... You can make suggestions to potential customers of things that are likely to interest them.
Think over your action plan to generate additional profits. For example, if you are selling target merchandise, why not diversify by offering additional products or complementary services?
It is important to create a clear and concise form. You can try modifying a field title, removing optional fields, changing field placement, formatting using lines or columns, etc.
A/B testing on pricing can be delicate. This is because you cannot sell the same product or service for a different price. You’ll need to use a little ingenuity when testing your conversion rate. For example, if you offer services, you can create a low-cost offering with fewer options. If you sell products, offer a different color, shape or material.
You can test different page connections by offering multiple conversion tunnels in one or several parts. For example, you can put the payment mode and information relating to delivery on a single page or separate them into two.
How to implement an A/B test
Devise a rigorous and indispensable testing methodology to get results from your A/B tests. Below are the steps to implement your own A/B tests.
1. Implement a project team
The success of your tests does not only rely on the A/B testing tool. Instead, it relies on the experience of those in charge of conversion optimization. It is important to bring together different talents capable of data analysis, identifying conversion problems, and putting themselves in the shoes of end users to suggest appropriate solutions.
Two profiles are also useful: a project manager and a sponsor. The project manager will coordinate teams and will guarantee the test roadmap. The sponsor will take their cues from the optimization initiative department and will be responsible for return on investment. If you don’t have these resources, a contact who centralizes testing and analyzing results is advised as a bare minimum.
2. Prioritize the tests to conduct
You’ve used various methods to zero in on some conversion problems, and you were able to formulate different hypotheses to test. You now need to prioritize your next steps to establish a roadmap, formalize the A/B testing program, and set a pace for test implementation. Several criteria will help you prioritize these hypotheses:
- Estimate gain. To do this, you must analyze the potential of your solutions. What is the estimated gain? What are the chances of increasing your conversion rate? By investigating your data, you can quickly identify the pages with a strong reconversion potential (high exit rate, limited connection time, etc.)
- Page volume data. It is also important to showcase pages that attract the most traffic. If you have a low number of visitors, the impact of your A/B testing will be difficult to observe. Additionally, it is preferable to make only one change at a time instead of setting up tests that will require too many modifications.
- Ease of implementation. To prioritize your tests, you must also determine how easy each of your hypothesis will be to test. The simpler a solution is (easy graphic design changes, no special technical skills needed, etc.), the fewer resources it will require.
Following this prioritization, your roadmap will have begun to take shape. In order to formalize properly, it is recommended that you clearly formulate—with the highest level of detail possible—and share your roadmap to mobilize, align, and coordinate the efforts of the stakeholders involved. This roadmap will be used to manage A/B testing activity.
3. Implement tests
The way you set up a test will vary depending on the kind of A/B testing solution you’re using and your usual way of working.
Some A/B testing tools are rather complex to put in place, and require you to get your technical teams involved to directly modify the test pages’ source code. Other tools make it possible to start a test without any technical skills. In the latter of these cases, the user modifies the site pages via a WYSIWYG (What You See Is What You Get) editor. This type of tool requires little time to get used to and users quickly become autonomous.
In terms of getting tests up and running, companies usually go either one of two ways: either the company itself manages the entire test from A to Z, or they outsource to an external provider who, aside from offering consultancy services, also defines the design of different variations, develops graphical and written elements as needed, then implements tests via one of the tools on the market.
The choice of an A/B testing solution and a mode of operation depend on the maturity of a company and its resources. Each case is different and solutions must be tailored to needs and constraints. A complex tool is useless if users wishes to be autonomous. If they opt for one, they are dependent on a service provider to use it. Conversely, a tool that is too simple may be a limitation when needs evolve.
4. Analyze test results
Take a look at our section dedicated to this topic. It discusses reliability and statistical method, as well as best practices to interpret test data.
5. Document tests
It is essential to document and archive tests to efficiently share information with all those in charge of optimizing conversions. Documenting a test means keeping a written trace of the following information after each test:
- Test name
- Test period
- Hypothesis tested
- A description of the variations implemented (with screenshots)
- Results and information learned from the test
- Potential future monetary gain.
6. Implement the best performing versions
Once a variation clearly outperforms the original, it is time to put the winning version into production. Depending on the company, the time between each site production launch can be long. In order not to miss out on gains, most A/B testing solutions make it possible to display the winning variation to 100% of visitors during the time that changes are entering into production.
Next, it is important to make sure that the boost in performance that was observed during the test period remains in the long term. A number of external factors could explain that, during the test, the optimization implemented generated improved results. For example, during the end of the year holiday period, a feeling of urgency increases, and conversion rates can naturally improve. If a test shows that a variation outperforms the original by 10% during the holidays, performance outside of this very specific period might not be as strong.
7. Distribute test results
It is important to communicate what is learned from the tests conducted, particularly to managers. The most important insights gained, which affect other aspects of activity, must also be shared with other departments (sales, marketing, communication, etc.). Lastly, if the A/B testing tool enables evaluation of monetary gains for tests (the differential of revenue generated between the original page and the variations), mentioning these gains makes it possible to calculate the ROI of the testing program, thus justifying the investment dedicated to it.
8. Conduct permanent testing
A/B testing is a continual optimization process. Things are learned after each test and go on to form part of new test hypotheses, thus enriching the roadmap. It is over the long term that these efforts will be rewarding. The first tests will certainly not produce the desired results, since building expertise takes time.
How to analyze A/B results
The test analysis phase is the most sensitive. The A/B testing solution must at least offer a reporting interface indicating the conversions saved by variation, the conversion rate, the percentage of improvement compared with the original and the statistic reliability index saved for each variation. The most advanced tools narrow down the raw data, segmenting results by dimension (e.g. traffic source, geographical location of visitors, customer typology, etc.).
Primary and secondary objectives
Result analysis depends on objectives set beforehand and their corresponding indicators. If there is nothing to prevent measuring data during a test, it is important to identify a priority indicator in order to isolate the variations. It is not rare to see tests produce the opposite effects on two indicators (e.g. increase in number of purchases, but drop in average cart). Primary objectives are what motivated your test—registrations, orders, subscriptions, etc. Secondary objectives offer additional indications about the behavior of your visitors (bounce rate, session duration, etc.). They allow you to hone your modifications and improve your conversion rate.
Knowing how to analyze the results of a test is critical to making sound decisions.
Understanding A/B test statistics
Before it is possible to analyze test results, the main difficulty involves obtaining a sufficient level of statistical confidence. A threshold of 95% is generally adopted. This means that the probability that result differences between variations are due to chance is very low. The time necessary to reach this threshold varies considerably according to site traffic for tested pages, initial conversion rate for the measured objective and the impact of modifications made. It can go from a few days to several weeks. For low-traffic sites, it is advisable to test a page with higher traffic. Before the threshold is reached, it is pointless to make any conclusions.
All the advice you need to analyze A/B test results and avoid making the wrong choiceClick To Tweet
Furthermore, the statistical tests used to calculate the confidence level (such as the chi-square test) are based on a sample size close to infinity. Should the sample size be low, exercise caution when analyzing the results, even if the test indicates a reliability of more than 95%. With a low sample size, it is possible that leaving the test active for a few more days will greatly modify the results. This is why it is advisable to have a sufficiently sized sample. There are scientific methods to calculate the size of this sample (use our sample size calculator) but, from a practical standpoint, it is advisable to have a sample of at least 5,000 visitors and 75 conversions saved per variation.
There are two types of statistical tests:
- Frequentist tests. The chi-square method, or Frequentist method, is objective. It allows for analysis of results only at the end of your test. The study is thus based on observation, with a reliability of 95%.
- Bayesian tests. The Bayesian method is deductive. By taking from the laws of probability, it lets you analyze results before the end of the test. Be sure, however, to correctly read the confidence interval. Check out our dedicated article to see all there is to know about advantages of the Bayesian approach for A/B testing.
Lastly, although site traffic makes it possible to quickly obtain a sufficiently sized sample, it is recommended that you leave the test active for several days to take into account differences in behavior observed by weekday, or even by time of day. A minimum duration of one week is preferable, ideally two weeks. In some cases, this period can even be longer, particularly if the conversion concerns products for which the buying cycle requires time (complex products/services or B2B). As such, there is no standard duration for a test.
Other articles on statistics for A/B testing:
Tips and best practices for A/B testing
Below are several best practices that can help you avoid running into trouble. They are the result of the experiences, both good and bad, of our clients during their testing activity.
1. Ensure data reliability for the A/B testing solution
Conduct at least one A/A test to ensure a random assignment of traffic to different versions. This is also an opportunity to compare the A/B testing solution indicators and those of your web analytics tool. This is done in order to verify that figures are in the ballpark, not to make them correspond exactly.
2. Conduct an acceptance test before starting
Do some results seem counter-intuitive? Was the test set up correctly and were the objectives correctly defined? In many cases, time dedicated to acceptance testing saves precious time which would be spent interpreting false results.
3. Test one variable at a time
This makes it possible to precisely isolate the impact of the variable. If the location of an action button and its label are modified simultaneously, it is impossible to identify which change produced the observed impact.
4. Conduct one test at a time
For the same reasons cited above, it is advisable to conduct only one test at a time. The results will be difficult to interpret if two tests are conducted in parallel, especially if they're on the same page.
5. Adapt number of variations to volume
If there is a high number of variations for little traffic, the test will last a very long time before giving any interesting results. The lower the traffic allocated to the test, the less there should be different versions.
6. Wait to have a statistical reliability before acting
So long as the test has not attained a statistical reliability of at least 95%, it is not advisable to make any decisions. The probability that differences in results observed are due to chance and not to the modifications made is very high otherwise.
7. Let tests run long enough
Even if a test rapidly displays statistical reliability, it is necessary to take into account the size of the sample and differences in behavior linked to the day of the week. It is advisable to let a test run for at least a week—two ideally—and to have recorded at least 5,000 visitors and 75 conversions per version.
8. Know when to end a test
If a test takes too long to reach a reliability rate of 95%, it is likely that the element tested does not have any impact on the measured indicator. In this case, it is pointless to continue the test, since this would unnecessarily monopolize a part of the traffic that could be used for another test.
9. Measure multiple indicators
It is recommended to measure multiple objectives during the test. One primary objective to help you decide on versions and secondary objectives to enrich the analysis of results. These indicators can include click rate, cart addition rate, conversion rate, average cart, and others.
10. Take note of marketing actions during a test
External variables can falsify the results of a test. Oftentimes, traffic acquisition campaigns attract a population of users with unusual behavior. It is preferable to limit collateral effects by detecting these kinds of tests or campaigns.
11. Segment tests
In some cases, conducting a test on all of a site’s users is nonsensical. If a test aims to measure the impact of different formulations of customer advantages on a site’s registration rate, submitting the current database of registered users is ineffective. The test should instead target new visitors.
More articles on best practices and traps to avoid
Examples of A/B tests
Many of you are looking for ideas for your next A/B tests. If you have been reading the previous chapters, you know there is no magic bullet and that textbook cases are site-specific. However, since you just can’t help yourself, here are some links to a few examples.
More examples of A/B tests and results
Choosing an A/B testing platform
We can only recommend you use AB Tasty. In addition to offering a full A/B testing solution, AB Tasty offers a suite of software to optimize your conversions. You can analyze your users’ behavior with heatmaps and session recording (also known as user session replay). You can also customize your website in terms of numerous targeting criteria and audience segmentation.
In addition to A/B testing, @A/BTasty also helps you understand your website users and personalize their experienceClick To Tweet
But, in order to be exhaustive, and also to provide you with as much valuable information as possible when it comes to choosing a vendor, here are a few articles to help you choose your A/B testing tool.
Other forms of A/B testing
A/B testing is not limited to modifications to your site’s pages. You can apply the concept to all your marketing activities, such as traffic acquisition via e-mail marketing campaigns, AdWords campaigns, Facebook Ads, and much more.
Resources for going further with A/B testing:
The best blogs on A/B testing and conversion
We obviously recommend you read our very own blog, but other experts in international optimization also publish very pertinent articles on the subject of A/B testing and conversion more generally. Here is our selection to stay up to date with the world of CRO.
Blogs to bookmark:
To be sure you keep up to date with the latest tech trends, you can also use Twitter— relevant experts are almost all present on this social network.
Influencers on the themes of A/B testing and conversion optimization:
- Chris Goward - @chrisgoward
- Kissmetrics - @Kissmetrics
- Unbounce - @unbounce
- Peep Laja - @peeplaja
- Bryan Eisenberg - @TheGrok
- Oli Gardner - @oligardner
- Rand Fishkin - @randfish
- Nelio - @NelioSoft
- UserTesting - @usertesting
- Michael Aagaard - @ContentVerve
- Roger Dooley - @rogerdooley
- MoreVisibility - @MoreVisibility
- WiderFunnel - @WiderFunnel
- Brian Massey - @bmassey
- invesp - @invesp
- Get Elastic - @getelastic
- Online Behavior - @onbehavior
- Moz - @Moz
- SiteTuners - @SiteTuners
- Tim Ash - @tim_ash
- ConversionAid - @ConversionAid
- Conversioner - @Conversioner_
- Theo van der Zee - @theovdzee
- Scott Brinker - @chiefmartec
- Venture Harbour - @VentureHarbour
- PPC Hero - @ppchero
- WordStream - @WordStream
- MarketingExperiments - @MktgExperiments
- ConversionConference - @ConversionConf
Table of Contents
- 1 Introduction
- 2 What is A/B testing?
- 3 The origins of A/B testing
- 4 What type of websites are relevant for A/B testing ?
- 5 What types of A/B tests should you use?
- 6 A/B testing and conversion optimization
- 7 How to find A/B test ideas
- 8 What elements should be tested on a website?
- 9 How to implement an A/B test
- 10 How to analyze A/B results
- 11 Tips and best practices for A/B testing
- 12 Examples of A/B tests
- 13 Choosing an A/B testing platform
- 14 Other forms of A/B testing
- 15 The best blogs on A/B testing and conversion