Article

4min read

The Pros and Cons of Multivariate Tests

Wait! New to multivariate testing? If so, we recommend you first read our article, Multivariate Testing: All you need to know about multivariate testing


During an A/B test, you must only modify one element at a time (for example, the wording of an action button) to be able to determine the impact. If you simultaneously change this button’s wording and color (for example, a blue “Buy” button vs. red “Purchase” button) and see an improvement, how do you know which of the wording or the color changes really contributed to this result? The contribution of one may be negligible, or the two may have contributed equally.

The benefits of multivariate tests

A multivariate test aims to answer this question. With this type of experiment, you test a hypothesis for which several variables are modified and determine which is the best combination of all possible ones. If you change two variables and each has three possibilities, you have nine combinations between which to decide (number of variants of the first variable X number of possibilities of the second).

Multivariate testing has three benefits:

  • avoid having to conduct several A/B tests one after the other, saving you time since we can look at a multivariate test as several A/B tests conducted simultaneously on the same page,
  • determine the contribution of each variable to the measured gains,
  • measure the interaction effects between several supposedly independent elements (for example, page title and visual illustration).

Types of multivariate tests

There are two major methods for conducting multivariate tests:

  • Full Factorial“: this is the method that is usually referred to as multivariate testing. With this method, all combinations of variables are designed and tested on an equal part of your traffic. If you test two variants for one element and three variants for another, each of the six combinations will be assigned to 16.66% of your traffic.
  • Fractional Factorial“: as its name suggests, only a fraction of all combinations is actually subjected to your traffic. The conversion rate of untested combinations is statistically deduced based on that of those actually tested. This method has the disadvantage of being less precise but requires less traffic.

While multivariate testing seems to be a panacea, you should be aware of several limitations that, in practice, limit its appeal in specific cases.

Limits of multivariate tests

The first limit concerns the volume of visitors to subject to your test to obtain usable results. By multiplying the number of variables and possibilities tested, you can quickly reach a significant number of combinations. The sample assigned to each combination will be reduced mechanically. Where, for a typical A/B test, you are allocating 50% of your traffic to the original and the variant, you are only allocating 5, 10, or 15% of your traffic to each combination in a multivariate test. In practice, this often translates into longer tests and an inability to achieve the statistical reliability needed for decision-making. This is especially true if you are testing deeper pages with lower traffic, which is often the case if you test command tunnels or landing pages for traffic acquisition campaigns.

The second disadvantage is related to the way the multivariate test is brought into consideration. In some cases, it is the result of an admission of weakness: users do not know exactly what to test and think that by testing several things at once, they will find something to use. We often find small modifications at work in these tests. A/B testing, on the other hand, imposes greater rigor and better identification of test hypotheses, which generally leads to more creative tests supported by data and with better results.

The third disadvantage is related to complexity. Conducting an A/B test is much simpler, especially in the analysis of the results. You do not need to perform complex mental gymnastics to try to understand why one element interacts positively with another in one case and not in another. Keeping a process simple and fast to execute allows you to be more confident and quickly iterate your optimization ideas.

Conclusion

While multivariate tests are attractive on paper, note that carrying out tests for too long only to obtain weak statistical reliability can make them a less attractive option in some cases. In order to obtain actionable results that can be quickly identified, in 90% of cases, it is better to stick to traditional A/B tests (or A/B/C/D). This is the ratio found among our customers, including those with an audience of hundreds of thousands or even millions of visitors. The remaining 10% of tests are better reserved for fine-tuning when you are comfortable with the testing practice, have achieved significant gains through your A/B tests, and are looking to exceed certain conversion thresholds or to gain a few increments.

Finally, it is always helpful to remember that, more than the type of test (A/B vs. multivariate), it is the quality of your hypotheses – and by extension that of your work of understanding conversion problems – which will be the determining factor in getting boosts and convincing results from your testing activity.

Subscribe to
our Newsletter

bloc Newsletter EN

We will process and store your personal data to respond to send you communications as described in our  Privacy Policy.

Article

5min read

A/A Testing: A Waste of Time or Useful Best Practice?

A/A TestingA/A testing is little known and subject to strong discussions on its usefulness, but it brings added value for those who are looking to integrate an A/B testing software with rigor and precision.

But before we begin…

What is A/A testing?

A/A testing is a derivative of A/B testing (check out A/B testing definition). However, instead of comparing two different versions (of your homepage, for example), here we compare two identical versions.

Two identical versions? Yes!

The main purpose of A/A testing is simple: verify that the A/B testing solution has been correctly configured and is effective.

We use A/A testing in three cases:

  • To check that an A/B testing tool is accurate
  • To set a conversion rate as reference for future tests
  • To decide on an optimal sample size for A/B tests

Checking the accuracy of the A/B Testing tool

When performing an A/A test, we compare two strictly identical versions of the same page.

Of course, the purpose of an A/A test is to display similar values in terms of conversion. The idea here is to prove that the test solution is effective.

Logically, we will organize an A/A test when we set up a new A/B test solution or when we go from one solution to another.

However, sometimes a “winner” is declared on two identical versions. Therefore, we must seek to understand “why” and this is the benefit of A/A testing.

  • The test may not have been conducted correctly
  • The tool may not have been configured correctly
  • The A/B testing solution may not be effective.

Setting a reference conversion rate

Let’s imagine that you want to set up a series of A/B tests on your homepage. You set up the solution but a problem arises: you do not know to which conversion rate to compare the different versions to.

In this case, an A/A Test will help you find the “reference” conversion rate for your future A/B tests.

For example, you begin an A/A Test on your homepage where the goal is to fill out a contact form. When comparing the results, you get nearly identical results (and this is normal): 5.01% and 5.05% conversions. You can now use this data with the certainty that it truly represents your conversion rate and activate your A/B tests to try to exceed this rate. If your A/B tests tell you that a “better” variant achieves 5.05% conversion, it actually means that there is no progress.

Finding a sample size for future tests

The problem in comparing two similar versions is the “luck” factor.

Since the tests are formulated on a statistical basis, there is a margin of error that can influence the results of your A/B testing campaigns.

It’s no secret how to reduce this margin of error: you have to increase the sample size to reduce the risk that random factors (so-called “luck”) skew the results.

By performing an A/A test, you can “see” at what sample size the test solution comes closest to “perfect equality” between your identical versions.

In short, an A/A test allows you to find the sample size at which the “luck” factor is minimized; you can then use that sample size for your future A/B tests. That said, A/B tests generally require a smaller sample size.

A/A testing: a waste of time?

The question is hotly debated in the field of A/B Testing: should we take the time to do an A/A test before doing an A/B test?

And that is the heart of the issue: time.

Performing A/A tests takes considerable time and traffic

In fact, performing A/A tests takes time, considerably more time than A/B tests since the volume of traffic needed to prove that the two “identical variants” lead to the same conversion rate is significant.

The problem, according to ConversionXL, is that A/A testing is time-consuming and encroaches on traffic that could be used to conduct “real tests,” i.e., those intended to compare two variants.

Finally, A/A testing is much easier to set up on high traffic sites.

The idea is that if you run a site that is being launched or has low traffic, it is useless to waste your time doing an A/A test: focus instead on optimizing your purchase tunnel or on your Customer Lifetime Value: the results will be much more convincing and, especially, must more interesting.

An interesting alternative: data comparison

To check the accuracy of your A/B Testing solution, there is another way that is easy to set up. To do this, your A/B Testing solution needs to integrate another source of analytic data.

By doing this, you can compare the data and see if it points to the same result: it’s another way to check the effectiveness of your test solution.

If you notice significant differences in data between the two sources, you know that one of them is:

  • Either poorly configured,
  • Or ineffective and must be changed.

Did you like this article? We would love to talk to you more about it.