Article

7min read

Sample Ratio Mismatch: What Is It and How Does It Happen?

A/B testing can bring out a few types of experimental flaws.

Yes, you read that right – A/B testing is important for your business, but only if you have trustworthy results. To get reliable results, you must be on the lookout for errors that might occur while testing.

Sample ratio mismatch (SRM) is a term that is thrown around in the A/B testing world. It’s essential to understand its importance during experimentation.

In this article, we will break down the meaning of sample ratio mismatch, how to spot SRM, when it is and is not a problem, why it can happen and how to detect SRM.

Sample ratio mismatch overview

Sample ratio mismatch is an experimental flaw where the expected traffic allocation doesn’t fit with the observed visitor number for each testing variation.

In other words, an SRM is evidence that something went wrong.

Sample ratio mismatch is crucial to be aware of in A/B testing.

Now that you have the basic idea, let’s break this concept down piece by piece.

What is a “sample”?

The “sample” portion of SRM refers to the traffic allocation.

Traffic allocation refers to how the traffic is split toward each test variation. Typically, the traffic will be split equally (50/50) during an A/B test. Half of the traffic will be shown the new variation and the other half will go toward the control version.

This is how an equal traffic allocation will look for a basic A/B test with only one variant:

A/b testing equal traffic allocation

If your test has two variants or even three variants, the traffic will still be allocated equally to each test to ensure that each version receives the same amount of traffic. An equal traffic allocation in an A/B/C test will be split into 33/33/33.

For both A/B and A/B/C tests, traffic can be manipulated in different ways such as 60/40, 30/70, 20/30/50, etc. Although this is possible, it is not a recommended practice to get accurate and trustworthy results from your experiment.

Even by following this best practice guideline, equally allocated traffic will not eliminate the chance of an SRM. This type of mismatch is something that can still occur and must be calculated no matter the circumstances of the test.

Define sample ratio mismatch (SRM)

Now that we have a clear picture of what the “sample” is, we can build a better understanding of what SRM means:

  • SRM happens when the ratio of the sample does not match the desired 50/50 (or even 33/33/33) traffic allocation
  • SRM occurs when the observed traffic allocation to each variant does not match the  allocation chosen for the test
  • The control version and variation receive undesired mismatched samples

Whichever words you choose to describe SRM, we can now understand our original definition with more confidence:

“Sample ratio mismatch is an experimental flaw where the expected traffic allocation doesn’t fit with the observed visitor number for each testing variation.”

sample ratio mismatch

Is SRM always a problem?

To put it simply, SRM occurs when one test version receives a noticeably different amount of visitors than what was originally expected.

Imagine that you have set up a classic A/B test: Two variations with 50/50 traffic allocation. You notice at one point that version A receives 10,000 visitors and version B receives 10,500 visitors.

Is this truly a problem? What exactly happened in this scenario?

The problem is that while conducting an A/B test, an extremely strict respect of the allocation scheme is not always 100% possible since it must be random. The small difference in traffic that is noted in the example above is something we would typically refer to as a “non-problem.”

If you are seeing a similar traffic allocation on your A/B test in the final stages, there is no need to panic.

A randomly generated traffic split has no way of knowing exactly how many visitors will stumble upon the A/B test during the given time frame of the test. This is why toward the end of the test duration, there may be a smaller difference in the traffic allocation while the majority (+95%) of traffic is correctly allocated.

When is SRM a problem?

Some tests may have SRM due to experimental setup.

When the SRM is a big problem, there will be a noticeable difference in traffic allocation.

If you see 1,000 directed to one variant and 200 directed to the other — this is an issue. Sometimes, spotting SRM does not require a particular mathematical formula dedicated to calculating SRM as it is evident enough on its own.

However, an extreme difference in traffic allocation can be very rare. Therefore, it’s essential to check the visitor counts in an SRM test before each test analysis.

Does SRM occur frequently?

Sample ratio mismatch can happen more often than we think. According to a study done by Microsoft & Booking, about 6% of experiments experience this problem.

Furthermore, if the test includes a redirect to an entirely new page, SRM can be even more likely.

Since we heavily rely on tests and trust their conclusions to make strategic business decisions, it’s important that you are able to detect SRM as early as possible when it happens during your A/B test.

Can SRM still affect tests using Bayesian?

The reality is that everyone needs to be on the lookout for SRM, no matter what type of statistical test they are running. This includes experiments using the Bayesian method.

There are no exemptions to the possibility of experiencing a statistically significant mismatch between the observed and expected results of a test. No matter the test, if the expected assumptions are not met, the results will be unreliable.

Sample ratio mismatch: why it happens

Sample ratio mismatch can happen due to a variety of different root causes. Here we will discuss three common examples that cause SRM.

One common example is when the redirection to one variant isn’t working properly for poorly connected visitors.

Another classic example is when the direct link to one variant is spread on social media, which brings all users who click on the link directly to one of the variants. This error does not allow the traffic to be properly distributed among the variants.

In a more complex case, it’s also possible that a test including JS code is crashing a variant and therefore some of the visitor configurations. In this situation, some visitors that are being sent to the crashing variant won’t be collected and indexed properly, which leads to SRM.

All of these examples have a selection bias: some non-random visitors are excluded. The non-random visitors are arriving directly from a link shared on social media, have a poor connection, or are visiting a crashing variant.

In any case, when these issues occur, the SRM is an indication that something went wrong and you cannot trust the numbers and the test conclusion.

Checking for SRM in your A/B tests

Something important to be aware of when doing an SRM check is that the priority metric when checking needs to be “users” and not “visitors.” Users are the specific people that are allocated to each variation, meanwhile, the visitors metric is counting the number of sessions that each user makes.

It’s important to differentiate between users and visitors because results may be skewed if a visitor comes back to their variation multiple times. SRM detected with “visitors” may not be the most reliable metric, but using the “users” metric is evidence of a problem.

SRM in A/B testing

Testing for sample ratio mismatch may seem a bit complicated or unnecessary at first glance. In reality, it’s quite the opposite.

Understanding what SRM is, why it happens, and how it can affect your results is crucial in A/B testing. Running an A/B test to help make key decisions is only helpful for your business if you have reliable data from those tests.

Want to get started on A/B testing for your website? AB Tasty is a great example of an A/B testing tool that allows you to quickly set up tests with low code implementation of front-end or UX changes on your web pages, gather insights via an ROI dashboard, and determine which route will increase your revenue.

Subscribe to
our Newsletter

bloc Newsletter EN

We will process and store your personal data to send you communications as described in our  Privacy Policy.

Article

7min read

The ROI of Experimentation

When you hear ‘A/B Testing’, do you think straight away of revenue gain? Uplift? A dollars and cents outcome? 

According to David Mannheim, CEO of the Conversion Rate Optimization (CRO) agency User Conversion, you probably do – and shouldn’t. Here’s why:

Unfortunately, it’s just not that simple. 

Experimentation is more than just a quick strategy to uplift your ROI

In this article we will discuss why we experiment, the challenges of assessing return on investment (ROI), prioritization, and what A/B testing experimentation is really about. 

Why do we experiment?

Technically speaking, experimentation is performed to support or reject a hypothesis. Experimentation provides you with valuable insights into cause-and-effect relationships by determining the outcome of a certain test when different factors are manipulated in a controlled setting. 

In other words, if there is no experiment, there is no way to refute a hypothesis and reduce the risk of losing business or negatively impacting metrics.

Experimentation is about prioritization, minimizing risk and learning from the outcome. The tests you choose to implement should be developed accordingly. It’s not necessarily about making the “right” or “wrong” decision, experimentation helps you make better decisions based on data.

In visual terms, experimentation will look something like this:

ROI frustration backlog

Online experiments in the business world must be carefully designed to learn, accomplish a specific purpose, and/or measure a key performance indicator that may not have an immediate financial effect. 

However, far too often it’s the key stakeholders (or HIPPOs) who decide what tests get implemented first. Their primary concern? The amount of time it will take to see a neat revenue uplift.

This tendency leads us to the following theory:

The ROI of experimentation is impossible to achieve because the industry is conditioned to think that A/B testing is only about gain.

Frustrations and challenges of ROI expectations 

You may be asking yourself at this point, What’s so bad about expecting revenue uplift from A/B tests? Isn’t it normal to expect a clear ROI?

It is normal, however, the issue isn’t just that simple.

We’ve been conditioned to expect a neat formula with a clean-cut solution: “We invested X, we need to get Y.”  

This is a misleading CRO myth that gets in the way. 

Stakeholders have come to erroneously believe that every test they run should function like this – which has set unrealistic ROI expectations for conversion optimization practitioners

As you can imagine, this way of thinking creates frustration for those implementing online experimentation tests.

Experiment backlog example

What people often overlook is the complexity of the context in which they are running their experimentation tests and assessing their ROI.

It’s not always possible to accurately measure everything online, which makes putting an exact number on it next to impossible. 

Although identifying the impact of experiments can be quite a challenge due to the complexity of the context, there are some online tools that exist to measure your ROI efforts as accurately as possible. 

AB Tasty is an example of an A/B testing tool that allows you to quickly set up tests with low-code implementation of front-end or UX changes on your web pages, gather insights via an ROI dashboard, and determine which route will increase your revenue.

Aside from the frustration that arises from the ingrained ROI expectation to be focused on immediate financial improvement, three of the biggest challenges of the ROI of experimentation are forecasting, working with averages, and multiple tests at once.

Challenge #1: Forecasting

The first challenge with assessing the ROI of experimentation is forecasting. A huge range of factors impacts an analyst’s ability to accurately project revenue uplift from any given test, such as:

  • Paid traffic strategy
  • Online and offline marketing
  • Newsletters
  • Offers
  • Bugs
  • Device traffic evolution
  • Season
  • What your competitors are doing
  • Societal factors (Brexit)

In terms of estimating revenue projection for the following year from a single experiment– it’s impossible to predict an exact figure. It’s only possible to forecast an ROI trend or an expected average. 

Expecting a perfectly accurate and precise prediction for each experiment you run just isn’t realistic – the context of each online experimentation test is too complex.

Challenge #2: Working with averages

The next challenge is that your CRO team is working with averages – in fact, the averages of averages.

Let’s say you’ve run an excellent website experiment on a specific audience segment – and you experienced a high uplift in conversion rate. 

If you then take a look at your global conversion rate for your entire site, there’s a very good chance that this uplift will be swallowed up in the average data. 

Your revenue wave will have shrunk to an undetectable ripple. And this is a big issue when trying to assess overall conversion rate or revenue uplift – there are just too many external factors to get an accurate picture.

With averages, the bottom line is that you’re shifting an average. Averages make it very difficult to get a clear understanding. 

On average, an average customer, exposed to an average A/B test will perform… averagely

Challenge #3: Multiple tests

The third challenge of ROI expectations happens when you want to run multiple online experiments at one time and try to aggregate the results. 

Again, it’s tempting to run simple math equations to get a clear-cut answer for your gain, but the reality is more complicated than this. 

Grouping together multiple experiments and the results of each experiment will provide you will blurred results

This makes ROI calculations for experimentation a nightmare for those simultaneously running tests. Keeping experiments and their respective results separate is the best practice when running multiple tests.

Should it always be “revenue first”?

Is “revenue first” the best mentality? When you step back and think about it, it doesn’t make sense for conversion optimizers to expect revenue gain, and only revenue gain, to be the primary indicator of success driving their entire experimentation program.

What would happen if all businesses always put revenue first?

That would mean no free returns for an e-commerce site (returns don’t increase gain!), no free sweets in the delivery packaging (think ASOS), the most inexpensive product photographs on the site, and so on.

If you were to put immediate revenue gain first – as stakeholders so often want to do in an experimentation context – the implications are even more unsavory. 

Let’s take a look at some examples: you would offer the skimpiest customer service to cut costs, push ‘buy now!’ offers unendingly, discount everything, and forget any kind of brand loyalty initiatives. Need we go on?

In short, focusing too heavily on immediate, clearly measurable revenue gain inevitably cannibalizes the customer experience. And this, in turn, will diminish your revenue in the long run.

What should A/B testing be about?

One big thing experimenters can do is work with binomial metrics

Avoid the fuzziness and much of the complexity by running tests that aim to give you a yes/no, black or white answer.

binomial metrics examples

Likewise, be extremely clear and deliberate with your hypothesis, and be savvy with your secondary metrics: Use experimentation to avoid loss, minimize risk, and so on.

But perhaps the best thing you can do is modify your expectations

Instead of saying, experimentation should unfailingly lead to a clear revenue gain, each and every time, you might want to start saying, experimentation will allow us to make better decisions.

Good experimentation model

These better decisions – combined with all of the other efforts the company is making – will move your business in a better direction, one that includes revenue gain.

The ROI of experimentation theory

With this in mind, we can slightly modify the original theory of the ROI of experimentation:

The ROI of experimentation is difficult to achieve and should be contextualized for different stakeholders and businesses. We should not move completely away from a dollar sign way of thinking, but we should deprioritize it. “Revenue first” is not the best mentality in all cases- especially in situations as complex as calculating the ROI of experiments.