Article

9min read

Inconclusive A/B Test Results – What’s Next?

Have you ever had an experiment leave you with an unexpected result and were unsure of what to do next? This is the case for many when receiving neutral, flat, or inconclusive A/B test results and this is a question we aim to answer.

In this article, we are going to discuss what an inconclusive experimentation result is, what you can learn from it, and what the next step is when you receive this type of result.

What is an inconclusive experiment result?

We have two definitions for an inconclusive experiment: a practitioner’s answer and a more broken-down answer. A basic practitioner’s answer is a numerical answer that shows statistical information depending on the platform you’re using:

  • The probability of a winner is less than 90-95%
  • The pValue is bigger than 0.05
  • The lift confidence interval includes 0

In other words, an inconclusive result happens when the results of an experiment are non-statistically significant or an uplift is too small to be measured. 

However, let’s take note of the true meaning of “significance” in this case: the significance is the threshold one has previously set as a metric or a statistic for measurement. If this previously set threshold is crossed, then an action will be made, usually implementing the winning variation.

Setting thresholds for experimentation

It’s important to note that the user sets the threshold and there are no magic formulas for calculating a threshold value. The only mandatory thing that must be done is that the threshold must be set before the beginning of an experiment. In doing so, this statistical hypothesis protocol provides caution and mitigates the risks of making a poor decision or missing an opportunity during experimentation.

To set a proper threshold, you will need a mix of statistical and business knowledge considering the context.

There is no golden rule, but there is a widespread consensus for using a “95% significance threshold.” However, it’s best to use this generalization cautiously as using the 95% threshold may be a bad choice in some contexts.

To make things simple, let’s consider that you’ve set a significance threshold that fits your experiment context. Then, having a “flat” result may have different meanings – we will dive into this more in the following sections.

The best tool: the confidence interval (CI)

The first thing to do after the planned end of an experiment is to check the confidence interval (CI) that can tell useful information without any notion of significance. The usage is a 95% confidence level to build these intervals. This means that there is a 95% chance that the real value lies between its boundaries. You can consider the boundaries to be an estimate of the best and worst-case scenarios.

Let’s say that your experiment is collaborating with a brand ambassador (or influencer) to attract more attention and sales. You want to see the impact the brand ambassador has on the conversion rate. There are several possible scenarios depending on the CI values:

Scenario 1:

The confidence interval of the lift is [-1% : +1%]. This means that in the best-case scenario, this ambassador effect is a 1% gain and in the worst-case scenario, the effect is -1%. If this 1% relative gain is less than the cost of the ambassador, then you know that it’s okay to stop this collaboration.

A basic estimation can be done by taking this 1% of your global revenue from an appropriate period. If this is smaller than the cost of the ambassador, then there is no need for “significance“ to validate the decision – you are losing money.

Sometimes neutrality is a piece of actionable information.

Scenario 2: 

The confidence interval of the lift is [-1% : +10%]. Although this sounds promising, it’s important not to make quick assumptions. Since the 0 is still in the confidence interval, you’re still unsure if this collaboration has a real impact on conversion. In this case, it would make sense to extend the experiment period because there are more chances that the gain will be positive than negative.

It’s best to extend the experimentation period until the left bound gets to a “comfortable” margin.

Let’s say that the cost of the collaboration is covered if the gain is as small as 3%, then any CI [3%, XXX%] will be okay. With a CI like this, you are ensuring that the worst-case scenario is even. And with more data, you will also have a better estimate of the best-case scenario, which will certainly be lower than the initial 10%.

Important notice: do not repeat this too often, otherwise you may be waiting until your variant beats the original just by chance.

When extending a testing period, it’s safer to do it by looking at the CI rather than the “chances to win” or P-value, because the CI provides you with an estimate of the effect size. When the variant wins only by chance (which you increase when extending the testing period), it will yield a very small effect size.

You will notice the size of the gain by looking at the CI, whereas a p-value (or any statistical index) will not inform you about the size. This is a known statistical mistake called p-hacking. P-hacking is basically running an experiment until you get what you expect.

The dangers of P-hacking in experimentation

It’s important to be cautious of p-hacking. Statistical tests are meant to be used once. Splitting the analysis into segments, to some extent, can be seen as portraying different experiences. Therefore, if making a unique decision at a 95% significance level means accepting a 5% risk of having a false positive, then checking for 2 segments implicitly leads to doubling this risk to 10% (roughly).

We recommend the following advice may help to mitigate this risk:

  • Limit the number of segments you are studying to only segments that could have a reason to interact differently with the variation. For example: if it’s a user interface modification (such as the screen size or the navigator used), it may have an impact on how the modification is displayed, but not the geolocation.
  • Use segments that convey strong information regarding the experiment. For example: Changing the wording of anything may have no link to the navigator used. It may only have an effect on the emotional needs of the visitors, which is something you can capture with new AI technology when using AB Tasty.
  • Don’t check the smallest segments. The smallest segments will not greatly impact your business overall and are often the least statistically significant. Raising the significance threshold may also be useful to mitigate the risk of having a false positive

Should you extend the experiment period often?

If you notice that you often need to extend the experiment period, you might be skipping an important step in the test protocol: estimating the sample size you need for your experiment.

Unfortunately, many people are skipping this part of the experiment thinking that they can fix it later by extending the period. However, this is bad practice for several reasons:

  • This brings you close to P-hacking
  • You may lose time and traffic on tests that will never be significant

Asking a question you can’t know the answer to can be very difficult: what will be the size of the lift? It’s impossible to know. This is one reason why experimenters don’t often use sample size calculators. The reason you test and experiment is because you do not know the outcome.

A far more intuitive approach is to use a Minimal Detectable Effect (MDE) calculator. Based on the base conversion rate and the number of visitors you send to a given experiment, an MDE calculator can help you come up with the answer to the question: what is the smallest effect you may be able to detect? (if it exists).

For example, if the total traffic on a given page is 15k for 2 weeks, and the conversion rate is 3% – the calculator will tell you that the MDE is about 25% (relative). This means that what you are about to test must have a quite big impact: going from 3% to 3.75% (25% relative growth).

If your variant is only changing some colors to a small button, developing an entire experiment may not be worth the time. Even if the new colors are better and give you a small uplift, it will not be significant in the classic statistical way (having a “chance to win” >95% or a p-value < 0.05).

On the other hand, if your variation tests a big change such as offering a coupon or a brand new product page format, then this test has a chance to give usable results in the given period.

Digging deeper into ‘flatness’

Some experiments may appear to be flat or inconclusive when in reality, they need a closer look.

For example, frequent visitors may be puzzled by your changes because they expect your website to remain the same, whereas new visitors may instantly prefer your variation. This combined effect of the two groups may cancel each other out when looking at the overall results instead of further investigating the data. This is why it’s very important to take the time to dig into your visitor segments as it can provide useful insights.

This can lead to very useful personalization where only a given segment will be exposed to the variation with benefits.

What is the next step after receiving an inconclusive experimentation result?

Let’s consider that your variant has no effect at all, or at least not enough to have a business impact. This still means something. If you reach this point, it means that all previous ideas fell short; You discovered no behavioral difference despite the changes you made in your variation.

What is the next step in this case? The next step is actually to go back to the previous step – the hypothesis. If you are correctly applying the testing protocol, you should have stated a clear hypothesis. It’s time to use it now.

There might be several meta-hypotheses about why your hypothesis has not been validated by your experiment:

  • The signal is too weak. You might have made a change, but perhaps it’s barely noticeable. If you offered free shipping, your visitors might not have seen the message if it’s too low on the page.
  • The change itself is too weak. In this case, try to make the change more significant. If you have increased the product picture on the page by 5% – it’s time to try 10% or 15%.
    The hypothesis might need revision. Maybe the trend is reversed. For instance, if the confidence interval of the gain is more on the negative side, why not try the opposite idea to implement?
  • Think of your audience. Another consideration is that even if you have a strong belief about your hypothesis, it’s just time to change your mind about what is important for your visitors and try something different.

It’s important to notice that this change is something that you’ve learned thanks to your experiment. This is not a waste of time – it’s another step forward to better knowing your audience.

Yielding an inconclusive experiment

An experiment not yielding a clear winner (or loser), is often called neutral, inconclusive, or flat. This still produces valuable information if you know how and where to search. It’s not an end, it’s just another step further in your understanding of who you’re targeting.

In other words, an inconclusive experiment result is always a valuable result.

Subscribe to
our Newsletter

bloc Newsletter EN

We will process and store your personal data to respond to send you communications as described in our  Privacy Policy.

Article

10min read

Software Development Team Best Practices

Software development isn’t just developers writing code. It also includes less technical processes that precede the actual development process such as the planning and the testing stages as well as post-development when the software is released and feedback is gathered from end-users.

What this means is that many teams beyond development are involved in the software development life cycle, such as product, design, testing, sales and marketing teams. These teams are all involved in achieving common objectives and ensuring a high-quality product.

Software development teams

To understand why it’s so important for teams to work cross-functionally, it helps to take a look at the different types of teams involved in software development. This will uncover the best practices to get these teams on the road for enhanced collaboration.

When we visualize a software development team, it’s easy to imagine a group of developers writing and releasing code but the reality is that it’s much more than that.

A software development team brings together a wide range of expertise from different teams within an organization to ensure the success of a project. This means that most teams are not solely made up of developers because while they’re responsible for creating the product, there also needs to be people dedicated to building the vision of the product, managing its life cycle, testing the product and marketing it and so on.

A software development team typically consists of the following roles:

With various teams from different departments coming together, there’s a great advantage in having expertise across multiple disciplines, which can bring innovative solutions to problems and insights which otherwise would be overlooked if these teams were to work in silos. Therefore, each of these roles is key to the effective development of your product.

Many factors will influence the structure of your development team such as project complexity, budget and size as well as the needs and expectations of stakeholders.

Ultimately, the team you put together will determine your project’s likelihood of success or failure and their collaboration will be key to achieving desired outcomes.

Why is cross-functional collaboration important?

It’s inevitable for different teams to clash during projects. For example, developers and product teams tend to approach projects from vastly different perspectives and their metrics for success will also differ.

Product managers are often focused on achieving outcomes and overall business objectives. They’re looking to quickly validate their ideas and just as quickly release features that will bring in more revenue for the business. Developers, for their part, aim to build the best possible product by focusing on the deliverables that come with building this product. 

The same applies to other teams within the organization with the different types of mindset, skills and goals coming into play during a project.

It’s vital for cross-functional teams to communicate and collaborate effectively around a shared goal in order to successfully achieve it.

Effective collaboration results in better products. Less conflict between the different teams translates to more time dedicated to building and releasing high-quality products. In other words, when teams are on the same page, it allows them to focus on what really matters, which is creating value for the customer through quality software.

Furthermore, cross-functional collaboration is a key driver for creativity and innovation. Collaboration often results in new ideas that can help companies gain competitive advantages as different people come together to work on a project as they encourage each other to consider things from different angles.

Software development team best practices

To enhance productivity and to continue to deliver value to customers, software development teams should stick to some best practices to put them on the path to improved collaboration and success.

Define clear goals

While each team has their own set of internal goals, they still need to make sure that those goals align with the overall business objectives of the company (and product). It’s essential that all teams have a shared understanding of business goals and how to achieve them.

This means keeping all teams coordinated and aligned around the product vision throughout the software development life cycle. It also involves determining the project scope and requirements that will best achieve these objectives.

Once shared goals are established, teams can work with each other instead of against each other even as they perform their own distinct tasks while pushing ahead in the same direction. 

The responsibilities of everyone involved in the software development process need to be clearly defined to avoid clashes and create a sense of accountability. The clearer the roles, the less chance of confusion as teams go deeper into the project.

This is a time when having the right leadership can make all the difference. Leaders must clearly define roles and responsibilities and ensure that all teams understand how their work creates value for the organization and its customers. 

Choose the appropriate project management methodology

Depending on the size and complexity of your project and teams, it’s important to choose a methodology that works well with the organization’s culture and values. 

From Waterfall to Agile methodologies, there are many approaches you can choose. For large projects with a clearly defined start and end-point, the Waterfall approach would work best. However, for projects that are adaptable and divided to smaller sprints, an Agile approach is better suited.

This will serve as the foundation on how you begin to structure your teams and the type of tools they should implement in their daily workflows.  

Invest in DevOps

Many software development companies organize themselves in a way that often leads to functional silos so that all the different teams tend to work in isolation while focused on their own goals.

As we’ve discussed, a project also consists of many moving parts and effective communication throughout the different stages of the project becomes complicated as there’s less visibility during these stages.

The DevOps methodology mainly grew out of frustration of the silos between teams, primarily development and operations teams. However, the term has evolved in modern software development to encompass a set of practices and tools that increase an organization’s ability to quickly deliver new software by promoting enhanced collaboration and communication between different teams.

DevOps goes beyond adopting the right tools to achieve high performance and better quality. It also involves undergoing a cultural transformation and teams adopting a shared culture and mindset that allows them to focus on quality of software and speed of delivery.

Choose the right tools

There are a number of collaborative tools your teams can adopt to establish efficient communication practices. The tools you choose will serve as the foundation to help teams work together towards common goals.

Among those are ones that can help you plan your projects from the bigger picture to the little details, which is especially useful for large organizations with multiple teams and team members collaborating. This also enables teams to have greater visibility over each member’s progress in the project and empowers teams to share input and encourage fast feedback loops in order to build a culture of DevOps and open communication.

Different teams will use their own tools to track work and progress throughout a project. For example, product teams will typically use roadmapping tools to plan, prioritize and track features while development teams will use development tools such as version control systems.

Therefore, there are many tools to choose from depending on the needs of the organization and teams from collaboration to project management tools as well as automation tools to streamline processes.

For example, teams adopting DevOps practices into their workflows will also need to choose the right stack of tools according to their unique business needs in order to implement DevOps successfully.

Communicate constantly and efficiently

From the onset of a project, all teams and other stakeholders will need to be involved in the decision-making process to keep issues down to a minimum and avoid miscommunication further down the road.

For example, many product teams often take the lead on setting the product vision and requirements without reviewing them with the development and engineering teams. In this scenario, it’s essential to understand that developers are the ones who will be writing for the product in question and so they must be consulted early on to give them the necessary context to build and prioritize the right features.  

In this scenario, product managers and owners will need to effectively communicate the strategic direction and vision of a product with the help of a dedicated product roadmap. This will enable developers to understand why they’re building the product, its value and how it’s linked to overall business objectives.

A project often consists of many moving parts which means it might not be possible for teams to have full visibility over them. However, Many important decisions are collaborative in nature and require input from multiple sides. It’s imperative that everyone on the team is encouraged to actively provide feedback and share information about the progress of development. This also helps teams identify any potential roadblocks and take quick action to address them.

Set KPIs and metrics to track performance

Depending on the business objectives set at the beginning, teams will need to establish metrics to track and measure performance throughout the development process and beyond. 

These metrics will be essential in the short and long term to make data-driven decisions to optimize products and improve team performance.

Teams can set any kind of metrics that will allow them to assess their efficiency. These could include productivity metrics such as velocity and cycle time, customer metrics such as customer satisfaction and net promoter scores and and more DevOps specific metrics such as DORA metrics.

Software development is a team effort 

To produce high-quality software that meets your customers’ expectations and needs, different teams have to come together within a collaborative environment to solve complex problems.

If organizations work on fostering collaboration across the entire software development life cycle, software development teams can overcome challenges, maximize productivity and software quality as well as deliver better value to customers. It’s a win-win situation.

At the end of the day, all teams within an organization are looking to accomplish the same end-goals and outcomes, mainly to keep the business running smoothly and bring in profit as well as create top-notch products that customers will love.