One of the pioneers of experimentation shares a humbling reality check: Most ideas will fail (and it’s a good thing)
Few people have accumulated as much experience as Ronny Kohavi when it comes to experimentation. His work at tech giants such as Amazon, Microsoft and Airbnb — just to name a few — has laid the foundation of modern online experimentation.
Before the idea of “build fast, deploy often” took hold across tech companies, developers followed a waterfall model that saw fewer releases (sometimes every 2-3 years). The shortening of development cycles in the early 2000s thanks to the Agile methodology and an uptick in online experimentation created the perfect storm for a software development revolution ― and Ronny was at the center of it all.
AB Tasty’s VP Marketing Marylin Montoya set out to uncover the early days of experimentation with Ronny and why failure is actually a good thing. Here are some of the key takeaways from their conversation.
Progressive deployments as a safety net
A typical cycle of experimentation involves exposing the test to 50% of the population for an average of two weeks before a gradual release. But Ronny suggests coming at it from a different vantage point: Starting with a small audience of just 2% before ramping up to 50%. The slower ramp-up gives you the time to detect any egregious issues or a degradation in metric values in near real time.
In an experiment, we may focus on just two features, but we have a large set of guardrails that suggest we shouldn’t be degrading X, Y or Z. Statistical data that you’re collecting could also suggest that you’re impacting something you didn’t mean to. Hence, the usage of progressive deployments in which you can identify external factors and easily rollback your test.
It’s like if you’re cooling water: You may realize you’re changing the temperature, but it’s not until you reach 0ºC (32ºF) that ice forms. You suddenly realize that when you get to a certain point, something very big happens. So, deploying at a safe velocity and monitoring the results can lead to huge improvements.
Your great idea? It will most likely fail.
Nothing gives you a better reality check than experimentation at scale. Everyone thinks they’re doing the best stuff in the world until it’s in the hands of their users. That’s when the real feedback kicks in.
Over two-thirds of ideas actually fail to move the metrics that they were designed to improve — a statistic Ronny shares from his time at Microsoft, where he founded the experimentation platform team of more than 100 data scientists, developers and program managers.
Don’t be deterred, however. In the world of experimentation, failing is a good thing. Fail fast, pivot fast. Being able to realize that the direction you’re going in isn’t as promising as previously thought enables you to use those new findings to enrich your next actions.
At Airbnb, Ronny’s experimentation team deployed a lot of machine learning algorithms to improve search. Out of 250 ideas tested in controlled experiments, only 20 of them proved to have a positive impact on the key metrics — meaning over 90% of ideas failed to move the needle. On the flip side, however, the 20 ideas that did succeed in some form? Those resulted in a 6% improvement in booking conversion, worth hundreds of millions of dollars.
The starter kit to experimentation
It’s easier today to convince leadership to invest in experimentation because there are plenty of successful use cases out there. Ronny’s advice is to start with a team that has iteration capital. If you’re able to run more experiments and a certain percentage are pass/fail, this ability to try ideas is key.
Pick a scenario where you can easily integrate the experimentation process into the development cycle and then work your way on to more complex scenarios. The value of experimentation is clearer because deployments are happening more often. If you’re working in a team that deploys every six months, there’s not a lot of wiggle room because everyone has already invested their efforts into this idea that the feature cannot fail. Which, as Ronny pointed out earlier, has a low probability of success.
Is experimentation for every company? The short answer is no. A company has to have certain ingredients in order to unlock the value of experimentation. One ingredient you need is being in a domain where it’s easy to make changes, such as website services or software. A second ingredient is you need enough users. Once you have tens of thousands of users, you can start experimenting and doing it at scale. And lastly, make sure you have trustworthy results from which you are taking your decisions.
What else can you learn from our conversation with Ronny Kohavi?
- How experimentation becomes central to your product build
- Why experimentation is at the root of top tech companies
- The role leaders play in evangelizing an experimentation culture
- How to build an environment for true experimentation and trustworthy results
About Ronny Kohavi
Ronny Kohavi is an authority in experimentation, having worked on controlled experiments, machine learning, search, personalization and AI for nearly three decades. Ronny previously was vice president and technical fellow at Airbnb. Prior to that, Ronny led the Analysis and Experimentation at Microsoft’s Cloud and AI group and was the director of data mining and personalization at Amazon. Ronny has also co-authored “Trustworthy Online Controlled Experiments : A Practical Guide to A/B Testing.,” which is currently the #1 best-selling data-mining book on Amazon.
About 1,000 Experiments Club
The 1,000 Experiments Club is an AB Tasty-produced podcast hosted by Marylin Montoya, VP of Marketing at AB Tasty. Join Marylin and the Marketing team as they sit down with the most knowledgeable experts in the world of experimentation to uncover their insights on what it takes to build and run successful experimentation programs.