David Mannheim explains a remastered approach to personalization for long-term customer loyalty
With over 15 years of experience in digital businesses, David Mannheim has helped many companies, such as ASOS, Sports Direct and Boots to improve and personalize their digital experience and conversion strategy. He was also the founder of one of the UK’s largest independent conversion optimization consultancies – User Conversion.
With his experience as an advisor helping e-commerce businesses to innovate and iterate personalization and creativity at speed, David has recently published his own book where he tackles the “Person in Personalisation”, why he believes personalization has lost its purpose and what to do about it. David is currently building a solution to tackle this epidemic with his new platform; Made With Intent – a product that helps retailers understand the intent and mindset of their audience, not just their behaviors or what page they’re on.
AB Tasty’s VP Marketing Marylin Montoya spoke with David about the current state of personalization and the importance of going back to the basics and focusing on putting the person back in personalization. He also highlights the need for brands to build a relationship with customers based on trust and loyalty, particularly in the digital sphere instead of focusing on immediate gratification.
Here are some key takeaways from their conversation.
Personalization is about being personal
David stresses the importance of not forgetting the first three syllables at the beginning of personalization. In other words, it’s imperative to remember that personalization is about being personal and putting the person at the heart of everything- it’s all about customer-centricity.
For David, personalization nowadays has become too commercialized and too focused on immediate gratification. Instead, the focus should be on metrics such as customer lifetime value and loyalty. Personalization should be a strategic value add rather than a tactical add-on used solely to drive short-term sales and growth.
“If we move our metrics to focus more on the long-term metrics of customer satisfaction, more quality than quantity, more about customer lifetime value and loyalty as well as recognizing the intangibles, not just the tangibles, I think that puts brands in a much better place.”
He further argues that there is a sort of frustration point when it comes to the topic of personalization and who actually does it well. This frustration was clear after David interviewed 153 experts for his book, most of whom struggled to answer the question of “who does personalization well” and found it difficult to name any brands outside of the typical “big players” such as Netflix and Amazon.
This frustration, David believes, stems from the difficulty of replicating an in-store experience in a human-to-screen relationship. Nonetheless, when customers are loyal to a brand, that same loyalty should be reciprocated from the brand side as well to make a customer feel they’re more than just a number. The idea is to achieve a sort of familiarity and acknowledgment with the customer and create a genuine, authentic relationship with them. This is the key to unlocking customer-centricity.
It’s about offering a personalized experience that focuses on adding value for each individual customer, rather than exploiting value where only customers end up with a commercialized experience geared towards driving growth for the company itself.
Disparity between brands’ and customers’ perceptions of personalization
Citing Salethru’s Personalization Index, David refers to a particular finding in their yearly report where 71% of brands think they excel in personalization but only 34% of customers actually agree with that.
In that sense, there’s a mismatch between customers’ expectations and brands’ own expectations of what is competent customer service.
He refers to recommendations as one example that brands primarily incorporate into their personalization strategy. However, he believes recommendations only address the awareness part of the AIDA model (Awareness, Intent, Desire and Action).
“Product discovery for me is only one piece of a puzzle. If you take personalization back to what it’s designed to be, to be personal, well, where is the familiarity? Where’s the acknowledgment? Where’s the connection? Where’s the conversation?” David argues.
What’s missing is a core, intangible ingredient that helps create a relationship between two individuals, in this case, a human and a brand. Because brands have difficulty pinpointing what that is, they choose instead to base their personalization strategy on something more tangible and visible – recommendations.
For brands, the recommendations narrative is fully immersed within customer expectations and so encompasses the idea of personalization, particularly as that’s the approach that the “bigger” brands have adopted when it comes to personalizing the user experience.
“It becomes an expectation. I go on X website so I expect the bare minimum which is to see things that are relevant to what I search for or the things that I’m interested in…..This is what people associate personalization to,” David says.
Recommendations are an essential first step of personalization but David argues the future of personalization requires brands to go even further.
Brands should focus on building trust
In order for brands to build that sense of familiarity and truly become more personal with customers, brands need to take personalization to the next stage beyond awareness. For example, customers should be able to trust that a brand is recommending to them what they actually need rather than what makes the most profit.
David believes that the concept of trust is missing in a human-to-screen relationship, which is what’s hindering brands from reaching that next level.
In other words, it’s all about transforming the whole approach of personalization along with its purpose to demonstrate greater care with the few rather than “trying to get the many” to establish trust with customers. Brands should shift their focus to care, which David believes is what makes a brand truly customer-centric.
“I think it’s an initiative, if you can call it that, to focus on care. It does make the brand more customer-centric. You’re putting the customer, their experiences and expectations first with the purpose of providing a better experience for them.”
In that sense, two crucial aspects play into the concept of trust, according to David: competence and care.
Brands need to be able to be competent in that customers can trust they’re being recommended the most suitable products for their needs rather than the one that has the higher profit margin; in other words, recommending products that are best for the business instead of the customer. At the same time, brands need to demonstrate care by being more personable with customers to be able to create a connection between brand and consumer.
“The more caring you are, the more you can demonstrate trust,” David says.
Think of banking. Banking demonstrates all the competence in the world, but no care whatsoever. And that therefore destroys their trust. Think of the other way around. Think of your grandma giving you a sweater at Christmas. I’m sure you trust your grandma, but you won’t trust her to buy you a Christmas present, for example.”
For David, context is a prerequisite for trust and the best way to understand user context is through intent, which is where the difference between persuasion and manipulation lies. This is why he has been busy building Made With Intent for the past 8 months focused on that very same concept.
When it comes to recommendations, in particular, it’s essential to contextualize them and understand customer intent. Only then can a brand excel at its recommendation strategy and create a relationship of trust where customers can be confident they’re being recommended products unique to them only.
What else can you learn from our conversation with David Mannheim?
His take on AI and its role in personalization
Ways brands can demonstrate care to build trust and familiarity with their consumers
How brands can shift their personalization approach
About David Mannheim
David has worked in the digital marketing industry for over 15 years and along with founding one of the UK’s largest independent conversion optimization consultancies, he has worked with some of the UK’s biggest retailers to improve and personalize their digital experience and conversion strategy. Today, David has published his own book about personalization and is also building a new platform that helps retailers understand the intent and mindset of their audience, not just their behaviors or what page they’re on.
About 1,000 Experiments Club
The 1,000 Experiments Club is an AB Tasty-produced podcast hosted by Marylin Montoya, VP of Marketing at AB Tasty. Join Marylin and the Marketing team as they sit down with the most knowledgeable experts in the world of experimentation to uncover their insights on what it takes to build and run successful experimentation programs.
In the dynamic realm of e-commerce, selecting the right experience optimization platform (EOP) is essential for achieving success. But, how do you assess the impact on your website performance and unleash its full potential on your site?
We’re here to guide you with key questions to ask experimentation and personalization solutions you’re assessing, specifically designed to help you evaluate performance – so buckle up and continue reading to unlock new levels of success!
Bonus audio resource: Curious to know more about what AB Tasty does to address performance and optimize customer experience? Listen to this insightful discussion between Léo, one of our product managers, and Margaret, our product marketing manager. In this chat, Léo explains what AB Tasty specifically does to improve performance for our customers. Want to know even more? Check out Léo’s in-depth blog post.
#1: Does the platform offer 99.9% uptime and availability?
Downtime can be a nightmare for your business. Make sure the EOP is known for its reliability and high uptime. Although it might not sound like a big deal, the difference between 99.5% uptime and 99.9% uptime is huge. With 99.9% uptime, you can expect less than 9 hours of downtime annually, vs. 99.5% which can mean nearly 2 full days of downtime in a year. It’s crucial to choose a platform that can keep your website accessible to customers as often as possible, ensuring a seamless shopping experience around the clock and more revenue for your business.
#2. Does the platform prioritize website speed and load time?
It goes without saying that in the fast-paced online world, speed matters. Does the EOP offer features that prioritize website load time? Look for optimization techniques such as caching, image compression and code optimization to ensure quick and smooth page loading. A snappy website keeps customers engaged and drives conversions.
#3. Does the platform provide a comprehensive performance center?
Acting on detailed performance data ensures your website is always giving users the best experience. Does the EOP offer comprehensive insights into reducing the tag or campaign weight for optimal performance and user experience? Your EOP should have a performance center that guides you to campaign optimization, including ways to reduce tag weight, identify heavy or old campaigns you can delete, or targeting verification.
#4. Do the performance metrics they’re showing you come from sites that are active?
Some EOPs might show you performance metrics that include sites that aren’t actually active. An inactive site has a much lighter tag weight than an active site, which makes their performance metrics look much better than they actually are. Always ask the EOP if their metrics are from active sites to ensure you’re seeing the most accurate representation of what you can expect if you go with them.
#5. Are they regularly adding new features to enhance performance?
To stay ahead in the rapidly evolving digital ecosystem, it’s imperative that your EOP consistently adds new features to optimize performance. With regular updates like these, you can ensure you’re meeting user expectations, addressing emerging challenges, enhancing performance metrics, and keeping an edge on the competition.
Take, for example, dynamic imports. Using dynamic imports has a huge advantage. When we were using a monolithic approach, as some EOPs are still doing, removing a semi-colon in one campaign and pushing this change to production meant that all visitors would have to download the full package again, even though only one character over tens of thousands had changed. With dynamic imports, all visitors redownload the new version of the campaign – and that’s it. Simple.
#6. Can the platform handle spikes in web traffic?
E-commerce sites often face surges in traffic during peak periods or promotional events like Black Friday. How does the EOP handle increased web traffic without compromising performance? Look for platforms with content delivery networks (CDNs) that handle load balancing and scalability to ensure your website remains stable and accessible during high-demand periods.
#7. Does the platform have both server-side and client-side offers?
Having both server-side and client-side EOPs is crucial for e-commerce companies, especially given how much e-commerce is happening on mobile and apps. Server-side optimizes performance with zero flicker and seamless mobile experience, while client-side enhances user experience and puts the power of experimentation and personalization into the hands of marketers, freeing up developer time. Utilizing both platforms enables holistic optimization and consistent experiences, drives business growth, and leads to more satisfied customers.
#8. What level of local customer support and documentation does the platform offer?
Technical support and comprehensive documentation are vital for a smooth experience with your platform. What kind of reliable customer support channels does the EOP provide? Look for platforms that offer timely assistance in your locality and language, and extensive documentation, empowering you to resolve issues and make the most of your platform’s features. Review P2P sites like G2 to understand what EOPs consistently offer the best service.
#9. Is the platform scalable and adaptable to future needs?
As your e-commerce business grows, your optimization needs may change. To what degree is the EOP scalable and flexible enough to accommodate future requirements without affecting performance? Does the platform have well-known medium and large client brands with high traffic demands? Choose a platform that can adapt to evolving business goals and easily incorporate new features. This ensures the platform remains aligned with your growing needs.
#10. Can you test out the tag for yourself?
Tags should be easy to implement. You want to make sure that the one you go with is compatible with your system. While industry reports can give you an idea of what you can expect, they aren’t representative of your site. The best way to tell is to test it for yourself on your site. This lets you see if what the EOP says is actually what you get. It can also give you an idea of implementation, use, accuracy, reliability and confidence. Finally, it lets you see if there may be any issues that could arise and gives the EOP a chance to address them immediately.
Evaluate the Performance of EOPs to unlock your potential
By asking these key questions, you can begin to evaluate the performance of experience optimization platforms and ensure you select one that helps you unlock your potential. Focus on uptime, speed, traffic handling, mobile optimization, integration capabilities, support, and scalability – and ensure the EOP has an answer for every one of these questions, with proof to back it up. This way, you’ll be able to make an informed decision and optimize your ecommerce site for a seamless user experience, driving higher conversions and business growth.
Go through the checklist below, whether you have an EOP already in place, or are looking to start your EOP journey, and ask providers what they offer:
☑️ Does the platform offer 99.9% uptime and availability?
☑️ How does the platform prioritize website speed and load time?
☑️ What does the platform’s performance center look like?
☑️ How does the platform handle spikes in traffic?
☑️ Does the platform offer both server-side and client-side optimization?
☑️ Does the platform integrate with the tools and systems that you already use?
☑️ What level of support and documentation does the platform offer?
☑️ Is the platform scalable and adaptable to your future business needs?
In modern software development, teams adopting a DevOps methodology aim to release more frequent releases in smaller batches to validate them and test their impact.
This enables teams to reduce the risk of a big bang release that could lead to buggy features that could damage the user experience. This also prevents doing a full rollback and then implementing rollout all over again.
This ultimately means that software organizations are constantly releasing new updates and features to improve their products’ stability and quality and to deliver the best user experience possible.
Having a set plan in place to introduce new features allows teams to roll out releases to gather feedback and optimize accordingly before going for a full release.
What is a feature rollout plan?
A feature rollout, as the name implies, is when new product features (or updates to existing features) are released to end-users. It’s all the processes that go into gradually introducing a feature to a set of users to test its functionality before deploying to all your users.
Put simply, the main purpose of a feature rollout plan is to keep all teams involved in the development and release of new features on the same page by making it easier to identify what are the key elements of each phase in the rollout.
Failure of efficiently managing the release of these new features could potentially lead to low quality releases and a negative impact on the user experience. This could all severely damage a company’s reputation and competitiveness in a world where customer expectations are at an all time high. In that sense, a solid rollout plan will ensure more adoption of the software by your customers and improved and more organized workflows for all teams involved.
Therefore, it’s generally recommended to put together a detailed, robust plan early on in the development process and not scramble at the last minute as this plan will require meticulous planning to ensure the successful release of your new features.
Feature rollout process
It’s important to first highlight the steps involved in a feature rollout so teams can effectively incorporate the requirements of each phase into their planning.
Typically, the rollout process is divided into the following phases:
Design and planning – Define your objectives and KPIs, key stakeholders involved, set deliverables and communicate this plan to teams. This includes determining which features to prioritize and release to create the rollout plan accordingly.
Develop rollout strategy – Identify your target users whose needs are best addressed with the new feature and determine how you will give them access to your new features- your deployment strategy.
Build the feature and manage its progress throughout the development process.
Controlled rollout – validate and test your features with controlled rollouts using feature flags, for example.
Collect feedback by putting in place a constant feedback loop.
Full release – once the feature has been optimized and refined according to the feedback collected, it is ready to be released to all users.
You will also need to identify and anticipate any potential roadblocks and challenges along the way in your planning and address them early on.
As you advance in the rollout process, plan in-house training sessions and a user onboarding strategy as well as proper documentation to support your feature rollout to serve as a guide for users (both internal and external) to understand the feature in-depth and its value proposition.
Therefore, based on the above, your rollout plan should ideally include the following components to make sure your releases go without any hiccups:
Main objective and goals for each phase
Action steps and the teams involved
Timeframe to provide clarity and set expectations for all teams
Metrics to observe
Checkpoints to monitor progress and ensure the project stays on track
Best practices to creating the ideal plan
All in all, to have an efficient rollout plan at hand, you can follow these best practices:
Start early
As already mentioned, you need to draw up your plan early, way before the development and deployment stages. For a successful feature launch, you should start working on your rollout plan as soon as the development process kicks off.
Planning a seamless feature rollout could take months so the earlier you start considering all the elements within your plan, the easier it will be to keep your teams aligned and avoid any mishaps along the way.
Be flexible
It’s important that your plan allows for enough flexibility and can be adapted throughout the development process. This means your rollout plan shouldn’t be so rigid that it cannot be updated as priorities and timelines continuously shift throughout the software development lifecycle.
Define a clear rollout strategy
Your rollout plan will revolve around what strategy you’re adopting to roll out your new features. This means you need to determine how you’ll be rolling out your new features and the type of deployment strategy that is best suited to your new feature.
For example, should you choose a small group of beta users to opt in to test your product first to collect feedback and optimize your product before going for a full launch? Or is it better to run alpha testing on internal users first before releasing to real-world users?
Alternatively, you may decide to do a progressive rollout using canary deployment where you start with a small percentage of your users then expand the rollout process gradually until it’s released to all your users.
Set a tentative timeline
Being flexible is not equal to not having deadlines. You need to set a rough timeline of your rollout process with a clear rollout date that your team should target.
Setting a realistic timeline creates accountability by allowing individuals to outline their own responsibilities and build a personal roadmap that defines smaller deadlines leading up to the rollout release.
Set milestones
Setting key milestones in your feature rollout plan can be useful to further keep all stakeholders aligned and in sync throughout the project. This will allow them to clearly monitor as the software goes from one stage of the rollout to the next to track its progress by establishing clearly defined roadmaps for success.
Keep stakeholders in the loop
As we’ve seen, a feature rollout process requires coordination and collaboration between stakeholders and multiple teams across an organization.
Early on, establish a core team including relevant and key stakeholders from each department to get their input on key decisions in the rollout process and provide them with all the information needed to understand the value of the new feature and to ensure a successful rollout.
Outline an external communication plan
So you’ve developed and released your new feature but how do you make sure that your target users know about your exciting new releases?
You will need to make sure that you set a communication strategy so that customers know your software release is available. This is particularly important when you’re releasing new changes or updates to your features so customers know you’re continuously striving to improve your products.
Afterwards, you will also have to determine how you will start collecting the feedback you need to reiterate your products throughout the rollout process.
However, as we’ve mentioned in the previous point, make sure that your communication strategy includes all relevant stakeholders, external and internal users, and your customer-facing teams. Clear and consistent communication is required from top management so that teams are aware of and understand the vision and strategy behind any new feature.
Why do you need a feature rollout plan?
One of the biggest advantages of a feature rollout plan is that it allows for enhanced collaboration and communication among teams involved in the feature rollout process.
A rollout plan helps keep teams on the same page and move forward towards the same objectives to get your software into the hands of your users. In that sense, feature rollouts usually require the close collaboration of many teams and not just development teams so a plan helps different teams aligned around the same end-goals.
Furthermore, as new features are gradually introduced to users, such a plan enables careful planning. Thus, it gives teams more control over the release process by carefully planning who gets to see the new feature and when.
We also mentioned the importance of identifying any potential roadblocks in your feature rollout process. A rollout plan can facilitate the discovery of these roadblocks and anticipate them so you can work on removing them so they don’t interfere with the new feature release. Otherwise, you might end up coming across these roadblocks when it’s way too late in the process significantly delaying your release.
Above all, a rollout plan’s primary purpose is to manage and mitigate any potential risk among which includes a backup plan in case things go awry during the rollout process to minimize negative impact on your user base as much as possible.
Feature flags: The foolproof ingredient for successful rollouts
There are many ways and strategies to roll out new features, one of which includes the use of feature flags.
Feature flags are a powerful software development tool that allows teams to mitigate risk of release by separating code deployment from release.
This means that teams can hide new features behind a flag and turn them on for certain user segments while keeping them switched off for the rest while they monitor performance and impact on KPIs.
Feature flags, therefore, are an essential ingredient in your feature rollout plans for your teams to have more control over their releases and perform gradual rollouts of new features to gather necessary feedback.
There are many deployment and rollout strategies you can use alongside feature flags including A/B testing, canary deployments and blue/green deployments to test new features before committing to a full rollout.
Your release strategy can also be more specific. For example, you can choose to release your feature to users in a certain country while keeping them turned off for everyone else.
Feature rollout is not a one-time event. Rather, it’s a continuous process that many teams will need to partake in.
For that reason, releasing and implementing new features can be very stressful.There are a lot of elements and risks involved in the process, which means having a clear plan in place can make the process much easier.
A well-designed plan is key to providing a structured framework or blueprint to plan and execute the rollout process efficiently and it’s also an indispensable tool when it comes to successful implementation and coordination among teams.
Ultimately, the success of any project will depend on how well cross-functional teams work together towards shared objectives by communicating, defining clear goals, adapting quickly to changes as they occur while staying motivated and productive.
Metrics play an essential role in measuring performance and influencing decision-making.
However, relying on certain metrics alone can lead you to misguided conclusions and poor strategic choices. Potentially misleading metrics are often referred to as “pitfall metrics” in the world of Conversion Rate Optimization.
Pitfall metrics are data indicators that can give you a distorted version of reality or an incomplete view of your performance if analyzed in isolation. Pitfall metrics can even cause you to backtrack in your performance if you’re not careful about how you evaluate these metrics.
Metrics are typically split into two categories:
Session metrics: Any metrics that are measured on a session instead of a visitor basis
Count metrics: Metrics that count events (for instance number of pages viewed)
Some metrics can mesh into both categories. Needless to say, that’s the worst option for a few main reasons: no real statistical model is used when meshing into both categories. There is no direct/simple link to business objectives and these metrics may not need standard optimization.
While metrics are very valuable for business decisions, it’s crucial to use them wisely and be mindful of potential pitfalls in your data collection and analysis. In this article, we will explore and explain why some metrics are very not wise to use in practice in CRO.
Session-based metrics vs visitors
One problem with session-based metrics is that “power users” (AKA users returning for multiple sessions during the experimentation) will lead to a bias with the results.
Let’s remember that during experimentation, the traffic split between the variations is a random process.
Typically you think of a traffic split as very random but very even groups. When we talk about big groups of users – this is typically true. However, when you consider a small group, it’s very unlikely that you will have an even split in terms of visitor behaviors, intentions and types.
Let’s say that you have 12 power users that need to be randomly divided between two variations. Let’s say that these power users have 10x more sessions than the average user. It’s quite likely that you will end up with a 4 and 8 split, a 2 and 10 split, or another uneven split. Having an even split randomly occur is very unlikely. You will then end up in one of two very likely situations:
Situation 1: Very few users may make you believe you have a winning variation (which doesn’t yet exist)
Situation 2: The winning variation is masked because it received too few of these power users
Another problem with session-based metrics is that a session-based approach blurs the meaning of important metrics like transaction rates. The recurring problem here is that not all visitors display the same type of behavior. If average buyers need 3 sessions to make a purchase while some need 10, this is a difference in user behavior and does not have anything to do with your variation. If your slow buyers are not evenly split between the variations, then you will see a discrepancy in the transaction rate that doesn’t actually exist.
Moreover, the metric itself will lose part of its intuitive meaning over time. If your real conversion rate is around 3%, but counted by session and not by unique visitors, you will only likely only see a 1% conversion rate when switching to unique visitors.
This is not only disappointing but very confusing.
Imagine a variation urging visitors to buy sooner by using “stress marketing” techniques. Let’s say this leads to a one session purchase instead of three sessions. You will see a huge gain (3x) on the conversion per session. BUT this “gain” is not an actual gain since the number of conversions will have no effect on the revenue earned. It’s also good to keep in mind that visitors under pressure may not feel very happy or comfortable with such a quick purchase and may not return.
It’s best practice to avoid using session-based metrics unless you don’t have another choice as they can be very misleading.
Understanding count metrics
We will come back to our comparison of these two types of metrics. But for now, let’s get on the same page about “count metrics.” To understand why count metrics are harder to assess, you need to have more context on how to measure accuracy and where exactly the measure comes from.
To model rate accuracy measures, we use beta distribution. In the graph below, we see the measure of two conversion ratios – one blue and one orange. The X-axis is the rate and Y-axis is the likelihood. When trying to measure the probability that the two rates are different, we implicitly explore the part of the two curves that are overlapping.
In this case, the two curves have very little overlap. Therefore, the probability that these two rates are actually different is quite high.
The more narrow or compact the distribution is, the easier it is to see that they’re different.
Want to start optimizing your website with a platform you can trust?AB Tasty is the best-in-class experience optimization platform that empowers you to create a richer digital experience – fast. From experimentation to personalization, this solution can help you activate and engage your audience to boost your conversions.
The fundamental difference between conversion and count distributions
Conversion metrics are bounded into [0:1] as a rate or [0%:100%] as a percentage. But, for count metrics the range is open, and the counts are in [0,+infinity].
The following figure shows a gamma distribution (in orange) that may be used with this kind of data, along with a beta distribution (in blue).
These two distributions are based on the same data: 10 visitors and 5 successes. This is a 0.5 success rate (or 50%) when considering unique conversions. In the context of multiple conversions, it’s a process with an average of 0.5 rate conversion per visitor.
Notice that the orange curve (for the count metric) is non-0 above x = 1, this clearly shows that it expects that sometimes there will be more than 1 conversion per visitor.
We will see that comparisons between this kind of metric depend on whether we consider it as a count metric or as a rate. There are two options:
Either we consider that the process is a conversion process, using a beta distribution (in blue), which is naturally bounded in [0;1].
Or we consider that the process is a count process, using gamma distribution (in orange), which is not bounded on the right side.
On the graph, we see an inner property of count data distributions, they are dissymmetric: the right part goes slower to 0 than the left part. This makes it naturally more spread out than the beta distribution.
Since both curves are distributions, their surface under the curve must be 1.
As you can see, the beta distribution (in blue) has a higher peak than the gamma distribution (in orange). This exposes that the gamma distribution is more spread out than the beta distribution. This is a hint that count distributions are harder to get accurate than conversion distributions. This is also why we need more visitors to assess a difference when using count metrics rather than when using conversion metrics.
To understand this problem you have to imagine two gamma distribution curves, one for each variation of an experiment. Then, gradually shift one on the right, showing an increasing difference between the two distributions. (see figure below)
Since both curves are right-skewed, the overlap region will occur on at least one of the skewed parts of the distributions.
This means that differences will be harder to assess with count data than with conversion data. This comes from the fact that count data works on an open range, whereas conversion rates work on a closed range.
Do count metrics need more visitors to get accurate results?
No, it is more complex than that in the CRO context. Typical statistical tests for count metrics are not suited for CRO in practice.
Most of these tests come from the industrial world. A classic usage of count metrics is counting the number of failures of a machine in a given timeframe. In this context, the risk of failure doesn’t depend on previous events. If a machine already had one failure and has been repaired, the chance of a second failure is considered to be the same.
This hypothesis is not suited for the number of pages viewed by a visitor. In reality, if a visitor saw two pages, there’s a higher chance that they will see a third page compared to a visitor that just saw one page (since they have a high probability to “bounce”).
The industrial model does not fit in the CRO context since it deals with human behavior, making it much more complex.
Not all conversions have the same value
The next CRO struggle also comes from the direct exploitation of formulas from the industrial world.
If you run a plant that produces goods with machines, and you test a new kind of machine that produces more goods per day on average, you will conclude that these new machines are a good investment. Because the value of a machine is linear with its average production, each extra product adds the same value to the business.
But this is not the same in CRO.
Imagine this experiment result for a media company:
Variation B is yielding an extra 1,000 page views more than the original A. Based on that data, you put variation B in production. Let’s say that variation B lost 500 people that saw 2 pages and variation B also won 20 people that saw 100 pages each. That makes a net benefit of 1000 page views for variation B.
But what about the value? These 20 people, even if they spent a lot of time on the media, are maybe not the same value as 500 people that come regularly.
In CRO each extra value added to a count metric does not have the same value, so you cannot trust measured increment as a direct added value.
In applied statistics, one adds an extra layer to the analysis: a utility function, which links extra counts to value. This utility function is very specific to the problem and is unknown to most CRO problems. Even if you get some more conversions in a count metric context, you are unsure about the real value of this gain (if any).
Some count metrics are not meant to be optimized
Let’s see some examples where raising the number of a count metric might not be a good thing:
Page views: If the count of page views rises, you can think it’s a good thing because people are seeing more of your products. However, you can also think that people get lost and need to browse more pages to find what they need.
Items added to cart: We have the same idea for the number of products added to the cart. If you do not check how many products remain in the cart at the checkout stage, you don’t know if the variation helps to sell more or if it just makes the product selection harder.
Product purchased: Even the number of products purchased may be misleading as a business objective alone if used alone in an optimization context. Visitors could be buying two cheaper products instead of one high-quality (and more expensive) product.
You can’t tell just by looking at these KPIs if your variation or change is good for your business or not. There is more that needs to be considered when looking at these numbers.
How do we use this count data then?
We see in this article how counterintuitive optimization based on sessions is. And even worse, we see how misleading count metrics are in CRO.
Unless you have both business and statistics expert resources, it’s best practice to avoid them, at least as a unique KPI.
As a workaround, you can use several conversion metrics with specific triggers using business knowledge to set the thresholds. For instance:
Use one conversion metric for count is in the range [1; 5] called “light users.”
Use another conversion metric in the range [6,10] called “medium users.”
Use another one for the range [11,+infinity] called “heavy users”.
Splitting up the conversion metrics in this way will give you a clearer signal about where you gain or lose conversions.
Another piece of advice is to use several KPIs to have a broader view.
For instance, although analyzing the product views alone is not a good idea – you can check the overall conversion rate and average order value at the same time. If product views and conversion KPIs are going up and the average order value is stable or goes up, then you can conclude that your new product page layout is a success.
Counterintuitive Metrics in CRO
Now you see that except for conversions counted on a unique visitor basis, nearly all other metrics can be very counterintuitive to use in CRO. Mistakes can happen because of statistics that work differently, and also because the meaning of these metrics and their evolutions may have several interpretations.
It’s important to understand that CRO skill is a mix of statistics, business and UX knowledge. Since it’s very rare to have all this within one person, the key is to have the needed skills spread across a team with good communication.
We are once again thrilled to share that as a continued part of our strategy to optimize how you access AB Tasty’s platform of experimentation and personalization solutions, Epoq by AB Tasty is being streamlined to join the AB Tasty brand and website.
AB Tasty’s acquisition of Epoq in October 2022 realized a shared vision of empowering digital teams to deliver relevant and engaging shopping experiences along the consumer journey and brought search and product recommendations to our best-in-class experience optimization platform.
Placing Epoq within the AB Tasty brand represents an exciting next step for AB Tasty as we consolidate all our solutions under one place and one name.
The AB Tasty and Epoq websites are now one. All resources and landing pages previously hosted on Epoq’s website (epoq.de) can be found in one location on the AB Tasty website (abtasty.com).
If you have questions about what this means for you, you’ve come to the right place. Below we will dig into what is changing, helpful links and resources and some general FAQs.
As always, our team of AB Tasty magic makers are available to answer any additional questions that might pop up along the way. If you have any more questions after reading this, don’t hesitate to send us an email at hello@abtasty.com and we will update this page as needed.
How are AB Tasty and Epoq related?
AB Tasty acquired AI-powered personalization provider Epoq, ushering in a new era of experience optimization. Through recommendations and intelligent search the acquisition expands AB Tasty’s best-in-class offering to provide relevant and engaging customer experiences. Simplifying access for digital teams (from marketing to product to technology) by providing a single platform that delivers a 360-degree view to further optimize the digital customer experience.
What do you mean when you say merge? Will the Epoq website be gone for good?
By merging we mean all content around our leading Experience Optimization Platform will be available on one website. The Epoq website will no longer be available but all the search and recommendation content you have come to love will not disappear and will continue to be available on the AB Tasty website. New articles and insights to help you build your 1-1 personalization strategies will continue to be added.
Why are we merging the Epoq and AB Tasty websites?
The website merge is aimed to make it easier for everyone to access all the information around AB Tasty’s EOP solutions in one place, including content around the products, technology and impact. Grouping together the combined knowledge of Epoq and AB Tasty’s experts in one resource hub, giving marketing and product teams best practices and insights into experimentation and personalization strategy.
What will happen to all the resources (blog posts, guides, e-books, etc.) on epoq.de?
Epoq’s resources section will be moved to the AB Tasty website. All the Epoq content will be redirected to help customers find the content quickly and easily.
How can I log into Epoq? And where can I access the documentation?
You can log into the Epoq Control Desk, the new AB Tasty Search & Recommendation Workspace, through a link on the AB Tasty website in the upper right corner. The documentation can be accessed through the menu of the workspace called “Developer Documentation” where you will be forwarded to our knowledge base.
Will there be any changes to the products or services offered?
The products and services offered will not be changed themselves. The joint product range will become a unique platform for optimizing the digital customer experience, offering our clients even more opportunities to differentiate and stand out in the market.
How will the merger affect customers who are new to Epoq? Where can I sign up for a demo for AB Tasty’s intelligent site search and recommendation solution?
If you’re new and you’d like to try out AB Tasty Search or Recommendations, click the banner below or click the “Get a demo” button on the top right-hand corner of the page to explore how AI-powered 1:1 personalization can help you deliver memorable digital experiences.
Have any additional questions about Epoq and AB Tasty? Send us an email at hello@abtasty.com to let us know and stay tuned for more exciting updates and information still to come!
We invited Oliver Walker from our partner Hookflash to talk us through the practical ways you can use GA4 with your experimentation.
Although many people are talking about GA4 as a different platform from the previous version (Universal Analytics), conceptually it lets you do largely the same things. Its primary functions are to help you to understand and optimize your media; to understand and optimize your website; and to understand and segment your website visitors into audiences. However, with GA4 several features can really help you to power an experimentation program.
Here we’ll outline how to use GA4 to its full potential to drive results for your testing program.
Understanding User Behavior
At its core, Google Analytics has always been great for helping website owners to understand their website traffic. Whether it’s where they started their journey or where they ended their digital journey, or whether they sought help halfway through, there are a few options to know about. What we know about GA4 already is that it’s not the most intuitive tool in the world so here are some quick tips on that front:
Landing Pages – use Explorations – although there is a default report for landing pages…it’s not the best. Not just because there’s a known bug resulting in an empty row, but also because it doesn’t have the most useful metrics, i.e. bounce rate or engagement rate. If you build a report in Explorations, you can use a different dimension (called “Landing page + query string”) and choose the dimensions you’d find useful:
Exit rate – similar to the above, you no longer get Exits (or Exit Rate) in the default Pages & Screens report. Again, rebuilding the report in Explorations gives you both the ability to add Exits as a metric, and you can choose your preferred pages dimension. The default dimension in the Pages and Screens report does not include query strings but if you’d prefer to use the one that does, choose the dimension “Page path + query string”.
Site search – and finally, where’s the Site Search report gone!? There’s no longer a default report for this but you can rebuild this in Explorations. You can understand which search terms were most often looked for, by building an Exploration with the dimension of “Search term” and the metric “Event count”.
Understanding User Flow
What Universal Analytics was not particularly good at is visualizing how people traverse through a website. The flow reports were horribly sampled and just merely teased you as to what you could have had. GA4 has on-the-fly path exploration reports that can be used and tweaked, very flexibly. You can find these within Explorations too, just choose Path Exploration and then tweak, as per the following:
Get the pages view – for some unhelpful reason, the default view is Event Name, within each step. In the visualization, click the drop-down underneath Step +1 and change Event Name to be your preferred page dimension to get a view of how users move from page to page.
Double-click the page you are interested in to see where users go next. You can also click the +15 more (or whichever number) link at the bottom of each column to get the longer tail
Choose a dimension to “breakdown” by lets you easily compare routes through the site for different users, for example mobile vs. desktop or for each of the different browsers. Likewise, you can use segments here to review a certain audience type, e.g. non-UK traffic or Purchasers.
Audience targeting & triggers
Speaking of audiences, this was always a great feature of Universal Analytics and when Google Optimize was in its pomp, the ability to share audiences from UA to Optimize was one of its prime features. With GA4 you get the same ability to build audiences and to share audiences natively with other Google Marketing Platform (GMP) plus some neat additional elements:
The ability to use user behavior to trigger new types of goals. For example, if you’re a publisher and you want to engage people to read a certain number of articles in a particular time frame, it’s possible to create an audience for this and then have that set of behavior trigger a new event. It’s called audience triggers. And this becomes a powerful new metric with which to optimize your testing campaigns, by importing that conversion into your chosen testing tool
This is generally a great leap forward as GA4 also has the concept of users being added, and removed from audience groups, whereas most testing tools don’t have this feature.
Advanced analysis using BigQuery
The final area where GA4 really steps forward beyond its predecessor is that all GA4 accounts have a native integration with Google BigQuery. Whilst the integration itself is free, it’s worth noting that you do incur costs by storing or processing data in BigQuery, although a good partner will be able to advise on what that might look like for you.
So where does BigQuery help? The data schema provided by integrating GA4 and BigQuery is raw-level data – that means each row is effectively an event, with a time stamp, and all the associated parameters. It lets you have a greater degree of flexibility over what you analyze, provided you’re able to query the data (using SQL, or your friendly AI-driven chat tool.) For example:
If you want to understand how long it takes a user to complete a particular flow or set of actions. Worth noting that Google Analytics does batch events so this isn’t perfect, but it is easier than within the interface
If you want to look at user flows at an even greater level of detail, for example, how users traverse through the site having landed at a particular page
If you want to stitch together any data that GA didn’t capture but that also exists in Google Cloud, e.g. following a lead to submission through to outcome.
If you want to conduct a deeper analysis within your post-experiment analyses. All testing platforms will pass events and parameters to denote whether a user was part of an experiment and the variation they saw, so GA4 is a powerful additional tool to deep-dive into results
It’s not all doom and gloom
Yup, GA4 does have some limitations, it’s a big change to a tool that lots of people loved and it’s hard to pick-up. BUT when you start to understand certain concepts and familiarize yourself with capabilities, there are lots of features to help you with your experimentation program.
Every visitor shopping online wants to find a product that precisely meets their expectations quickly and efficiently. To achieve this, you can offer your potential customers purchasing advice to guide them throughout their buying journey.
In this article, you will discover the different forms of virtual shopping assistants available in e-commerce and the advantages they bring to you and your customers.
What are virtual shopping assistants?
Virtual shopping assistants, enabled by shopping engagement software, provide your shoppers with support in their product selection through an interactive and personalized exchange. By asking precise questions, your customers can find products that align with their wishes and needs more quickly.
This approach is based on the purchase advice provided in brick-and-mortar retail, aiming to overcome the impersonal components of online shops and enhance the individual user experience.
How do virtual shopping assistants differ from faceted search?
With faceted search, your customers can filter their search results in the online shop to view the products that interest them. For example, when searching through an e-commerce apparel shop they can use faceted navigation to select features, such as women’s blue capris in size 40, providing a user-friendly experience.
However, customers need to already know exactly what they want to buy to filter accordingly. If a customer is uncertain about their purchase or unsure about the specific product features they desire, they require support in the form of virtual shopping assistants.
What kinds of virtual shopping assistants are available?
There are various formats of virtual shopping assistants in e-commerce that can be integrated at different points of the customer journey. Let’s take a closer look at two categories: person-to-person communication tools and automated tools that can handle multiple customer inquiries in real time.
Virtual shopping assistants with human-to-human communication
Below, we present two examples of virtual shopping assistants that utilize human-to-human communication:
Live chat
Live chat is a messenger tool that allows your customers to directly contact an employee of your online shop. Typically integrated as a pop-up window on the company website, it facilitates one-to-one communication, resembling the experience of brick-and-mortar retail.
Video consultation
Video consultation is a rising trend in the e-commerce industry.
Customers visiting your e-commerce site may still be exploring their needs, making phone, chat or email interactions insufficient. With video consulting, customers can engage in face-to-face conversations with an employee of your online shop, ask questions, and receive individual advice on your products and processes.
For instance, customers can share their screens and present their ideas and inspiration to the sales representative, leading to a more targeted sales pitch. This combination of online shopping with personalized attention replicates the experience of boutique purchases and ultimately boosts customer loyalty and satisfaction.
The advantage: Your customers receive immediate, personalized answers to their questions about products and processes while they browse your shop. Especially for complex products that require explanation, customer-oriented live chat can positively influence purchase decisions. Additionally, you can offer appointments for individual purchase advice.
Virtual shopping assistants with AI-based tools
Now, let’s explore two examples of online consulting software that utilize AI-based tools for real-time interactions with multiple customers at once.
AI-based chatbots
Chatbots using artificial intelligence can respond to hundreds of customer inquiries simultaneously and in real time.
With the emergence of large language model chatbots such as OpenAI’s ChatGPT and Google’s Bard, brands have the potential to revolutionize how they engage with their customers online.
Depending on how the tool is programmed, it can recognize natural language, generate suitable answers from text blocks and databases on your website, and even escalate queries to a human employee if necessary. This enables personnel-friendly automation of various processes.
Guided Selling
Guided Selling involves guiding your customers through the product selection process to facilitate a confident purchase decision. This is particularly useful for potential buyers who may not possess enough knowledge about the products to make an informed choice.
For instance, when it comes to purchasing a stroller, expectant parents can feel overwhelmed by the countless models available. Guided Selling assists them in narrowing down the selection through targeted questions, leading to the ideal stroller. This can be seen in the example from babymarkt.de, who uses Guided Selling from AB Tasty to provide better shopping experiences for their customers.
This form of assistance, where a customer is guided step-by-step through the consultation process based on specific questions, is especially suitable for products that require explanation and mirrors the experience of a sales pitch in brick-and-mortar retail. Guided Selling can also be used for self-explanatory products, where customers can find the right product selection by selecting certain tags.
What makes Guided Selling special is that the results can be personalized to display suitable products based on the individual click and buying behavior of your customer. This ensures that your customer receives not only products that match their desired features and requirements but also their unique preferences.
Why is good customer engagement important in e-commerce?
Customers who feel well-advised are happy to come back. This applies to both brick-and-mortar stores and e-commerce shops. In addition, there are other reasons for using shopping engagement software like virtual shopping assistants.
Personalized shopping experience
When potential buyers walk into a brick-and-mortar store, they can approach the on-site sales consultants to find the right product.
By integrating this service into your online shop in the form of live chats, video advice or Guided Selling, you enable your customers to recreate the feeling of an interactive, personalized shopping experience.
Shoppers become customers
Virtual shopping assistants help you convert potential buyers into customers. By putting customers in direct contact with your team or catalog, they get answers to their questions that can positively influence their purchase decision.
For very personal products such as mattresses, a virtual shopping assistant tool helps visitors to find the one that exactly meets their needs from the multitude of models.
A better user experience
Your visitors appreciate positive experiences throughout their customer journey.
Support through virtual shopping assistants gives them a secure feeling when choosing a product and more frequently leads to a purchase decision. In addition, virtual shopping assistants make shopping easier: You present your customers with suitable solutions, they feel understood and the positive user experience is anchored in their memory.
Higher conversion
With virtual shopping assistants and shopper engagement software, you can reduce lost sales opportunities and thus increase your conversions. Sometimes potential buyers leave a shop because they didn’t find a product that is actually there. If they can easily ask a sales representative about the product via live chat, it will improve their shopping experience.
Your potential customers have already added products to their shopping cart, so why are they abandoning the checkout process? One possible reason: They had a question about a process that was not answered quickly enough. With an AI-based chatbot available during the checkout, these questions can be solved quickly and efficiently.
Higher customer satisfaction
The personalized service of a virtual shopping assistant creates an intimate atmosphere – a 1:1 exchange reminiscent of brick-and-mortar experiences. This not only strengthens potential buyers’ trust in your company but also their satisfaction. And satisfied customers turn into loyal customers.
Fewer Returns
Implementing virtual shopping assistants in your shop reduces the risk of returns. The two most common reasons for returns are either that the product didn’t fit or they didn’t like it.
With personal, targeted advice, you can help your customers to choose the right products that meet their wishes and needs as precisely as possible. This reduces your costs and makes your returns management easier.
Conclusion: Virtual shopping assistants make e-commerce more human
Virtual shopping assistants are a must-have in e-commerce. It offers advantages for you as an e-commerce marketer as well as for your customers.
Live chats or chatbots, video advice and Guided Selling make it easier for potential buyers to select a product and improve their user experience. In a 1:1 exchange, they receive personalized answers to their questions – the online shop becomes more human. At the same time, you benefit from higher customer loyalty and fewer returns, which means you can increase your sales.
The concept of feature flags is quite straightforward and easy to implement, at least at first. In the beginning, you would usually be managing one feature flag by modifying a configuration file but when you start using multiple flags they may become harder to manage and to keep everyone in sync across different functions.
Undoubtedly, feature flags become increasingly important as engineering and product teams begin to see their benefits. By separating code deployment from feature release, teams can now deliver new functionalities to users safely and quickly.
Feature flags are also extremely versatile and their uses can extend to a number of different scenarios to achieve various tasks by all teams across your organization. As feature flags help developers release faster with lower risk, it makes sense that teams would want to extend their usage across these additional use cases.
We can look at feature flag implementation as a journey that is used initially for one simple use case and which then evolves to more advanced implementations by different stakeholders. This article will illustrate this journey by introducing you to the different use cases of feature flags from simple to more complex and to help you consider whether it is in your best interest to build or buy a feature flag management system according to your goals.
Are you looking for a feature flagging solution packed full of features with an easy-to-use dashboard? AB Tasty is the all-in-one feature flagging, rollout, experimentation and personalization solution that empowers you to create a richer digital experience — fast.
The value of feature flags
Before we go deeper into the build vs buy topic, it’s important to highlight exactly why you need feature flags in your daily workflows and the value they can bring to your teams.
As we’ve mentioned, feature flags can be used across a range of use cases. Here’s a quick overview of when feature flags are especially handy:
User targeting and feature testing: When you have a new feature but you’re not yet ready for a big bang release; instead, you want to have the control to target who sees this new feature to collect necessary feedback for optimization purposes.
Testing in production: When you want to test a production by gradually rolling out a new feature or change to validate it.
Kill switch: When you want to have the ability to quickly roll back a feature in case anything goes wrong and turn it off while the issue is being fixed.
This means that feature flags are a great way to continuously (and progressively) roll out releases with minimal risk by controlling who gets access to your releases and when.
The journey begins with a simple step: if/else statements
A feature flag in a code is essentially an IF statement. Here is a very straightforward, basic example:
Therefore, you can just be starting off with a simple if/else statement, usually reserved for short-lived flags but less so if you’re planning to keep the flag around for a long time or for other more advanced use cases which require more sophistication. Therefore, feature flags have evolved beyond one use case and can serve a variety of purposes. Inserting a few IF statements is easy but it’s actually maintaining a feature flag management system that’s hard work; it requires time, resources and commitment.
You can implement a feature flag by reading from a config file in order to control which code path will be exposed to your subset of users. Using a config file at the beginning may seem like a viable solution but in the long-term may not be so practical, resulting in technical debt that accumulates over time.
Here, a simple flagging solution will not suffice and so you would need to turn to a more advanced solution. Implementing the solution you need in-house can be quite costly and requires a lot of maintenance. In this case, you can turn to a third-party option.
Bumps along the road: Evolving use cases
When you’re just starting out, you’ll implement a feature flag from a config file with an easy on/off toggle to test and roll out new features. Sounds simple enough. Then, one flag turns into 10 or 20 and as you keep adding to these flags leading to the aforementioned technical debt issue as it becomes harder to pinpoint which of your active feature flags need to be removed. In this case, a proactive approach to managing your flags is essential in the form of a comprehensive feature flag management system.
Therefore, at the start of your feature flag journey, you may simply be employing one use case which is experimentation through release management but over time, you may want to implement feature flags across a variety of use cases once you’ve seen first-hand the difference feature flags are making to your releases.
Test in production
You may for example want to test in production but only internally so you expose the feature to people within your organization. You may also use feature flags to manage entitlements, that is a small subset of users can access your feature, such as users with a premium subscription to your product or service. These types of flags are referred to as permission toggles. So you will need to build a system that can handle different levels of permissions for different users.
To be able to carry out such controlled roll-outs, your feature flagging system should enable you to make such context-specific flagging decisions, for example, for carrying out A/B tests.
So, for example, you might want to expose your feature to 5, 10 or 15% of your users or you might be looking to test this feature on users from a certain region. A good feature management system provides the means necessary to take such specific contexts when making flagging decisions. Therefore, such contexts can include additional information about the user so here we take into consideration the server handling the request as well as the geographic market the request is linked to.
As a result, feature flags allow you to choose who you want to release your feature to, so the new code can be targeted to a specific group of users whose feedback you need. This would require you to have a system in place that would allow you to perform feature experimentation on those users and attach KPIs to your releases to monitor their reception. However, some companies may not have the time or resources or even experience to collect this kind of rich data.
Kill switches
Feature flags can be used to kill off non-essential features or disable any broken features in production. Therefore, as soon as your team logs an error, they can easily turn off the feature immediately with the click of a button while your team investigates the issue. This would require your team to have a 2-way communication pathway between monitoring tools and the internal flag system that could be complex to set-up and maintain. The feature can then just as easily be turned on again once it’s ready for deployment. Such kill switches usually require a mature feature flag service implementation platform.
Feature flag hell
We can conclude that when implementing feature flags, you must continuously be aware of the state of each of your feature flags. Otherwise, you could find yourself becoming overwhelmed with the amount of flags in your system leading you to lose control of them when you’re unable to keep track of and maintain them properly. Things could get complicated fast as you add more code to your codebase so you need to make sure that the system you have in place is well-equipped to handle and reduce those costs.
You’ve probably already come across the term ‘merge hell’ but there’s also such a thing as ‘feature flag hell’. This is basically when you add too many feature flags which can convert your code into the nightmare that is ‘feature flag hell’.
As mentioned above, you can start off with a simple if/else statement but more sophistication will be needed to implement these more advanced use cases.
It is also important to be able to manage the configuration of your in-house system. Any small configuration change can have a major impact on the production environment. Therefore, your system will need to have access controls, audit logs and custom permissions to restrict who can make changes.
Your system will also need to have an environment-aware configuration that supports a flag configuration from one environment to the next. Most systems should be able to create two kinds of environments: one for development and one for production with its own SDK key. Then you would be able to control the flag’s value depending on which of these environments it’s being used. For example, the flag could be ‘true’ in development but ‘false’ in production.
Having different environments prevents you from accidentally exposing something in production before you are prepared. When you have all these flags across different environments, it becomes harder to keep everyone in sync, which leads us back to the issue of ‘feature flag hell’ if you don’t have the right system in place.
Feature flags categorization
With such sophisticated use cases, it would not make sense to place feature flags under one category and call it a day. Thus, here we will talk about feature flags when it comes to their longevity and dynamism.
Static vs dynamic
The configuration for some flags will need to be more dynamic than for others. Flipping a toggle can be a simple on/off switch. However, other categories of toggle are more dynamic and will require more sophisticated, very context-specific flagging decisions which are needed for advanced use cases such as A/B testing. For example, permission toggles, usually used for entitlements mentioned earlier, tend to be the most dynamic type of flag as their state depends on the current user and are triggered on a user basis.
Long- vs short-lived
We can also categorize flags based on how long the decision logic for that flag will remain in the codebase. On the one hand, some flags may be transient in nature, such as release toggles, which can be removed within a few days where the decision logic can be implemented through a simple if/else statement. On the other hand, for flags that will last longer then you’ll need to use more maintainable implementation techniques. Such flags include permission toggles and kill switches.
Therefore, it is important that your feature management solution can keep track of all the flags by determining which flag is which and indicating which flags need to be removed that are no longer needed or in use.
Challenges of an in-house system
As use cases grow so do the challenges of developing an in-house feature flagging system. Among the challenges organizations face when developing such a system include:
Many organizations will start out with a basic implementation where config changes would need to be made manually so the config change for every release will need to be made manually, which is time-consuming. Similarly, when rolling out releases, compiling customer IDs will also be done manually so keeping track of the features rolled out to each user would prove to be a major challenge.
Most of these manual processes would be carried out by the engineering team so product managers would be unable to make changes from their end and so will be dependent on engineers to make those changes for them.
The preceding point also raises the question of what you want your engineers to devote their time to. Your engineers will need to dedicate a large chunk of their time maintaining your in-house feature flagging tool which could divert their attention from building new features that could drive revenue for your company.
This ties to a lack of a UI that could serve as an audit log tracking to monitor when changes are made and by who. The lack of a UI will also mean that only engineers can control feature rollouts while product managers cannot do such deployments themselves or view which features are rolled out to which users. Thus, a centralized dashboard is needed so that all relevant stakeholders can monitor feature impact.
As mentioned previously, inability to monitor and clean up old flags will become increasingly difficult as more flags are generated. When flag adoption increases, people across your organization will find it more difficult to track which flags are still active.
Eventually, if your team does not remove these flags from the system, technical debt would become a major issue. Even keeping track of who created which flag and for what purpose could become a problem if the in-house system doesn’t provide such data.
Thus, while the advantages of feature flags are numerous, they will be far outweighed by the technical debt you start to accumulate over time that could slow you down if you are not able to take control and keep track of your feature flags’ lifecycles.
There are often high costs associated with maintaining such in-house tools as well as costs associated with upgrades so over time you will see such costs as well as your technical debt accumulating over time.
Besides the rising costs, building and maintaining a feature flagging system requires ample resources and a high degree of technical expertise as such systems require a solid infrastructure to handle large amounts of data and traffic, which many smaller organizations lack.
Such in-house tools are usually built initially to address one pain point so they have minimal functionality and thus cannot be used widely across teams and lack the scalability required to handle a wide range of uses and functions.
Time taken to develop feature flag solutions could be time lost that you could have spent developing features for your customers so you will need to consider how much time you are willing to dedicate to developing such a system.
On the other hand:
Buying a platform from a third-party vendor can be cost-effective which means you can avoid the associated costs with building a platform. There are also ongoing costs associated with buying a platform but with many options out there, companies can find a platform that suits their needs and budget.
Third-party systems typically come with ongoing support and maintenance from the vendor including comprehensive documentation so you wouldn’t have to worry about handling the upkeep for it yourself or the costs associated to maintain the platform to handle large-scale implementations.
Perhaps one of the biggest advantages of buying a solution is its immediate availability and market readiness as the solution is ready-made with expert support and pre-existing functionalities. Thus, you can save valuable time and your teams can quickly implement feature flags in their daily workflows to accelerate releases and time-to-market.
Time dedicated to building and maintaining your in-house solution could otherwise be spent developing innovative and new revenue-generating features.
Safe landing: How to proceed
To ensure a safe arrival at the final spot of your feature flag journey (depending on why and how you’re using feature flags), you will need to decide whether in-house or a third-party solution is what’s right for you. With each additional use case, maintaining an in-house solution may become burdensome. In other words, as the scope of the in-house system grows so do the challenges of building and maintaining your in-house system.
Let’s consider some scenarios where the “buy” end of the argument wins:
Your flag requirements are widening: your company is experiencing high growth- your teams are growing and different teams beyond development and engineering are becoming more involved in your feature flag journey, who in turn have different requirements.
With increasing flag usage and build-up, it’s become harder to keep track of all them in your system eventually leading to messy code.
You’re now working with multiple languages that maintaining SDKs may become highly complex.
You have an expanding customer-base which means higher volume of demand and release velocity leading to strained home-grown systems.
You need more advanced features that can handle the needs of more complex use cases. In-house systems usually lack advanced functionalities as they are usually built for immediate needs unlike third-party tools that come equipped with sophisticated features.
All these different scenarios illustrate the growing scope of feature flag usage which in turn means an increase in scope for your feature flagging system, which could pose a serious burden on in-house solutions that often lack the advanced functionalities to grow as you grow.
Many third-party feature flagging platforms come equipped with a user-friendly UI dashboard that teams can easily use to manage their feature flag usage.
Using AB Tasty’s Feature Experimentation and Rollouts, all teams within an organization from development to product can leverage to streamline the software development and delivery processes. Product teams can run sophisticated omnichannel experiments to get critical feedback from real-world users while development teams can continuously deploy new features and test in production to validate them.
Teams also have full visibility over all the flags in their system in our “flag tracking dashboard” where they can control who gets access to each flag so when the time comes they can retire unused flags to avoid build-up of technical debt.
Feature flag system is a must
At this point, you may decide that using a third-party feature flag management tool is the right choice for you. Which one you opt for will largely depend on your needs. As already pointed out, implementing your own solution is possible at first but it can be quite costly and troublesome to maintain.
Keep in mind the following before selecting a feature flag solution:
Pain points: What are your goals? What issues are you currently facing in your development and/or production process?
Use cases: We’ve already covered the many use cases where feature flags can be employed so you need to consider what you will be using feature flags for. You also need to consider who will be using it (is it just your developers or are there stakeholders involved beyond developers such as Product, Sales, etc?)
Needs and resources: Carefully weigh the build vs buy decision taking into account factors such as total costs and budget, the time required to build the platform, the scope of your solution (consider the long-term plan of your system), whether there is support across multiple programming languages-the more languages you use, the more tools you will need to support them.
Following the aforementioned points, your feature flagging management system will need to be: stable, scalable, flexible, highly-supported and multi-language compatible.
It’s more than fine to start simple but don’t lose sight of the higher value feature flags can bring to your company, well beyond the use case of decoupling deploy from release. To better manage and monitor your flags, the general consensus is to rely on a feature flag management tool. This will make feature flags management a piece of cake and can help speed up your development process.
With AB Tasty, formerly known as Flagship, we take feature flagging to the next level where we offer you more than just switching features on and off, offering high performance and highly scalable managed services. Our solution is catered not just to developers but can be widely used across different teams within your organization. Sign up for a free trial today to learn how you can ship with confidence anytime anywhere.
How teams decide to deploy software is an important consideration before starting the software development process.
This means long before the code is written and tested, teams need to carefully plan the deployment process of new features and/or updates to ensure it won’t negatively impact the user experience.
Having an efficient deployment strategy in place is crucial to ensure that high quality software is delivered in a quick, efficient, consistent and safe way to your intended users with minimal disruptions.
In this article, we’ll go through what a deployment strategy is, the different types of strategies you can implement in your own processes and the role of feature flags in successful rollouts.
What is a deployment strategy?
A deployment strategy is a technique adopted by teams to successfully launch and deploy new application versions or features. It helps teams plan the processes and tools they will need to successfully deliver code changes to production environments.
It’s worth noting that there’s a difference between deployment and release though they may seem synonymous at first.
Deployment is the process of rolling out code to a test or live environment while release is the process of shipping a specific version of your code to end-users and the moment they get access to your new features. Thus, when you deploy software, you’re not necessarily exposing it to real-world users yet.
In that sense, a deployment strategy is the process by which code is pushed from one environment into another to test and validate the software and then eventually release it to end-users. It’s basically the steps involved in making your software available to its intended users.
This strategy is now more important than ever as modern standards for software development are demanding and require continuous deployment to keep up with customer demands and expectations.
Having the right strategy will help ensure minimal downtime and will reduce the risk of errors or bugs so users get the best experience possible. Otherwise, you may find yourself dealing with high costs due to the number of bugs that need to be fixed resulting in disgruntled customers which could severely damage your company’s reputation.
Types of deployment strategies
Teams have a number of deployment strategies to choose from, each with their own pros and cons depending on the team objectives.
The deployment strategy an organization opts for will depend on various factors including team size, the resources available as well as how complex your software is and the frequency of your deployment and/or releases.
Below, we’ll highlight some of the most common deployment strategies that are often used by modern software development and DevOps teams.
A recreate deployment strategy involves developers scaling down the previous version of the software to zero in order to be removed and to upload a new one. This requires a shutdown of the initial version of the application to replace it with the updated version.
This is considered to be a simple approach as developers only have to deal with one scaling process at a time without having to manage parallel application deployments.
However, this strategy will require the application to be inaccessible for some time and could have significant consequences for users. This means it’s not suited for critical applications that always need to be available and works best for applications that have relatively low traffic where some downtime wouldn’t be a major issue.
A rolling deployment strategy involves updating running instances of the software with the new release.
Rolling deployments offer more flexibility in scaling up to the new software version before scaling down the old version. In other words, updates are rolled out to subsets of instances one at a time; the window size refers to the number of instances updated at a time. Each subset is validated before the next update is deployed to ensure the system remains functioning and stable throughout the deployment process.
This type of deployment strategy prevents any disruptions in service as you would be updating incrementally- which means less users are affected by any faulty update- and you would then direct traffic to the updated deployment only after it’s ready to accept traffic. If any issue is detected during a subset deployment, it can be stopped while the issue is fixed.
However, rollback may be slow as it also needs to be done gradually.
A blue/green deployment strategy consists of setting up two identical production environments nicknamed “blue” and “green” which run side-by-side, but only one is live, receiving user transactions. The other is up but idle.
Thus, at any given time, only one of them is the live environment receiving user transactions- the green environment that represents the new application version. Meanwhile, teams use the idle blue system as the test or staging environment to conduct the final round of testing when preparing to release a new feature.
Afterwards, once they’ve validated the new feature, the load balancer or traffic router switches all traffic from the blue to the green environment where users will be able to see the updated application.
The blue environment is maintained as a backup until you are able to verify that your new active environment is bug-free. If any issues are discovered, the router can switch back to the original environment, the blue one in this case, which has the previous version of the code.
This strategy has the advantage of easy rollbacks. Because you have two separate but identical production environments, you can easily make the shift between the two environments, switching all traffic immediately to the original (for example, blue) environment if issues arise.
Teams can also seamlessly switch between previous and updated versions and cutover occurs rapidly with no downtime. However, for that reason this strategy may be very costly as it requires a well-built infrastructure to maintain two identical environments and facilitate the switch between them.
Canary deployments is a strategy that significantly reduces the risk of releasing new software by allowing you to release the software gradually to a small subset of users. Traffic is directed to the new version using a load balancer or feature flag while the rest of your users will see the current version
This set of users identifies bugs, broken features, and unintuitive features before your software gets wider exposure. These users could be early adopters, a demographically targeted segment or a random sample.
Therefore, you start testing on this subset of users then as you gain more confidence in your release, you widen your release and direct more users to it.
Canary deployments are less risky than blue-green deployments as you’re adopting a gradual approach to deployment instead of switching from one environment to the next.
While blue/green deployments are ideal for minimizing downtime and when you have the resources available to support two separate environments, canary deployments are better suited for testing a new feature in a production environment with minimal risk and are much more targeted.
In that sense, canary deployments are a great way to test in production on live users but on a smaller scale to avoid the risks of a big bang release. It also has the advantage of a fast rollback should anything go wrong by redirecting users back to the older version.
However, deployment is done in increments, which is less risky but also requires monitoring for a considerable period of time which may delay the overall release.
A/B testing, also known as split testing, involves comparing two versions of a web page or application to see which performs better, where variations A and B are presented randomly to users. In other words, users are divided into two groups with each group receiving a different variation of the software application.
A statistical analysis of the results then determines which version, A or B, performed better, according to certain predefined indicators.
A/B testing enables teams to make data-driven decisions based on the performance of each variation and allows them to optimize the user experience to achieve better outcomes.
It also gives them more control over which users get access to the new feature while monitoring results in real-time so if results are not as expected, they can redirect visitors back to the original version.
However, A/B tests require a representative sample of your users and they also need to run for a significant period to gain statistically significant results. Moreover, determining the validity of the results without a knowledge database can be challenging as several factors may skew these results.
AB Tasty is an example of an A/B testing tool that allows you to quickly set up tests with low code implementation of front-end or UX changes on your web pages, gather insights via an ROI dashboard, and determine which route will increase your revenue.
Feature flags: The perfect companion for your deployment strategy
Whichever deployments you choose, feature flags can be easily implemented with each of these strategies to improve the speed and quality of the software delivery process while minimizing risk.
By decoupling deployment from release, feature flags enable teams to choose which set of users get access to which features to gradually roll out new features.
For example, feature flags can help you manage traffic in blue-green deployments as they can work in conjunction with a load balancer to manage which users see which application updates and feature subsets.
Instead of switching over entire applications to shift to the new environment all at once, you can cut over to the new application and then gradually turn individual features on and off on the live and idle systems until you’ve completely upgraded.
Feature flags also allow for control at the feature level. Instead of rolling back an entire release if one feature is broken, you can use feature flags to roll back and switch off only the faulty feature. The same applies for canary deployments, which operate on a larger scale. Feature flags can help prevent a full rollback of a deployment; if anything goes wrong, you only need to kill that one feature instead of the entire deployment.
Feature flags also offer great value when it comes to running experiments and feature testing by setting up A/B tests by allowing for highly granular user targeting and control over individual features.
Put simply, feature flags are a powerful tool to enable the progressive rollout and deployment of new features, run A/B testing and test in production.
What is the right deployment strategy?
Choosing the right deployment strategy is imperative to ensure efficient, safe and seamless delivery of features and updates of your application to end-users.
There are plenty of strategies to choose from, and while there is no right or wrong choice, each comes with its own advantages and disadvantages.
Whichever strategy you opt for will depend on several factors according to the needs and objectives of the business as well as the complexity of your application and the type of targeting you’re looking to implement i.e whether you want to test a new feature on a select group of users to validate it before a wider release.
No matter your deployment strategy, AB Tasty is your partner for easier and low risk deployments with Feature Experimentation and Rollouts. Sign up for a free trial to explore how AB Tasty can help you improve your software delivery processes.
Marianne Stjernvall explains the evolution of CRO and the importance of centralizing your CRO Program to create a data-driven organization
Before becoming a leading specialist in CRO and A/B testing, Marianne Stjernvall was studying computer and systems science when a company reached out to her on LinkedIn for a position as a CRO specialist, which for her turned out to be the perfect mix of logic programming data and business and people.
Since then she founded the Queen of CRO where Marianne acts as an independent CRO consultant helping many organizations with experimentation, CRO, personalization and creating a data-driven culture for growth.
Previously, Marianne worked for companies such as iProspect, TUI and Coop Sverige where she spearheaded their CRO roadmap and developed a culture of experimentation. Additionally, she was awarded CRO Practitioner of the Year in 2020.
AB Tasty’s VP Marketing Marylin Montoya spoke with Marianne on the importance of contextualizing A/B test data to make better-informed decisions. Marianne also shared her own take on the much debated build vs buy topic and some wise advice from her years of experience with CRO and experimentation.
Here are some key takeaways from their conversation.
The importance of contextualizing data
For Marianne, CRO is becoming a big part of product development and delivery. She highlights the importance of this methodology when it comes to collecting data and acting on it in order to drive decisions.
Marianne stresses the importance of putting data into context and deriving insights from that data. This means companies need to be able to answer why they’re collecting certain information and what they plan to do with that information or data.
CRO is the key to unlocking many of those insights from the vast amount of data organizations have at hand and to pinpoint exactly what they need to optimize.
“What are you going to do with that information? You need context to provide insights and that, I think, is what CRO actually is about,” Marianne says.
This is what makes CRO so powerful as it enables organizations to take more valuable actions based on the insights derived from data.
When done right, testing within the spectrum of CRO can help move organizations into a completely different path that they were on before onto a more innovative and transformative journey.
Centralize and standard your experimentation processes first
When companies are just starting to create their experimentation or CRO program, Marianne recommends having parts of it centralized and to run tests within a framework or process to avoid teams running their own tests and executing these tests all over each other.
Otherwise, you could have different teams, such as marketing, product development and CRO teams, executing tests with no set process in place which could potentially lead to chaos.
“You will be taking decisions on A/B tests on basically three different data sets because you will be checking different kinds of data. So having an ownership of that to produce this framework and process, this is how the organization should work with these kinds of tests,” says Marianne.
With established frameworks and processes in place, organizations can set rules on how to carry out tests to get better value out of them and create ownership for the entire organization. The trick is to start small with one team and build in these processes over time onto the next team and so on.
This is especially important as Marianne argues that many organizations cannot increase their test velocity because they don’t have set processes to act on the data they get from their A/B tests. This includes how they’re calculating the tests, how they’re determining the winning or losing variation and what kind of goals or KPIs they’ve set up.
In other words, experimentation needs to be democratized as a starting point to allow an organization to naturally evolve around CRO.
Putting people at the center of your CRO program
When it comes to the build vs buy debate, Marianne argues that an A/B testing tool will not automatically solve everything.
“A great A/B testing tool can make you comfortable in that we have all the grounds covered with that. Now we can actually execute on this, but the rest is people and the organization. That’s the big work.”
In fact, companies tend to blame the tech side of things when their A/B testing is not going as planned. For Marianne, that has nothing to do with the tool; the issue primarily lies with people and processes.
As far as the build vs buy debate, before deciding to build a tool in-house, companies should first ask themselves why they want to build their own tool beyond the fact it’s more cost-efficient. This is because these tools need time to get set up and running. It may not be so cost-effective as many tend to think when choosing to build their own tool.
Marianne believes that companies should focus their energy and time on building processes and educating teams on these processes instead. In other words, it’s about people first and foremost; that’s where the real investment lies.
Nevertheless, before starting the journey of building their own tool, companies should evaluate themselves internally to understand how teams are utilizing and incorporating data obtained from tests into their feature releases.
If you’re just starting on your CRO journey, it’s largely about organizing your teams and involving them in these processes you’re building. The idea is to build engagement across all teams so that this journey happens in the organization as a whole. (An opinion that was shared by 1,000 Experiments Club podcast guest Ben Labay).
What else can you learn from our conversation with Marianne Stjernvall?
What to consider when choosing the right A/B testing tool
Her own learnings from experiments she’s run
How to get HIPPOs more involved during A/B testing
How “failed” tests and experiments can be a learning experience
About Marianne Stjernvall
Having worked with CRO and experiments for a decade and executed more than 500 A/B tests, Marianne Stjernvall has helped over 30 organizations to help them grow their CRO programs. Today, Marianne has transformed her passion for creating experimental organisations with a data-driven culture to become a CRO consultant at her own company, the Queen of CRO. She also regularly teaches at schools to pass on her CRO knowledge and show the full kind of spectrum of what it takes to execute on CRO and A/B testing and experimentation.
About 1,000 Experiments Club
The 1,000 Experiments Club is an AB Tasty-produced podcast hosted by Marylin Montoya, VP of Marketing at AB Tasty. Join Marylin and the Marketing team as they sit down with the most knowledgeable experts in the world of experimentation to uncover their insights on what it takes to build and run successful experimentation programs.