Tag: EN

Article

Jan 4, 2022

18min read

Bayesian vs. Frequentist: How AB Tasty Chose Our Statistical Model

Hubert Wassner

The debate about the best way to interpret test results is becoming increasingly relevant in the world of conversion rate optimization.

Torn between two inferential statistical methods (Bayesian vs. frequentist), the debate over which is the “best” is fierce. At AB Tasty, we’ve carefully studied both of these approaches and there is only one winner for us.

There are a lot of discussions regarding the optimal statistical method: Bayesian vs. frequentist. — There are a lot of discussions regarding the optimal statistical method: Bayesian vs. frequentist (Source)

But first, let’s dive in and explore the logic behind each method and the main differences and advantages that each one offers. In this article, we’ll go over:

[toc]

What is hypothesis testing?

The statistical hypothesis testing framework in digital experimentation can be expressed as two opposite hypotheses:

H0 states that there is no difference between the treatment and the original, meaning the treatment has no effect on the measured KPI.
H1 states that there is a difference between the treatment and the original, meaning that the treatment has an effect on the measured KPI.

The goal is to compute indicators that will help you make the decision of whether to keep or discard the treatment (a variation, in the context of AB Tasty) based on the experimental data. We first determine the number of visitors to test, collect the data, and then check whether the variation performed better than the original.

There are two hypothesis in the statistical hypothesis framework. — There are two hypotheses in the statistical hypothesis framework (Source)

Essentially, there are two approaches to statistical hypothesis testing:

Frequentist approach: Comparing the data to a model.
Bayesian approach: Comparing two models (that are built from data).

From the first moment, AB Tasty chose the Bayesian approach for conducting our current reporting and experimentation efforts.

What is the frequentist approach?

In this approach, we will build a model Ma for the original (A) that will give the probability P to see some data Da. It is a function of the data:

Ma(Da) = p

Then we can compute a p-value, Pv, from Ma(Db), which is the probability to see the data measured on variation B if it was produced by the original (A).

Intuitively, if Pv is high, this means that the data measured on B could also have been produced by A (supporting hypothesis H0). On the other hand, if Pv is low, this means that there are very few chances that the data measured on B could have been produced by A (supporting hypothesis H1).

A widely used threshold for Pv is 0.05. This is equivalent to considering that, for the variation to have had an effect, there must be less than a 5% chance that the data measured on B could have been produced by A.

This approach’s main advantage is that you only need to model A. This is interesting because it is the original variation, and the original exists for a longer time than B. So it would make sense to believe you could collect data on A for a long time in order to build an accurate model from this data. Sadly, the KPI we monitor is rarely stationary: Transactions or click rates are highly variable over time, which is why you need to build the model Ma and collect the data on B during the same period to produce a valid comparison. Clearly, this advantage doesn’t apply to a digital experimentation context.

This approach is called frequentist, as it measures how frequently specific data is likely to occur given a known model.

It is important to note that, as we have seen above, this approach does not compare the two processes.

Note: since p-value are not intuitive, they are often changed into probability like this:

p = 1-Pvalue

And wrongly presented as the probability that H1 is true (meaning a difference between A & B exists). In fact, it is the probability that the data collected on B was not produced by process A.

What is the Bayesian approach (used at AB Tasty)?

In this approach, we will build two models, Ma and Mb (one for each variation), and compare them. These models, which are built from experimental data, produce random samples corresponding to each process, A and B. We use these models to produce samples of possible rates and compute the difference between these rates in order to estimate the distribution of the difference between the two processes.

Contrary to the first approach, this one does compare two models. It is referred to as the Bayesian approach or method.

Now, we need to build a model for A and B.

Clicks can be represented as binomial distributions, whose parameters are the number of tries and a success rate. In the digital experimentation field, the number of tries is the number of visitors and the success rate is the click or transaction rate. In this case, it is important to note that the rates we are dealing with are only estimates on a limited number of visitors. To model this limited accuracy, we use beta distributions (which are the conjugate prior of binomial distributions).

These distributions model the likelihood of a success rate measured on a limited number of trials.

Let’s take an example:

1,000 visitors on A with 100 success
1,000 visitors on B with 130 success

We build the model Ma = beta(1+success_a,1+failures_a) where success_a = 100 & failures_a = visitors_a – success_a =900.

You may have noticed a +1 for success and failure parameters. This comes from what is called a “prior” in Bayesian analysis. A prior is something you know before the experiment; for example, something derived from another (previous) experiment. In digital experimentation, however, it is well documented that click rates are not stationary and may change depending on the time of the day or the season. As a consequence, this is not something we can use in practice; and the corresponding prior setting, +1, is simply a flat (or non-informative) prior, as you have no previous usable experiment data to draw from.

For the three following graphs, the horizontal axis is the click rate while the vertical axis is the likelihood of that rate knowing that we had an experiment with 100 successes in 1,000 trials.

What usually occurs here is that 10% is the most likely, 5% or 15% are very unlikely, and 11% is half as likely as 10%.

The model Mb is built the same way with data from experiment B:

Mb= beta(1+100,1+870)

For B, the most likely rate is 13%, and the width of the curve’s shape is close to the previous curve.

Then we compare A and B rate distributions.

Blue is for A and orange is for B (Source: AB Tasty)

We see an overlapping area, 12% conversion rate, where both models have the same likelihood. To estimate the overlapping region, we need to sample from both models to compare them.

We draw samples from distribution A and B:

s_a[i] is the i th sample from A
s_b[i] is the i th sample from B

Then we apply a comparison function to these samples:

the relative gain: g[i] =100* (s_b[i] – s_a[i])/s_a[i] for all i.

It is the difference between the possible rates for A and B, relative to A (multiplied by 100 for readability in %).

We can now analyze the samples g[i] with a histogram:

The horizontal axis is the relative gain, and the vertical axis is the likelihood of this gain (Source: AB Tasty)

We see that the most likely value for the gain is around 30%.

The yellow line shows where the gain is 0, meaning no difference between A and B. Samples that are below this line correspond to cases where A > B, samples on the other side are cases where A < B.

We then define the gain probability as:

GP = (number of samples > 0) / total number of samples

With 1,000,000 (10^6) samples for g, we have 982,296 samples that are >0, making B>A ~98% probable.

We call this the “chances to win” or the “gain probability” (the probability that you will win something).

The gain probability is shown here (see the red rectangle) in the report:

Using the same sampling method, we can compute classic analysis metrics like the mean, the median, percentiles, etc.

Looking back at the previous chart, the vertical red lines indicate where most of the blue area is, intuitively which gain values are the most likely.

We have chosen to expose a best- and worst-case scenario with a 95% confidence interval. It excludes 2.5% of extreme best and worst cases, leaving out a total of 5% of what we consider rare events. This interval is delimited by the red lines on the graph. We consider that the real gain (as if we had an infinite number of visitors to measure it) lies somewhere in this interval 95% of the time.

In our example, this interval is [1.80%; 29.79%; 66.15%], meaning that it is quite unlikely that the real gain is below 1.8 %, and it is also quite unlikely that the gain is more than 66.15%. And there is an equal chance that the real rate is above or under the median, 29.79%.

The confidence interval is shown here (in the red rectangle) in the report (on another experiment):

What are “priors” for the Bayesian approach?

Bayesian frameworks use the term “prior” to refer to the information you have before the experiment. For instance, a common piece of knowledge tells us that e-commerce transaction rate is mostly under 10%.

It would have been very interesting to incorporate this, but these assumptions are hard to make in practice due to the seasonality of data having a huge impact on click rates. In fact, it is the main reason why we do data collection on A and B at the same time. Most of the time, we already have data from A before the experiment, but we know that click rates change over time, so we need to collect click rates at the same time on all variations for a valid comparison.

It follows that we have to use a flat prior, meaning that the only thing we know before the experiment is that rates are in [0%, 100%], and that we have no idea what the gain might be. This is the same assumption as the frequentist approach, even if it is not formulated.

Challenges in statistics testing

As with any testing approach, the goal is to eliminate errors. There are two types of errors that you should avoid:

False positive (FP): When you pick a winning variation that is not actually the best-performing variation.
False negative (FN): When you miss a winner. Either you declare no winner or declare the wrong winner at the end of the experiment.

Performance on both these measures depends on the threshold used (p-value or gain probability), which depends, in turn, on the context of the experiment. It’s up to the user to decide.

Another important parameter is the number of visitors used in the experiment, since this has a strong impact on the false negative errors.

From a business perspective, the false negative is an opportunity missed. Mitigating false negative errors is all about the size of the population allocated to the test: basically, throwing more visitors at the problem.

The main problem then is false positives, which mainly occur in two situations:

Very early in the experiment: Before reaching the targeted sample size, when the gain probability goes higher than 95%. Some users can be too impatient and draw conclusions too quickly without enough data; the same occurs with false positives.
Late in the experiment: When the targeted sample size is reached, but no significant winner is found. Some users believe in their hypothesis too much and want to give it another chance.

Both of these problems can be eliminated by strictly respecting the testing protocol: Setting a test period with a sample size calculator and sticking with it.

At AB Tasty, we provide a visual checkmark called “readiness” that tells you whether you respect the protocol (a period that lasts a minimum of 2 weeks and has at least 5,000 visitors). Any decision outside these guidelines should respect the rules outlined in the next section to limit the risk of false positive results.

This screenshot shows how the user is informed as to whether they can take action.

Looking at the report during the data collection period (without the “reliability” checkmark) should be limited to checking that the collection is correct and to check for extreme cases that require emergency action, but not for a business decision.

When should you finalize your experiment?

Early stopping

“Early stopping” is when a user wants to stop a test before reaching the allocated number of visitors.

A user should wait for the campaign to reach at least 1,000 visitors and only stop if a very big loss is observed.

If a user wants to stop early for a supposed winner, they should wait at least two weeks, and only use full weeks of data. This tactic is interesting if and when the business cost of a false positive is okay, since it is more likely that the performance of the supposed winner would be close to the original, rather than a loss.

Again, if this risk is acceptable from a business strategy perspective, then this tactic makes sense.

If a user sees a winner (with a high gain probability) at the beginning of a test, they should ensure a margin for the worst-case scenario. A lower bound on the gain that is near or below 0% has the potential to evolve and end up below or far below zero by the end of a test, undermining the perceived high gain probability at its beginning. Avoiding stopping early with a low left confidence bound will help rule out false positives at the beginning of a test.

For instance, a situation with a gain probability of 95% and a confidence interval like [-5.16%; 36.48%; 98.02%] is a characteristic of early stopping. The gain probability is above the accepted standard, so one might be willing to push 100% of the traffic to the winning variation. However, the worst-case scenario (-5.16%) is relatively far below 0%. This indicates a possible false positive — and, at any rate, is a risky bet with a worst scenario that loses 5% of conversions. It is better to wait until the lower bound of the confidence interval is at least >0%, and a little margin on top would be even safer.

Late stopping

“Late stopping” is when, at the end of a test, without finding a significant winner, a user decides to let the test run longer than planned. Their hypothesis is that the gain is smaller than expected and needs more visitors to reach significance.

When deciding whether to extend the life of a test, not following the protocol, one should consider the confidence interval more than the gain probability.

If the user wants to test longer than planned, we advise to only extend very promising tests. This means having a high best-scenario value (the right bound of the gain confidence interval should be high).

For instance, this scenario: gain probability at 99% and confidence interval at [0.42 %; 3.91%] is typical of a test that shouldn’t be extended past its planned duration: A great gain probability, but not a high best-case scenario (only 3.91%).

Consider that with more samples, the confidence interval will shrink. This means that if there is indeed a winner at the end, its best-case scenario will probably be smaller than 3.91%. So is it really worth it? Our advice is to go back to the sample size calculator and see how many visitors will be needed to achieve such accuracy.

Note: These numerical examples come from a simulation of A/A tests, selecting the failed ones.

Confidence intervals are the solution

Using the confidence interval instead of only looking at the gain probability will strongly help improve decision-making. Not to mention that even outside of the problem of false positives, it’s important for the business. All variations need to meet the cost of its implementation in production. One should keep in mind that the original is already there and has no additional cost, so there is always an implicit and practical bias toward the original.

Any optimization strategy should have a minimal threshold on the size of the gain.

Another type of problem may arise when testing more than two variations, known as the multiple comparison problem. In this case, a Holm-Bonferroni correction is applied.

Why AB Tasty chose the Bayesian approach

Wrapping up, which is better: the Bayesian vs. frequentist method?

As we’ve seen in the article, both are perfectly sound statistical methods. AB Tasty chose the Bayesian statistical model for the following reasons:

Using a probability index that corresponds better to what the users think, and not a p-value or a disguised one;
Providing confidence intervals for more informed business decisions (not all winners are really interesting to push in production.). It’s also a means to mitigate false positive errors.

At the end of the day, it makes sense that the frequentist method was originally adopted by so many companies when it first came into play. After all, it’s an off-the-shelf solution that’s easy to code and can be easily found in any statistics library (this is a particularly relevant benefit, seeing as how most developers aren’t statisticians).

Nonetheless, even though it was a great resource when it was introduced into the experimentation field, there are better options now — namely, the Bayesian method. It all boils down to what each option offers you: While the frequentist method shows whether there’s a difference between A and B, the Bayesian one actually takes this a step further by calculating what the difference is.

To sum up, when you’re conducting an experiment, you already have the values for A and B. Now, you’re looking to find what you will gain if you change from A to B, something which is best answered by a Bayesian test.

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Oct 20, 2021

11min read

Net Promoter Score (NPS): Your Ultimate Guide to the What, Why, and How

AB Tasty

In a world where customers increasingly seek to buy into a brand than buy from a brand, it’s critical that companies create experiences that turn customers into loyal fans, rather than regard them as simple business transactions.

Customer satisfaction alone is no longer enough to thrive in today’s economy. The goal is to earn your customers’ fierce loyalty with authenticity and transparency, while aligning your offers and actions with a mission that speaks to them.

By measuring the net promoter score (NPS), businesses gain unique insight into how consumers perceive their customer journey in a number of different ways. Companies that use NPS to analyze customer feedback and identify areas of improvement hold the keys to optimizing rapid and effective business growth.

In this article, we’ll cover why measuring NPS is essential to scaling business sustainably, how to gather and calculate NPS feedback, and best practices to increase response rates and run successful NPS campaigns.

[toc]

What is NPS?

Let’s start with a little history. The Net Promoter Score was officially pioneered and coined by Fred Reichheld in the early 2000s, and has since become an invaluable methodology for traditional and online businesses alike. The value lies in using data to effectively quantify customer loyalty and its effect on business performance — a factor that was previously challenging to measure at scale.

The system works by asking customers a version of this question: How likely are you to recommend our brand/product/service to a friend or colleague? Answers range on a scale of 0-10, from “not at all likely” to “extremely likely.” Depending on their answers, respondents are separated into one of three categories.

Promoters (score 9-10): Loyal customers who keep buying and actively promote and refer your brand to their circle of friends, family, and/or colleagues.
Passives (score 7-8): Customers who’ve had satisfactory or standard experiences with your brand, and are susceptible to competitors’ offers.
Detractors (score 0-6): Unhappy customers who risk damaging your brand with public complaints and negative word-of-mouth.

To calculate the final net promoter score, subtract the percentage of promoters from the percentage of detractors. The metric can range from a low of -100 to a maximum of 100, the latter if every customer was a promoter.

For many e-commerce companies, high customer retention, referral, and positive reviews are all critical drivers of success. NPS helps these businesses understand overall buyer behaviors and identify which customer profiles have the potential to be brand enthusiasts, enabling marketers to adjust their strategy to convert passives into promoters.

Simply put, NPS surveys are a simple and powerful method for companies to calculate how customer experience management impacts their overall business performance and growth.

How to gather NPS feedback

Common methods used to gather NPS feedback are email, SMS, and website pop-ups or chat boxes. Regardless of which method is used, there is a common set of steps to ensure a successful NPS campaign:

Set clear objectives before sending out the NPS survey. Save time and increase the relevance of survey responses by determining exactly what kind of feedback you’re looking for before launching the survey.
Segment recipients with customer behavior profiles. Get specific with your survey questions by customizing them to different audiences based on their unique history and interaction(s) with your brand.
Make surveys short, concise, and timely. Instead of lengthy annual or quarterly feedback requests, increase response rates by sending quick and easy surveys to customers soon after they’ve had meaningful interactions with your brand.
Use an automation tool to optimize survey delivery. Whether it’s with an email marketing platform or website widget integration, using automation tools to design and deliver your NPS surveys streamlines the entire feedback process, while reducing the margin for human error.

Integrating the NPS survey directly into the customer journey on your website increases response rate and relevancy of feedback. To implement a NPS survey like this, try using an intuitive visual editor like AB Tasty with NPS widget capabilities.

AB Tasty’s visual editor enables marketers of all levels to:

Modify visual and interactive elements on the website without any manual coding necessary;
Set up action-tracking to directly measure the performance of variations you’ve created;
Use the NPS widget to customize the content and feel of surveys across one or more pages of the website; and
Track the evolution potential of customer loyalty and benchmark against competitor performance via the NPS report.

Below are two case studies of clients who’ve used the AB Tasty NPS widget with highly successful campaigns to collect customer feedback and gain valuable insight to improve their customer experiences.

How to calculate NPS feedback

So what makes a good NPS score? A general rule of thumb states that anything below 0 means your business has some work to do … and a “good score” falls between 0-30. However, the true value of a NPS score depends on several factors — namely what industry your business is in.

If your NPS score isn’t as high as you’d hoped, don’t fret! There is always room for improvement and the good news is that it’s easy to implement actionable changes to optimize your NPS campaigns, no matter where you are on the scale.

When benchmarking for NPS, look at competitors that are in the same industry and relatively similar size as your company to get the most accurate visualization possible. Look for graphs that map out average NPS data by industry to get more insights on performance and opportunities for improvement in your sector.

It’s important to understand that comparing your business’s results to significantly larger or unrelated brands can lead not only to inaccurate interpretation of the data, but also sets unrealistic and irrelevant goals for customer experience teams.

How to increase your NPS response rate

Reaching your customers with your NPS survey is just one half of the battle. The other half is getting enough customers to actually respond to it, which is critical to calculate an NPS score that accurately reflects your company’s customer satisfaction performance. Here are some tips for boosting your NPS response rate:

Customize your NPS survey. Take the time to brand your survey with the proper fonts and colors, following your brand design guide. Given the fact that the average person sees upwards of 6,500 ads in a day, information overload is a real struggle for consumers and marketers alike. A consistent look and feel from your survey helps customers recognize and trust your brand, making it an easy transition to take the next step in their customer journey.
Personalize the message. Studies show that personalized subject lines increase email open rates by 26%. If you’re sending the survey in an email, use merge fields or tags to automatically add each recipient’s name into the subject line or body of the email.
Use responsive design. 75% of customers complete surveys on their phone. Make sure your survey is fully functional and accessible from all devices (i.e., desktop, mobile, and tablet), as well as on as many operating systems and internet browsers as possible.
Offer incentives for completing the survey. From gift cards, cash, and promo codes to raffles, offering monetary rewards is an easy method to increase engagement, especially for longer surveys. However, this should be researched and done carefully to avoid review bias and more seriously, legal issues.

Why you should use NPS

Taking customer feedback seriously is important business. As of 2020, 87% of people read online reviews for local businesses, and 79% said they trust online reviews as much as a personal recommendation from friends or family. This means your customers’ perception of your brand can literally make or break it.

It’s clear that looking at sales revenue as the sole determiner of success is not sustainable for long-term business growth. Neither is assuming that several user case scenarios represent the majority without the data to prove it.

NPS is an especially powerful metric for e-commerce, as it uses data to help businesses identify truly relevant areas for improvement and opportunities to build a strong and loyal customer base that is so vital to thrive in this sector.

Build a strong relationship with your customer base — Building a strong relationship with your customer base and incentivizing brand promoters is crucial to succeeding in the e-commerce market

Rather than guesstimating what priorities should be, businesses can use longer surveys with open-ended questions to evaluate how their customers feel about specific aspects of the business (e.g., products, website, and brand) and target strategy accordingly.

When calculated correctly, NPS is the key to determining the likelihood of repeat business and acquisition driven by brand promoters. Marketing and product teams can boost customer retention and increase sales with customized products they know buyers want. Happy customers love loyalty programs and referral rewards, which also bring in new business with significantly less spend than cold advertising.

When is the ideal time to send users an NPS survey

Deciphering what time customers are most likely to open emails, or when they’re more responsive to brand communications, is one of the biggest challenges for marketing teams.

Some studies suggest that the best time of the week to send emails is Tuesday at 10am. Although as many marketers know from experience, a one-time-fits-all solution doesn’t truly exist (though we wish it did!).

Depending on your industry and audience, your brand’s ideal time to hit send will likely change over time — and experimentation and optimization are the best ways to stay on top of it.

Identify the right time to send customer satisfaction surveys — Identifying the right time to send customer satisfaction surveys requires continual testing of different elements like message personalization and audience segmentation

However it is possible to find ideal times based on data you likely already have: by focusing on meaningful interactions between brand and customer.

One of the optimal times to send a NPS survey is shortly after customers have had a meaningful interaction with the brand. This could be after a customer finishes a purchase cycle, receives a product, or even speaks with customer service.

During this time, the customer experience is still top-of-mind, which means they are more likely to complete a feedback survey with higher chances of providing more detailed — and honest — insights.

It’s also better to send short surveys more frequently. Asking for smaller amounts of feedback more often than once or twice a year enables you to monitor customer satisfaction with a quicker response time.

With regular feedback surveys, businesses can catch onto unhappy customers early on and make prompt changes to address problems in the customer journey, increasing customer retention.

Another benefit of this practice is that businesses can also identify highly successful campaigns throughout the year and prioritize resources on scaling strategies that are already proven to work well.

Do’s and don’ts for running an effective NPS campaign

Do:

Add open-ended questions. If you want more qualitative insight to support your business decisions, ask customers for specific input, as Eurosport did in this campaign.
Send from a person. Humans value real connections. Increase NPS response rate by sending surveys with the name and email of a real employee, not an automatic “no-reply” bot address.
Integrate your NPS survey into the user journey. To boost your reach beyond email surveys, use an NPS widget on your website for increased response rate and in-depth responses. Match your survey’s design to flow with the product page UX.

Don’t:

Disrupt the customer journey. Don’t overdo it with pop-up surveys or make them difficult to close, this can distract customers from their website experience and increase bounce rate.
Ask only one question. Don’t ask for just a 0-10 score. To collect actionable insight, add a follow-up question after the NPS score to ask why they gave that rating.
Not share NPS results. Transparency makes cross-team collaboration more effective and creative. NPS data is valuable for not only customer-facing teams, but also marketing and product teams to improve the customer experience.

Optimize your NPS strategy

In summary, NPS is incredibly user-friendly and simple to implement. This metric helps brands gain actionable insight into their customer loyalty and satisfaction, and identify opportunities to significantly boost customer retention and acquisition.

NPS widgets and automated feedback collection enable cross-team collaborators to work more cohesively on customer experience campaigns

Businesses can use this data to run their operations better and smarter, and also improve cross-team collaboration on enhancing the customer experience. Regular testing and following best practices enable teams to continually improve their NPS strategy and reach higher response rates.

Ready to integrate your next NPS campaign directly into your website and customer journey? With an intuitive interface and no-code visual editor, AB Tasty enables you to fully customize the entire NPS survey live on your website, and experiment with different triggers to optimize your NPS strategy.

Our NPS widget makes it easy to scale this process quickly within even the fastest growing companies — give it a spin today.

AB Tasty’s NPS Widget Case Studies:

How Eurosport’s Survey Pop-In Got 5K Responses in Less Than Two Weeks

Avid Transforms Internal Culture and Website Experience with AB Tasty

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Oct 19, 2021

11min read

Best Feature Management Tools

Rowan Haddad

Feature management tools have become valuable for modern development teams. With such tools, teams now have a way to push releases safely and quickly with more confidence than ever before.

These tools provide a safe and efficient way to toggle features on and off targeting a specific set of users while fixing bugs without any disruption to users.

Furthermore, many of these tools provide an easy-to-use dashboard that can be used by all teams, even the less technical product teams, allowing them to release features at their own pace without relying too much on development teams to do it for them.

Feature management softwares, in that sense, pave the way for feature testing enabling product teams to experiment and test their ideas so that they may deliver high quality software that customers actually want.

Undoubtedly, with the many benefits that come with such tools, there are many options to choose from that you may find yourself at a loss at which one to opt for.

In this post, we will list the many tools out there and their main features so that you can decide which is best suited for your organization and unique needs.

What is feature management?

First, we will start off with what exactly we mean by the term ‘feature management’ before we move onto our list.

The term feature management encompasses software development tools that allow teams to better manage the rollout and roll back of features.

These tools provide a framework for feature flag-driven development meaning that teams can use flags on a wide scale to manage features across a wide variety of use cases.

Teams that implement such tools see higher productivity and an increase in software quality as they enable continuous testing of releases through gradual roll out of these releases to subsets of users.

As already mentioned, this allows teams, particularly product teams, to continuously test out their ideas through techniques such as A/B testing and experimentation and optimize products accordingly.

The core of feature management: Feature flags

Just as a quick recap of feature flags: feature flags are basically if-else statements that decouple deployment from release providing teams with full control over which users see which features.

When you’re just starting out, you’ll implement a feature flag from a config file with an easy on/off toggle to test and roll out new features.

However, at some point, things may become more complicated as your use cases evolve making flags harder to manage.

As you start to realize the value of feature flags in your experimentation roadmap, adopting the right feature management tool will help you manage hundreds, if not thousands, flags across several teams. Eventually, you will need to retire some of these flags, lest they accumulate into technical debt.

The key to managing all these active flags in your system is to opt for a more comprehensive feature flag management tool.

Feature flag management tools: Our top picks

Without further ado, we will present you with a list of tools that are worth looking into. Note that there is no particular order to this list.

Instead, we will lay out the tools highlighting their main features (and when they might come particularly handy) so you can decide which one best suits your objectives and teams’ capabilities.

LaunchDarkly

Launchdarkly is a feature management platform that is jam-packed with features for safe testing in production. The platform is made to integrate seamlessly into your existing software development workflow.

Some of the features you can expect from the platform:

Target specific users based on built-in user attributes
Run sophisticated A/B/n tests for data-driven product decisions
Integrate with other tools to enhance monitoring and collaboration
A number of SDKs that can be used for a variety of programming languages and platforms

The platform was built with developers in mind so that they may ship code whenever they want but is also suited for all teams across an organization so that they can also control and release software. It’s a good choice for those looking for advanced feature management capabilities to run more sophisticated experiments.

Pricing

The platform offers 3 tiers of pricing based on team size with basic functionality on the starter plan and more advanced functionalities at the enterprise level. They come with a 14-day free trial period. However, the payment is distributed per number of seats so you are limited to the number of seats per account and reach is capped at a maximum of 1000 client-side monthly active users (MAUs) in the cheaper plans.

Split

Split is a feature delivery platform ‘built by developers for developers’. The platform can be easily adapted to your workflow allowing teams to set up and manage flags to streamline deployment of features and maximize business impact throughout the feature delivery lifecycle.

Some of the core features of this platform include:

Provides instant alerts for performance issues with any features released
Comes with a kill switch to disable a faulty feature
Allows for user targeting through percentage rollouts to minimize blast radius
Pairs with a number of data pipelines to measure impact of each feature
Scales across multiple teams and environments

This solution is ideal for engineering teams that want to deliver features quickly and safely through ‘impact-driven development’. By helping to reduce the engineering cycle duration, this frees up engineering teams to solve business and customer problems.

It is also suited for teams working in more ‘sensitive’ industries that are looking to have extra security to protect and have control of the data that is sent to the platform.

Pricing

Split offers a free plan that accommodates up to 10 seats and a Platform Edition plan based on the number of Monthly Tracked Keys (MTKs) evaluated by the platform. It includes 25 seats and a 30 day trial.

Optimizely

Optimizely is a digital experience platform that provides teams with insights to personalized experiences. Although it’s not primarily a feature management tool, it provides these teams with feature management capabilities to test features. These include feature flags that allow you to run A/B tests, staged rollouts and target rollouts by audience.

Among the features you can find in this platform include:

Manages all features in one place within a single interface
Ability to switch off features without redeploying code
Offers a Stats Engine that deviates from traditional statistical methods for more accurate and actionable results that is a better fit for digital data
Provides a marketer-friendly UI for marketing teams to manage features and tests independently

This platform is suitable for teams looking for a full stack experimentation platform beyond feature management. It is particularly suited to product and marketing teams so that they can run advanced A/B tests across all customer touchpoints to uncover insights for better decision-making.

Pricing

While pricing is not revealed (it must be requested), there are 3 plans that companies can consider, among which include the Intelligence Cloud plan which offers feature management capabilities. Optimizely also offers free feature flagging as part of Optimizely Rollouts, on top of the paid Optimizely Full Stack, which comes with unlimited seats and feature flags.

CloudBees

CloudBees is a software delivery platform whose purpose is to enable automation so that developers can focus on creating and delivering software. The platform also comes with a feature management solution that allows for progressive delivery of features with advanced user testing based on any criteria in your database.

Other notable features that come with the feature management solution include:

Percentage rolling releases that provides feedback from live users
Customer data security-locally available attributes for targeting are not transmitted back to the platform
Kill switch without redeploying when a feature doesn’t work as expected
Developer-friendly workflows allowing developers to edit flags without using dashboards
Visibility of feature flag usage across all teams

This tool is ideal for large organizations with multiple teams seeking to improve visibility and governance across these teams in order to enhance collaboration and speed of software delivery.

Pricing

CloudBees’ feature management solution comes with the usual 3 plans other tools have adopted, beginning with a free Community plan for up to 15 users user followed by Team, up to a maximum of 25 users, and Enterprise plans for more than 25 users.

Harness

Harness is a software delivery platform that also provides feature flags as part of its product, which was designed for developers to be used as a simple and scalable feature management solution.

Some features that come with this platform include:

End-to-end visibility from build to release
Native integration with CI/CD
Integrates to existing developer workflow tools such as Jira and Slack
Reusable feature templates that can be standardized across different features

This tool is ideal for developers looking to automate their processes including flag life cycle management for future feature releases. It is also suited for teams wanting to manage all their processes in a single, integrated pipeline.

Pricing

Harness feature flags comes with 3 plans, starting with the free plan with just 2 developer licenses and up to 25K client-side MAUs followed by the Team, limited to 50 developer licenses and Enterprise plans that comes with a 14-day free trial.

AB Tasty

Last but not certainly not least is our own feature flagging platform to safely turn features on or off but that’s not all it is. AB Tasty is your one-stop solution for all your experimentation needs that was made with not only developers in mind but also to satisfy your product and marketing needs. It also comes with dedicated support to help you get set up.

Our server-side solution is designed for modern teams to deploy releases faster multiple times a day and reduce the risk that usually comes with testing in production.

The platform comes packed with features, which include:

Update flags without the need to change code or redeploy
User targeting and progressive rollouts based on user attributes- time-based deployment and gradual percentage rollout
Ready-to-use templates to speed up workflows
Kill switch in case an issue comes up
Visual dashboard to comfortably analyze feature performance, set goals and track them with real-time reports
Full visibility over all flags to easily create and manage flags
Server-side A/B tests with deep-level modifications beyond cosmetic changes
Option of rollback KPI to automatically turn off a feature/flag if metric crosses a certain threshold

Our feature flagging platform is particularly great for product teams wanting to delve into more advanced server-side A/B testing and experimentation as the easy-to-use dashboard can be used by all teams across an organization without any fuss.

The platform basically allows you to designate who has access to the platform determined by ‘role’ so that people across your team can track the different flags for easier and more efficient management.

Pricing

When it comes to pricing, there is a Premium plan and a custom-priced Enterprise plan based on the number of monthly active users. Unlike most other tools’ pricing plans, the plans come with a longer, 30-day free trial period to get a taste of AB Tasty and what it can do for you.

Our feature management solution comes with a host of features to help you advance your experimentation capabilities to release your software quickly and safely.

But don’t just take our word for it, sign up for your 30-day trial and see for yourself what our solution can do for you and find out how we stack up against our competitors.

Which is the best tool for you?

That depends. All the tools listed offer teams great capabilities, most prominent is releasing features safely and quickly through progressive delivery. If this will be the main focus for you, then opting for a simple solution focusing on toggling flags on or off is a good option.

However, if you’re looking for more advanced features, specifically for product and marketing teams to be able to test various releases across the full software stack, particularly on the server-side, and monitor their performance by tracking KPIs then going for a platform that offers deeper experimentation options is the best way to go.

All in all, it will ultimately depend on where you are in your experimentation roadmap and how far you want to delve into it.

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Sep 30, 2021

8min read

Release Management with Jira Software

Rowan Haddad

As organizations increasingly adopt an Agile methodology into their software development processes, this means that teams are now releasing more frequently in smaller batches.

Therefore, changes are being merged into the mainline multiple times a day making it difficult to determine when changes are actually being rolled out.

The developers usually aren’t the only ones who need to know about the release of new changes or features. Product teams are also just as involved in the release process.

As we’ve seen in our release management guide, the product manager is the one who plans releases and will oversee the feature or product as it moves through the different stages of the software development life cycle.

With such frequent releases, things can become highly stressful. Fortunately, nowadays, teams can automate these processes, freeing them up to focus on more important and less repetitive tasks.

One such software is the Jira platform that can contribute to better, more efficient releases.

In this article, we will look into just how you can incorporate Jira software into your release management process to help improve collaboration and communication between your teams.

Jira software overview

First, we will start with a quick overview of Jira software. Created by Atlassian, this software helps you plan, track and release software in software development projects.

It is one of the most popular tools used by Agile teams to manage their releases by offering all team members full visibility over the release process, enhancing collaboration between engineering and product teams to plan and track releases and increase productivity across the organization.

Initially, this software was developed to track issues and bugs and has since evolved into a work management tool for a wide range of use cases.

When it comes to product management, Jira can be used to build roadmaps for each project allowing teams to track and share their progress for the roadmap.

Jira takes a customer-centric approach when it comes to designing projects and it’s highly customizable making it a popular tool among Agile teams to incorporate into software development processes.

Getting started with Jira

You will start by creating a project, where you will pick a template depending on your team’s requirements. You can choose from scrum, kanban and bug tracking. For every project, Jira creates a board to help you see a visual representation of your work.

Before you create a project, consider how you will organize your projects. The best way to do so is to organize your projects according to how your company runs; whether, for example, you’d rather organize by product if you’re constantly releasing software or if you’d rather organize by team.

To manage releases in Jira, the first step is to create a version.

To do so, click on ‘Releases’ as can be seen in the image above, where you will be prompted to create name, start & end dates and description.

You can then assign any issue type to the releases to track their statuses throughout their life cycle.

The release could contain tasks with different statuses such as in progress or closed.

Better release management with Jira

Jira issues

An issue is a work item that can be tracked throughout its life cycle. It can, for example, be a product feature or software; issues are essentially the building blocks of your Jira software project. It can also be referred to as a task, request or ticket.

Jira issue allows you to track individual pieces of work that should be completed. Thus, it notifies your team of any updates by allowing them to track the issue and lets them know if there are any comments or of any change in update status.

Jira workflows

Jira provides a way to customize the way issues transition through the development phases, also referred to as workflows. These can be customized to suit the needs of your project.

Since the Jira platform is flexible, you can add more statuses to track which issues are included in a specific release.

The default Jira platform has three default statuses:

A few more statuses can be added to help teams track the status of their work more precisely. Thus, more statuses can be added such as:

Awaiting release
Ready to merge

Once your team becomes more comfortable with the basic workflow, optimize it by creating statuses for each type of work in the process. Build, design, development, code and test can all, for example, be individual statuses, which can then be shared with the rest of the organization.

Roadmaps in Jira

In our release management guide, we mentioned how essential a product roadmap is to ensure the success of your products.

Roadmaps in Jira help teams see the bigger picture when it comes to their work and how they’re contributing to larger business goals.

Thus, a roadmap will help illustrate your team’s strategic vision to all relevant stakeholders. It gives your team a single, shared source of truth that outlines the direction and progress of a product or team over time to ease cross-team communication and help release more predictably.

Advanced Roadmaps in Jira loads your Jira issues into your plan and suggests releases you can work with, allowing you to track the progress of your releases and determine if they’ll be completed on time.

Tracking releases

You can run a release report in Jira at any time. This report will show you how many total issues there are in the release broken down by how many are in progress or completed.

Jira and feature flags

It is possible to apply feature flags to Jira issues. There, you would be able to monitor your flag’s rollout status from your Jira issue. You would even be able to roll out a feature to a certain subset of users.

Feature flags at their simplest level are if/else statements that allow you to decouple deployment from release giving you more control over the release process so that you can ship changes in code confidently and safely.

Using Jira for feature flags has many advantages, primarily allowing teams full visibility over what features have been released or in the process of being released, all from a single place. Thus, it fosters collaborations so that teams can work together efficiently.

Connecting Jira to feature flags gives your team immediate insight into the status of releases once they are deployed. Teams then would be able to see which feature flag has been turned on and which percentage of users have access to it.

Therefore, you can create a workflow around feature flagging to track the state of each flag. This means that each time a flag is created, an issue in Jira is automatically created and as the flag is rolled out, the issue in Jira automatically moves through the workflow as can be seen in the image below:

Data from Jira issues allows teams to see if there are any bugs in the feature so that they may be immediately fixed before rolling out the feature to the rest of your users.

Additionally, Jira helps reduce technical debt as it enables you to develop rules of governance for feature flags so that your team knows once a flag is rolled out to 100% so that it may be removed from the system.

In other words, whenever a new feature is added to the codebase, you can assign cleanup tickets to engineering or product teams.

Conclusion

The automation capabilities that Jira brings to your releases is essential especially when it comes to managing multiple complex projects.

With feature flags, developers can have even more control over their releases so that they may launch new features quicker than ever before while receiving direct insights from their releases.

Jira as a whole allows you to manage releases according to your unique needs and objectives. It also facilitates an Agile methodology within your organization by creating Agile workflows to help release high quality software faster.

Overall, using releases in Jira significantly improves your tracking and reporting allowing you to have better coordination and planning across your releases.

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Sep 23, 2021

5min read

Top 6 Release Management Tools

Rowan Haddad

In this article, we will highlight some release management tools that are essential to the success of your release with heightened velocity. A release manager will need to be familiar with at least some of these tools in order to create automated processes that result in high quality releases.

These tools help to increase speed of software delivery while reducing risk and errors. The following are our top picks to help you deploy faster and more efficiently, some of which are either open source or premium paid tools.

But before we start, here is a quick reminder about what is release management.

Release management process

In our release management guide, we mentioned what release management is, its different phases as well as the different deployment strategies to release new features to production.

As a quick recap, release management basically outlines the different phases involved in the release process of a feature from its inception to its launch. It is the process of planning and controlling software builds throughout its entire life cycle.

As organizations transition from more traditional practices of Waterfall to an Agile methodology, the goal of the release management process is now to move software builds through the various phases quicker and more frequently.

Without further ado, here is the list of our favorite release management tools.

Ansible

Ansible is an open source configuration management and application deployment tool. It has the advantage of being simple and easy to use, hence creating a collaborative environment.

This tool also enhances productivity as it eliminates repetitive tasks so that your team can focus on other more important tasks and strategic work.

Other features:

Comes with an agentless architecture
Allows you to orchestrate your app lifecycle
Doesn’t require any special coding skills

Jenkins

Jenkins is one of the most popular tools currently available. A leading open source automation server, it provides hundreds of plugins to support building, automating and deploying your projects.

This tool is easy to set up and configure and integrates with practically any tool and can be distributed across multiple machines.

It is also extensible and so can be used for continuous integration and continuous delivery.

Other features:

Easy to install
Detects failures early on
Helps organize releases and keep deployments on schedule

Plutora

Plutora, a value stream management platform, improves time-to-value and improves your digital transformation journey by scaling Agile and DevOps across your organization.

The platform also gives you full visibility and control over the release process enhancing productivity and allowing different teams to see what they’re doing.

This transparency over the software delivery process allows you to increase efficiency and reduce time-to-value to deliver better software faster.

Other features:

Increases delivery speed through automation and streamlined processes
Improves collaboration to ensure fast workflows between development and test teams
Provides release insights to make sure high-quality releases are delivered on time

Plutora - Value Stream Management platform

Chef

Chef helps to facilitate a culture of DevSecOps (Development, Security and Operations) by allowing effective collaboration through software automation.

Chef helps teams build, deploy and secure your releases as well as enabling them to scale continuous delivery across different applications.

Other features:

Facilitates cross-team collaboration with actionable insights for configuration
Provides operational visibility for all your teams in one place
Provides a single source of truth for all environments

Chef - Release management software automation

Clarive

Clarive helps make application delivery easier through a unified workflow. This release management tool drives application changes from development all the way to production.

This tool also allows you to choose a template that fits your organization’s unique workflow and delivery mode.

Other features:

Provides a Kanboard board to create a deployment.
Allows you to track your release progress through the different stages
Provides an end-to-end delivery

Spinnaker

Spinnaker is an open source, multi-cloud continuous delivery platform. Created by Netflix, this platform allows for fast and safe deployments across multiple cloud providers including AWS EC2, Kubernetes and Microsoft Azure.

Its deployment features enable teams to construct and manage continuous delivery workflows.

Other features:

Provides role-based access control so that access can be restricted to projects or accounts leading to heightened security
Allows you to restrict execution of stages to certain times; for example, when the right people are on hand to access the rollout
Requires manual judgements (a manual approval) prior to a release

Ultimately, the release management tool you choose will depend on a number of factors including: company size, amount of projects (i.e. whether you have a large number of simultaneous projects) and ease of use.

In the end, whatever tool you end up choosing, to ensure a smooth release management process, it’s imperative to build consistent workflows and foster an environment of collaboration.

Choosing the right tools will allow teams to efficiently build, manage and deploy new features without much hassle.

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Sep 1, 2021

6min read

Fun with Flags: Short vs Long-lived Feature Flags

Rowan Haddad

In the previous addition to our Fun with Flags series, we discussed how to store feature flags based on the category of flags they belong to.

We also discussed how flags are broken down into categories based on their dynamism and longevity.

In this post, we will explore the differences between short- and long-lived flags and which to choose depending on the use-case. We already touched upon this difference in the previous post but here we will go into more details.

Feature flag longevity

First, we will briefly define what we mean by longevity. This concept refers to how long the decision logic for a flag will remain in a codebase.

Feature flags, or features toggles, can be broken down into the following categories. Each of these categories serve a different purpose and thus should be managed differently.

Feature toggles can be categorized as the following:

Release toggles– these are short-lived toggles that help developers in writing new features. They are usually on/off switches that control whether a feature is enabled or not. They are usually removed after the feature is released.
Operational toggles– these are used to control operational aspects of your system. They are usually used to turn features off. They can be short-lived but it’s not uncommon to have longer-lived ones that serve as kill switches.
Experiment toggles– these are used to perform A/B or multivariate tests. They will usually stay in the system for the duration necessary to generate statistically significant results from the feature testing so could be anywhere from a few weeks to months.
Permission toggles-these toggles are usually longer-lived compared to the other categories of toggles, sometimes remaining in your system for years. They are used for user segmenting or, in other words, to make features available to certain subsets of users.

As we’ve seen so far, there are different types of flags, each used for different purposes and could stay in your system from days to months, even years, depending on how they are deployed.

Being aware of the longevity of a flag is crucial for a few reasons.

First of all, it is important when it comes to implementing the flag. For example, for short-lived flags, usually an if/else statement is sufficient but for flags that are meant to stay for a longer period of time will require more sophistication to support these flags and their associated use cases.

The second reason is to avoid the accumulation of technical debt, which will be explained in further detail in the next section.

Short vs long-term feature flags

This brings us to the key differences between short- and long-lived flags.

From the above, we can conclude that short-lived flags have a limited time span, meaning that they should be removed from your system as soon as they have served their purpose.

Long-lived or, in some cases, permanent flags are used for an extended period of time, sometimes beyond the release of a feature.

Long-lived flags become part of your regular software operations and infrastructure so you create it with the intention of keeping it in your system for an indefinite period of time.

It is crucial, therefore, to constantly review the flags you have in your flags.

Why is this important?

Flags must be checked at regular intervals to avoid the accumulation of technical debt. This debt might take a toll on your system, causing major disruptions and even resulting in a total breakdown, if you don’t keep careful track of it. The more flags you add to your system, the higher this cost becomes.

Thus, the best way to minimize this cost is to conduct a regular clean-up of your flags. Remove any stale, unused flags that you no longer use; otherwise, you might end up with hundreds, or even thousands, of unused flags.

However, this may not be as simple as it sounds. Often, when flag adoption increases, many organizations could find it difficult to determine which flags are still active so this process might become time-consuming and challenging the further you are in your feature flag journey.

Let’s take a look at how to overcome this challenge…

You will need to have a process in place in which feature flags are regularly reviewed and removed from your codebase.

For short-term flags, check to see if the feature has been rolled out to all users or no users and for more long-lived flags, you should determine if the flag is still needed for one reason or another.

You should also consider setting up access control, lest someone mistakenly ends up deleting a permanent flag that you’re still very much using! Thus, consider setting up permissions for each flag to assign who can delete certain flags.

Additionally, adhering to naming conventions, which indicate the flag’s longevity is a great way to keep track of the many flags you have in your system.

An advanced feature flagging platform can give you full visibility over all the flags you have in your system.

In particular, the Flag Tracking Dashboard and enhanced Git integration give you the ability to manage tech debt and see where your flags are being used. It also gives you the ability to reference your codebase and all your AB Tasty Flags, allowing you to quickly locate the flags that you are no longer using.

To sum up…

To review what we’ve seen so far, the following table lists each category of flag and its longevity:

Category	Longevity
Release	From days to weeks (until the feature is rolled out)
Experiment	From weeks to months, depending on how long it takes to gather statistically significant results/Temporary
Operational	From months to years, depending on their use case.
Permissioning	Usually long-lasting/Permanent

Are you creating this flag for purposes of testing, rollout or feature releases? These flags are usually temporary (days to weeks) and should be removed once the feature is deployed.
Are you conducting an A/B test? This must remain long enough to gather sufficient data but no longer than that.
Are you using this flag to manage entitlements? This is when you give access to a feature to a certain segment of users. These are often long-lived or permanent flags.

As a final note: Review your flags often

In conclusion, make sure to review the flags you have in your system, particularly the short-lived ones, so that you can ensure they’re deleted as soon as possible.

The expected life-span of a flag will depend on the purpose it was created for so everyone in the team needs to have a clear understanding of the purpose of each flag in your system.

With that said, the most important takeaway is to make sure to schedule regular clean-ups to keep technical debt in check. As already mentioned, a third-party service will make the process less daunting by giving you full visibility of all your flags so that you can identify which flags need to be removed.

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Aug 5, 2021

7min read

Fun with Flags: Where to Store Feature Flags

Rowan Haddad

So far, in our Fun with Flags series, we have looked at some important best practices when it comes to feature flags including naming conventions and access control.

In this part of our series, we will be looking at the different ways feature flags can be stored based on which category they belong to.

Let’s start with a simple definition of feature flags. A feature flag is a software development practice that allows you to decouple code deployment from release, enabling quicker and safer releases.

They allow you to enable or disable functionality remotely without changing code. At its most basic, a feature flag is a file separate from your code file.

For a more comprehensive overview of feature flags, read our definitive guide to feature flags.

Configuration files

You could start off deploying feature flags using a configuration file. This is the simplest implementation method. A config file is usually used to store application settings and some flags just fall into this category.

However, this might be time-consuming as developers may need to redeploy the application after each value update to reflect the changed value. Redeploying can take some time and if your changes are mission critical you may lose some reactivity. Configuration files also don’t allow you to make context-specific decisions, meaning that flags will either be on or off for all users (only hold one specific value at a time).

This makes the most sense when you’re just starting out your feature flag journey and you wish to experiment with feature flags at a smaller scale. Additionally, this could be a viable option for static flags like a release toggle, which are feature flags that enable trunk-based development to support dev teams as they write new features.

For example, a developer may need for a kill switch to work instantaneously to kill off a buggy feature while the flipping time for a release toggle from off to on can take longer.

Such static flags, as their name implies, will be relatively static and will only change through code or configuration file changes.

To better understand how feature flag values should be stored, it’s best to start with the different categories of flags, which are based on longevity and dynamism.

Short vs long lived flags

Longevity of a feature flag, as the name suggests, refers to how long a flag will remain in your code.

Some types of flags are short-lived such as release toggles while others such as kill switches are long-lived; they will stay in your codebase often for years.

This distinction between short- and long-lived flags will influence how you go about implementing its toggle point. The longer the lifetime of a flag, the more careful you’ll need to be in choosing about choosing the toggle point location.

Keep reading: Short vs Long-lived Feature Flags (Fun with Flag series)

Dynamic vs static flags

There are obviously different types of flags, which means these different flags cannot be managed and configured the same way. Dynamism refers to how much we need to modify a flag in its lifetime.

Therefore, feature flags can be broken down into two categories depending on how they are hosted:

Static flags-as already mentioned, these are usually hosted by a file that only changes when you want it to. These flags get hard-coded into the app at build time so the flag configuration is stored in the actual code.
Dynamic flags-usually hosted by a service that will change the value of the flag depending on the values you send with the request. This type of flag can be changed at runtime allowing for dynamic control of code execution on a user-by-user or session-by-session basis.

This means that the configuration for some types of flags needs to be more dynamic than others. For example, if you want to change a value based on a shared attribute among groups of users.

Alternative storage method: database

As an alternative to config files, feature flags can be stored in some sort of database. It is a convenient place to store your settings and they can often be easily updated.

So if you see the number of flags in your system and you need more granular controls, you might opt to deploy your flags via a database. This would allow you to target features to individual users.

Moreover, having flags in a database means that not only your developers can change values assigned to flags but also your product teams, thereby reducing dependency. This assumes that some kind of dashboard or management system is in place for other people to easily interact with the database. You’ll also probably want to keep track of changes to this database.

However, you still wouldn’t be able to do gradual rollouts or targeting users based on IDs or attributes or monitor metrics. In other words, it limits the extent of your user segmetting.

Consequently, while config files and databases are quick solutions, when you’re just starting out, they are not good in the long-term if you’re looking to scale or use more dynamic flags.

You might then consider a feature flag open source solution, which allows for simple user segmentation and controlled rollouts. Nevertheless, such solutions wouldn’t allow you to track who made changes or limit access and they are generally language specific. This could be a problem if your application uses multiple languages so you would need to deploy multiple solutions.

Feature management services

The most dynamic flags are those whose state depends on the current user. Dynamic flags need to be hosted by a service rather than a static file mentioned previously. This is because feature flag services allow you to serve different values to different types of users at runtime.

In that sense, you would target values based on user IDs or according to percentage, i.e. a feature can be seen by x% of users, or according to a certain group the users belong to, using some user traits as targeting criteria.

This also works if you’re experimenting with features, with a traditional A/B testing approach, and you need to randomly put users in different groups so the user’s ID would be sent to the service to place in a group accordingly.

Such IDs (that could be stored in cookies or localStorage…) are also useful when testing internally where a feature is turned on for internal users as detected via a special ID value. Developers would then turn these features on for themselves to make sure it’s working as it should and test directly in production.

In such situations, you would opt for a third party service to fetch values of the feature flags.

Such feature flag services would, therefore, also allow you to to choose values for each flag depending on specific attributes such as the country users are from.

For example, AB Tasty’s flagging functionality allows you to assign specific flag values to different user segments. So, internal users would be served one value and would then see the new feature while external users will be served another value.

For example, in the image below from the AB Tasty dashboard, the value that is fetched will be IOS in Scenario 1 and ANDROID in Scenario 2 so only the users matching this value will be exposed to the feature.

This is extremely useful if you want to do gradual rollouts and testing in production and for generally more advanced use-cases.

Conclusion

For advanced use-cases, it would make sense to opt for a third party solution to help you manage and implement the different types of flags in your system.

These solutions offer features such as a UI for managing configuration with access control to limit who does what as well as audit logs to determine which changes were made by whom.

Therefore, sophisticated feature management solutions like AB Tasty are built to scale and include role-based and access controls to execute more complex use-cases.

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Jul 30, 2021

5min read

Top 10 Feature Flags Projects on GitHub

Anthony Brebion

What are the top feature flags projects on GitHub? Let’s find out.

If you are into open-source software, GitHub is probably the go-to place to find interesting projects you can use for free and contribute to.

Feature flags tools are no exception. We have listed below the top 10 feature toggle repositories on github.com ranked by popularity.

If you want to explore alternatives that scale better and are suitable for more use cases, read our article about feature flag implementation journey where we answer the question: should I build or buy a feature flag platform.

For a comprehensive overview of what feature flags are, refer to our feature flagging guide.

1. Unleash/unleash

Unleash is the open-source feature management platform. It provides a great overview of all feature toggles/flags across all your applications and services. Unleash enables software teams all over the world to take full control on how they deploy new functionality to end users.

https://github.com/Unleash/unleash

2. Fetlife/rollout

Fast feature flags based on Redis.

https://github.com/fetlife/rollout

3. jnunemaker/flipper

Flipper gives you control over who has access to features in your app.

Enable or disable features for everyone, specific actors, groups of actors, a percentage of actors, or a percentage of time.
Configure your feature flags from the console or a web UI.
Regardless of what data store you are using, Flipper can performantly store your feature flags.
Use Flipper Cloud to cascade features from multiple environments, share settings with your team, control permissions, keep an audit history, and rollback.

https://github.com/jnunemaker/flipper

4. uber/piranha

Piranha is a tool to automatically refactor code related to stale flags. At a higher level, the input to the tool is the name of the flag and the expected behavior, after specifying a list of APIs related to flags in a properties file. Piranha will use these inputs to automatically refactor the code according to the expected behavior.

This repository contains four independent versions of Piranha, one for each of the four supported languages: Java, JavaScript, Objective-C and Swift.

https://github.com/uber/piranha

5. checkr/flagr

Flagr is an open source Go service that delivers the right experience to the right entity and monitors the impact. It provides feature flags, experimentation (A/B testing), and dynamic configuration. It has clear swagger REST APIs for flags management and flag evaluation.

https://github.com/checkr/flagr

6. markphelps/flipt

Flipt is an open source, on-prem feature flag application that allows you to run experiments across services in your environment. Flipt can be deployed within your existing infrastructure so that you don’t have to worry about your information being sent to a third party or the latency required to communicate across the internet.

Flipt supports use cases such as:

Simple on/off feature flags to toggle functionality in your applications
Rolling out features to a percentage of your customers
Using advanced segmentation to target and serve users based on custom properties that you define

https://github.com/markphelps/flipt

7. ff4j/ff4j

FF4j, is an implementation of the Feature Toggle pattern for Java. It provides a rich set of features:

Enable and disable features at runtime – no deployments.
Enable features not only with flag values but also drive access with roles and groups.
Implement custom predicates to evaluate if a feature is enabled.
Keep your code clean and readable: Avoid nested if statements but use annotations.
Each action (create, update, delete, toggles) can be traced and saved in the audit trail for troubleshooting.
Administrate FF4j (including features and properties) with the web UI.
Wide choice of databases technologies to store your features, properties and events.
(Distributed) Cache Evaluating predicates may put pressure on DB (high hit ratio).

https://github.com/ff4j/ff4j

8. togglz/togglz

Togglz is another implementation of the Feature Toggles pattern for Java.

Modular setup. Select exactly the components of the framework you want to use. Besides the main dependency, install specific integration modules if you are planning to integrate Togglz into a web application (Servlet environment) or if you are using CDI, Spring, Spring Boot, JSF.
Straight forward usage. Just call the isActive() method on the corresponding enum to check if a feature is active or not for the current user.
Admin console. Togglz comes with an embedded admin console that allows you to enable or disable features and edit the user list associated with every feature.
Activation strategies. They are responsible for deciding whether an enabled feature is active or not. Activation strategies can, for example, be used to activate features only for specific users, for specific client IPs or at a specified time.
Custom Strategies. Besides the built-in default strategies, it’s easy to add your own strategies. Togglz offers an extension point that allows you to implement a new strategy with only a single class.
Feature groups. To make sure you don’t get lost in all the different feature flags, Togglz allows you to define group for feature that are just used for a visual grouping in the admin console.

https://github.com/togglz/togglz

9. jason-roberts/FeatureToggle

Simple, reliable feature toggles in .NET

https://github.com/jason-roberts/FeatureToggle

10. tompave/fun_with_flags

FunWithFlags is an OTP application that provides a 2-level storage to save and retrieve feature flags, an Elixir API to toggle and query them, and a web dashboard as control panel.

It stores flag information in Redis or a relational DB (PostgreSQL or MySQL, with Ecto) for persistence and synchronization across different nodes, but it also maintains a local cache in an ETS table for fast lookups. When flags are added or toggled on a node, the other nodes are notified via PubSub and reload their local ETS caches

https://github.com/tompave/fun_with_flags

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Jul 28, 2021

3min read

Introducing 1,000 Experiments Club: A New Podcast Series From AB Tasty

AB Tasty

Join VP Marketing Marylin Montoya as she takes a deep dive into all things experimentation

Today, we’re handing over the mic to AB Tasty’s VP Marketing Marylin Montoya to kick off our new podcast series, “1,000 Experiments Club.”

At AB Tasty, we’re a bunch of product designers, software engineers and marketers (aka Magic Makers), working to build a culture of experimentation. We wanted to move beyond the high-level rhetoric of experimentation and look into the nitty gritty building blocks that go into running experimentation programs and digital experiences.

Enter: “1,000 Experiments Club,” the podcast that examines how you can successfully do experimentation at scale. Our podcast brings together a selection of the best and brightest leaders to uncover their insights on how to experiment and how to fail … successfully.

In each episode, Marylin sits down to interview our guests from tech giants, hyper-growth startups and consulting agencies — each with their own unique view on how they’ve made experimentation the bedrock of their growth strategies.

You’ll learn about why failing is part of the process, how to turn metrics into your trustworthy allies, how to adapt experimentation to your company size, and how to get management buy-in if you’re just starting out. Our podcast is for CRO experts, product managers, software engineers; there’s something for everyone, no matter where you fall on the maturity model of experimentation!

We are kicking things off with three episodes, each guest documenting their journey of where they went wrong, but also the triumphs they’ve picked up from decades of experimentation, optimization and product development.

Ronny Kohavi (ex-Amazon, Airbnb, Microsoft)

He shares a humbling reality check: Most ideas will fail

Chad Sanderson (Convoy)

He breaks down the most successful types of experimentations

Jonny Longden (Journey Further)

He believes anyone can and should do experimentation

In the culture of experimentation, there’s no such thing as a “failed” experiment: Every test is an opportunity to learn and build toward newer and better ideas. So have a listen and subscribe to “1,000 Experiments Club” on Apple Podcasts, Spotify or wherever you get your podcasts.

Take to me to the podcast!

You might also like...

See all

Article

4min read

Beauty E-Commerce Gets a Glow-Up: Insights from Our Cosmetics Consumer Report

Kate Feng

Aug 21, 2025

Article

7min read

Winning Big During the E-Commerce Holiday Season: Strategies and Insights

Brian Heberer

Aug 18, 2025

Article

3min read

Unlock Any Audience Source with AB Tasty’s Universal Connector

Gaetan Postorivo

Aug 1, 2025

Subscribe to
our Newsletter

Article

Jul 19, 2021

5min read

1,000 Experiments Club: A Conversation With Jonny Longden of Journey Further

AB Tasty

Is experimentation for everyone? A resounding yes, says Jonny Longden. All you need are two ingredients: A strong desire and tenacity to implement it.

There’s a dangerous myth lurking around, and it’s the idea that you have to be a large organization to practice experimentation. But it’s actually the smaller companies and start-ups that need experimentation the most, says Jonny Longden of performance marketing agency Journey Further.

With over a decade of experience in conversion optimization and personalization, Jonny co-founded Journey Further to help clients embed experimentation into the heart of what they do. He currently leads the conversion division of the agency, which also focuses on PPC, SEO, PR — among other marketing specializations.

Any company that wants to unearth any sort of discovery should be using experimentation, especially start-ups who are in the explorative phase of their development. “Experimentation requires no size: It’s all about how you approach it,” Jonny shared with AB Tasty’s VP Marketing Marylin Montoya.

Here are a few of our favorite takeaways from our wide-ranging chat with Jonny.

The democratization of experimentation

People tend to see more experimentation teams and programs built at large-scale companies, but that doesn’t necessarily mean other companies of different sizes can’t dip their toes in the experimentation pool. Smaller companies and start-ups can equally benefit from this as long as they have the tenacity and capabilities to implement it.

You need to truly believe that without experimentation, your ideas won’t work, says Jonny. There are things that you think are going to work and yet they don’t. Conversely, there are many things that don’t seem like they work but actually end up having a positive impact. The only way to arrive at this conclusion is through experimentation.

Ultimately, the greatest discoveries (for example, space, travel, medicine, etc.) have come from a scientific methodology, which is just observation, hypothesis, testing and refinement. Approach experimentation with this mindset, and it’s anyone’s game.

Building the right roadmaps with product teams

Embedding experimentation into the front of the product development process is important, but yet most people don’t do it, says Jonny. From a pure business perspective, it’s about trying to de-risk development and prove the value of a change or feature before investing any more time, money and bandwidth.

Luckily, the agile methodology employed by many modern teams is similar to experimentation. Both rely on iterative customer collaboration and a cycle of rigorous research, quantitative and qualitative data collection, validation and iteration. The sweet spot is the collection of both quantitative and qualitative data — a good balance of feedback and volume.

The success of building a roadmap for an experimentation program comes down to understanding the organizational structure of a company or industry. In SaaS companies, experimentation is embedded into the product teams; for e-commerce businesses, experimentation fits better into the marketing side. Once you’ve determined the owner and objectives of the experimentation, you’ll need to understand whether you can effectively roll out the testing and have the right processes in place to implement results of a test.

Experimentation is, ultimately, innovation

The more you experiment, the more you drive value. Experimentation at scale enables people to learn and build more tests based on these learnings. Don’t use testing to only identify winners because there’s much more knowledge to be gained from the failed tests. For example, you may only have 1 in 10 tests that work. The real value comes in the 9 lessons you’ve acquired, not just the 1 test that showed positive impact.

When you look at it through these lenses, you’ll realize that the post-test research and subsequent actions are vital: That’s where you’ll start to make more gains toward bigger innovation.

Jonny calls this the snowball effect of experimentation. Experimentation is innovation — when done right. At the root, it’s about exploring and seeing how your customers respond. And as long as you’re learning from the results of your tests, you’ll be able to innovate faster precisely because you are building upon these lessons. That’s how you drive innovation that actually works.

What else can you learn from our conversation with Jonny Longden?

Moving from experimentation to validation
How to maintain creativity during experimentation
Using CRO to identify the right issues to tackle
The required building blocks to successful experimentation

About Jonny Longden

Jonny Longden leads the conversion division of Journey Further, a performance marketing agency specializing in PPC, SEO, PR, etc. Based in the United Kingdom, the part-agency, part-consultancy helps businesses become data-driven and build experimentation into their programs. Prior to that, Jonny dedicated over a decade in conversion optimization, experimentation and personalization, working with Sky, Visa, Nike, O2, Mvideo, Principal Hotels and Nokia.

About 1,000 Experiments Club

The 1,000 Experiments Club is an AB Tasty-produced podcast hosted by Marylin Montoya, VP of Marketing at AB Tasty. Join Marylin and the Marketing team as they sit down with the most knowledgeable experts in the world of experimentation to uncover their insights on what it takes to build and run successful experimentation programs.