Tag: EN

Article

Sep 23, 2021

5min read

Top 6 Release Management Tools

Rowan Haddad

In this article, we will highlight some release management tools that are essential to the success of your release with heightened velocity. A release manager will need to be familiar with at least some of these tools in order to create automated processes that result in high quality releases.

These tools help to increase speed of software delivery while reducing risk and errors. The following are our top picks to help you deploy faster and more efficiently, some of which are either open source or premium paid tools.

But before we start, here is a quick reminder about what is release management.

Release management process

In our release management guide, we mentioned what release management is, its different phases as well as the different deployment strategies to release new features to production.

As a quick recap, release management basically outlines the different phases involved in the release process of a feature from its inception to its launch. It is the process of planning and controlling software builds throughout its entire life cycle.

As organizations transition from more traditional practices of Waterfall to an Agile methodology, the goal of the release management process is now to move software builds through the various phases quicker and more frequently.

Without further ado, here is the list of our favorite release management tools.

Ansible

Ansible is an open source configuration management and application deployment tool. It has the advantage of being simple and easy to use, hence creating a collaborative environment.

This tool also enhances productivity as it eliminates repetitive tasks so that your team can focus on other more important tasks and strategic work.

Other features:

Comes with an agentless architecture
Allows you to orchestrate your app lifecycle
Doesn’t require any special coding skills

Jenkins

Jenkins is one of the most popular tools currently available. A leading open source automation server, it provides hundreds of plugins to support building, automating and deploying your projects.

This tool is easy to set up and configure and integrates with practically any tool and can be distributed across multiple machines.

It is also extensible and so can be used for continuous integration and continuous delivery.

Other features:

Easy to install
Detects failures early on
Helps organize releases and keep deployments on schedule

Plutora

Plutora, a value stream management platform, improves time-to-value and improves your digital transformation journey by scaling Agile and DevOps across your organization.

The platform also gives you full visibility and control over the release process enhancing productivity and allowing different teams to see what they’re doing.

This transparency over the software delivery process allows you to increase efficiency and reduce time-to-value to deliver better software faster.

Other features:

Increases delivery speed through automation and streamlined processes
Improves collaboration to ensure fast workflows between development and test teams
Provides release insights to make sure high-quality releases are delivered on time

Plutora - Value Stream Management platform

Chef

Chef helps to facilitate a culture of DevSecOps (Development, Security and Operations) by allowing effective collaboration through software automation.

Chef helps teams build, deploy and secure your releases as well as enabling them to scale continuous delivery across different applications.

Other features:

Facilitates cross-team collaboration with actionable insights for configuration
Provides operational visibility for all your teams in one place
Provides a single source of truth for all environments

Chef - Release management software automation

Clarive

Clarive helps make application delivery easier through a unified workflow. This release management tool drives application changes from development all the way to production.

This tool also allows you to choose a template that fits your organization’s unique workflow and delivery mode.

Other features:

Provides a Kanboard board to create a deployment.
Allows you to track your release progress through the different stages
Provides an end-to-end delivery

Spinnaker

Spinnaker is an open source, multi-cloud continuous delivery platform. Created by Netflix, this platform allows for fast and safe deployments across multiple cloud providers including AWS EC2, Kubernetes and Microsoft Azure.

Its deployment features enable teams to construct and manage continuous delivery workflows.

Other features:

Provides role-based access control so that access can be restricted to projects or accounts leading to heightened security
Allows you to restrict execution of stages to certain times; for example, when the right people are on hand to access the rollout
Requires manual judgements (a manual approval) prior to a release

Ultimately, the release management tool you choose will depend on a number of factors including: company size, amount of projects (i.e. whether you have a large number of simultaneous projects) and ease of use.

In the end, whatever tool you end up choosing, to ensure a smooth release management process, it’s imperative to build consistent workflows and foster an environment of collaboration.

Choosing the right tools will allow teams to efficiently build, manage and deploy new features without much hassle.

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

Sep 1, 2021

6min read

Fun with Flags: Short vs Long-lived Feature Flags

Rowan Haddad

In the previous addition to our Fun with Flags series, we discussed how to store feature flags based on the category of flags they belong to.

We also discussed how flags are broken down into categories based on their dynamism and longevity.

In this post, we will explore the differences between short- and long-lived flags and which to choose depending on the use-case. We already touched upon this difference in the previous post but here we will go into more details.

Feature flag longevity

First, we will briefly define what we mean by longevity. This concept refers to how long the decision logic for a flag will remain in a codebase.

Feature flags, or features toggles, can be broken down into the following categories. Each of these categories serve a different purpose and thus should be managed differently.

Feature toggles can be categorized as the following:

Release toggles– these are short-lived toggles that help developers in writing new features. They are usually on/off switches that control whether a feature is enabled or not. They are usually removed after the feature is released.
Operational toggles– these are used to control operational aspects of your system. They are usually used to turn features off. They can be short-lived but it’s not uncommon to have longer-lived ones that serve as kill switches.
Experiment toggles– these are used to perform A/B or multivariate tests. They will usually stay in the system for the duration necessary to generate statistically significant results from the feature testing so could be anywhere from a few weeks to months.
Permission toggles-these toggles are usually longer-lived compared to the other categories of toggles, sometimes remaining in your system for years. They are used for user segmenting or, in other words, to make features available to certain subsets of users.

As we’ve seen so far, there are different types of flags, each used for different purposes and could stay in your system from days to months, even years, depending on how they are deployed.

Being aware of the longevity of a flag is crucial for a few reasons.

First of all, it is important when it comes to implementing the flag. For example, for short-lived flags, usually an if/else statement is sufficient but for flags that are meant to stay for a longer period of time will require more sophistication to support these flags and their associated use cases.

The second reason is to avoid the accumulation of technical debt, which will be explained in further detail in the next section.

Short vs long-term feature flags

This brings us to the key differences between short- and long-lived flags.

From the above, we can conclude that short-lived flags have a limited time span, meaning that they should be removed from your system as soon as they have served their purpose.

Long-lived or, in some cases, permanent flags are used for an extended period of time, sometimes beyond the release of a feature.

Long-lived flags become part of your regular software operations and infrastructure so you create it with the intention of keeping it in your system for an indefinite period of time.

It is crucial, therefore, to constantly review the flags you have in your flags.

Why is this important?

Flags must be checked at regular intervals to avoid the accumulation of technical debt. This debt might take a toll on your system, causing major disruptions and even resulting in a total breakdown, if you don’t keep careful track of it. The more flags you add to your system, the higher this cost becomes.

Thus, the best way to minimize this cost is to conduct a regular clean-up of your flags. Remove any stale, unused flags that you no longer use; otherwise, you might end up with hundreds, or even thousands, of unused flags.

However, this may not be as simple as it sounds. Often, when flag adoption increases, many organizations could find it difficult to determine which flags are still active so this process might become time-consuming and challenging the further you are in your feature flag journey.

Let’s take a look at how to overcome this challenge…

You will need to have a process in place in which feature flags are regularly reviewed and removed from your codebase.

For short-term flags, check to see if the feature has been rolled out to all users or no users and for more long-lived flags, you should determine if the flag is still needed for one reason or another.

You should also consider setting up access control, lest someone mistakenly ends up deleting a permanent flag that you’re still very much using! Thus, consider setting up permissions for each flag to assign who can delete certain flags.

Additionally, adhering to naming conventions, which indicate the flag’s longevity is a great way to keep track of the many flags you have in your system.

An advanced feature flagging platform can give you full visibility over all the flags you have in your system.

In particular, the Flag Tracking Dashboard and enhanced Git integration give you the ability to manage tech debt and see where your flags are being used. It also gives you the ability to reference your codebase and all your AB Tasty Flags, allowing you to quickly locate the flags that you are no longer using.

To sum up…

To review what we’ve seen so far, the following table lists each category of flag and its longevity:

Category	Longevity
Release	From days to weeks (until the feature is rolled out)
Experiment	From weeks to months, depending on how long it takes to gather statistically significant results/Temporary
Operational	From months to years, depending on their use case.
Permissioning	Usually long-lasting/Permanent

Are you creating this flag for purposes of testing, rollout or feature releases? These flags are usually temporary (days to weeks) and should be removed once the feature is deployed.
Are you conducting an A/B test? This must remain long enough to gather sufficient data but no longer than that.
Are you using this flag to manage entitlements? This is when you give access to a feature to a certain segment of users. These are often long-lived or permanent flags.

As a final note: Review your flags often

In conclusion, make sure to review the flags you have in your system, particularly the short-lived ones, so that you can ensure they’re deleted as soon as possible.

The expected life-span of a flag will depend on the purpose it was created for so everyone in the team needs to have a clear understanding of the purpose of each flag in your system.

With that said, the most important takeaway is to make sure to schedule regular clean-ups to keep technical debt in check. As already mentioned, a third-party service will make the process less daunting by giving you full visibility of all your flags so that you can identify which flags need to be removed.

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

Aug 5, 2021

7min read

Fun with Flags: Where to Store Feature Flags

Rowan Haddad

So far, in our Fun with Flags series, we have looked at some important best practices when it comes to feature flags including naming conventions and access control.

In this part of our series, we will be looking at the different ways feature flags can be stored based on which category they belong to.

Let’s start with a simple definition of feature flags. A feature flag is a software development practice that allows you to decouple code deployment from release, enabling quicker and safer releases.

They allow you to enable or disable functionality remotely without changing code. At its most basic, a feature flag is a file separate from your code file.

For a more comprehensive overview of feature flags, read our definitive guide to feature flags.

Configuration files

You could start off deploying feature flags using a configuration file. This is the simplest implementation method. A config file is usually used to store application settings and some flags just fall into this category.

However, this might be time-consuming as developers may need to redeploy the application after each value update to reflect the changed value. Redeploying can take some time and if your changes are mission critical you may lose some reactivity. Configuration files also don’t allow you to make context-specific decisions, meaning that flags will either be on or off for all users (only hold one specific value at a time).

This makes the most sense when you’re just starting out your feature flag journey and you wish to experiment with feature flags at a smaller scale. Additionally, this could be a viable option for static flags like a release toggle, which are feature flags that enable trunk-based development to support dev teams as they write new features.

For example, a developer may need for a kill switch to work instantaneously to kill off a buggy feature while the flipping time for a release toggle from off to on can take longer.

Such static flags, as their name implies, will be relatively static and will only change through code or configuration file changes.

To better understand how feature flag values should be stored, it’s best to start with the different categories of flags, which are based on longevity and dynamism.

Short vs long lived flags

Longevity of a feature flag, as the name suggests, refers to how long a flag will remain in your code.

Some types of flags are short-lived such as release toggles while others such as kill switches are long-lived; they will stay in your codebase often for years.

This distinction between short- and long-lived flags will influence how you go about implementing its toggle point. The longer the lifetime of a flag, the more careful you’ll need to be in choosing about choosing the toggle point location.

Keep reading: Short vs Long-lived Feature Flags (Fun with Flag series)

Dynamic vs static flags

There are obviously different types of flags, which means these different flags cannot be managed and configured the same way. Dynamism refers to how much we need to modify a flag in its lifetime.

Therefore, feature flags can be broken down into two categories depending on how they are hosted:

Static flags-as already mentioned, these are usually hosted by a file that only changes when you want it to. These flags get hard-coded into the app at build time so the flag configuration is stored in the actual code.
Dynamic flags-usually hosted by a service that will change the value of the flag depending on the values you send with the request. This type of flag can be changed at runtime allowing for dynamic control of code execution on a user-by-user or session-by-session basis.

This means that the configuration for some types of flags needs to be more dynamic than others. For example, if you want to change a value based on a shared attribute among groups of users.

Alternative storage method: database

As an alternative to config files, feature flags can be stored in some sort of database. It is a convenient place to store your settings and they can often be easily updated.

So if you see the number of flags in your system and you need more granular controls, you might opt to deploy your flags via a database. This would allow you to target features to individual users.

Moreover, having flags in a database means that not only your developers can change values assigned to flags but also your product teams, thereby reducing dependency. This assumes that some kind of dashboard or management system is in place for other people to easily interact with the database. You’ll also probably want to keep track of changes to this database.

However, you still wouldn’t be able to do gradual rollouts or targeting users based on IDs or attributes or monitor metrics. In other words, it limits the extent of your user segmetting.

Consequently, while config files and databases are quick solutions, when you’re just starting out, they are not good in the long-term if you’re looking to scale or use more dynamic flags.

You might then consider a feature flag open source solution, which allows for simple user segmentation and controlled rollouts. Nevertheless, such solutions wouldn’t allow you to track who made changes or limit access and they are generally language specific. This could be a problem if your application uses multiple languages so you would need to deploy multiple solutions.

Feature management services

The most dynamic flags are those whose state depends on the current user. Dynamic flags need to be hosted by a service rather than a static file mentioned previously. This is because feature flag services allow you to serve different values to different types of users at runtime.

In that sense, you would target values based on user IDs or according to percentage, i.e. a feature can be seen by x% of users, or according to a certain group the users belong to, using some user traits as targeting criteria.

This also works if you’re experimenting with features, with a traditional A/B testing approach, and you need to randomly put users in different groups so the user’s ID would be sent to the service to place in a group accordingly.

Such IDs (that could be stored in cookies or localStorage…) are also useful when testing internally where a feature is turned on for internal users as detected via a special ID value. Developers would then turn these features on for themselves to make sure it’s working as it should and test directly in production.

In such situations, you would opt for a third party service to fetch values of the feature flags.

Such feature flag services would, therefore, also allow you to to choose values for each flag depending on specific attributes such as the country users are from.

For example, AB Tasty’s flagging functionality allows you to assign specific flag values to different user segments. So, internal users would be served one value and would then see the new feature while external users will be served another value.

For example, in the image below from the AB Tasty dashboard, the value that is fetched will be IOS in Scenario 1 and ANDROID in Scenario 2 so only the users matching this value will be exposed to the feature.

This is extremely useful if you want to do gradual rollouts and testing in production and for generally more advanced use-cases.

Conclusion

For advanced use-cases, it would make sense to opt for a third party solution to help you manage and implement the different types of flags in your system.

These solutions offer features such as a UI for managing configuration with access control to limit who does what as well as audit logs to determine which changes were made by whom.

Therefore, sophisticated feature management solutions like AB Tasty are built to scale and include role-based and access controls to execute more complex use-cases.

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

Jul 30, 2021

5min read

Top 10 Feature Flags Projects on GitHub

Anthony Brebion

What are the top feature flags projects on GitHub? Let’s find out.

If you are into open-source software, GitHub is probably the go-to place to find interesting projects you can use for free and contribute to.

Feature flags tools are no exception. We have listed below the top 10 feature toggle repositories on github.com ranked by popularity.

If you want to explore alternatives that scale better and are suitable for more use cases, read our article about feature flag implementation journey where we answer the question: should I build or buy a feature flag platform.

For a comprehensive overview of what feature flags are, refer to our feature flagging guide.

1. Unleash/unleash

Unleash is the open-source feature management platform. It provides a great overview of all feature toggles/flags across all your applications and services. Unleash enables software teams all over the world to take full control on how they deploy new functionality to end users.

https://github.com/Unleash/unleash

2. Fetlife/rollout

Fast feature flags based on Redis.

https://github.com/fetlife/rollout

3. jnunemaker/flipper

Flipper gives you control over who has access to features in your app.

Enable or disable features for everyone, specific actors, groups of actors, a percentage of actors, or a percentage of time.
Configure your feature flags from the console or a web UI.
Regardless of what data store you are using, Flipper can performantly store your feature flags.
Use Flipper Cloud to cascade features from multiple environments, share settings with your team, control permissions, keep an audit history, and rollback.

https://github.com/jnunemaker/flipper

4. uber/piranha

Piranha is a tool to automatically refactor code related to stale flags. At a higher level, the input to the tool is the name of the flag and the expected behavior, after specifying a list of APIs related to flags in a properties file. Piranha will use these inputs to automatically refactor the code according to the expected behavior.

This repository contains four independent versions of Piranha, one for each of the four supported languages: Java, JavaScript, Objective-C and Swift.

https://github.com/uber/piranha

5. checkr/flagr

Flagr is an open source Go service that delivers the right experience to the right entity and monitors the impact. It provides feature flags, experimentation (A/B testing), and dynamic configuration. It has clear swagger REST APIs for flags management and flag evaluation.

https://github.com/checkr/flagr

6. markphelps/flipt

Flipt is an open source, on-prem feature flag application that allows you to run experiments across services in your environment. Flipt can be deployed within your existing infrastructure so that you don’t have to worry about your information being sent to a third party or the latency required to communicate across the internet.

Flipt supports use cases such as:

Simple on/off feature flags to toggle functionality in your applications
Rolling out features to a percentage of your customers
Using advanced segmentation to target and serve users based on custom properties that you define

https://github.com/markphelps/flipt

7. ff4j/ff4j

FF4j, is an implementation of the Feature Toggle pattern for Java. It provides a rich set of features:

Enable and disable features at runtime – no deployments.
Enable features not only with flag values but also drive access with roles and groups.
Implement custom predicates to evaluate if a feature is enabled.
Keep your code clean and readable: Avoid nested if statements but use annotations.
Each action (create, update, delete, toggles) can be traced and saved in the audit trail for troubleshooting.
Administrate FF4j (including features and properties) with the web UI.
Wide choice of databases technologies to store your features, properties and events.
(Distributed) Cache Evaluating predicates may put pressure on DB (high hit ratio).

https://github.com/ff4j/ff4j

8. togglz/togglz

Togglz is another implementation of the Feature Toggles pattern for Java.

Modular setup. Select exactly the components of the framework you want to use. Besides the main dependency, install specific integration modules if you are planning to integrate Togglz into a web application (Servlet environment) or if you are using CDI, Spring, Spring Boot, JSF.
Straight forward usage. Just call the isActive() method on the corresponding enum to check if a feature is active or not for the current user.
Admin console. Togglz comes with an embedded admin console that allows you to enable or disable features and edit the user list associated with every feature.
Activation strategies. They are responsible for deciding whether an enabled feature is active or not. Activation strategies can, for example, be used to activate features only for specific users, for specific client IPs or at a specified time.
Custom Strategies. Besides the built-in default strategies, it’s easy to add your own strategies. Togglz offers an extension point that allows you to implement a new strategy with only a single class.
Feature groups. To make sure you don’t get lost in all the different feature flags, Togglz allows you to define group for feature that are just used for a visual grouping in the admin console.

https://github.com/togglz/togglz

9. jason-roberts/FeatureToggle

Simple, reliable feature toggles in .NET

https://github.com/jason-roberts/FeatureToggle

10. tompave/fun_with_flags

FunWithFlags is an OTP application that provides a 2-level storage to save and retrieve feature flags, an Elixir API to toggle and query them, and a web dashboard as control panel.

It stores flag information in Redis or a relational DB (PostgreSQL or MySQL, with Ecto) for persistence and synchronization across different nodes, but it also maintains a local cache in an ETS table for fast lookups. When flags are added or toggled on a node, the other nodes are notified via PubSub and reload their local ETS caches

https://github.com/tompave/fun_with_flags

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

Jul 28, 2021

3min read

Introducing 1,000 Experiments Club: A New Podcast Series From AB Tasty

AB Tasty

Join VP Marketing Marylin Montoya as she takes a deep dive into all things experimentation

Today, we’re handing over the mic to AB Tasty’s VP Marketing Marylin Montoya to kick off our new podcast series, “1,000 Experiments Club.”

At AB Tasty, we’re a bunch of product designers, software engineers and marketers (aka Magic Makers), working to build a culture of experimentation. We wanted to move beyond the high-level rhetoric of experimentation and look into the nitty gritty building blocks that go into running experimentation programs and digital experiences.

Enter: “1,000 Experiments Club,” the podcast that examines how you can successfully do experimentation at scale. Our podcast brings together a selection of the best and brightest leaders to uncover their insights on how to experiment and how to fail … successfully.

In each episode, Marylin sits down to interview our guests from tech giants, hyper-growth startups and consulting agencies — each with their own unique view on how they’ve made experimentation the bedrock of their growth strategies.

You’ll learn about why failing is part of the process, how to turn metrics into your trustworthy allies, how to adapt experimentation to your company size, and how to get management buy-in if you’re just starting out. Our podcast is for CRO experts, product managers, software engineers; there’s something for everyone, no matter where you fall on the maturity model of experimentation!

We are kicking things off with three episodes, each guest documenting their journey of where they went wrong, but also the triumphs they’ve picked up from decades of experimentation, optimization and product development.

Ronny Kohavi (ex-Amazon, Airbnb, Microsoft)

He shares a humbling reality check: Most ideas will fail

Chad Sanderson (Convoy)

He breaks down the most successful types of experimentations

Jonny Longden (Journey Further)

He believes anyone can and should do experimentation

In the culture of experimentation, there’s no such thing as a “failed” experiment: Every test is an opportunity to learn and build toward newer and better ideas. So have a listen and subscribe to “1,000 Experiments Club” on Apple Podcasts, Spotify or wherever you get your podcasts.

Take to me to the podcast!

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

Jul 19, 2021

5min read

1,000 Experiments Club: A Conversation With Jonny Longden of Journey Further

AB Tasty

Is experimentation for everyone? A resounding yes, says Jonny Longden. All you need are two ingredients: A strong desire and tenacity to implement it.

There’s a dangerous myth lurking around, and it’s the idea that you have to be a large organization to practice experimentation. But it’s actually the smaller companies and start-ups that need experimentation the most, says Jonny Longden of performance marketing agency Journey Further.

With over a decade of experience in conversion optimization and personalization, Jonny co-founded Journey Further to help clients embed experimentation into the heart of what they do. He currently leads the conversion division of the agency, which also focuses on PPC, SEO, PR — among other marketing specializations.

Any company that wants to unearth any sort of discovery should be using experimentation, especially start-ups who are in the explorative phase of their development. “Experimentation requires no size: It’s all about how you approach it,” Jonny shared with AB Tasty’s VP Marketing Marylin Montoya.

Here are a few of our favorite takeaways from our wide-ranging chat with Jonny.

The democratization of experimentation

People tend to see more experimentation teams and programs built at large-scale companies, but that doesn’t necessarily mean other companies of different sizes can’t dip their toes in the experimentation pool. Smaller companies and start-ups can equally benefit from this as long as they have the tenacity and capabilities to implement it.

You need to truly believe that without experimentation, your ideas won’t work, says Jonny. There are things that you think are going to work and yet they don’t. Conversely, there are many things that don’t seem like they work but actually end up having a positive impact. The only way to arrive at this conclusion is through experimentation.

Ultimately, the greatest discoveries (for example, space, travel, medicine, etc.) have come from a scientific methodology, which is just observation, hypothesis, testing and refinement. Approach experimentation with this mindset, and it’s anyone’s game.

Building the right roadmaps with product teams

Embedding experimentation into the front of the product development process is important, but yet most people don’t do it, says Jonny. From a pure business perspective, it’s about trying to de-risk development and prove the value of a change or feature before investing any more time, money and bandwidth.

Luckily, the agile methodology employed by many modern teams is similar to experimentation. Both rely on iterative customer collaboration and a cycle of rigorous research, quantitative and qualitative data collection, validation and iteration. The sweet spot is the collection of both quantitative and qualitative data — a good balance of feedback and volume.

The success of building a roadmap for an experimentation program comes down to understanding the organizational structure of a company or industry. In SaaS companies, experimentation is embedded into the product teams; for e-commerce businesses, experimentation fits better into the marketing side. Once you’ve determined the owner and objectives of the experimentation, you’ll need to understand whether you can effectively roll out the testing and have the right processes in place to implement results of a test.

Experimentation is, ultimately, innovation

The more you experiment, the more you drive value. Experimentation at scale enables people to learn and build more tests based on these learnings. Don’t use testing to only identify winners because there’s much more knowledge to be gained from the failed tests. For example, you may only have 1 in 10 tests that work. The real value comes in the 9 lessons you’ve acquired, not just the 1 test that showed positive impact.

When you look at it through these lenses, you’ll realize that the post-test research and subsequent actions are vital: That’s where you’ll start to make more gains toward bigger innovation.

Jonny calls this the snowball effect of experimentation. Experimentation is innovation — when done right. At the root, it’s about exploring and seeing how your customers respond. And as long as you’re learning from the results of your tests, you’ll be able to innovate faster precisely because you are building upon these lessons. That’s how you drive innovation that actually works.

What else can you learn from our conversation with Jonny Longden?

Moving from experimentation to validation
How to maintain creativity during experimentation
Using CRO to identify the right issues to tackle
The required building blocks to successful experimentation

About Jonny Longden

Jonny Longden leads the conversion division of Journey Further, a performance marketing agency specializing in PPC, SEO, PR, etc. Based in the United Kingdom, the part-agency, part-consultancy helps businesses become data-driven and build experimentation into their programs. Prior to that, Jonny dedicated over a decade in conversion optimization, experimentation and personalization, working with Sky, Visa, Nike, O2, Mvideo, Principal Hotels and Nokia.

About 1,000 Experiments Club

The 1,000 Experiments Club is an AB Tasty-produced podcast hosted by Marylin Montoya, VP of Marketing at AB Tasty. Join Marylin and the Marketing team as they sit down with the most knowledgeable experts in the world of experimentation to uncover their insights on what it takes to build and run successful experimentation programs.

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

10min read

How to Release Fearlessly

AB Tasty

The excitement of launching a new feature

Are you on the brink of launching a new feature – one that will affect many of your high-value clients? You’ve worked hard to build it, you’re proud of it and you should be!

You can’t wait to release it for all your users, but wait! What if you’ve missed something? Something that would ruin all your engineering efforts?

There’s nothing worse than starting the day after a release by having to immediately deal with a number of alerts for production issues and spending the day checking a number of logging and monitoring systems for errors and, ultimately, having to rollback the feature you just launched. You would just feel frustrated and unmotivated.

In addition to sapping the morale of your technical teams, NIST has shown that the longer a bug takes to be detected, the more costly it is to fix. This is illustrated by the following graph:

This is explained by the fact that once the feature has been released and is in production, finding bugs is difficult and risky. In addition to preventing users from being affected by problems, it’s critical to ensure service availability.

Are you sure your feature is bug-free?

You might think that this won’t happen to you. That your feature is safe and ready to deploy.

History has shown that it can happen to the biggest companies. Let’s name a few examples.

Facebook, May 7, 2020. An update to Facebook’s SDK rolled out to all users, missed a bug: a server value that was supposed to provide a dictionary of things was changed to provide a simple YES/NO instead. This really tiny change was enough to break Facebook’s authentication system and affect tons of other apps like TikTok, Spotify, Pinterest, Venmo, and other apps that didn’t even use Facebook’s authentication system as it is extremely common for apps to connect to Facebook regardless of whether they use a Facebook-related feature, mainly for ad attribution. The result was unequivocal, the app simply crashed right after launch. Facebook fixed the problem in a hurry, with about two hours for things to get back to normal. But do you have the same resources as Facebook?

Apple, September 19, 2012. Another good example, even though it’s a bit older, would be the replacement of Google Maps with Apple Maps in iOS 6 in 2012 on iOS devices. For many customers and especially fans, Apple always handles the rollout of new features carefully, but this time they messed up. Apple didn’t want to be tied to Google’s app anymore, so they made their own version. However, in their rush to release their map system, Apple made some unforgivable navigational mistakes. Among the many failures of Apple Maps are erased cities, disappearing buildings, flattened landmarks, duplicate islands, distorted graphics, and erroneous location data. A large part of this mess could have been avoided if they had deployed their new map application progressively. They would have been able to spot the bugs and quickly fix them before massive deployment.

And now, thinking about this and seeing that even big companies are impacted, you’re stressed out and may not even want to release it anymore.

But don’t worry! At AB Tasty, we know that building a feature is only half of the story and that to be truly effective, that feature has to be well deployed.

Our feature management service has you covered. You’ll find a set of useful features, such as progressive rollout, to free you from the fear of a release catastrophe and erase feature management frictions, so that you can focus on value-added tasks to get high-quality features into production and apply your energy and innovation in the best way possible, thereby delivering maximum value to your customers.

What’s progressive rollout?

So now you’re curious: what’s progressive rollout? How will this help me monitor the release and make sure everything is okay?

A progressive rollout approach lets you test the waters of a new version with a restricted set of clients. You can set percentages of users to whom your feature will be released and gradually update the percentage to safely deploy your feature. You can also do a canary launch by manually targeting several groups of people at various stages of your rollout.

This is a practice already used by large companies that have realized the significant benefits of a progressive rollout.

Netflix, for example, is one of the most dynamic companies and its developers are constantly releasing updates and new software, but users rarely experience downtime and encounter very few bugs or issues. The company is able to deliver such a smooth experience thanks to sophisticated deployment strategies, such as Canary deployment and progressive deployment, multiple staging environments, blue/green deployments, traffic splitting, and easy rollbacks to help development teams release software changes with confidence that nothing will break.

Disney is another good example of a company that makes the most of progressive deployment. It has taken the phased deployment approach to a whole new level for its “Disney +” and “Star” streaming services by deploying them regionally rather than globally. This delivery method is driven by the needs of the business. The company is making sure that everything is ready at the regional level, in line with its focus on the most important markets. Prior to launching Disney+ in Europe, it spent a lot of time building the local infrastructure needed to deliver a high-quality experience to consumers when launching Disney+ in Europe, including establishing local colocation facilities and beefing up data centers to cache content regionally. After starting to roll it out in Europe, Disney was able to identify that, for some markets, the launch of Disney+ could actually create issues that would have resulted in latency and thus provide a poor experience for affected users. So they took proactive steps to reduce their overall bandwidth usage by at least 25% prior to their march 24 launch and delayed their launch in France by two weeks. Without progressive deployment, they wouldn’t have been able to identify these issues. And that’s why the launching of Disney + was remarkable.

What are the benefits of the progressive rollout?

There are three main benefits to the progressive rollout approach.

Avoiding bugs impacting everyone in production at once

First, by slowly ramping up the load, you can monitor and capture metrics about how the new feature impacts the production environment. If any unanticipated issues come to light, you can pause the full launch, fix the issues, and then smoothly move ahead. This data-driven approach guarantees a release with maximum safety and measurable KPIs.

Validating the “Viable part” in your MVP

You can effectively measure how well your feature is welcomed by your users. If you launch a new feature to 10% of your client base and notice revenue or engagement taking a dip, you can pause the release and investigate. The other major advantage? Anticipating costs. Since margin, profit and revenue are an important part of sustainability, unexpected costs that blow up your projected budgets at the end of the month are almost as bad as the night sweats that come from an unexpected bug! Monitoring your costs during a progressive rollout and immediately pausing the launch if those costs spike is a phenomenal level of control that you will absolutely want to get in on.

Progressively deploying services based upon business drivers

Finally, deploying a service or product progressively can also be seen as a way of prioritizing specific markets based on data-driven business plans. Disney, for example, decided not to launch the service in the U.S. when it launched “Star,” its new channel available in the Disney+ catalog for international audiences, which will feature more mature R-rated movies, FX TV shows, and other shows and movies that Disney owns the rights to but that do not fit the Disney+ family image. Ironically: U.S. customers will have to pay extra on their Disney+ subscription to access the same content on the other streaming service, Hulu.

The decision was made following a complex matrix of rights agreements and revenue streams. Disney found that subscribers are willing to pay for the separate Hulu and Disney+ libraries in the U.S., but that Star’s more limited lineup was enough to justify a standalone paid purchase for international customers, who will have to add $2 to their initial $6.99 subscription to access it. When the content library for Star is enough to justify not going through Hulu anymore, the U.S. customers will have access to it by paying just 1$ more. This progressive rollout approach has enabled Disney to make sure that once they launch Star in the U.S., everything will be ready and they will achieve good results.

In other words, the progressive rollout approach helps you ensure that your functionality meets the criteria of usability, viability, and desirability in accordance with your business plan.

How to act fast when you identify bugs while progressively deploying a feature?

Now that you know more about the progressive rollout of your features/products, you may be wondering how to take action if you identify bugs or if things aren’t going well. Lucky for you, we’ve thought of that part too. In addition to progressive rollout, you’ll also find automatic rollback on KPIs and feature flagging in the AB Tasty toolkit.

Feature flagging will let you set up flags on your feature, that work as simply as a switch on/off button. If for any reason you identify threats in your rollout or if the engagement of your users is not really convincing, you can simply toggle your feature off and take time to fix any issues.

This implies that you are aware and that someone from the product team is available to turn it off. But what if something happens overnight and no one can check on the progress of the deployment? Well, for that eventuality, you can set up automatic rollbacks (also called Rollback Threshold) linked to key performance indicators. Our algorithm will check the performance of your deployment and, based on the KPIs you set, if something goes wrong, it will automatically roll back the deployment and inform you that a problem has occurred. This way, in the morning, your engineers will be able to fix the problems without having to deal with the rollback themselves.

Conclusion

Downtime incidents are stressful for both you and your customers. To resolve them quickly and efficiently, you need to have access to the right tools and make the most of them. The progressive rollout, automatic rollback, and feature flagging are great levers to relieve your product teams of stress and let them focus on innovating your product to create a wonderful experience for your users. Highly effective organizations have already realized the importance of having the right approach to deployment with the right tools. What about your organization?

AB Tasty minimizes risk and maximizes results to make the lives of Product teams a whole lot easier. Create a free account today!

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

5min read

Fun with Flags: Access Control

Rowan Haddad

In the previous post in this series, we mentioned how important it is to adhere to naming conventions to keep track of your flags, such as who created them and for what purpose.

We also mentioned that in a worst case scenario without such conventions, someone could accidentally trigger the wrong flag inciting disruptions within your system.

This brings us to why it’s also essential to manage and control who has access to the flags in your system. Without access control, you might also risk someone switching on a faulty feature and going live to your user base.

Consequently, access control should be used to manage the level of control people across your organization have and restrict usage if necessary.

The general rule of thumb is to only give users the kind of access they require to do their job, not more and not less.

Let’s look at a practical example…

AB Tasty, for example, offers a feature flagging functionality that is designed to be used by product and marketing teams as well as developers while offering fine control over user access.

As you can see in the image above, there are different categories of access under ‘Role’ in the AB Tasty dashboard. There are 4 categories overall, which are divided as follows:

SuperAdmin- this is the category with the highest access privileges. The SuperAdmin can do anything from creating a project to toggling on/off a use case as well as access reporting.
Admin- the Admin can do almost anything a SuperAdmin can except create, delete or rename a project
User
Creator
Viewer

User, Creator and Viewer have varying but lower levels of access with Viewer having the lowest level of access.

It’s essential that as your use-cases grow and as you go further along the feature flag journey, you are able to manage access for users at a detailed level.

This is also useful when you’ve scheduled regular flag clean-ups to avoid technical debt. By implementing access control, you’ll minimize the risk of having someone on your team accidentally delete a permanent flag. Similarly, it allows teams to track short-lived flags they created so that they can delete them from your system to avoid accumulation of technical debt.

This way, each flag owner would be responsible for assessing the flag they own and keep tabs on it by grouping flags by owner instead of allocating the task of auditing the entire set of flags in your system at random.

Determining who gets access to what

As you’ve seen so far, different teams within your organization could and should have different levels of visibility and access to flags.

Once you’ve decided who gets access to which flags, you’ll need to consider how to control access.

Are technical teams the only ones allowed to make changes or will other teams such as your product and marketing teams get access to flags too? If so, you might need a sophisticated control panel to help them make changes.

Opting for feature flags as a service from a third party will make this process easier rather than building your own system as such services would provide you with a rich UI control panel without costing you too much time and money.

In contrast, building your own advanced feature flag service could drain your resources, especially as your use cases become more advanced and as more of your teams start using this service simultaneously.

Read more about whether you’re better off building or buying a feature flagging management system.

Audit logs

One way to track changes and control access to feature flags is by constructing an audit log. Such a log would help you keep track of all changes being made to each feature flag, such as who made the changes and when.

It allows for complete transparency, giving full visibility over the implementation of any feature flag changes in your system.

Because flag configuration carries risk, having an audit trail of change is crucial, especially when flags are being used in a highly regulated environment such as in finance or healthcare.

Thus, this makes the process more secure when you restrict access to the number of people allowed to make changes to sensitive flags.

Why does this matter?

At this point, you may be wondering why you should even be considering giving all your teams, especially your non-technical teams access to feature flags.

While your team of developers are the ones who create these flags, it is usually your marketing or product teams who will decide when to release a feature and for what purpose. They are the ones that are working closely with your customers and know their needs and preferences.

It is usually the case that your marketing team is the one forming these experiments in the first place to test out a new idea and this team would know who are the most relevant users to test on.

Therefore, the marketing team would often be the one who’s handling the experimental toggles, for example, to run A/B tests.

Meanwhile, operational toggles such as a kill switch are usually handled by the operations team.

When you have a clear-cut UI panel, you are restricting what those teams can do so each team focuses on managing the flags they’re actually responsible for without allowing any room for error.

A management system should allow you to keep track of your toggles, who created them and who can flip them on and off safely.

Conclusion

It’s important that you ensure you have full visibility over the flags in your system and that everyone across your teams know their level of responsibility when it comes to implementing feature flags.

Most sophisticated feature management solutions provide role-based access controls as well as an audit log for flags and contributors for increased transparency over your flags.

With such solutions, you would have an easy to use interface that allows you to expand feature management beyond your development team in a safe and controlled manner.

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

Jul 12, 2021

6min read

Fun with Flags: Naming Conventions

Rowan Haddad

Feature flags are an indispensable tool to have when it comes to software development and release that can be used for a wide range of use-cases.

You know the saying ‘you can never have too much of a good thing’? In this case, the opposite is true. While feature flags can bring a lot of value to your processes, they can also pose some problems particularly if mismanaged. Such problems particularly come about when using an in-house feature flagging platform.

In this series of posts of ‘Fun with Flags’, we will be introducing some essential best practices when it comes to using and implementing feature flags.

In this first post, we will focus on an important best practice when it comes to keeping your flags organized: naming your flags.

What are feature flags?

First things first, let’s start with a brief overview of feature flags.

Feature flags are a software development practice that allows you to turn certain functionalities on and off to safely test new features without changing code.

Feature flags mitigate risks of new releases by decoupling code deployment from release. Therefore, any new changes that are ready can be released while features that are still a work-in-progress can be toggled off making them invisible to users.

There are many types of feature flags, categorized based on their dynamism and longevity that serve different purposes.

As the number of flags within your system grows, the harder it becomes to manage your flags. Therefore, you need to establish some practices in order to mitigate any negative impact on your system.

It is because of these different categories of flags that establishing a naming convention becomes especially important.

Click here for our definitive guide on feature flags.

What’s in a name?

Defining a name or naming convention for your flags is one essential way to keep all the different flags you have in your system organized.

Before creating your first flag, you will need to establish a naming convention that your teams could adhere to when the time comes to start your feature flag journey. Thus, you need to be organized from the very beginning so as not to lose sight of any flags you use now and in the future.

The most important rule when naming your flags is to try to be as detailed and descriptive as possible. This will make it easier for people in your organization to determine what this flag is for and what it does. The more detailed, the less the margin for confusion and error.

We suggest the following when coming up with a naming convention:

Include your team name– this is especially important if you come from a big organization with many different teams who will be using these flags.
Include the date of creating the flag– this helps when it comes to cleaning up flags, especially ones that have been in your system for so long that they’re no longer necessary and have out-served their purpose.
Include a description of the flag’s behavior.
Include the flag’s category– i.e. whether it’s permanent or temporary.
Include information about the test environment and scope as in which part of the system the flag affects.

Let’s consider an example…

Let’s say you want to create a kill switch that would allow you to roll back or turn off any buggy features.

Kill switches are usually long-lived so they may stay in a codebase for a long time, meaning it is essential the name is clear and detailed enough to keep track of this flag since its use can extend across many contexts over a long period of time.

The name of such a flag could look something like this:

algteam_20-07-2021_Ops_Configurator-killswitch

Following the naming convention mentioned above, the first part is the name of the team followed by the date of creation then the category and behavior of the flag. In this case, a kill switch comes under the category of operational (ops) toggles.

Using such a naming convention allows the rest of your teams across your organization to see exactly who created the flag, for what purpose and how long it has been in the codebase.

Why is this important?

Sticking with such naming conventions will help differentiate such long-lived flags from shorter-lived ones.

This is important to remember because an operational toggle such as a kill switch serves a vastly different purpose than, for example, an experiment toggle.

Experiment toggles are created in order to measure the impact of a feature flag on users. Over the course of your feature flag journey, you may create numerous flags such as these to test different features.

Without an established and well-thought out naming convention, you’ll end up with something like ‘new-flag’ and after some time and several flags later, this naming system becomes tedious and repetitive to the point when you cannot determine which flag is actually new.

In this case, you might want to opt for something more specific. For example, if you want to test out a new chat option within your application, you’ll want to name it something akin to:

Markteam_21-07-2021_chatbox-temp

Such a name will clue you into who created it and what this flag was created for, which is to test the new chat box option and of course, you will know it’s a temporary flag.

This way, there will be no more panic from trying to figure out what one flag out of hundreds is doing in your system or ending up with duplicate flags.

Or even worse, someone on your team could accidentally trigger the wrong flag, inciting chaos and disruption in your system and ending up with very angry users.

Don’t drown in debt

Making a clear distinction between the categories of flags is crucial as short-lived, temporary flags are not meant to stay in your system for long.

This will help you clean up those ‘stale’ flags, otherwise they may end up clogging your system giving rise to the dreaded technical debt.

To avoid an accumulation of this type of debt, you will need to frequently review the flags you have in your system, especially the short-lived ones.

This would be no easy task if your flags had names like ‘new flag 1’, ‘new flag 2’, and well, you get the gist.

A wise man once said…

…“the beginning of wisdom is to call things by their proper name” (Confucius). In the case of feature flags, this rings more true than ever.

Whatever naming convention you choose that suits your organization best, make sure that everyone across all your teams stick to them.

Feature flags can bring you a world of benefits but you need to use them with caution to harness their full potential.

With AB Tasty’s flagging functionality, you can remotely manage your flags and keep technical debt at bay by providing you with dedicated features to help you keep your feature flags under control.

For example, the Flag Tracking dashboard within the AB Tasty platform lists all the flags that are set up in your system so you can easily keep track of every single flag and its purpose.

You might also like...

See all

Article

5min read

Why AB Tasty Delivers 4x Faster

Leo Wiel

Jul 7, 2025

Article

15min read

16 Experimentation Influencers You Should Follow

Maddie Ostrander

Jul 3, 2025

Article

3min read

Experiment Health Check: Proactive Monitoring for Reliable Experimentation

Emily Healy

Jul 1, 2025

Subscribe to
our Newsletter

Article

Jul 10, 2021

17min read

What is Blue-Green Deployment

Anthony Brebion

One of the most critical metrics in DevOps is the speed with which you deliver new features. Aligning developers, ops teams, and support staff together, they quickly get new software into production that generates value sooner and can often be the deciding factor in whether your company gains an edge on the competition.

Quick delivery also shortens the time between software development and user feedback, which is essential for teams practicing CI/CD.

One practice you should consider adding to your CI/CD toolkit is the blue-green deployment. This process helps reduce both technical and business risks associated with software releases.

In this model, two identical production environments nicknamed “blue” and “green” are running side-by-side, but only one is live, receiving user transactions. The other is up but idle.

In this article, we’ll go over how blue-green deployments work. We’ll discuss the pros and cons of using this approach to release software. We’ll also compare how they stack up against other deployment methodologies and give you some of our recommended best practices for ensuring your blue-green deployments go smoothly.

[toc]

How do blue-green deployments work?

One of the most challenging steps in a deployment process is the cutover from testing to production. It must happen quickly and smoothly to minimize downtime.

A blue-green deployment methodology addresses this challenge by utilizing two parallel production environments. At any given time, only one of them is the live environment receiving user transactions. In the image below, that would be green. The blue idle system is a near-identical copy.

A blue-green deployment routing diagram (Source)

Your team will use the idle blue system as your test or staging environment to conduct the final round of testing when preparing to release a new feature. Once the new software is working correctly on blue, your ops team can switch routing to make blue the live system. You can then implement the feature on green, which is now idle, to get both systems resynchronized.

Generally speaking, that is all there is to a blue-green deployment. You have a great deal of flexibility in how the parallel systems and cut-overs are structured. For example, you might not want to maintain parallel databases, in which case all you will change is routing to web and app servers. For another project, you may use a blue-green deployment to release an untested feature on the live system, but set it behind a feature flag for A/B user testing.

Example

Let’s say you’re in charge of the DevOps team at a niche e-commerce company. You sell clothing and accessories popular in a small but high-value market. On your site, customers can customize and order products on-demand.

Your site’s backend consists of many microservices in a few different containers. You have microservices for inventory management, order management, customization apps, and a built-in social network to support your customers’ niche community.

Your team will release early and often as you credit your CI/CD model for your continued popularity. But this niche community is global, so your site sees fairly steady traffic throughout any given day. Finding a lull in which to update your production system is always tricky.

When one of your teams announces that their updated customization interface is ready for final testing in production, you decide to release it using a blue-green deployment so it can go out right away.

Animation of load balancer adjusting traffic from blue to green (Source)

The next day before lunch, your team decides they’re ready to launch the new customizer. At that moment, all traffic routes to your blue production system. You update the software on your idle green system and ask testers to put it through Q/A. Everything looks good, so your ops team uses a load balancer to redirect user sessions from blue to green.

Once traffic is completely filtered over to green, you make it the official production environment and set blue to idle. Your dev team pushes the updated customizer code to blue, puts in their lunch order, and takes a look at your backlog.

Pros: Benefits & use cases

One of the primary advantages of blue-green deployments over other software release strategies is how flexible they are. They can be beneficial in a wide range of environments and many use cases.

Rapid releasing

For product owners working within CI/CD frameworks, blue-green deployments are an excellent method to get your software into production. You can release software practically any time. You don’t need to schedule a weekend or off-hours release because, in most cases, all that is necessary to go live is a routing change. Because there is no associated downtime, these deployments have no negative impact on users.

They’re less disruptive for DevOps teams too. They don’t need to rush updates during a set outage window, leading to deployment errors and unnecessary stress. Executive teams will be happier too. They won’t have to watch the clock during downtime, tallying up lost revenue.

Simple rollbacks

The reverse process is equally fast. Because blue-green deployments utilize two parallel production environments, you can quickly flip back to the stable one should any issues arise in your live environment.

This reduces the risks inherent in experimenting in production. Your team can easily remove any issues with a simple routing change back to the stable production environment. There is a risk of losing user transactions cutting back—which we’ll get into a little further down—but many strategies for managing that situation are available.

You can temporarily set your app to be read-only during cutovers. Or you could do rolling cutovers with a load balancer while you wait for transactions to complete in the live environment.

Built-in disaster recovery

Because blue-green deployments use two production environments, they implicitly offer disaster recovery for your business systems. A dual production environment is its own hot backup.

Load balancing

Blue-green parallel production environments also make load balancing easy. When the two environments are functionally identical, you can use a load balancer or feature toggle in your software to route traffic to different environments as needed.

Easier A/B testing

Another use case for parallel production environments is A/B testing. You can load new features onto your idle environment and then split traffic with a feature toggle between your blue and green systems.

Collect data from those split user sessions, monitor your KPIs, and then, if analyses of the new feature look good in your management system, you can flip traffic over to the updated environment.

Cons: Challenges to be aware of

Blue-green deployments offer a great deal of value, but integrating the infrastructure and practices required to carry them out creates challenges for DevOps teams. Before integrating blue-green deployments into your CI/CD pipeline, it is worth understanding these challenges.

Resource-intensive

As is evident by now, to perform a blue-green deployment, you will need to resource and maintain two production environments. The costs of this, in money and sysadmin time, might be too high for some organizations.

For others, they may only be able to commit such resources for their highest value products. If that is the case, does the DevOps team release software in a CI/CD model for some products but not others? That may not be sustainable.

Extra database management

Managing your database—or multiple databases—when you have parallel production environments can be complicated. You need to account for anything downstream of the software update you’re making needs in both your blue and green environments, such as any external services you’re invoking.

For example, what if your feature change requires you to rename a database column? As soon as you change the name to blue, the green environment with old code won’t function with that database anymore.

Can your entire production environment even function with two separate databases? That’s often not the case if you’re using your blue and green systems for load balancing, testing, or any function other than as a hot backup.

A blue-green deployment diagram with a single database (Source)

Product management

Aside from system administration, managing a product that runs on two near-identical environments also requires more resources. Product Managers need reliable tools for tracking how their software is performing, which services different teams are updating, and ways to monitor the KPIs associated with each. A reliable product and feature management dashboard to monitor and coordinate all of these activities becomes essential.

Blue-green deployments vs. rolling deployments

Blue-green deployments are, of course, not the only option for performing rapid software releases. Another popular approach is to conduct a rolling deployment.

Rolling deployments also require a production environment that consists of multiple servers hosting an application, often, but not always, with a load balancer in front of them for routing traffic. When the DevOps team is ready to update their application, they configure a staggered release, pushing to one server after another.

While the release is rolling out, some live servers will be running the updated application, while others have the older version. This contrasts with a blue-green deployment, where the updated software is either live or not for all users.

As users initiate sessions with the application, they might either reach the old copy of the app or the new one, depending on how the load balancer routes them. When the rollout is complete, every new user session that comes in will reach the software’s updated version. If an error occurs during rollout, the DevOps team can halt updates and route all traffic to the remaining known-good servers until they resolve the error.

Rolling deployments are a viable option for organizations with the resources to host such a large production environment. For those organizations, they are an effective method for releasing small, gradual updates, as you would in agile development methodologies.

There are other use cases where blue-green deployments may be a better fit. For example, if you’re making a significant update where you don’t want any users to access the old version of your software, you would want to take an “all or nothing” approach, like a blue-green deployment.

Suppose your application requires a high degree of technical or customer support. In that case, the support burden is magnified during rolling deployment windows when support staff can’t tell which version of an application users are running.

Blue-green deployments vs. canary releasing

Rolling and blue-green deployments aren’t the only release strategies out there. Canary deployments are another alternative. At first, only a subset of all production environments receives a software update in a canary release. But instead of continuing to roll deploy to the rest, this partial release is held in place for testing purposes. A subset of users is then directed to the new software by a load balancer or a feature flag.

Canary releasing makes sense when you want to collect data and feedback from an identifiable set of users about updated software. Practicing canary releases dovetails nicely with broader rolling deployments, as you can gradually roll the updated software out to larger and larger segments of your user base until you’ve finished updating all production servers.

Best practices

You have many options for releasing software quickly. If you’re considering blue-green deployments as your new software release strategy, we recommend you adopt some of these best practices.

Automate as much as possible

Scripting and automating as much of the release process as possible has many benefits. Not only will the cutover happen faster, but there’s less room for human error. A dev can’t accidentally forget a checklist item if a script or a management platform handles the checklist. If everything is packaged in a script, then any developer or non-developer can carry out the deployment. You don’t need to wait for your system expert to get back to the office.

Monitor your systems

Always make sure to monitor both blue and green environments. For a blue-green deployment to go smoothly, you need to know what is going on in both your live and idle systems.

Both systems will likely need the same set of monitoring alerts, but set to different priorities. For example, you’ll want to know the second there is an error in your live system. But the same error in the idle system may need to be addressed sometime that business day.

Write backward and forward-compatible code

In some cases, new and old versions of your software won’t be able to run simultaneously during a cutover. For example, if you need to alter your database schema, it would help if you structured your updates so that both blue and green systems will be functional throughout the cutover.

One way to handle these situations is to break your releases down into a series of even smaller release packages. Let’s say our e-commerce company is deepening its inventory and needs to update its database by changing a field name from “shirt” to “longsleeve_shirt” for clarity.

They might break this update down by:

Releasing a feature flag-enabled intermediary version of their code that can interpret results from both “shirt” and “longsleeve_shirt”;
Running a rename migration across their entire database to rename the field;
Releasing the final version of the code—or flip their feature flag—so the software only uses “longsleeve_shirt.”

Do more, smaller deployments

Smaller, more frequent updates are already an integral practice in agile development and CI/CD. It is even more important to follow this practice if you’re going to conduct blue-green deployments. Reducing deployment times shortens feedback loops, informing the next release, making each incremental upgrade more effective and more valuable for your organization.

Restructure your applications into microservices

This approach goes hand-in-hand with conducting smaller deployments. Restructuring application code into sets of microservices allows you to manage updates and changes more easily. Different features are compartmentalized in a way that makes them easier to update in isolation.

Use feature flags to reduce risk further

By themselves, blue-green deployments create a single, short window of risk. You’re updating everything, all-or-nothing, but you can cut back if needed should an issue arise.

Blue-green deployments also have a pretty consistent amount of administrative overhead that comes with each cutover. You can reduce this overhead through automation, but still, you’re going to follow the same process no matter whether you’re updating a single line of code or you’re overhauling your entire e-commerce suite.

AB Tasty feature flag management — AB Tasty feature flag service

Feature flags can offer a very granular level of control over how and when users experience newly available software. Feature flags are like powerful “if” statements, from which at least one of two or more different codepaths is followed at runtime depending on a provided condition.

Those conditions can be simple “yes/no” checks, or they can be complex decision trees. Feature flags help make software releases more manageable by controlling what is turned on or off at a feature-by-feature level.

For example, our e-commerce company can perform a blue-green deployment of their customizer microservice but leave the new code turned off behind a feature flag in the live system. Then, the DevOps team can turn on that feature according to whatever condition they wish, whenever it is convenient.

The team might want to do some further A/B testing in production. Or maybe they want to conduct some further fitness tests. Or it might make more sense for the team to do a canary release of the customizer for an identified set of early adopters.

Your feature flags can work in conjunction with a load balancer to manage which users see which application and feature subsets while performing a blue-green deployment. Instead of switching over entire applications all at once, you can cut over to the new application and then gradually turn individual features on and off on the live and idle systems until you’ve completely upgraded. This gradual process reduces risk and helps you track down any bugs as individual features go live one-by-one.

You can manually control feature flags in your codebase, or you can use feature flag services for more robust control. These platforms offer detailed reporting and KPI tracking along with a deep set of DevOps management tools.

We recommend using feature flags in any major application release when you’re doing a blue-green deployment. They’re valuable even in smaller deployments where you’re not necessarily switching environments. You can enable features gradually one at a time on blue, leaving green on standby as a hot backup if a major problem arises. Combining feature flags with blue-green deployments is an excellent way to perform continuous delivery at any scale.

Consider adding blue-green deployments to your DevOps arsenal

Blue-green deployments are an excellent method for managing software releases of any size, no matter whether they’re a whole application, major updates, a single microservice, or a small feature update.

It is essential to consider how well blue-green deployments will integrate into your existing delivery process before adopting them. This article detailed how blue-green deployments work, the pros and cons of using them in your delivery process, and how they stack up against other possible deployment methods. You should now have a better sense of whether blue-green deployments might be a viable option for your organization.

Want to see other ways to improve your delivery process? Request a demo of our feature flagging platform today.