Article

10min read

Measure your DevOps Performance: DORA Metrics

Nowadays, as software development processes become more decentralized and as the number of teams working on different projects (in different places too) increases, it becomes that much harder to set and track metrics to measure performance across these teams.

And yet data is now more important than ever. Data is a company’s most valuable asset in order to measure how efficiently teams are performing over time to deliver the best products and user experience to customers. 

This is especially relevant for DevOps teams where there’s a need for a clear framework to measure their performance.

This is where DORA metrics come in.

What are DORA metrics?

DORA metrics provide a standard framework to help leaders who are implementing a DevOps methodology in their organization to measure the performance of their teams.

This framework was the result of a six-year research program conducted by Google Cloud’s DevOps Research and Assessment (DORA) team after analyzing survey responses from over 32,000 professionals worldwide. Their goal was to determine the most effective ways to develop and deliver software.

Through the use of behavioral science, the research identified four key metrics that would indicate the performance of a software development team. 

With these metrics, teams can measure their software delivery performance, monitor it over a period of time and be able to easily identify areas of improvement to optimize performance. In that sense, they shed light on the capabilities that drive high performance in technology delivery.

Therefore, DORA metrics are especially relevant for DevOps teams as they provide them with concrete data to measure performance and improve the effectiveness of their DevOps operations. It also allows them to assess whether they are building and delivering software that meets customer requirements as well as gain insights on how to improve and provide more value for customers.

The four DORA metrics

In this section, we will list the four main metrics that the DORA team identified for DevOps teams to measure their performance. 

The following chart shows from the 2022 State of DevOps report, updated each year, shows the ranges of each metric according to the different categories of performers:

The four key metrics used are:

  1. Deployment frequency

Deployment frequency measures velocity. In this case, the goal is to measure how often an organization successfully deploys code to production or releases it to end users.

This is an important metric particularly for DevOps teams whose ultimate goal is to release software quickly and frequently. It helps teams to measure their productivity and agility as well as uncover issues and bottlenecks in their workflow that may be slowing things down.

High performing teams deploy on-demand, multiple times a day. Thus, this metric stresses the importance of continuous development and deployment, which is one of the principles of a DevOps methodology.

Each organization will need to consider what constitutes a “successful” deployment for its teams such as taking into account what level of traffic is sufficient to represent a successful deployment.

How to improve this metric:

To enhance this metric, it’s usually best to ship code in small batches on a frequent basis. This will allow you to reduce risk of deploying bugs and increase speed of delivery. Implementing an automated CI/CD pipeline will also enable you to increase deployment speed.

  1. Lead time for changes

Lead time for changes is the amount of time it takes a commit to get into production. Therefore, this metric also seeks to measure velocity and gives an indication of a team’s cycle time. The lower the lead time for changes, the more efficient the team is at deploying code.

This metric requires looking at two pieces of data: when the commit happened and when it was deployed. The goal is to keep track of the time development starts until the committed code is finished and deployed to uncover any inefficiencies in a team’s processes. The average time can then be used to analyze overall performance.

In other words, the purpose of this metric is to give an indication of the waiting time between the initial stages of implementing the change and deployment. A high lead time may suggest inefficiencies in the CI/CD pipeline and not enough automation, especially if every change has to go through manual testing instead which significantly slows things down.

How to improve this metric:

Again, here it’s best to work with smaller changes. This allows for faster feedback so developers can immediately fix any issues. Teams should also eliminate bottlenecks and integrate automated testing at every stage of the CI/CD pipeline to detect issues early on. 

Feature flags are also a great tool to lower lead time as any unfinished changes can be hidden behind a flag while other changes can be deployed.

  1. Change failure rate

This represents the number of deployments causing a failure in production. In other words, it measures any changes to code that resulted in incidents, rollbacks or any other failures. This depends on the number of deployments attempted and how many of those resulted in failures in production.

As a result, this metric is a measure of the stability and quality while the previous two focus mainly on speed of software delivery. 

This metric requires the number of deployments that resulted in failures divided by the total number of deployments. The percentage calculated will give insight into how much time is dedicated to fixing errors as opposed to delivering new code. 

The lower the rate the better. High performing teams have a change failure rate of 0-15%. 

Consequently, teams with a low change failure rate is a sign that these teams have an efficient deployment process in place, which can be mainly achieved through automating every step of the process to avoid common manual errors.

It’s important to note, however, that this metric can be hard to quantify as the definition of failure can vary widely. Therefore, it’s best for each organization to set goals for its teams according to their unique business objectives. 

How to improve this metric:

Automation is crucial to also help improve this metric. Automated tests can evaluate code at every stage in its development. This way, issues are caught and fixed early on so they’re less likely to make it to production. Creating critical feedback loops are necessary to get a low change failure rate to prevent incidents like this from happening again in the future.

  1. Time to restore service 

Also referred to as “mean time to recovery MTTR”, this indicates how long it takes for an organization to recover from a failure in production that impacts user experience.

This metric, like change failure rate, is meant to determine the stability of a system or application when unplanned outages occur. Thus, information about when the incident occurred and when it was resolved then deployed will be needed to measure the time to restore service.

Therefore, the “time to restore service” metric is important as it encourages teams to build more stable systems and create action plans to be able to respond immediately to any failures. High performing teams will resort to deploying in small batches to reduce risk while increasing speed of delivery.

This is particularly applicable to DevOps teams as they place high emphasis on the idea of continuous monitoring, which will in turn help them to improve their performance when it comes to this metric.

How to improve this metric: 

Consider using feature flags. Feature flags act as switches enabling you to turn a change on or off in production. This means that if any issue occurs, you can toggle the switch off if something goes wrong with a change in production with minimal disruption while it’s being resolved. This will then help reduce your MTTR. 

The DORA metrics can then be compiled into a dashboard. To do so, DORA created the Four Keys dashboard template to generate data based on the metrics and visualize the results. See example below of this dashboard:

The dashboard gives a higher-level view for senior stakeholders of their organization’s DORA metrics to understand how their teams are performing and what corrections can be done to remedy any problems.

Why are DORA metrics important?

As we’ve already mentioned, DORA metrics are a great way to keep track of the performance of DevOps teams and identify areas of improvement.

It helps organizations assess their delivery process and encourage teams to streamline their processes and increase the speed of delivery while maintaining quality.

As a result, the main benefits of these metrics are:

  • More effective decision-making– with the data acquired from these metrics, teams know what aspects to focus on that need improvement. Teams can easily detect issues and bottlenecks within the software development process and devise an action plan to address them. Decisions will be based on data rather than opinions or gut feelings which may be misleading. 
  • Better value- DORA metrics will give teams an indication whether they’re delivering value to customers by evaluating the efficiency of your value stream and finding areas to improve within your delivery process to build higher quality products.
  • Continuous improvement- this is particularly important as it is one of the main pillars of a DevOps methodology. Using DORA metrics, the team gets insight on their performance and set goals to improve the quality and delivery of software.

Challenges of DORA metrics

DORA metrics have a lot of advantages, but they do come with their own challenges as well.

One of the main challenges when faced with these metrics is that they will vary across organizations and teams as, often, they have different definitions and processes in place. In other words, no products or teams are the same and may operate at their own level of complexity. As a result, it’s important to put this data into context before making decisions.

DORA metrics give a good overall picture of how teams are performing in certain categories. This means that it’s important to have a valid way to keep track of the data but don’t rely solely on them. 

Teams may be facing issues beyond what is accounted for in these metrics. DORA metrics are focused mainly on the bottom line instead of the inputs and processes that lead to the outputs being measured. Sometimes, there’s more to the story than what DORA metrics measure so tread carefully. 

Ultimately, enhancing performance will be unique to each organization. Work on shifting your attention to your team and goals to give context to the story all these metrics are telling. Focus on building the right culture for your team and providing them with the tools they need to enhance performance. This, in turn, will help them deliver business value faster. 

DORA metrics: The key to unlocking more value

DORA metrics are a great starting point, especially to help teams make informed decisions about what can be improved and the steps to take to achieve that. 

They give a good indication of a team’s progress along their DevOps journey and encourage the implementation of the key principles of DevOps including shipping in small batches more frequently.

In particular, it enables them to assess and analyze the efficiency of their development and delivery processes by offering a framework for measuring performance across two important variables in DevOps: speed (deployment frequency & lead time for changes) and stability (change failure rate & time to restore service). 

Teams will then be able to create more value for their customers faster. Above all, DORA metrics are a way for teams to shift their focus to maximizing velocity and stability.

It’s important to note that tracking these metrics should be in line with your organizational goals and customers’ needs to give context to these metrics, make sense of them and improve them.

Feature flags are also another effective way to improve performance across these metrics. They will allow you to ship new changes in small batches and hide any that are not yet ready to speed up deployment while reducing risk of big bang releases making problems easier to detect and resolve.

Get a demo of AB Tasty to unlock the value feature flags can bring to your teams today.

Subscribe to
our Newsletter

bloc Newsletter EN

We will process and store your personal data to send you communications as described in our  Privacy Policy.

Article

7min read

Understanding User Behavior Through Data

As part of our customer-centric data series, we are speaking with AB Tasty partners about how brands can use data to better profile customers, understand their needs, and forge valuable emotional connections with them, as well as measure overall digital impact and build a data-based, customer-centric vision.

You can catch the beginning of the series here. For our 8th installment, we spoke with Helen Wilmot, UX Director at Dentsu International. Dentsu is one the largest marketing and digital companies in the world and provides communication, marketing and digital strategies across a range of disciplines. Helen leads the UX strategy and research for Dentsu.

Why customers come to Dentsu

Dentsu is an integrated global company. People are often looking for the different channels they operate in, as well as having several disciplines collaborating under one roof to serve a client. More than that, Dentsu is a multi-national company with a network of global expertise focused on how to provide value to clients so that they actually see that they are linked to something that’s larger. Specific to the  UX and optimization team, they are tightly integrated with other channels such as brand and digital strategists, SEO, and commercial. All of those aspects are taken on board when they are looking at optimization.

That means growth is one of the major factors at play: “We are firmly focused on being customer-centric and providing value for the users,” says Helen. “We also live in the real world and we want to tie that value to real growth. We know that being customer-centric drives growth, study after study after study shows that, but it’s important that we can show that ourselves in our own behaviors.”

Understanding user behavior

Helen stresses that there is no single way to understand user behavior and as such it will often depend on what is appropriate for the research or business problem at hand. Dentsu combines many techniques when looking to understand customer behavior, often a balance of qualitative and quantitative data. In many cases they are looking for a breadth of data to make it accurate – interviews, usability testing, AI, eye tracking, data insights, and card sorting all play a part.

The techniques employed depend on the customer and the business model. Consistently, however, they are led by data. Researchers perform usability tests, both at a distance via remote studies and in-person moderated use testing. Having users try something out in their own environment using their own devices also helps Dentsu asses more natural behavior. They sometimes employ ethnographic techniques, which are borrowed from anthropology, as researchers are embedded in users’ real experiences. Looking at interaction with a topic or task in real-time, this research is more generative and exploratory but can help uncover larger issues. New technology in behavioral science and neuroscience can look at emotions, implicit response testing and eye behavior.

What customers say they want and how they actually behave

Specifically, when Helen’s team conducts interviews they look at how users behave as well as what they say. 

User testing should be about observing,” shares Helen. “If the user said they found the task easy, but they clearly didn’t, it’s a sign to use our critical thinking to evaluate that feedback.” 

So Dentsu always dig deeper to back those statements up with data. They have a rigorous approach to their research, including the use of a laddering technique used to get to the root cause of people’s thoughts and experiences. “It’s all working within the reality of how we know people’s minds work. We know that people are terrible at remembering exactly how they felt and terrible at predicting our own behavior. So that’s not to say that experiences don’t matter but as researchers, our job is to work within the reality of how human psychology works too,” says Helen.

The importance of testing and experimentation

Another way Dentsu works to validate user feedback is with a reliance on A/B testing. In the past, they have had users report certain things that motivate them, but the A/B tests do not back up that information. For Dentsu, testing out a hypothesis is a crucial part of their optimization process and it is unthinkable that they would go ahead with new ideas without testing them first. 

“I think it’s a bit of an act of hubris If you don’t go ahead and test,” stresses Helen. “The risk just absolutely shoots up, and you can’t de-risk a solution without testing it. Even if a solution is successful and is fit for purpose, there are always iterative changes that you can make. That is the beauty of a testing and experimentation mindset. You are never finished, and we never think that what we do is perfect.”

Testing allows Dentsu to move at speed. Some tests don’t always work, but Helen points out that losing tests also brings a wealth of information for other ideas. Knowing something doesn’t work for your users informs you for your next test and when you are working with future customers, it can point you in the right direction as to what techniques are currently effective for messaging and psychological persuasion.

The KPI’s and metrics within user behavior

For UX, one if the main KPIs is NPS (Net Performance Score), just as it was for Realise. In usability tests, Dentsu will look at the number of errors someone makes. If they are testing a certain structure or tree testing, they will look at success rate and directness. NPS compliments all of these because it has numerous studies behind it associated with growth. There is also a holistic view to take when it comes to user behavior metrics: what areas within the business are being affected by friction on the website? It could be customer support, payment or delivery, but they are all vital to the user experience.

A good user journey

A good user journey will always provide revenue. The two are intrinsically linked for Dentsu. If you get it right, people become more engaged in the brand, more engaged with what you are offering and it makes them more likely to do what you want. Helen wants to go further, though. Usability and revenue are vital aspects, but they are only the beginning.

Helen explains, “I think usability is the absolute baseline that we should aim for, and we really should be focused on delighting customers while creating emotionally resonant experiences. There is strong data on the link between emotion, memory and brand perception and by creating these rich, emotionally resonant experiences we can boost lifetime value. 

It seems that in the current climate, people are placing an emphasis on the bottom line and conversion, but, as Helen shares, the original thinkers on UX such as Donald Norman were looking to delight to provide insight into optimization. Brand and user experience are inevitably related and users who have an emotional connection with a brand will give you more and more chances to be present in their lives.

“It’s important to view delight as a bit of a North Star. If you make your users happy, you’ll make shareholders happy as well.”

You can find out more about our Customer Centric led by looking at our previous installment on how to solve real user problems with a CRO strategy.