Agile and DevOps Methodology (2019)

Agile and DevOps Methodology

What is Agile and DevOps Methodology?

There are large organizations in the industry using leading-edge techniques such as Agile and DevOps to develop software faster and more efficiently than anyone ever thought possible. This tutorial explains how Agile DevOps Methodology used with the best examples.


Think of Google, Amazon, and Facebook. Currently, however, the majority of software is not developed by leading-edge groups like these, but by more traditional organizations using less efficient approaches. This blog is written to help leaders of these traditional organizations understand how to successfully transform their development and delivery processes.


Improving the effectiveness of software development in traditional organizations is essential because the software is a key way businesses now compete across a broad range of industries.


Mechanical engineers that designed and built cars led the automobile industry. Then, through no fault of their own, they found that computers had infiltrated their product and become a larger part of the value they provide their customers.


Now, instead of the salesman showing off the car engine, they start with a screen for the entertainment and control system—all based on software.


Financial institutions that used to depend on traders working the floor and brokers forging customer relationships are finding that software for managing trades and interacting with their customers is helping them stay competitive.


Retail has gone from building, stocking, and managing stores to creating software that provides a common customer experience across stores, websites, and mobile devices and that manages inventory more efficiently across all these channels.


While it is clear that software is becoming a more and more important aspect of how these companies need to compete, most large, traditional organizations are struggling to deliver. They can’t respond to changes in the marketplace fast enough, and the businesses are getting frustrated.


These companies are typically struggling with lots of hard-to-change, tightly coupled legacy software that requires them to coordinate development, qualification, and deployment efforts across hundreds to thousands of engineers, making frequent deliveries impossible.


The deliveries they do provide require lots of brute-force manual effort that is frustrating and burning out their teams.


The net result is that most large, traditional organizations are finding it more and more difficult to compete in the marketplace and deliver the software innovations that their businesses require. Their current software delivery approaches are constraining their businesses and limiting their ability to compete.


Because their current approaches don’t work, many larger organizations are looking to leverage the successes that smaller businesses have seen using Agile methodologies.


They bring in Agile coaches and start forming Agile teams to apply Agile principles at the team level. The problem with this approach is that in small organizations, a couple of small Agile teams can organize to support the business.


In large, traditional organizations, however, most of the time individual teams can’t independently deliver value to the customer because it requires integrating work across hundreds of developers and addressing all the inefficiencies of coordinating this work. These are issues that the individual teams can’t and won’t solve on their own.


This is why the executives need to lead the transformation. They are uniquely positioned to lead the all-important cultural changes and muster the resources to make the necessary organization-wide technical changes.


This tutorial will provide a fundamentally different approach for transforming the software development processes in large, traditional organizations by addressing the organization-wide issues that you, the executives, are uniquely positioned to handle.


While most Agile implementations start with a focus on applying Agile principles at the team level, the approach presented in this blog focuses on applying the basic principles of Agile and DevOps across the organization.


It is based on what we, as executives leading complex transitions in large, traditional organizations, have found to be most effective for delivering solid business results.


Case study 

Case study

Many specifics referenced in this blog are leveraged from a case study of transformation at HP, detailed in A Practical Approach to Large-Scale Agile Development.


This case study includes the following dramatic results:

  1. Development costs reduced from ~$100M to ~$55M
  2. 140% increase in the number of products being supported
  3. Increased capacity for innovation from 5% to 40%


The organization at HP achieved these results by applying DevOps and Agile principles at scale. Our focus was on applying the principles at the executive staff level, and we left the teams with as much flexibility in operational choices as possible.


There were some groups that applied all the team-level Agile principles and some that chose to operate with more traditional methods.


What we found, in the end, is that there were not dramatic differences in the teams’ productivity based on the methods they used. There were, though, dramatic improvements in the overall productivity of the entire organization.


This lead to our conclusion that how teams come together to deliver value in large organizations is the first-order effect, while how individual teams work was a second-order effect.


Therefore, this blog will primarily focus on how to transform the way the teams come together to provide value to the business by integrating all their changes early and often in an operation-like environment.


This is one of the most important steps in improving the effectiveness of large organizations because it forces resolving conflicts between teams early before too much time and effort is wasted on code that won’t work together in production.


Then, when that part of the transformation is complete, the organization will have the right framework in place to continue improving and fine-tuning how the individual teams work with more traditional Agile methods at the team level.


Executives need to understand that applying Agile and DevOps principles at scale both differs significantly from typical Agile implementations and provides quicker time to value. To help executives understand why they need to understand the challenges that large organizations experience with traditional approaches. 


The first step executives need to understand about our approach is that it is paramount, to begin with, business objectives. You should never “do Agile or DevOps” just so you can say you did them.


A large-scale transformation is a too much work and turmoil just to be able to say you are “doing Agile.” We believe that the key reason executives would be willing to take on this much change is that their current development processes are failing to meet the overarching needs of the business.


Executives are in the best position to understand those failings and the needs of the business, so they are best suited to clarify the objectives of the transformation.


In we will go into how executives begin to lead the transformation, using these objectives to communicate the vision, prioritize improvements, and show progress.

business objectives

Once the business objectives have clarified the long-term goals of the transformation, executives then will use an enterprise-level continuous improvement process to engage the organization throughout the journey.


Because it is so hard to measure process improvements with software, executives can’t just manage the change by metrics like they would other parts of their business.


They are going to have to engage with the organization to get a more qualitative understanding of what is working and what needs fixing next. This transformation can’t be top-down, just like it can’t be bottom-up.


The continuous improvement process is designed to engage the broader organization in setting objectives the team feels are important and achievable.


Additionally, since a transformation of this size can take years and is going to be such a discovery process, it is designed to capture and respond to what everyone is learning along the way.


The executives will use a combination of the business objectives and the continuous improvement process to lead the transformation and prioritize improvements based on what will provide the biggest benefit to the business. 


In this blog, we will use the term enterprise-level to describe an organization with software development efforts that require 100 or more engineers to coordinate the development, qualification, and release of their code. It does not refer to a coordinated effort across an organization the size of HP, because that would just be too complex.


The plan for transforming the organization should be kept as small as possible to reduce complexity. But if different applications in the enterprise have to be qualified together to ensure they work in production, then they should be included as part of the same enterprise-level transformation.


Once the business objectives and continuous improvement process are in place, executives can start changing development processes by applying Agile and DevOps principles at scale.


This will require two big changes: applying Agile principles to the planning process and using DevOps to address the basic Agile principle of being able to economically release smaller batches of changes on a more frequent basis.


Executives need to understand that managing software and the planning process in the same way that they manage everything else in their organization is not the most effective approach.


The software has a few characteristics that are different enough from everything else that it makes sense to take a different approach.


First, each new software project is new and unique, so there is a higher degree of uncertainty in the planning. Second, the ability of organizations to predict the effectiveness of software changes is so poor that literally 50% of all software is never used or does not meet its business objectives.


Third, unlike any other asset in business, if the software is developed correctly it is much more flexible and cheaper to change in response to shifts in the market.


If the planning process doesn’t take these differences into account, executives are likely to make the classic mistake of locking in their most flexible asset to deliver features that will never be used or won’t ever meet the business intent.


Additionally, if executives don’t design the planning process correctly, it can end up using a lot of the organization’s capacity without providing much value.


In the last section, we will cover how to design processes to minimize investments in planning and requirement breakdown but still support the critical business decisions by breaking the planning process into different time horizons and locking in capacity over time.


The DevOps approach

DevOps approach

The DevOps approach of integrating working code across the organization in an operation-like environment is one of the biggest challenges for large, traditional organizations, but it provides the most significant improvements in aligning the work across teams.


It also provides the real-time feedback engineers need to become better developers. For this to work well, the continuous deployment pipeline needs to be designed to quickly and efficiently localize issues in large, complex systems and organizations.


It requires a large amount of test automation, so it is important that the test automation framework is designed to quickly localize issues and can easily evolve with the application as it changes over time.


This is a big challenge for most organizations, so the executives need to make sure to start with achievable goals and then improve stability over time as the organization’s capabilities improve.


Teams can’t and won’t drive this level of change, so the executives need to understand these concepts in enough detail to lead the transformation and ensure their teams are on the right track. 


Transforming development and delivery processes in a large, traditional organization requires a lot of technical changes that will require some work, but by far the biggest challenges are with changing the culture and how people work on a day-to-day basis.


What do these cultural shifts look like? Developers create a stable trunk in a production-like environment as job #1. Development and Operation teams use common tools and environments to align them on a common objective.


The entire organization agrees that the definition of done at the release branch means that the feature is signed off, defect-free, and the test automation is ready in terms of test coverage and passing rates.


The organization embraces the unique characteristics of software and designs a planning process that takes advantage of the software’s flexibility. These are big changes that will take time, but without the executives driving these cultural shifts, the technology investments will be of limited value.


From business objectives and continuous improvement to planning and DevOps, Leading the Transformation takes you through the step-by-step process of how to apply Agile and DevOps principles at scale.


It is an innovative approach that promises markedly better business results with less up-front investment and organizational turmoil.




Traditional implementations that focus on scaling small Agile teams across the organization are very different from applying Agile and DevOps principles at scale. Executives play a key role in communicating the advantages of the latter approach and in explaining how it differs from what is typically done in the industry.


This blog outlines the basic Agile principles for executives and highlights the limitations of the typical approach of scaling small teams across the organization. This information is vital to executives looking to avoid the struggles of traditional implementations and to capitalize on the business benefits of a successful transformation.


Waterfall Method vs. Agile

As many executives know, the Waterfall Method of development leverages project management principles used for managing many types of projects. It starts by gathering requirements and then planning all the work.


Development begins after the planning, and then the software is integrated for the final qualification and release. The goal of this approach is to structure the program such that you can determine the schedule, scope, and resources up front.


Large, complex software development projects, however, are fundamentally different than other types of projects, and traditional project management approaches are not well equipped to deal with these differences.


Software development is such a discovery process that many of the assumptions made in the planning stage quickly become obsolete during development.


Additionally, integration and qualification tend to uncover major issues late in the process, which results in frequent and costly schedule slips and/or working the teams to death.


The first step in leading the transformation is understanding that Agile principles are a response to the shortcomings of using traditional Waterfall project management approaches for software. They were proposed as a framework to address these unique software development challenges.


Instead of long phases of requirements, planning, development, and qualification, there are much smaller iterations where complete features are integrated and qualified on a regular basis. Additionally, the entire code base is kept stable so that the code can be released at the end of each iteration if required.


This fixes the schedule and resources while letting the scope absorb the program uncertainty. The features are all worked on in priority order, with the most valuable features being developed first.


Agile practitioners have numerous examples where after delivering less than 50% of the original must-have features, the customer is happy with the product and no longer requesting more features.


Contrast this with the Waterfall Method, where there are no regular code drops. The qualification and integration process would not have started until all the must-have features were complete, taking much longer to deliver any value and creating features that may not have been necessary.


While there are many other benefits to Agile, this highlights the key breakthrough for the business and as such is imperative for executives to understand when contemplating leading a large-scale Agile transformation.


Change Management Capacity

Change Management

Transitioning to Agile is a very big effort for a large organization. There are technical and process changes required. Frequently, organizations focus on technical solutions. While required, they represent a smaller portion of the effort. Most of the real challenges are in organizational change management and shifting the culture.


Executives need to understand that the capacity of the organization to absorb change is the biggest constraint to rolling out these improvements. This means that organizational change management capacity is the most precious resource, and it should be actively managed and used sparingly.


The approach to rolling out an enterprise-level Agile transition should focus on the business breakthroughs the Agile principles are intended to achieve while taking into consideration the capacity of an organization to change. This is where executives can add knowledge and expertise.


The Limitations of Traditional Agile Implementation: An Executive Perspective

Agile Implementation

What follows is an example of a large (1,000-developer) organization that tries to enable small Agile teams in the organization. The example is not real, but it is a snapshot of what is being done in the industry today.


The first step is to select a few pilot teams with eight to ten members who will start being “Agile.” These teams will gain valuable experience and create some best practices. Once these teams have demonstrated the advantages of Agile, they create a plan for how it should be done for this organization.


The plan will need to scale across the organization, so in the end, there are going to be ~100 Agile teams. Agile coaches will be hired to get a few teams going, and then within a year, all the teams should be up and running.


Throughout the year, each coach can probably ramp up about five teams; thus, this rollout could require in the range of 20 coaches. With a cost of $150/hour, this adds up to over $2M/year.


When forming Agile teams, it is important to understand that you want teams that can own the whole feature end to end, so you need to pick members from the different component teams to form the prototype teams.


This works fine with just a few teams, but when we start rolling this out organization-wide, we are going to need a complete reorganization. Once everyone is in teams creating effective workspaces for collaboration, moving everyone around will probably take another $1–2M.


The next step is making sure the teams have the right tool for documenting and tracking their Agile stories, which will probably run $1M or more. All of a sudden, we are up to over $5M for the transition.


We better make sure we talk to the executives who can commit that level of resources. This will require an ROI justification for the CFO.


So now we are committed to big investment and a big return to the top-level executives. At this point, we have an implementation that engages and includes the engineers and top-level executives. The management team in the middle, however, does not have a clear role except to provide their support and stay out of the way.


There are a number of problems with this whole approach. You are now $4–5M into a transition, and you still don’t have a plan for having always-releasable code for the enterprise or an enterprise backlog.


Teams may have a clear local definition of “done” in their dedicated environments and team backlogs, but at the enterprise level, you have not changed the process for releasing code.


Also, this approach has driven lots of organizational change that may be met with some resistance. We started with taking managers out of the process because they are a big part of the problem and don’t understand how to coach Agile teams. Can you see why and how they might undermine this transformation?


Next, we have a complete reorganization, which always tends to be a cause for concern and creates resistance to change. Add to that moves designed to get teams together in collaborative workspaces.


You can see how this approach is going to create a lot of turmoil and change while not fundamentally changing the frequency of providing code to the customers for feedback.


The other big challenge is getting the teams to buy in and support how they are going to approach the details of their day-to-day work. The first prototype teams are going to be successful because they had a lot of input in defining the team-level processes and have ownership of its success.


The problem most organizations have is that once they have determined the “right” way for their teams to do Agile, they feel the next step is teaching everyone one else to do it that way.


This approach of telling the teams how to do their day-to-day work feels like it is contrary to the good management practices that we learned early in our careers.


In most cases, for any problem, there are at least three different approaches that will all achieve the same solution. On the one hand, if we were really smart managers, we could pick the best approach and tell the team how to do it.


If, on the other hand, we let the team pick how to meet the objective, they are more likely to make their idea more successful than they would have made our idea.


If we told them how to do it and it failed, it was clear that we didn’t have a very good idea. If they got to pick the how then they were much more likely to do whatever it took to make their idea successful. Just being managers or being part of the prototype team did not mean we were any more likely to pick the best idea.


Therefore, as leaders we feel it is important, wherever possible, to provide the framework with the objectives and let the team have as much design flexibility in defining how the work will get done. It provides them with more interesting work, and they take more ownership of the results.


In addition, when the situation changes, those doing the work are likely to sense it and adapt more quickly than an executive would.



What we hope executives walk away with after reading this example is that most Agile implementations struggle to provide expected business results because they focus on rolling out Agile teams the “right” way instead of applying Agile principles at scale.


This approach creates a lot of change management challenges in an organization without fundamentally addressing the basic Agile business principles of an enterprise backlog and always-releasable code. We believe our approach offers answers to a lot of these struggles.


Having the executives and managers leading the transformation by setting the business objectives and running the continuous improvement process engages them in the transformation.


Focusing on improving the organization-wide planning and delivery processes provides clarity on the business breakthroughs the basic Agile principles were intended to provide.


Providing a framework for prioritizing and integrating work across the teams provides the basic processes for improving the effectiveness of large organizations while providing the teams with as much flexibility as possible in defining how they work on a day-to-day basis.


What follows is a detailed account of how each step of the process builds on the next. What you end up with is a concrete plan to apply Agile principles at scale to help executives lead the transformation in their own businesses.




The reason these two Agile authors say “don’t do Agile” is that we don’t think you can ever be successful or get all the possible business improvements if your objective is simply to do Agile and be done. Agile is such a broad and evolving methodology that it can’t ever be implemented completely.


Someone in your organization can at any time Google “What is Agile development” and then argue for pair programming or Extreme Program or less planning, and you begin a never-ending journey to try all the latest ideas without any clear reason why.

Additionally, Agile is about continuous improvement, so by definition, you will never be done.


At HP we never set out to do Agile. Our focus was simply on improving productivity. The firmware organization had been the bottleneck for the LaserJet business for a couple of decades.


In the few years before this transformation started, HP tried to spend its way out of the problem by hiring developers around the world, to no avail. Since throwing money at the problem didn’t work, we needed to engineer a solution.


We set off on a multiyear journey to transform the way we did development with the business objective of freeing up capacity for innovation and ensuring that, after the transformation, the firmware would not be the bottleneck for shipping new products. This clear objective really helped guide our journey and prioritize the work along the way.


Based on this experience and others like it, we think the most important first step in any transformation is to develop a clear set of business objectives tuned to your specific organization to ensure you are well positioned to maximize the impact of the transformation on business results.


We see many companies that embark on a “do Agile” journey. They plan a big investment. They hire coaches to start training small Agile teams and plan a big organizational change.


They go to conferences to benchmark how well they are “doing DevOps or Agile.” They see and feel improvements, but the management teams struggle to show bottom-line business results to the CFO.


Not having clear business objectives is a key source of the problem. If they started out by focusing on the business and just using DevOps ideas or implementing some Agile methods that would provide the biggest improvements, they would find it much easier to show bottom-line financial results. This worked at HP.


When we started, firmware had been the bottleneck in the business for a couple of decades and we had no capacity for innovation.


At the end of a three-plus-year journey, adding a new product to our plans was not a major cost driver. We had dramatically reduced costs from $100M to $55M per year and increased our capacity for innovation by eight times.


To be clear, achieving these results was a huge team effort. For example, it required us to move to a common hardware platform so that a single trunk of code could be applied to the entire product line up. Without collaboration with our partners throughout the business, we could not have achieved these results.


Having a set of high-level business objectives that the entire organization is focused on is the only way to get this type of cross-organizational cooperation and partnership.


These types of results will not happen when you “do Agile.” It takes a laser-like focus on business objectives, a process for identifying inefficiencies in the current process and implementing an ongoing, continuous improvement process.


Where to Start

Where to Start

Once you have a clear set of business objectives in place, the next step is determining where to start the transformation. You can’t do everything at once and this is going to be a multiyear effort, so it is important to start where you will get the biggest benefits.


From our perspective, there are two options that make sense for determining where to start. The first is the activity-based accounting and cycle-time approach that we used at HP.


You start with a clear understanding of how people are spending their time and the value the software is intended to provide to your business.


This approach addresses the biggest cost and cycle-time drivers that are not key to your business objectives. The challenge with this approach is that sometimes it can be very time-consuming to get a good understanding of all the cost and cycle-time drivers.


The other approach is to focus on the areas that are typically the biggest sources of inefficiencies in most enterprise software development efforts: maintaining always-releasable code and your planning processes. Then apply DevOps and Agile principles to these areas.


The beauty of software is that once you develop a new feature, the marginal cost of delivering that feature should be almost zero.


This is not the case in most organizations that are either using Waterfall development methodologies or have focused their Agile implementations on scaling Agile teams.


Neither does a good job of addressing the biggest organization-level inefficiencies like fixing the planning process or enabling easy, more frequent releases of new features.


This approach starts with applying DevOps and Agile principles at scale to address these enterprise-level inefficiencies first.


Activity-Based Accounting and Cycle-Time Approach

Cycle-Time Approach

At HP our clear business objectives of freeing up capacity for innovation and no longer being the bottleneck for the business meant we needed to dramatically improve productivity.


Therefore, we started by understanding our cost and cycle-time drivers to identify waste in our development processes. We determined these by mapping our processes, thinking about our staffing, digging through our finances, and looking back at our projects under development. 


This is an important first step. Most people understand how they are spending money from a cost-accounting perspective. They have detailed processes for allocating costs out to products or projects. They don’t have an activity-based accounting view of what is driving the costs.


This step requires either a deep understanding of how people spend their time or a survey of the organization. Also keep in mind that while it does not have to be accurate to three significant digits, you do need to have a reasonably good feel for the cost and cycle-time drivers to prioritize your improvement ideas. 


Once we were clear about our business objectives, cycle-times, and cost drivers, we were ready to start our improvement process. We focused on waste in the system, which we defined as anything driving time or cost that was not key to our business objectives.


It was only at this point that we considered changing our development approach to align with some of the DevOps and Agile methods.


We also wanted to make sure we were starting where we could show quick progress. There are lots of different starting points and an organization can only change so many things at once, so we wanted to make sure we began where we would get the biggest return on our investment.


Applying DevOps and Agile Principles at Scale

Agile Principles

While starting with activity-based accounting and cycle-time drivers are probably the most accurate approach, it will quickly point you to the processes for maintaining always-releasable code and your planning processes, just as it did for us at HP.


Writing code is similar in large and small organizations, but the processes for planning it, integrating it with everyone else, qualifying it, and getting it to the customer are dramatically different.


In almost every large, traditional organization, this is by far the biggest opportunity for improvement. Far too often these parts of the process take a lot of resources and time because traditional organizations tend



business-specific results

It is difficult to address the unique and business-specific results that come out of activity-based accounting and cycle-time driver approach. Therefore, the rest of this blog will focus on applying Agile and DevOps principles at scale to provide ideas and recommendations to address the traditional challenges of transforming these parts of your development process.


Whether you do the detailed activity-based accounting and cycle-time view of your processor start with applying DevOps and Agile principles at scale, it is important that you begin with clear business objectives.


This transformation is going to take a lot of effort, and if you don’t have clear business objectives driving the journey, you can’t expect the transformation to provide the expected business results.


At HP we were able to get two to three times the improvement in business results because that was the focus of all our changes.


If we couldn’t see clearly how a change would help meet those goals, we did not waste any time with it, even though it might have been popular in the Agile community. It is this type of focus on the business objectives that enable the expected business results.


These objectives also help in the organizational change management process, where you can constantly remind the team of where you are, where you are going, and the expected benefits.



software productivity

Executives can’t just manage this transformation with metrics. Since software productivity is so hard to measure, they must continuously engage with the organization and its people throughout the journey to get a qualitative feel for what is and isn’t working.


This continuous improvement process is more than just making sure all the small Agile teams in the organization are doing retrospectives to become more effective. It requires creating a culture of continuous improvement where enterprise-level objectives are set and goals are defined for the iteration.


Having iteration checkpoints with retrospectives and incorporating new ideas into the objectives for the next iteration are both essential parts of continuous improvement.


Executives need to engage in the process to ensure you experience the most important principle of Agile development: learning and adjusting along the way.


At HP we made sure we had a set of objectives that we used to drive the organization during each iteration. There were typically four to seven high-level objectives with measurable sub-bullets we felt were most important to achieve, things like shipping the next set of printers or taking the next step forward in our test automation framework.


It also included things like improving our test passing rates, since stability had dropped, or a focus on feature throughput, since development was getting behind.


This required our second highest priority to be improving the stability of the codebase on CE and getting the simulator automated testing in place.


The third priority was addressing all the product-specific requirements of the new products in this release window. We also had the first prototypes of the next generation products showing up that were based on Windows CE and an ARM processor.


This led to our fourth priority of ensuring we could port the common codebase to the ARM processor. Our fifth priority was getting the first XPe on MIPS processor products ready for system qualification on the final products.


While it includes some team-level stories, it, more importantly, focuses on the enterprise-level deliverables. Executives work with the organization to set these kinds of objectives so that everyone feels they are important and achievable.


They also make sure the objectives are based on what the teams are actually doing and achieving. This kind of collaboration helps build a culture of trust.


These are very high-level strategic objectives that include business objectives and process improvements. Can you see how it is much more than just an aggregate of the team-level stories?


Can you also see why everyone across the organization would have to make these their top priorities instead of team-level stories if we were really going to make progress?


These objectives provided an organization-wide set of priorities that drove work throughout the organization. If you were working on one of these top priorities, the organization understood people needed to help if you asked. Each team was expected to prioritize the team’s stories that were required to meet these objectives.


Once the team had met the strategic objectives, they were empowered to use the rest of their capacity to make the best use of their time and prioritize the team-level stories.


This approach allowed for defining aligned objectives for the enterprise while also empowering the teams. It was a nice balance of top-down and bottom-up implementation.


Tracking Progress and Understanding Challenges

Tracking Progress

These objectives guided all of our work. There was a website that aggregated all of our metrics and enabled tracking every objective down through the organization. The metrics would start at the overall organization, then cascade down to each section manager, and then down to each team.


As executives, we would each start the morning at our desks spending 30–45 minutes reviewing these metrics, so we were able to know exactly how we were doing meeting these objectives and could identify where we were struggling.


We did not have many status meetings or Scrum of Scrum meetings because the data was always at everyone’s fingertips.


The leadership team would then spend most of their days walking the floor trying to understand where we were struggling and why. This is a new role for most executives and one we encourage executives to embrace if this process is going to be successful. We became investigative reporters trying to understand what was working and what needed to be improved.


People wanted to meet the objectives we set and felt were reasonable at the beginning of the iteration, so something must have happened if they were struggling.


These discussions around what got in the way were one of the most valuable parts of the process both in terms of learning what needed improving and in terms of changing the culture.


When high-level executives first started showing up at the desks of engineers that were struggling, it was intimidating for the engineers. After a while, though, the engineers realized we were just there to help, and the culture started to change.


Adjusting Based on Feedback and Aligning for the Next Iteration


During the last week of our monthly iteration, we would start evaluating what we could and could not get done in that iteration. We would also integrate what we learned about where the organization was struggling in setting the objectives for the next iteration.


Objectives usually started out as a discussion between the program manager and the director with a notepad standing in front of a poster of the previous iteration objectives. In about 20 minutes, we would create a rough draft of objectives for the next iteration.


We say “rough draft” because we had not yet gotten the perspectives and support of the broader organization. Over the next few days, there were reviews in lab staff meetings, the section managers’ meetings, and a project managers’ meeting where we scrubbed and finalized the objectives into a list everyone felt was achievable and critical to our success.


We know that aligning the organization on enterprise iteration goals takes longer than most small Agile teams take to set objectives for an iteration, but for a large organization it felt like a good balance of driving strategy at the top and getting the buy-in and support from the troops.



A culture of continuous improvement at the enterprise level is important for any large successful transformation. At HP, we never set off to implement a three-year detailed plan to transform our business.


We just started with our business objectives and worked down a path of continuous improvement one month at a time. In the end, we looked back and realized we had delivered dramatic business results.


It required setting enterprise-level objectives each iteration, tracking progress, and making the right adjustments for the next iteration. You are never going to know all the right answers, in the beginning, so you are going to need an effective enterprise-level process for learning and adjusting along the way.


An enterprise-level continuous improvement process is a key tool executive use for leading transformations and learning from the organization during the journey.




Convincing large, traditional organizations to embrace Agile principles for planning is difficult because most executives are unfamiliar with the unique characteristics of software development or the advantages of Agile.


As we have discussed, they expect to successfully manage software development using a Waterfall planning and delivery process, just like they successfully manage other parts of their business.


This approach, though, does not utilize the unique advantages of software or address its inherent difficulties. The software is infinitely flexible.


It can be changed right up to the time the product is introduced. Sometimes it can be changed even later than that with things like software or firmware upgrades, websites, and software as a service (SaaS).


Software does have its disadvantages, too. Accurately scheduling long-term deliveries is difficult, and more than 50% of all software developed is either not used or does not meet its business intent.


If executives managing software does not take these differences into account in their planning processes, they are likely to make the classic mistake of creating detailed, inaccurate plans for developing unused features.


At the same time, they are eliminating flexibility, which is the biggest advantage of software, by locking in commitments to these long-range plans.


Unique Characteristics of Software

Agile software development

Agile software development was created because of traditional planning and project management techniques were not working. Unlike traditional projects, software is almost infinitely flexible.


If you’ve ever lived through a full metal mask roll for a custom ASIC on a laser printer to resolve a design problem, you are fully aware of just how long, painful, and expensive this can be. A simple human error made during the ASIC design can result in a schedule slip of several months and cost overruns of $1M.


This same principle can be applied in most non-software projects. Once the two-by-fours are cut, they’re not going to get any longer. Once the concrete is poured and dried, it’s not going to change shape.


The point is that if you’re managing a construction project or a hardware design effort, you must do extensive requirements gathering and design work up front;


otherwise, mistakes are very time-consuming and costly to fix. The project management discipline has evolved to match this lack of flexibility and minimize the likelihood of making these kinds of mistakes.


Software on the other hand, if managed and developed correctly, is less expensive and easier to change. The problem is that most software is managed using the same techniques that are applied to these other inflexible disciplines.


In the software world, we call it the Waterfall development life cycle. By applying these front-end heavy management techniques to software development we limit ourselves in two ways.


First, we rob ourselves of software’s greatest benefit: flexibility. Instead of doing extensive planning and design work that locks us into a rigidly defined feature set, we should take advantage of the fact that software can be quickly and easily modified to incorporate new functionality that we didn’t even know about when we did the planning and scheduling several months prior.


The software process, if done correctly, cannot only accept these late design changes but can actually embrace and encourage them to ensure that the customers’ most urgent needs are met, regardless of when they are discovered.


Second, we drive software costs up far higher than they should be. When software is developed using the Waterfall model, many organizations will spend 20–30% or more of their entire capacity to do the requirement gathering and scheduling steps.


During these steps, the software teams make many assumptions and estimates because every new software project you do is unique.


It is unlike any other piece of software you already have. Even if you are rearchitecting or refactoring existing functionality, the implementation of the code is unique and unlike what you had before.


Compare this to remodeling your kitchen, where Waterfall planning works well. Your contractor has installed a thousand sinks, dishwashers, and cabinets. He will use the same nails, screws, and fittings that he has used before.


His estimates of cost and schedule are based on years of experience doing the same activities over and over, finding similar issues.


There is some discovery during the actual implementation, but it is much more limited. With software, your teams have limited experience doing exactly what you’re asking them to do.


As a result, the actual implementation is filled with many discoveries and assumptions made up front that are later found to be incorrect when actual development begins.


Additionally, integration and qualification tend to uncover major issues late in the process.


 If the customer sees the first implementation of a feature months after development began and you then discover that you misunderstood what was wanted, or if the market has shifted and other features are now more important, you have just wasted a significant amount of time and money.


It is far better to show the customer new functionality right as it becomes available, which is simply not possible using the Waterfall methodology.

software development

Because software development is so different from how executives usually do business, it can take a while to overcome resistance to applying Agile principles to the planning process.


We can’t stress enough that if you don’t design your planning process correctly, it can end up using a lot of the capacity of the organization without providing much value.


Creating a planning process that embraces Agile principles starts with executives understanding and accepting long-term predictability for software schedules. 


For a relatively small investment in planning, you can get a reasonable first pass at the plan. More investment can result in better accuracy up to a point until you start reaching diminishing returns on your investment.


Most traditional organizations, when faced with the reality of just how inaccurate their software planning processes are, tend to react by investing more and more in planning.


They do this because they are convinced that with enough effort they will make their plan accurate. It works for every other part of their business, so why not with software?


The reality is that with software you are reaching a point of diminishing returns, and at that point, the best way to learn more about the schedule is to start writing code. Once the code starts coming together, the team learns a lot more about their assumptions and can better define their schedule for shorter time frames.


There are two other approaches traditional organizations tend to use when addressing this dilemma of long-term accuracy in plans. The first, used widely in most successful Waterfall organizations, is to put ample buffer into the schedule.


This is typically defined as the time between a milestone like “functionality complete” and the software release.


If the analysis shows the development is going to take six months, then commit to the delivery in nine to twelve months. The other is to commit to the schedule and when it is not on track, just add resources and work for the Development teams night and day on a death march until the program ships.


In either case, you are not changing the curve in the graph. Your schedule is still inaccurate. You are just trying to fight that reality with different techniques. Getting the management chain to acknowledge and accept the realities of this curve is the first step toward embracing the Agile planning principles and techniques.


Executives also need to understand that more than 50% of software developed is never used or does not meet its intended business objectives. That is the metric for the industry as a whole.


Your company might be better, but even if your group is well above average, having 30% of all software you develop not meeting objectives is a big opportunity for improvement.


Additionally, software capabilities that are defined for release 12–24 months in the future are likely to have conditions change, and even if properly delivered will not meet the then-current business objectives.


Therefore, it makes more sense to capitalize on the flexibility of software and delay software feature decisions as late as possible so the commitment is made based on the most current understanding of the market.


Don’t lock in your most flexible asset to commitments for features that are not likely to meet your business objectives. Embrace the flexibility of software by creating a planning process that is designed to take advantage of the software’s unique flexibility so you can best respond to the ever-changing market.


Process Intent

Process Intent

Now that you have a better understanding of the unique characteristics of software, the key to designing a good software planning process in an enterprise is being very clear about the Process Intent.


The planning process is used primarily to support different business decisions. So you should be clear about the decisions required and invest the least amount possible to obtain enough information to support these decisions.


While the curve of the graph shows it is hard to get a high level of accuracy for long-range plans, that does not mean that businesses can live without firm long-range commitments for software.


The challenge is to determine what requires a long-term commitment and ensure these commitments don’t take up the majority of your capacity.


Ideally, if these long-term commitments require less than 50% of your development capacity, a small planning investment can provide the information required to make a commitment.


On the other hand, if these firm long-range commitments require 90–110% of the capacity, this creates two significant problems. First, when the inevitable discovery does occur during development, you don’t have any built-in capabilities to handle the adjustments.


Second, you don’t have any capacity left to respond to new discoveries or shifts in the market. In either case, if you do need to make a change, it is going to take an expensive, detailed replanning effort to determine what won’t get done so you can respond to the changes.


The planning process also needs a view of the priorities across the teams. It should look across the organization so the most important things are being developed first.


Otherwise, the individual teams are at risk of sub-optimizing the system by developing capabilities the teams feel are important but are less important to the overall line of business they are supporting.


There needs to be a process for moving people or whole teams across initiatives to ensure the right overall business priorities are being developed first.


The LaserJet Planning Example

Planning Example

An example from the HP LaserJet business can help clarify how this planning process can work. HP had to make significant investments in manufacturing capacity 12-plus months before the software or firmware development was complete, so being able to commit to shipping the product on time was critical.


Before we completed the rearchitecting and re-engineering of the development processes, porting the code to a new product and the testing required to release it was taking ~95% of the resources.


This resulted in large investments in planning to ensure enough accuracy for committing to a product launch. To make matters worse, the organization wanted firm long-term commitments to every feature that Marketing considered a “MUST.”


Additionally, Marketing had learned that if a feature was not considered a “MUST” then it was never going to happen, so almost everything became a “MUST” feature 12-plus months before the product was introduced, adding on another 50–55% of demand on capacity.


This led to an organization that was trying to make long-term commitments with 150% of its capacity, requiring a large investment in planning to clarify what could and couldn’t be done.


When the planning was done and it was clear that not all the “MUST” features could be completed, we needed another planning cycle to prove the “MUST” features really couldn’t be delivered—because after all, they were “MUST” features.


This vicious cycle was taking precious capacity away from a team that could have been delivering new capabilities and products.


Breaking this logjam required a significant investment in the code architecture and development processes. When just porting the code to a new product and the release process was taking ~95% of the capacity, it was not realistic to create an effective long-range planning process.


Therefore, we rearchitected the code so that porting to a new product did not require a major investment and we automated our testing so it was easier to release the code on new products.


These changes meant that the long-range commitments required from the business were taking less than 50% of our capacity.


Additionally, we separated out the long-term manufacturing commitment requirements from the feature commitment decisions that could wait until later. Only then was it possible to develop a nice, lightweight, long-range planning process?


This new planning process consisted of three approaches for different time horizons to support different business objectives and decisions.


The goal was to avoid locking in the capacity as long as possible but still support the decisions required to run the business. We worked to keep the long-range commitments to less than 50% of the capacity.


For shorter time frames when there was less uncertainty, we would commit another 30%. Then for the last part of the capacity, we didn’t really plan but instead focused on delivering to a prioritized backlog. This approach built-in flexibility with the reserved capacity to respond to the inevitable changes in the market and discoveries during development.


The first and longest range phase focused on the commitment to ship the product on a specific date. In this phase, the unique capabilities of the new printer were characterized and high-level estimations were put in place to ensure there wasn’t anything that would preclude committing to an introduction date.


Because this was taking less than 50% of the capacity, it did not require a lot of detailed rigor in the planning for introductions 12-plus months in the future.


The next phase focused on major marketing initiatives that the organization leaders wanted to target over the next six months.


This involved the system engineers working with the Marketing leadership team to clarify at a high-level what capabilities they would like to message across the product line in the next marketing window.


The system engineers would then roughly estimate the capacity left after commitments to shipping new products and the initiative’s demands on the different teams in the organization. 


The other change in the planning processes at HP was to minimize the requirements inventory. When we were not making long-term commitments to all the features, we did not have to invest in breaking down the requirements into details until we were much closer to starting development and knew the requirements were going to be used and much less likely to change.


For the long-range commitments, the only requirements details created were the unique characteristics of the new printer. Then, in the initial phase, the new features were only defined in enough detail to support the high-level estimates.


Once these initiatives were getting closer to development, the system engineers would break the initiatives into more detailed user stories so that everyone better understood what was expected.

Agile development

Then right before development started, these user stories were reviewed in feature kickoff meetings with all the engineers involved in the development along with the Marketing person and system engineers.


At this time the engineers had a chance to ask any clarifying questions or recommend potentially better approaches to the design. Then after everyone was aligned, the Development engineers would break down the high-level user stories into more detailed developer requirements, including their schedule estimates.


This just-in-time approach to managing our requirements provided a couple of key advantages. First, the investment in breaking the requirements down into more detail was delayed until we knew they would get prioritized for development.


Second, since there were not large piles of requirements inventory in the system when the understanding of the market changed, there were not a lot of requirements that needed to be reworked.


The net result of all these changes is that planning went from taking ~20% of the organization’s capacity down to less than 5%, freeing up an extra 15% of the capacity to focus on delivering business value. At the same time, the management team had the information required to make business decisions for the different planning horizons.


By delaying locking in all the capacity as long as possible, the organization was able to use the inherent flexibility of software to increase the likelihood that the new features would be used and would deliver the intended business results.



enterprise planning process

Creating an enterprise planning process that leverages the principles of Agile starts with embracing the characteristics that are unique to software. It also requires planning for different time horizons and designing a planning process that takes advantage of the flexibility software provides.


These changes in the planning processes can also help to eliminate waste in the requirements process by reducing the amount of inventory that isn’t ever prioritized or requires rework as the understanding of the market changes.


The HP LaserJet example shows how embracing the principles of Agile can provide significant business advantages. This example is not a prescription for how to do it, but it does highlight some important concepts every organization should consider.


First, the planning process should be broken down into different planning horizons to support the business decisions required for different time frames.


Second, if the items that must have long-term commitments require more than 50% of your capacity, you should look for architectural and process improvements so that those commitments are not a major capacity driver.


Third, since the biggest inherent advantage of software is its flexibility, you should not eliminate that benefit by allowing your planning process to lock in all of your capacity to long-range commitments—especially since these long-range features are the ones most likely to end up not meeting the business objectives.


Lastly, you should consider moving to the just-in-time creation of requirements detail to minimize the risk of rework and investment in requirements that will never get prioritized for development.


The specifics of how the planning process is designed needs to be tuned to meet the needs of the business.


It is the executives’ role in the organization to appreciate how software development is different from the rest of their business and be willing to drive these new approaches across the organization.


If they use the traditional approach of locking all their capacity to long-range commitments, then when there is discovered during development or in the market it will require expensive replanning cycles.


Or even worse, the organization will resist changing their plans and deliver features that don’t provide any value.


Instead, executives need to get the organization to appreciate that this discovery will occur and delay locking in the capacity as long as possible so they can avoid these expensive replanning cycles and take advantage of the flexibility software provides.


Executives need to understand that the Agile enterprise planning process can provide significant competitive advantages for those companies willing to make the change and learn how to effectively manage software.



solid foundation

Before applying DevOps principles at the scale it is important for executives to ensure they are working from a solid foundation and that they understand the fundamentals currently in place in their organization, or else they will needlessly struggle to transform their development processes.


The first fundamental is clean architectures that enable smaller teams to work independently in an enterprise and make it possible to find defects with the fast running unit or subsystem tests.


The second is build and the ability to manage different artifacts as independent components. The third is test automation.


Applying DevOps and Agile principles at scale requires lots of automated testing. Expect to create, architect, and maintain at least as much test code and automation scripts as you create production code.


Soundly architected test code leads to soundly architected production code that is easy to understand and maintain. If done well, this is a key enabler. If done wrong, it can turn into a maintenance nightmare that will cause a lot of problems, and the tests will not quickly localize coding issues.



Executives need to understand the characteristic of their current architecture before starting to apply DevOps principles at scale. Having software based off of a clean, well-defined architecture provides a lot of advantages.


Almost all of the organizations presenting leading-edge delivery capabilities at conferences have architectures that enable them to quickly develop, test, and deploy components of large systems independently.


These smaller components with clean interfaces enable them to run automated unit or subsystem tests against any changes and to independently deploy changes for different components. In situations like this, applying DevOps principles simply involves enabling better collaboration at the team level.


On the other hand, large, traditional organizations frequently have tightly coupled legacy applications that can’t be developed and deployed independently.


Ideally, traditional organizations would clean up the architecture first so that they could have the same benefits of working with smaller, faster-moving independent teams. The reality is that most organizations can’t hold off process improvements waiting for these architectural changes.


Therefore, executives are going to have to find a pragmatic balance between improving the development processes in a large, complex system and fixing the architecture so the systems are less complex over time.


We encourage you to clean up the architecture when and where you can, and we also appreciate that this is not very realistic in the short term for most traditional organizations. As a result, we will focus on how to apply DevOps principles at scale assuming you still have a tightly coupled legacy architectures.


In these situations where you are coordinating the work across hundreds to thousands of people, the collaboration across Development and Operations requires much more structured approaches like Continuous Delivery.


Embedded software and firmware has the unique architectural challenge of leveraging common stable code across the range of products it needs to support.


If the product differences are allowed to propagate throughout the code base, the Development team will be overwhelmed porting the code from one product to another.


In these cases, it is going to be important to either minimize the product-to-product hardware differences and/or isolate the code differences to smaller components that support the product variation. The architectural challenge is to isolate the product variation so as much of the code as possible can be leveraged unchanged across the product line.


The Build Process

Build Process

The next step in creating a solid foundation is to validate that the build process will enable you to manage different parts of your architecture independently. Some organizations do not have this fundamental in place. There is a simple test that can evaluate how ready you are with your build process.


This idea may seem simple, but it is a very basic building block because keeping the system stable requires building up and testing large software systems in a structured manner with different stages of testing and artifact promotion.


If we are thinking of these artifacts as independent entities but in reality can’t manage them that way, then we are not set up for success.


If your software does not pass this simple test you need to start with fixing your build process and modifying your architecture to ensure that all components can be built and deployed into testing environments independent of the others.


Test Automation

Test Automation

A large amount of test automation is necessary when changing the development processes for large, traditional organizations. Without a solid foundation here, your feedback loops are going to be broken and there won’t be an effective method for determining when to promote code forward in your pipeline.


Writing good test automation is even more difficult than writing good code because it requires strong coding skills plus a devious mind to think about how to break the code. It is frequently done poorly because organizations don’t give it the time and attention that it requires.


Because we know it is important we always try to focus a lot of attention on test automation. Still, in almost every instance we look back on, we wish we had invested more because it is so critical.


You just can’t keep large software systems stable without a very large number of automated tests running on a daily basis. This testing should start with the unit- and services-level testing which is fairly straightforward. Ideally, you would be able to use these tests to find all of your defects.


This works well for software with clean architectures but tends to not work as well in more tightly coupled systems with business logic in the user interface (UI) where you need to depend more on system level-based UI testing.


In this case, dealing with thousands of automated tests can turn into a maintenance nightmare if they are not architected correctly. Additionally, if the tests are not designed correctly, then you can end up spending lots of time triaging failures to localize the offending code.


Therefore, it is very important that you start with a test automation approach that will make it efficient to deal with thousands of tests on a daily basis. A good plan includes the right people, the right architecture, and executives who can maintain the right focus.


Test Environment

Test Environment

Running a large number of automated tests on an ongoing basis is going to require creating environments where it is economically feasible to run all these tests. These test environments also need to be as much like production as possible so you are quickly finding any issues that would impact delivery to the customer.


For websites, software as a service, or packaged software this is fairly straightforward with racks of servers. For embedded software or firmware this is a different story.


There the biggest challenge is running a large number of automated tests cost-effectively in an environment that is as close as possible to the real operational environment.


Since the code is being developed in unison with the product it is typically cost prohibitive, if not impossible, to create a large production-like test farm.


Therefore, the challenge is to create simulators and emulators that can be used for real-time feedback and a deployment pipeline that builds up to testing on the product.


A simulator is a code that can be run on a blade server or virtual machine that can mimic how the product interacts with the code being developed.


The advantage here is that you can set up a server farm that can quickly run thousands of hours of testing a day in a cost-effective way. The disadvantage is that it is not fully representative of your product, so you are likely to continue finding different defects as you move to the next stages of your testing.


The objective here is to speed up and increase the test coverage and provide feedback to enable developers to find and fix as many defects as possible as early and thus as cheaply as possible. The simulator testing is fairly effective for finding defects in the more software-like parts of your embedded solution.


The challenge is that lots of products include embedded firmware running on custom ASICs.

In this case, it is almost impossible to find defects in the interactions between the hardware and firmware unless you are running the code on the custom ASICs. This is the role of emulators.


It is like the simulator in that it is code that mimics the product, but in this case, it includes the electronics boards from the product with the custom ASICs but does not include the entire product.


This is incrementally better than simulator testing because it is more like production. The challenge here is that these are more expensive to build and maintain than the simulator-based generic blade servers.


Additionally, early in the development cycle the new custom ASICs are very expensive and in limited supply. Therefore, the test environments for the deployment pipeline are going to require a balance of simulators and emulators.


Finally, since these emulator environments are still not fully production-like, there will still need to be testing on the product. Creating processes for enabling small-batch releases and quick feedback for developers is going to require large amounts of automated testing running on the code every day.


The challenge for embedded software and firmware is that this is not really practical on the product.


Therefore, robust simulator and emulators that can be trusted to quickly find the majority of the defects are a must.


Executives will need to prioritize and drive this investment in emulators and simulators because too often embedded organizations try to transform development processes with unreliable testing environments, which will not work.


Designing Automated Tests for Maintainability

Automated Tests

A big problem with most organizations is that they delegate test automation task to the quality assurance (QA) organization and ask their manual testers to learn to automate what they have been doing manually for years.


Some organizations buy tools to automatically record what the manual testers are doing and just play it back as the automated testing, which is even worse.


The problem with record and playback is that as soon as something on the UI or display changes, tests begin to fail and you have to determine if it is a code defect or a test defect. All you really know is that something changed.


Since new behavior frequently causes the change, organizations get in the habit of looking to update the test instead of assuming it found a defect.


This is the worst-case scenario for automated testing: where developers start ignoring the results of the tests because they assume it is a test issue instead of a code issue.


The other approach of having manual testers writing automated tests is a little better, but it has a tendency to result in brittle tests that deliver very long scripts that just replicate the manual testing process.


This works fine for a reasonable number of tests when the software is not changing much. The problem, as we will demonstrate in the example below, comes when the software starts to change and you have thousands of tests. Then the upkeep of these tests turns into a maintenance nightmare.


Whenever the code changes, it requires going through all the thousands of tests that reference that part of the software to make sure they get the right updates.


In this case, organizations start to ignore the test results because lots of tests are already failing due to the QA team not being able to keep all their tests running and up to date.


The best approach for automated testing is to pair a really good development architect that knows the fundamentals of object-oriented programing with a QA engineer that knows how to code is manually tested in the current environment. They can then work together to write a framework for automated testing.


A good example can be found in the blog Cucumber & Cheese: A Tester’s Workshop by Jeff Morgan. He shows how to create an object-oriented approach to a test framework using a puppy adoption website as an example.


Instead of writing long monolithic tests for different test cases that navigate the website in similar ways, he takes an object-oriented approach.


Each page on the website is represented by a page object model. Each time the test lands on that page there is a data magic gem that automatically randomly fills in any data required for that page. The test then simply defines how to navigate through the website and what to validate.


With this approach, when a page on the website changes, that one-page object model is the only thing that needs to change to update all the tests that reference that page. This results in tests that are much less brittle and easier to maintain.


Using Page Object Models and other similar techniques will go a long way towards reducing your test development and maintenance costs.


However, if your approach is to move fully to automated testing, the use of Page Object Models will be insufficient to drive your maintenance costs to an acceptable level. There are some additional techniques and processes that you will need to put in place.


Creating a Test Results Database

Test Results Database

As discussed above, the first step in automated testing is to make sure the framework is stable and maintainable. The next step is ensuring there is a good tool for managing and reporting the test results.


When there are tens to hundreds of automated tests running every day it is pretty easy to debug and triage all the failing tests. When the number of tests gets larger this approach is not very practical or effective. 


Managing this number of tests requires using a statistical approach to driving up test passing rates and stability. It also requires grouping the tests that are associated with different parts of the code base and designing the tests so they quickly localize the code issues.


These test results then need to be in a tool or database that allows you to look at pass rates across different builds and test stages.


Designing Automated Tests to Quickly Localize Defects

Designing Automated Tests

We mentioned that one of the most common mistakes that organizations make when moving from manual to automated testing is to take their existing QA teams and simply have them start automating their existing manual scripts.


This approach will cause some serious issues and will prevent you from having a cost-effective testing process. Since we’ve fallen into this pit a few times ourselves, we want to make sure that you are aware of the issues with this approach so that you can avoid them.


An example that will help illustrate the issues with automating manual test scripts is a large e-commerce website with new, added functionality that allows them to accept a new brand of a credit card for payment.


We want some tests that demonstrate that you can actually check out and purchase a product using this new type of card.


To test this, our manual tester will go to the homepage of our website. From there they might search for a product using the keyword search, and after finding and selecting a product, they will add it to the cart. From there they might go to the cart and see if the product they selected is the one that actually ended up in the cart.


Next, they might sign in as a registered user so that they can determine if the loyalty points functionality will work using this new brand of credit card. To sign in as a registered user, they will have to enter their username and password.


Now they are signed in and can add this new brand of a credit card to their account and see if the website now accepts it as a valid form of payment. Assuming so, they’re now ready to check out.


They click on the checkout button and land on the checkout page. Since they’re logged in as a registered user, their name, address, and phone number should be prepopulated in the correct fields on the page.


After validating this information the manual tester is now ready to do the part of the test that we have been waiting for: Can I select this new brand of a credit card as my form of payment and actually purchase the product I selected?


The problem with automating this script is that the new automated test can fail for any number of reasons that have nothing to do with the new credit card or the checkout functionality.


For example, if keyword search is not working today, this checkout test will fail because it can’t find a product to put in the cart.


If the sign in functionality is broken, the test will fail because it can’t get past this stage of the test.


Because this new test was written to validate our new functionality and it failed, the responsibility to triage this test failure lands upon the team developing the new credit card capability.


In reality, there will be many of these tests, some with valid and invalid credit card numbers, some that attempt to purchase a gift card, and some that attempt to purchase a product, but ship it to an alternate address.


The list goes on. The result will be that on any given day our team could walk in to discover that most of their test suite that was working yesterday is now failing.


As they begin to triage each failing test, they quickly learn that most are not failing because of the new functionality that was supposed to be tested, but because something else in the system is broken. Therefore, they can’t trust the test failure to quickly localize the cause of the failure.


It was designed and labeled as a test to check out with a new credit card, and it is not until someone manually debugs the failure that it can be determined it is due to a code failure somewhere else in the system.


Tests that are designed this way require a lot of manual intervention to localize the offending code. Once you start having thousands of tests you will realize this is just not feasible.


The tests need to be written so that they are easy to update when the code changes, but they also need to be written so that they quickly localize the offending code to the right team.


One of the best and most important tools at your disposal to enable efficient triage is component-based testing. Think back to the architectural drawing.


 What you want is a set of automated tests that fully exercise each component without relying on the functionality of the other components.


In the traditional Waterfall method, we called this subsystem testing, and we expected each component or subsystem team to mock out the system around them and then test their component.


Once all components were tested in this way, we moved on to the integration phase and put them all together and began system testing.


However, it is now possible to develop an automated testing framework that allows you to isolate the various components of the system, even when the system is fully integrated and deployed.


Using our new credit card for a checkout example again, what we really want to do is start the test on the URL associated with the checkout page and to already have the user signed in and products loaded into the cart.


We don’t want to exercise the website to get all this data in place. We want the automation framework to do this work for us.


With manual testing, there is no way for this to happen. The tester has no choice but to walk through the entire workflow, end to end.


However, a test automation framework can be built out that will do all of these things for you and allow your automated tests to exercise just the functionality you want to test.


If our checkout tests and automation framework support the ability to test just the checkout functionality, things begin to change in a couple of very positive ways.


First, if these componentized tests begin to fail, you have a high level of assurance that the functionality that is broken is actually the functionality that the test is designed to exercise. When the team begins to triage the test failures, the likelihood that it is not their failure has decreased dramatically.


The team will now quickly learn to pay attention to the test results because a pass or fail is much more likely to be an accurate representation of the state of their functionality.


The second big change is in your ability to statistically analyze your test metrics. When your automated test suite consists of a large number of tests that are just an automated version of your manual tests, it is very difficult to look at a large number of test results each day and determine which functionality is working and which functionality is broken.


Even if you organize these tests into components, you will not be able to determine the status of each component. Remember from our example: when your checkout tests are all failing, you don’t know if the problem is in the checkout functionality or not. You have no idea what is broken.


If instead, you take a component-based approach of designing and grouping tests, then you can quickly localize a drop in pass rates to the offending code without a lot of manual triage intervention. This makes it feasible to efficiently respond to the thousands of automated tests that will be running every day.


Using Test Automation as Documentation

Test Automation

There is one more fundamental issue that involves automated testing as well as the actual code development process and is associated with the requirement- gathering and documentation process.


In most large organizations, requirements are gathered from Marketing or some other part of the business. The software teams don’t just dream up some new features and to implement them.


The problem is that these requirements get gathered and distributed to the Development team, who go about their work of developing code and testing for this new functionality. Invariably, the various teams interpret these requirements differently.


The Development teams put functionality in place that doesn’t match what the test does, and when the teams go back to the business for clarification, they often discover that neither the developers nor the testers got it right.


This leads to the Waterfall practice of more and more requirement reviews and estimating how long it will take to get something done.


The problem is amplified in any organization that is affected by things like the Sarbanes-Oxley Act, which necessitates requirements be associated with a piece of code and the tests that were used to validate that the code matches the requirement and are correct.


Any software that deals with financial transactions or that drives things as medical devices come under close scrutiny. Associating written requirements (specifications) to actual software and tests can be a time-consuming, expensive, and difficult process.


However, there are now tools and processes in place to significantly reduce the impact in this area. Most of the tools focus on creating specifications in an executable form.


The idea is to have the business write the specification in an executable form so the automated test and the specification are the same things. On the west side of the house, Cucumber is an excellent example.


The Cucumber framework uses a Feature File, which is in an easy English readable format.


The Feature File is the specification and is designed in such a way that it can be easily created by non-programmers. The Feature File is also executable code that is consumed by the test automation framework.


Instead of writing language-based specifications, the specifications can now be written in such a way that they actually become the automated tests.


This eliminates the need to manually associate specifications and code. It tests to meet audit requirements and gets everyone working off a common definition of the automated test from the very beginning.



software development practices

Transforming the software development practices of large, traditional organizations is a very big task, which has the potential for dramatic improvements in productivity and efficiency.


This transformation requires a lot of changes to how the software is developed and deployed. Before going too far down this path, it is important to make sure there is a solid foundation in place to support the changes.


Executives need to understand the basic challenges of their current architecture and work to improve it over time. The build process needs to support managing different artifacts in the system as independent entities.


Additionally, a solid, maintainable test automation framework needs to be in place so developers can trust the ability to quickly localize defects in their code when it fails. Until these fundamentals are in place, you will have limited success effectively transforming your processes.