Firefighting in Product Development: Past the Tipping Point
By Josh Simons on May 18, 2006
Give an engineer a problem involving interactions between lots of complicated bits of technology and she'll go and solve it---happens all the time. So why is it that we don't turn turn those powers of analysis loose on the complicated system that we call the software development process? Applying some rigor to understanding the dynamics of the development process makes huge sense. Turning the crank--producing high-quality products quickly and efficiently--is the essence of product engineering. Doing it well can get new products to market more quickly with higher quality and save your company millions of dollars in the process. Doing it well makes your customers happy and can create scads of new customers.
Well, good news. A group of MIT researchers have applied system dynamics to create mathematical models of the product development cycle and have published the results of their analyses in several papers. One of my favorites, and the topic of this blog entry, is Past the Tipping Point: The Persistence of Firefighting in Product Development by Repenning, Goncalves, and Black at the MIT Sloan School of Management.
Some of their findings are surprising and counterintuitive.
Consider, for example, firefighting--the unplanned allocation of resources to fix problems late in the development cycle. We all do it. We suffer the short-term scheduling hit on other projects to do the diving save. It hurts and we move on.
But it turns out it isn't that simple. The research shows that these short-term reallocations of resources can, in fact, have serious long-term effects on an organization's ability to turn the product development crank. Think death spiral.
For the purposes of explication, the authors created a simple model that illustrates their primary findings without getting into the messier details of their more complex, true-to-life models. The model is based on a several assumptions.
First, assume the organization releases a product every 12 months. The development cycle for a product consists of a 12-month concept development phase, followed by a 12-month product design and testing phase. At any given time, the organization is working on two products simultaneously--the design and testing phase for this year's product and the concept phase for next year's product.
Second, assume that the overall resource pool for the organization is fixed. If unanticipated problems arise during the design and test phase for this year's product, resources will be borrowed from next year's concept development work. This is the typical firefighting scenario in which longer term work is either skipped or delayed to deal with shorter term problems.
And, third, the model assumes that upstream concept development work that is skipped (for example, creating clear specifications of customer requirements), will create additional rework in the subsequent design and test phase for that product. There is ample industry evidence to support the reasonableness of this assumption.
The resulting system dynamics model (see diagram from the paper, below) has two feedback loops. In the Rework loop, more design problems in this year's product leads to more resources being tasked to fix those problems, which in turn reduces the number of design problems in the product. The other, Tipping Loop, is intimately tied to this first loop. In the Tipping Loop, as the number of resources assigned to do rework on this year's product increases, the number of resources available to work on the concept phase for next year's product decreases. This, in turn, results in less work being done on concept development activities for next year's product, which then, after a delay, results in more design problems in the subsequent design and test cycle.
So, what happens when you run the model under various scenarios? The model has a tipping point as illustrated in the diagram below. The tipping point, which is marked with the blue circle, divides the phase space into two regions. The arrows indicate the direction of motion in phase space for each of the two regions. In the bad region, the amount of up-front work completed this year is low enough that it causes significant rework in the following design and test phase. Which pulls resources off of the up-front work being done in next year's concept phase, which in turn creates even more defects in the subsequent design and test phase, etc. Mouse over the diagram to see both the vicious and virtuous cycles illustrated graphically (works under Mozilla and Safari, no guarantees with other browsers.)
The research shows that product development organizations have a Tipping Point and if you push them too far, you run the risk of pushing the organization into a death spiral of decreasing capability. Even worse, just as airline pilots cannot rely on their innate sense of physical orientation to pilot a plane, engineering management's intuition just doesn't work in dealing with these kinds of organizational problems. Our gut says temporary resource shifts should have only temporary ramifications. At some point, that just ain't so.
Pilots can be trained to fly on instruments. It remains to be seen if the same can be said for engineering management. This research is an attempt to raise awareness of the issues and highlight some of the non-intuitive ramifications of persistent firefighting in development organizations. I recommend reading the paper for a much more complete explanation of the model and its consequences.