Plan Analysis: CoCoMo
By Bob Hueston on Apr 25, 2007
This is the third in the series of analytical techniques for plans: CoCoMo.
CoCoMoCoCoMo is the Constructive Cost Model, which is an empirical model for software development projects. The model was created by examining many projects, from small to large, simple to complex, using various programming languages. You give it the number of lines of code, and other information about your product, your team, and your development environment, and it tells you how long projects like this normally take. The model is very accurate; quite frankly, eerily accurate.
I was first introduced to CoCoMo in 1987, when I was working in the aerospace industry. Over the ten years that followed, I used CoCoMo as an integral part of all software planning. Even after I left aerospace for commercial product development, I continued to use CoCoMo and evangelize it to others.
OverviewCoCoMo works by, well, quite frankly, I have no idea how it works. It just does. The CoCoMo model was developed by reviewing many projects, from small to large, embedded to interactive, and an equation was developed that best fit the empirical data. Actually several equations were developed -- a simple (basic) version with just a few variables, to a complex (expert) version with dozens of variables.
When I learned how to use CoCoMo, there were worksheets that you'd fill out, then you'd spend a few minutes crunching the numbers and equations. One of the first things I did as a junior engineer was put the equations into a Lotus 123 spreadsheet. [This was back when most engineers had TI-55 III calculators and some had a VT-220 terminal on their desk. Few even had PCs or knew what Lotus 123 was. I wonder how many young engineers today know what Lotus 123 was.] Today, there are online versions including one from The University of Southern California which greatly simplify the task.
The new tools are very simple to use. You start by entering the number of source lines of code, new, reused or modified. Then you answer several questions, to define "attributes". The lines of code plus the attributes constitute the "variables" of the CoCoMo model equations. As a suggestion, leave all of the attributes at "nominal" and review the questions to see if any attributes really should be adjusted up or down; nominal works for most things. The attributes are divided into four categories: project, product, platform and personnel, described below.
Product attributes include how reliable the product needs to be (is it the software that controls the autopilot system for a commercial jetliner; or xeyes), size and complexity of the database, and product complexity (are the algorithms well understood, or cutting edge).
Project attributes cover how the project is executed: the use of engineering tools and development methodologies, extent of distributed collaboration required, and the overall schedule demands.
Platform attributes include execution and memory constraints (is the platform an 8051 with 128 bytes of RAM, or a high-end server with 128 CPUs and a terabyte of RAM?). It also includes platform volatility (is the hardware still in development, or is it a mature product that is already shipping).
Personnel attributes address how capable the engineering team is, their familiarity with the product, the platform, and the language. There is always a tendency to claim your team members are above average, but in reality, most teams are "nominal".
After setting all of the attributes, you click a button, and it gives you a measure of the staff-months to execute the project, as well as a schedule (calendar months) measure. This isn't to say that your project will take this long or cost this much. But it is a measure of what similar projects have cost.
CoCoMo: Historical ExamplesAs an example, take a small project my team just completed. It took two years to develop the software. The first year I had two people working on it; the second year and a quarter there was just one person available to work on the project. Total cost was approximately 39 staff-months, and in the end there were 9,600 lines of C++ code.
The software ran on an existing OS, and existing CPU, with enough memory and storage. But it was controlling a newly designed hardware system that attached to the computer, so I set the platform volatility attribute to "high". I left all other attributes at nominal -- if I wanted to spend more time, I could probably tweak them, but for a quick demo, I just accepted the default settings. I plugged these values into the CoCoMo tool, and it predicted a cost of 37.9 staff-months -- within 3% of the actual cost. CoCoMo also predicted that the project could have been completed in about a year with a little more than three people. Perhaps, but my schedule was driven more by staff and hardware availability, not time-to-market.
Another project I completed recently had 95,000 lines of code, and took a team of 12 people just under three years to complete and ship. That comes to about 420 staff-months of development. I plugged the 95,000 number into CoCoMo, and since there was nothing Earth shattering about the project I left all the attributes at nominal. CoCoMo came up with 439.8 staff-months. OK, CoCoMo is high by almost 20 staff-months, but keep in mind that's an error of only 4.7%. Pretty good, when you realize that all I gave CoCoMo was one number: the lines of code. Next time, I'll tell CoCoMo that my team is above average in capability.
Using CoCoMo on historical data is a nice confirmation of the model. But a model is only valuable if it can predict the future, and this model is only useful if your variables are accurate. Specifically, you need an accurate measure of the source lines of code that you're going to develop or reuse. But quite frankly, I often find it's easier for many engineers to tell me the amount of code they need to produce rather than the amount of time it's going to take them.
If you can find another project that is similar, it can be fairly easy to come up with a reasonably accurate measure of the lines of code you will need to develop. Consider the last example. I have an old email from back before the project started where someone points out that this project is about half the scope of some other project we finished the previous year. I just checked, and the last project developed 208,000 lines of code. With 104,000 lines of code, CoCoMo predicts 485 staff-months of effort. Using very quick and rough back-of-the-envelop measure predicted the number of lines of code to within 10% of actual, and CoCoMo gave us a measure of the staff costs to within 15% accuracy. Not bad for 10 minutes of analysis. Comparing that to the four weeks we spent at the start of the project listing all of the high-level requirements, decomposing them into tasks and sub-tasks, and creating plans, CoCoMo is much faster and more accurate.
CoCoMo in Plan AnalysisCoCoMo does not develop plans for you. It is a tool for analyzing plan data.
I find the best way to use CoCoMo is to confirm or contradict the detailed planning work that you are doing. After defining your tasks and measuring them, the work adds up to some total cost for the project. You can then use CoCoMo to see if the sum of the tasks is reasonable, as a sanity check. If your detailed plan differs from CoCoMo by more than, say, ten or twenty percent, I would start to worry.
I've also found CoCoMo to be an excellent independent tool for defending project cost. When selling a plan, you can present detailed plans and show how you came up with your costs. Then you can present how CoCoMo confirms your analysis with a similar cost figure. When a person can back up a plan with a well-established model such as CoCoMo, it adds a lot of credibility to the plan.
Other Plan Analysis Techniques: