Like most developers, I believe, I create classes of two distinct types: the ones where my head is full of contextual information and I need to bang out a class that meets my immediate needs, and the ones that I patiently and carefully design. The former constitute the majority of my classes, alas, and they are initially sloppy beasts: I hit a keystroke combo and the IDE creates a new class with the copyright block in place, I give it a name, and I immediately start hacking. (If I’m refactoring code, then the IDE will either pull in the code from another class or generate new code for me. I love today’s tools!) I can go for a while putting into the code all the things teeming in my brain without fear that I am creating an unsightly tangle. My lack of fear is because I know I will refactor the code, write the necessary tests, compile it, finish cleaning it up, run the tests, and move on.
But I’m increasingly coming to the opinion that this series of steps, which is familiar to every developer, creates a lot of unnecessary activity. A better approach is to write down all the same data in words rather than code. Suppose instead of code, I wrote the following comments:
“This class validates a ticket number by computing a pair of check digits. Since we acquired [company name], the checkdigit algorithm varies by vendor, so first look up the vendor #. The constructor accepts the ticket number, and the principal method returns an enum indicating a valid number or the type of error. All exceptions are caught and converted to error enums. This method is called only by [class name] and so should be private.”
Now, I’ve captured what I wanted to put into code and my most immediate problem is coming up with a good name for the class. With the key details captured, I can develop it at leisure and write good code that does not need a lot of refactoring. If I’m a TDD zealot, I can start writing a test. Either way, I’m good to go.
An excellent new book called A Philosophy of Software Design advocates using comments as a design step for classes. The book is written by John Ousterhout, who is the creator of Tcl and Tk and who teaches at Stanford University (and earlier at Berkeley) when not working at his company, which specializes in large-project continuous delivery tools. Ousterhout suggests when creating a class that you use the following steps, which are a significant enlargement of what I’ve described above:
According to Ousterhout’s experience, the benefits are threefold: When the code is done, it’s properly commented and the comments are entirely up to date. The comment-first approach enables you to focus on the abstractions rather than being distracted by the implementation. The code will reveal complexity—if a method or variable requires a long, complex comment, it probably needs to be rethought and simplified. That’s a lot of benefits!
Of the things on which to comment, the most important in Ousterhout’s view are abstractions (which are difficult to tease out from reading the implementation code) and the reason why the code exists. In sum, a developer working on your code for the first time should be able to scan the class’s comments and have a good idea of what it does and an overview of the most important implementation aspects.
If this approach appeals to you—as it does to me—he suggests that you should use it until you’re accustomed to writing code this way. He believes, and I agree, that doing so will convert you by delivering cleaner, clearer code that is fully commented.
PS: Many open source projects wish they had more contributors, but their code is often so poorly commented and devoid of documentation that it’s impossible for potential contributors to climb on board. Ousterhout’s approach would go a long way toward addressing that problem.
Andrew Binstock (@platypusguy) is the lead developer on the Jacobin JVM project—a JVM written entirely in Go. He was formerly the editor in chief of Java Magazine, and before that he was the editor of Dr. Dobb’s Journal. Earlier, he cofounded the company behind the open source iText PDF library. He lives in Northern California with his wife, and when he’s not coding, he studies piano.
Previous Post
Next Post