Thursday Sep 24, 2009

Second Order

I had "that argument" with someone again recently. You know, the one about Java performance vs. C code. It's an argument I am very tired of having and I'm tempted to go "Barney Frank" on the next person to suggest there's a difference that matters.

One of the supposed C advantages is supposed to be that it offers greater opportunity for optimization. Maybe I have been looking at the wrong code lately but I see very little code these days that instruction-wise optimization, inline assembly or hand tweaking makes better. The biggest problem is not that careful optimization of small functions doesn't produce speed improvements but that those speed improvements are, overall, of marginal benefit. Here's some really bad code:
static void main(String[] args) {

    List sortedArgs = new LinkedList();

    for(String arg : args) {


Even though this is just a contrived example, yes, I do regularly see code this inefficient. Would optimizing the sort() operation have much impact on the overall behaviour of this program? How about the choice of LinkedList? Would using a bulk insertion strategy such as addAll() help? If there are many arguments passed in the biggest gains would almost certainly come from sorting the arguments only once and possibly using a different data structure to store the strings. Let's try that again:
static void main(String[] args) {

    String[] sortedArgs = Arrays.copyOf(args, args.length);


As cautions against premature optimization suggest, is critical to carefully examine the causes your hot code is burning so many cycles before optimizing. Frequently the real problem is with the calling code and not the exact piece of code which shows up in the profiler as your hot spot. When planning optimizations it's usually best to look "up the stack" and evaluate the usage pattern for the bottleneck code being called before diving in to make a targeted fix. Most likely, at best, you'll only get a marginal improvement by making a local optimization. For the really big wins you should be focusing on the higher order causes.

Tuesday Oct 21, 2008

Who am I writing this for?

I'm working with a new code base and a new build process for a small project I am working on. The project makes use of the 'jstyle' code style checker tool. I've been running seriously afoul of it's rules by having the timidity to leave extra whitespace at the end of lines and add extra whitespace in expressions.

I've complained before about overly rigorous coding standards but the "remove your extra whitespace" insistence is driving me nuts. I'm sorry, I just don't care. My editor puts in whitespace to make aligning text easier. I can't see whitespace and I normally don't scour my source to make sure that my editor hasn't added any. Since the compiler doesn't care, why complain about it? (I also happen to like using "flowed" format for Javadoc with the hope someday that my text editor will free me from having to do line wrapping manually within documentation comments).

There's a lot of stuff that just doesn't matter. There's a limit to how much "extra" energy people have for "doing the right thing". Being pedantic about minor issues depletes that energy without accomplishing much.

Tuesday Mar 11, 2008

80 Column Blues

There's generally a lot of disagreement about the best way to format code. One way to make every coding style unreadable is to enforce or require an 80 column/character line limit. A lot of coding standards still include an 80 character line length limit, including the Coding Conventions for the Java Programming Language. I've also seen 72 and 77 character line length limits in other coding standards. An 80 character line limit may have been a necessity in 1968, made good sense in 1978, was probably useful in 1988 and possibly helpful in 1998 but it's definitely annoying in 2008.

Left to my own devices I find that for block comments I write to about 100 characters per line which seems to be about the same line length as a typeset book, ie. a comfortable reading length. For code my average line length, not considering leading whitespace is probably 40 characters or less, again a comfortably short reading length. With whitespace my average line of code generally ends some where between column 65 and 75. Using inner classes and other constructs can result in deeply nested braces producing 60 or more characters of initial whitespace. This is where 80 character line limits really make thinks look ugly. I've seen code that has been "de-intended" (leading whitespace was removed) and still had less than 30 non-whitespace characters per line. The result was nearly unreadable.

In my own code I occasionally write lines with more than 80 non-whitespace characters. These are commonly logging, method calls, and statements using my friend, "?", the infamous "Riddler" conditional operator. Even though these statements are longer they generally aren't too difficult to parse and I don't find the extended line length a problem.

My current conclusions are:

  • Except for cro-magnon freaks we all have text editors that can easily edit text that's wider than 80 columns.
  • Forced line breaks to meet width requirements don't make code more readable. Forced line breaks often reduce readability.
  • 80 column hard limits should be dropped from modern coding standards
  • For block non-code text the width used should be a comfortable reading width. Really long lines of text are hard to read.
  • Consideration of maximum acceptable line length should ignore any leading whitespace. The intelligibility, or lack thereof, of a line of code is not strongly tied to the total line length but to the number of non-whitespace characters in the line.
  • The readability of a line of code is probably related to the density of the syntax. Lines using only simple syntax or syntax that can be easily decomposed can be longer and remain readable.



« July 2016