Sunday Jun 14, 2009

Audio for JavaOne interview available

A couple of weeks back I recorded an interview where I discussed The Developer's Edge. I've just found the audio up at BlogTalkRadio, it's about 15 minutes in duration.

Thursday Jun 11, 2009

Code complete: burn this chapter

That's a somewhat inflammatory title for this post, but continuing from my previous post on the book Code Complete, I think that the chapters (25 & 26) on performance really do not contain good advice or good examples.

To make this more concrete, consider the example on pg 593 where Steve McConnell compares the performance of these two code fragments:

for i = 1 to 10
  a[ i ] = i
end for

a[ 1 ] = 1
a[ 2 ] = 2
a[ 3 ] = 3
a[ 4 ] = 4
a[ 5 ] = 5
a[ 6 ] = 6
a[ 7 ] = 7
a[ 8 ] = 8
a[ 9 ] = 9
a[ 10 ] = 10

Steve finds that Visual Basic and Java run the unrolled version of the loop faster.

There's a couple of examples that talk about incorrect access ordering for arrays. Here's some C code that illustrates the problem:

Slow codeFast code
for (column=0; column < max_column; column++) 
  for (row=0; row < max_row; row++) 
for (row=0; row < max_row; row++) 
  for (column=0; column < max_column; column++)

On page 599 it is suggested that the slow code is inefficient because it might cause paging to disk, on page 623 it is suggested that the higher trip count loop should be inside to amortise the initialisation overhead for each execution of the inner loop. Neither of these explanations is right. As I'm sure most of you recognise the code is slow because of cache misses incurred when accessing non-adjacent memory locations. There is a cost to initialisation of the inner loop, but nothing significant, and yes, you could get paging to disk - but only if you are running out of memory (and if you're running out of memory, you're hosed anyway!). You're more likely to get TLB misses (and perhaps that is what Mr McConnell intended to say.

I consider the above issues to be quite serious, but, unfortunately, I'm not terribly happy with the rest of the material. Hence my recommendation to ignore (or burn ;) these chapters. I'll go through my other reservations now.

Lack of details. The timing information is presented with no additional information (pg 623) "C++ Straight Time = 4.75 Code-Tuned Time = 3.19 Time Savings = 33%". What was the compiler? What compiler flags were given? What was the test harness?

The book presents it as somehow that "C++" runs this code slowly, but in reality it's more likely to be a test of the effectiveness of the compiler, and the ability of the user to use the compiler. I'd be surprised if any compiler with minimal optimisation enabled did not do the loop interchange operation necessary to get good performance. Which leads to my next observation:

Don't compilers do this? I think the book falls into one of the common "optimisation book" traps, where lots of ink is spent describing and naming the various optimisations. This gives the false impression that it is necessary for the expert programmer to be able to identify these optimisations and apply them to their program. Most compilers will apply all these optimisations - afterall that is what compilers are supposed to do - take the grudgery out of producing optimal code. It's great for page count to enumerate all the possible ways that code might be restructured for performance, but for most situations the restructuring will lead to code that has the same performance.

Profiling. It's not there! To me the most critical thing that a developer can do to optimise their program is to profile it. Understanding where the time is being spent is the necessary first step towards improving the performance of the application. This omission is alarming. The chapter already encourages users to do manual optimisations where there might be no gains (at the cost of time spent doing restructuring that could be better spent writing new code, and the risk that the resulting code is less maintainable), but without profiling the application, the users are basically encouraged to do this over the entire source code, not just the lines that actually matter.

Assembly language. Yes, I love assembly language, there's nothing I enjoy better than working with it (no comment), but I wouldn't encourage people to drop into it for performance reasons, unless they had utterly exhausted every other option. The book includes an example using Delphi where the assembly language version ran faster than the high-level version. My guess is that the compilers had some trouble with aliasing, and hence had more loads than were necessary - a check of the assembly code that the compilers generated would indicate that, and then it's pretty straight forward to write assembly-language-like high level code that the compiler can produce optimal code for. [Note, that I view reading and analysing the code at the assembly language level to be very useful, but I wouldn't recommend leaping into writing assembly language without a good reason.]

So what would I recommend:

  • Profile. Always profile. This will indicate where the time is being spent, and what sort of gains you should expect from optimising parts of the application.
  • Know the tools. Make sure that you know what compiler flags are available, and that you are requesting the right kind of things from the compiler. All too often there are stories about how A is faster than B, which are due to people not knowing how to use the tools.
  • Identify those parts of the code where the time is spent, and examine them in detail to determine if it's a short coming of the compiler, the compiler flags, or an ambiguity in the source code, that causes time to be spent there. Many performance problems can be solved with by adding a new flag, or perhaps a minor tweak to the source code.
  • Only when you have exhausted all other options, and you know that you can get a significant performance gain should you start wildly hacking at the source code, or recoding parts in assembly language.

The other thing to recommend is a read of Bart Smaalder's Performance Anti-patterns.

Wednesday Jun 10, 2009

Code complete: coding style

I read Code Complete a couple of years back. It's an interesting book to read, but there were two parts that annoyed me. I was giving a presentation the other week on "Coding for performance" and I happened to mention the book, and say that I had these two reservations. So I figure I should write them up more formally.

My first issue was, basically, me just been a stuck in the mud. Those of you who regularly read my blog will see that I favour the following style of indenting:

  if (some condition)
     do something;

If you've read Solaris Application programming, you'll see that I actually use quite a few styles. In writing that book, there were particular places where there was limited space on the page and I ended up juggling the style to make it fit the medium. So I have preferences, but I'm pragmatic.

Anyway, CC on page 746 says identifies my preferred style as "unindented begin-end pairs" and says the following "Although this approach looks fine, it violates the Fundamental Theorem of Formatting; it doesn't show the logical structure of the code.".


So I wanted to read up more details on this Fundamental Theorem, perhaps I'm misreading the text, but this is how it appeared to me (pg 739) "A control construct in Visual Basic always has a beginning statement ... and it always has a corresponding End statement." (pg 740) "The controversy about formatting control blocks arises in part from the fact that some languages don't require block structures." (pg 740) "Uncoupling begin and end from the control structure - as languages like C++ and Java do - with { and } - leads to questions about where to put the begin and end. Consequently, many indentation problems are problems only because you have to compensate for poorly designed language structures." [Emphasis mine.] I read this as, basically, you need to make your untidy C/C++/Java code look more like VB. I guess that's why it's taken me a couple of years to calm down sufficiently to post this ;)

Actually, I'm not far from disagreeing. But let's return to this point in a moment. Let's start with the two approaches to indenting that are recommended in the book.

First of all, what is probably the most common style "pure block emulation":

  if (something) {
    do something;

The other recommended style is "begin and end as block boundaries":

  if (something)
    do something;

On page 745, a study by Hansen and Yim (1987) indicates that there's no difference in understandability between these two styles. Excellent - so it doesn't matter! I'm sure that if "unindented begin-end pairs" were also included in the survey then it too would provide indistinguishable understandability.

Anyway, the differences between the recommended "begin and end as block boundaries" and the shunned "unindented begin-end pairs" is basically four spaces, which I don't personally think is a lot.

Heading back to why I might actually agree with some of his comments. It is very easy to introduce a bug in a program where the begin and end braces have been omitted. For example:

  if ( a > max )
    max = a;

  if ( a > max )
    printf("New max = %i\\n",a);
    max = a;

So, whilst I agree that the absence of brackets can be a problem, I don't necessarily think that rigid adherence to a particular style naturally follows as the only solution to that problem.

I do have some rules that I tend to obey:

  • Indenting is a personal/project preference. There are tools out there that can render source code pretty much how you like it. The UI is a view of the source, and it doesn't really matter what the style of the source is. If I find the source hard to read, then I can process it to make it conform to what ever layout works best for me to solve the problem that I'm working on.
  • Always use begin and end brackets. They add a single character and can avoid the problem demonstrated above.
  • I tend to favour clarity over a rigid adherence to particular styles. I'm not above placing an entire statement on a single line when I feel that it is the best way to present the information. Taking the previous example:
    Multi-lineSingle line
      if ( a > max ) {
        max = a;
      if ( a > max ) { max = a; }

Wednesday Jun 03, 2009

The Developer's Edge talk in Second Life

Just finished talking in Second Life. The slides from the talk are available from SLX. I've got into the habit of writing a transcript for my SL presentations - basically in case the audio fails for some reason.

The talk focuses a bit more on the way that people now get information (through blog posts, articles, indexed by search engines) and the Q&A after the talk was more about that than the technical content of the book. This is a domain that I've given a fair amount of thought to. When writing technical books there is a challenge to balance the information so that it includes the necessary details without writing material that will be out of date by the time that the book hits the press. Fortunately a large amount of the information that developers need is relatively long lived. The challenges come when describing a particular revision of the software, or a particular processor - details which can be very useful for people, but also details which may not age gracefully!

Tuesday Feb 12, 2008

CMT related books has a side bar featuring books that are relevant to CMT. Mine is featured as one of the links. The other two books are Computer Architecture - a quantitative approach which features the UltraSPARC T1, and Chip-Multiprocessor Architecture which is by members of the team responsible for the UltraSPARC T1 processor.


Darryl Gove is a senior engineer in the Solaris Studio team, working on optimising applications and benchmarks for current and future processors. He is also the author of the books:
Multicore Application Programming
Solaris Application Programming
The Developer's Edge
Free Download


« July 2016
The Developer's Edge
Solaris Application Programming
OpenSPARC Book
Multicore Application Programming