Tuesday Jun 28, 2005

I need to write code

So I have the odd bit of insomnia. You know, when you wake up at, say, 4am (like this morning), and you're instantly awake and thinking about stuff. Pondering. Planning. Designing. That's the most difficult bit of awakeness for me. Alan and I used to do on-call support for Australia and New Zealand, so we have this --- I guess it's a "talent" --- of being instantly awake when you wake up at night. It's a bitch, it really really is. So anyway, I have a tape diagnostic tool to maintain. It's all mine precious! and it was born out of frustration with discovering that tape drives log information which they are not obliged to pass up to the kernel. I need to update it for some new tape drive models, add in some more pretty-printing, (re-)write the documentation, and generally make it ready to hand over to our Explorer team. Not that they're going to maintain it, but so that they can use it and [SGR mode = "Potential Problem Analysis" submode = "Take Preventive Actions"] deal with the "if $developer gets hit by a bus" issue. I also contribute code on a definitely ad-hoc basis to a crash dump analysis tool we have called Solaris Crash Analysis Tool aka Solaris CAT. My focus for this at the moment is on portability. Since Solaris is a multi-architecture operating environment, we greatly prefer to minimise the architecture-specific code. Especially since you can get almost exactly the same kernel issues on x86, x64 and sparc. Of course since I've been back from my holiday it's been a little bit of a numbers game with my day job and I'm going to have to wait for another few days before I can sync up my workspace and get back to the portability. Then there's another tool I have in my head. It's still only at the planning stage, but it occupies my mind on my daily commute. It too is taking shape due to a lack of existing options. I like coding because it's creative and helps me to solve problems. When I can't code I get frustrated. More later, when I've written some code :)

Sunday Jun 26, 2005

The second question...

I remembered what the second exam question was: define a
struct Element
to hold single elements from the periodic table, including:
  • an int for the atomic number,
  • a string for the element's name, and
  • an array of 7 ints to hold the number of electrons in each shell
Then write a function
void getElement(Element \*element)
to get the information. Fairly simple really as long as you document any assumptions, make sure you include at least _some_ error checking, remember to typedef the struct and remember that since the routine is a void... we're hurtling into pointers. Of course talking about this question afterwards, a lot of my peers indicated that they had trouble handling the pointer aspects of the question: they couldn't get a handle on it. (Terrible geek joke, sorry!) Now I wonder whether the examiner will understand my use of the /\* ARGSUSED \*/ marker.....

Thursday Jun 23, 2005

I think the exam went well

Last Thursday (23rd June) I had my "Embedded" C programming exam. I think I went alright with it. 50 marks were allotted for writing actual code, and 50 marks for syntax-like stuff about the C programming language. Some of the syntax questions were a complete gift -- "what are the file handle names for the screen and keyboard in C?" and "what is the point of header files?"

The first "write me some code" question involved writing a routine to round a double to a provided number of decimal places. Now of course writing my response in the exam I was enamoured of my neat response:
double round(double number, int noOfPlaces) {

    double temp = number;

    temp \*= pow(10, noOfPlaces);
    temp += 0.5;
    return ( floor(temp) / pow(10, noOfPlaces));

Which was just fine until I got back to work (no rest for the wicked) and realised that I should have had some error checking in there for noOfPlaces being positive. So a more reliable or accurate version would be something like this:

double round(double number, int noOfPlaces) {

    double temp = number;

    if (noOfPlaces < 1) {
        return (number);

    temp \*= pow(10, noOfPlaces);
    temp += 0.5;
    return ( floor(temp) / pow(10, noOfPlaces));

And of course we should have some library variable (errno equivalent) to indicate an error condition.
No matter! On with the exam....

For the life of me I can't now remember what the second question was.

The third question was about numbers which are relatively prime (they have no common factors). We had to write a whole program to determine whether two numbers were or were not relatively prime.

Some people came out and mentioned that they had only found one set of factors, rather than checking the factors for both. Of course in a 'I have to code this in real life' situation I would have put in some short-circuits (are the input numbers equal? is one prime?) but this was an exam of course so I didn't really have the mental space to think of the useful short-circuits that I'd do otherwise. But I reckon I got a decent answer written for that one too.

What I found worrisome was that one of the girls who (if I'm thinking of the right person) did really well at Electronics-n-Circuits last semester came out of this exam in tears. Of course, it could be the case that she didn't write enough code for this subject, but I don't know.

So I think I did ok. Since I was on 40/40 from the assignments I'm hoping to do fairly well overall. Have to wait until July to find out though.

Friday Apr 15, 2005

Linus vs Tridge: asbestos suits needed to protect innovation

El Reg has found a posting from Linus on the BitKeeper vs Andrew Tridgell bunfight. It makes for interesting reading. I was particularly interested to read this paragraph:
...But that's not what Tridge did. He didn't write a "better SCM than BK". He didn't even try - it wasn't his oal. He just wanted to see what the protocols and data was, without actually producing any replacement for the (inevitable) problems he caused and knew about...
I find these interesting for two reasons. Firstly, Tridge seems to have been engaging in some research. This is something which most other people would find quite laudable. Research is also what drives innovation --- you can't improve upon something unless you know at least a bit (or a lot!) about that something. Secondly, Linus' email leaves a lot open to interpretation. On my first reading of that paragraph I was left with the impression that Linus is accusing Tridge of creating data/metadata problems (what other people might call data corruption) within the linux kernel repository. Clearly a bad thing to be implying. I had to read the paragraph a few times before it occurred to me that Linus was probably only talking about Larry pulling the license. Then later in the email is this tidbit:
I'll write my own kernel source tracking tool because I can't use the best any more.
If we take Donald E. Knuth as any sort of reliable guideline on diversions like this, then there won't be any more innovations coming from Linus involving the linux kernel, because he'll be spending all his time and effort designing, debugging and generally re-inventing the source code management wheel. And finally, this bunfight has made it into the mainstream media. I like Sam Varghese's final paragraph:
All that this incident has done is to bring to the fore the fact that free software and open source software are definitely not one and the same thing and that compromises made at one point could well come back to bite those who make them.

Wednesday Apr 13, 2005

Why quotes are important, or, the importance of correctness

In the lecture today we covered arrays (and pointers) in C. About time too I thought. By way of an example the prepackaged lecture slides from Hanly and Koffman's C Program Design for Engineers, 2nd Ed showed how to use a character array as a string:
char b[] = ``Ned Flanders";
... is stored in memory as ...
`N' `e' `d' ` ` `F` `l' `a' `n' `d' `e' `r' `s' `\\0'
How many of you would expect that an example provided in a textbook on the C language would be legal code in C? My lecturer didn't appear to think so, which I think is outrageous and depressing at the same time. How are my fellow students supposed to learn the language when the examples presented are wrong? I've learnt a few things over the years when presenting information for others or teaching an SGR class. One of those things is that your examples must be correct --- in a language class, you must be able to compile the example! Five slides further on was a multi-dimensional char array example with not a single correct element. Folks, if you're going to display character constants in C, use the single quote or apostrophe ('). Using a "backtick" or ` character will not work. At all. Ever. If you are a unix sysadmin or perl programmer, you'll know the importance of the backtick. Back when I was a sysadmin (before I joined Sun) I used to read the book reviews on www.perl.org. I remember quite vividly a reviewer shredding what was otherwise a decent book because the font used for printing did not have a correct backtick glyph: at least half of the example code looked wrong on the page and was useless as a teaching or reference example.

Thursday Apr 07, 2005

Algorithms and satisfaction

One thing that working on this first programming assignment has made me realise is that nowhere in the course structure for my Computer Systems Engineering degree @ UTS is there a subject on algorithms. Now Chris, one of PTS' senior senior engineering staff who happens to be based in the Sydney office, mentioned the other day that when he did his degree (cue the Four Yorkshiremen skit) all the programming courses were about algorithms first and languages second. So of course when we've all been sitting around at lunch time challenging each other to find better ways to find perfect numbers, Chris has looked at what Nathan and I did, and then ripped the guts out and replaced them with better and better algorithms. And I do mean, better and better --- both in terms of cpu usage and memory usage. I remember being fairly good at working out good algorithms in my first degree but being terrible at coding them. Now I think the balance has turned somewhat --- my coding is much better and I'm re-learning how to develop algorithms. A few months ago I had a chance to fix a bug in the mpt(7D) driver (this is the on-board scsi controller in the v20z, v40z and v440). The problem manifested itself on a v20z with Solaris 9: handling an interrupt while handling an interrupt blew the kernel thread stack away because of the way we were doing scatter-gather list allocation for each scsi command structure. I re-designed the way we do this allocation to use an as-needed allocation of exactly as much kernel memory as is needed for that particular scsi command. Now while it took me a few goes to get all the necessary pieces re-written, it was tremendously satisfying to know that I personally had figured out how to do this, and it was efficient, it was elegant and it was mine. Now if I'd been tasked with fixing that bug straight out of uni first time around, it probably would have taken me twice as long and not been nearly as elegant --- because I didn't have the exposure to good code and good algorithms. But with a few years (ok, 10+) behind me, it was actually quite easy. Now what I need to do is try to pass this experience on to my colleagues and the young-uns who I'm studying with. That is going to be the challenge!

More on compiler options, this time with long integers

When I wrote earlier this week about amd64 compiler performance was all done with the unsigned int. I figured for a laugh this evening that I'd see what I got by using unsigned longs instead.

It's interesting. In both case below (testing the first million natural numbers) the compiler options differ by ---xarch=amd64. The consistent options were ---fast ---xlibmil ---xlibmopt

Data ModelWallclock time

This I find rather strange. Why is it that longs in 64bit mode make execution nearly 6 seconds longer?

Tuesday Apr 05, 2005

Sun Studio 10 compiler options for amd64

amd64 Compiler Options

One of the nifty things about Sun Studio 10 is that it includes a code generator for 64bit operation on amd64 (and ia32e) chips. I thought I'd see what results I could get when compiling my programming assignment in 64bit mode

All results below are from using timex(1) for the first million natural numbers.

ModelOptions usedsize(bytes)runtimespeedup(percent)
32bitnone82602m 17.20sec
32bit---xlibmil ---xlibmopt ---native79481m 45.85sec22.9
64bit---fast ---xarch=amd64 ---xlibmil1024016.46sec88.0
64bit---fast ---xarch=amd64 ---xlibmil ---xlibmopt1024815.79sec88.5
WOW! So what does the gcc (v3.4.3) that Sun supplies with Solaris 10 and Solaris Express give me?
ModelOptions usedsize(bytes)runtimespeedup(percent)
32bitnone82601m 38.64sec
32bit---O385881m 10.93sec28.1
32bit---O3 ---march=opteron862058.42sec40.8
64bit---O3 ---m64 ---march=opteron1096837.47sec62.0
64bit---O5 ---m64 ---march=opteron1096837.48sec62.0
It's all very interesting. gcc is frequently used in HPC environments, one result of the popularity of linux-based supercomputers and no vendor compilers. Another thing that gcc is known for is darned good code generation on ia32 and suboptimal code on sparc (in some cases). I was surprised to see the gcc results above. Perhaps a gcc-savvy reader could suggest some more command line options which match Sun Studio's ---fast etc. I'm keen to find out.

Is n prime?

While I was finding the right path to travel with my assignment, I figured out a fairly efficient method of calculating whether n is prime or not:
unsigned int
prime(unsigned it ofn) {
         \* :: returns 1 if ofn is prime, 0 otherwise
         \* precondition: ofn > 0

         unsigned int root; /\* |_sqrt(ofn)_| \*/
         unsigned int count;
         unsigned int mark = 1; /\* our boundary condition var \*/

         /\* we only need to go to floor(sqrt(ofn)) \*/
         root = (unsigned int)floor(sqrt(ofn));

         for (count = 2; count <= root; count++) {
                 if (!(ofn % count)) {
                         /\* not prime, so escape out \*/
         return (mark);
How much more efficient could I get with this routine?

The benefits of peer review, and make(1)

The benefits of peer review, and make(1)

The subject I'm studying this semester ("Embedded" C) is designed to teach us how to get started with software engineering, using C as the language of choice. That's all well and good except that the subject title doesn't match, we don't learn about make(1) and makefiles until week 11 or 12, and there's no coverage of revision control systems like SCCS, RCS or CVS. When I was doing first year uni for the first time (1991) at ANU, we learnt about make(1) in the first three weeks of first semester. I find it mind-boggling that this second-year subject not only assumes that we know nothing about make and that it's only really relevant for multi-source projects, but also that we have to wait until week 11 or 12 to get a hint of it. The question I emailed my tutor involved whether we could use functions from math.h --- because if we could then we'd need to link with -lm, but without providing a makefile we're dependent upon the tutor doing the right thing. (And providing a valid and accurate compiler invocation as a comment in the source). If we could provide our assignment as, say, a tar file then providing a makefile would be easy. As it turns out, my tutor said we could use math.h functions, and that if we provided a valid compiler invocation as a comment then he'd follow it. The lecturer and tutors could have used the faculty's existing automatic submission procedures and saved some time.... Bah! Ok, so I'm displaying my "old-fart" annoyance and impatience here but it's my blog and I'll whinge if I want to.... :)

So what about peer review?

We were given the spec for our first assignment two weeks ago now, and true to form, we have to display our understanding of looping in order to print the factors of all the perfect numbers up to 10000. A quick googling for "perfect number algorithm" gave me this very useful page at the University of St Andrews to get started with, and a decent number of references for further reading as well. A close reading of the assignment spec revealed that factoring (2\^(p-1)) \* (2\^p --- 1) would not meet the requirements, so it was back to iteration and brute-forcing. Along the way a virtual water-cooler conversation with some colleagues got them interested in the assignment too. More on that later. My first version used a singly-linked list to store the factors, called malloc(3c) for each new element, and was rather inefficient in both cpu and memory due to repeated and very hot code paths. It also featured a sort-on-insert for the list which in general terms is ok, but when you're appending already-sorted elements it's horrendously wasteful of cpu cycles. At least I had a working and spec-compliant (then extended so I could look at the first n natural numbers) assignment within 24 hours. In my first degree it would often take me days to get the algorithm figured, let alone coded. Experience helps! The next day at work I mentioned that I'd got a working and compliant version done. Of course within about 5 minutes the guys I was talking with had brainstormed up another 3 ways of approaching the problem. Did I mention that I like peer review? That's exactly what my colleagues were doing, only not for a product that we ship. It was an exhilarating experience and something I just didn't get when I was doing my first degree. I also mentioned the assignment to my brother-in-law (who is a bit of a computational maths afficionado), and we managed to come up with yet another way of solving the assignment. So on Friday I did my first re-write, to use a doubly-linked list, only calculate and save the factors up to floor(sqrt(n)), then print the rest of the factors in ascending order by walking back up the list if in fact n was a perfect number. Rather than making my laptop grind to halt by using excessive amounts of cpu and memory, my laptop slowed to a crawl. This was an improvement! [Side note: in my new address I'm able to catch a train to and from work, so my commute is the same duration, but half of it is me walking to or from a station. Much nicer!] During the train ride home on Thursday I realised that rather than incurring a malloc(3c) and free(3c) penalty along with a linked-list traversal, I could instead malloc(3c) an array and dump the factors up to floor(sqrt(n)) into it. Yes, there is memory wasted but it's only a few ints and I need to keep an index for the array, BUT I don't have a malloc(3c) and free(3c) penalty for each n, I don't have any linked-list traversals or element comparisons, and I need to ensure that when I'm printing factors I don't hit a 0 division. (I'm using 0 as the boundary condition for the final factor under floor(sqrt(n))). A quick run with this new code means that I can factorise the first million natural numbers in 1m37.7sec using 1236kb on my amd64 laptop running as a 32bit app in 64bit mode Solaris "next" @ 1.8GHz. I'm sure it's not the most cpu-efficient algorithm and I know I could do more to decrease the memory usage, but it is waaaaay faster (approx 40x) than the first and second versions that I produced.

Did I have a point?

Yes, I did. Without having some semblance of peer review I would probably have submitted my code after the first working version. With peer-review I was able to make my code much more cpu- and memory-efficient, and along the way inspire a bit of a coding challenge around the office. I've since created a coding mailing list which we're starting to use for fun things. We might even have a challenge along the way.

I work at Oracle in the Solaris group. The opinions expressed here are entirely my own, and neither Oracle nor any other party necessarily agrees with them.


« July 2016