The benefits of peer review, and make(1)
The subject I'm studying this semester ("Embedded" C
) is designed to teach us how to get started with software engineering, using C as the language of choice.
That's all well and good except that the subject title doesn't match, we don't learn about make(1) and makefiles until week 11 or 12, and there's no coverage of revision control systems like SCCS, RCS or CVS.
When I was doing first year uni for the first time (1991) at ANU
, we learnt about make(1) in the first three weeks of first semester. I find it mind-boggling that this second-year subject not only assumes that we know nothing about make and that it's only really relevant for multi-source projects, but also that we have to wait until week 11 or 12 to get a hint of it.
The question I emailed my tutor involved whether we could use functions from math.h --- because if we could then we'd need to link with -lm, but without providing a makefile we're dependent upon the tutor doing the right thing. (And providing a valid and accurate compiler invocation as a comment in the source). If we could provide our assignment as, say, a tar file then providing a makefile would be easy.
As it turns out, my tutor said we could use math.h functions, and that if we provided a valid compiler invocation as a comment then he'd follow it. The lecturer and tutors could have used the faculty's existing automatic submission procedures and saved some time....
Ok, so I'm displaying my "old-fart" annoyance and impatience here but it's my blog and I'll whinge if I want to....
So what about peer review?
We were given the spec for our first assignment two weeks ago now, and true to form, we have to display our understanding of looping in order to print the factors of all the perfect numbers up to 10000.
A quick googling for "perfect number algorithm" gave me this
very useful page at the University of St Andrews to get started with, and a decent number of references for further reading as well.
A close reading of the assignment spec revealed that factoring (2\^(p-1)) \* (2\^p --- 1) would not meet the requirements, so it was back to iteration and brute-forcing.
Along the way a virtual water-cooler conversation with some colleagues got them interested in the assignment too. More on that later.
My first version used a singly-linked list to store the factors, called malloc(3c) for each new element, and was rather inefficient in both cpu and memory due to repeated and very hot code paths. It also featured a sort-on-insert for the list which in general terms
is ok, but when you're appending already-sorted elements it's horrendously wasteful of cpu cycles.
At least I had a working and spec-compliant (then extended so I could look at the first n
natural numbers) assignment within 24 hours. In my first degree it would often take me days to get the algorithm figured, let alone coded. Experience helps!
The next day at work I mentioned that I'd got a working and compliant version done. Of course within about 5 minutes the guys I was talking with had brainstormed up another 3 ways of approaching the problem. Did I mention that I like peer review? That's exactly
what my colleagues were doing, only not for a product that we ship. It was an exhilarating experience and something I just didn't get when I was doing my first degree.
I also mentioned the assignment to my brother-in-law (who is a bit of a computational maths afficionado), and we managed to come up with yet another way of solving the assignment.
So on Friday I did my first re-write, to use a doubly-linked list, only calculate and save the factors up to floor(sqrt(n)), then print the rest of the factors in ascending order by walking back up the list if in fact n was a perfect number.
Rather than making my laptop grind to halt by using excessive amounts of cpu and memory, my laptop slowed to a crawl. This was an improvement!
[Side note: in my new address I'm able to catch a train to and from work, so my commute is the same duration, but half of it is me walking to or from a station. Much nicer!]
During the train ride home on Thursday I realised that rather than incurring a malloc(3c) and free(3c) penalty along with a linked-list traversal, I could instead malloc(3c) an array and dump the factors up to floor(sqrt(n)) into it. Yes, there is memory wasted but it's only a few ints and I need to keep an index for the array, BUT I don't have a malloc(3c) and free(3c) penalty for each n
, I don't have any linked-list traversals or element comparisons, and I need to ensure that when I'm printing factors I don't hit a 0 division. (I'm using 0 as the boundary condition for the final factor under floor(sqrt(n))).
A quick run with this new code means that I can factorise the first million natural numbers in 1m37.7sec using 1236kb on my amd64 laptop running as a 32bit app in 64bit mode Solaris "next" @ 1.8GHz. I'm sure it's not the most cpu-efficient algorithm and I know I could do more to decrease the memory usage, but it is waaaaay faster (approx 40x) than the first and second versions that I produced.
Did I have a point?
Yes, I did. Without having some semblance of peer review I would probably have submitted my code after the first working version. With peer-review I was able to make my code much more cpu- and memory-efficient, and along the way inspire a bit of a coding challenge around the office. I've since created a coding mailing list which we're starting to use for fun things. We might even have a challenge along the way.