Thursday Feb 25, 2010

This blog is moving

The more observant amongst you will have noticed a number of people who have blogs on have decided that they should move them elsewhere. The most recent I noticed was Simon Phipps, and realised that I need to follow suit.

For the curious, no, I didn't get a "don't come Monday" :-)

I've been running my own blog at for quite a while now, and I figure it's a reasonable enough location to blog from in general.

So without further ado, Please Update Your Feeds(tm)!

Thursday Jan 01, 2009

The quietest NYE ever...

This bundle is the reason is why

Thursday May 22, 2008

ITWire interviewed a mate of mine

I met a bloke called Arjen Lentz years ago. And I do mean years ago. Via HUMBUG, a user group I joined at its inception. He was with MySQL at the time, either their only or perhaps one of 2 employees that the company had in Australia.

He really drove home to me that it really was possible to work remotely from your boss, and from any other colleague, and develop software, and do so effectively.

Last year he moved on from MySQL to found his own consulting company, Open Query. Training, design, tuning... anything you want to do with MySQL (and PostgreSQL for that matter) he'll be able to help you do it. As far as I know, he's doing really well. I'm not sure that I'd like the startup-on-my-own thing - makes me just a little bit nervous right now - so I've got a lot of respect for those who do. (My best mate Leighton does this as well with Bare Metal Software).

Anyway.... Arjen was interviewed by one of Australia's well-known OpenSource commentators, Sam Varghese, and that interview is now live on ITWire.

If there's truly anybody out there who still wonders how you can make money from Open Source Software, Arjen is a great example - and he only lives about 10km from me :-)

Tuesday Mar 18, 2008

...filled with a bunch of nutcases

Just read this interview with James Gosling. On the front page of the SMH no less.

My favourite quote from the article is this:

At 52, Mr Gosling is a researcher at Sun Microsystems where his main interest is software development tools. "The reason why I stay is it's filled with a bunch of nutcases. Sun is a (relatively) small organisation, so there is a culture of tolerating craziness. It is open and understanding to risk; to an idea that might not be what people are expecting."

Ain't that the truth!

I've often said (to myself, at least!) that I work with some scarysmart people here at Sun, and it's nice to know that I'm not the only one who thinks that we're more than a little wacky.

Sunday Oct 07, 2007

Technorati fun stuff

One thing I'd forgotten about my homeblog was that I'd registered it with Technorati.

Please excuse this post as a minor distraction while I get this blog claimed as well.

Tuesday Sep 04, 2007

My plate is full

I've got a few things on my plate at the moment. Firstly and most importantly, we're in the process of getting our backport (for PSARC/2006/703 MPxIO extension for Serial Attached SCSI and PSARC/2007/046 stmsboot(1M) extension for mpt(7D)) approved. All the AIs from the CTeam review have been completed for the feature gate, so we're waiting on CTeam's deliberations.

Secondly, I've got general SAS stuff to address, which includes getting my head around how we get from realmode and grub up to the kernel. So far this bit has been really disgusting and is kinda doing my head in.

Then I've got two OpenSolaris sponsored items: DLG's mfi and JBK's drop-in replacement for a sparc libdisasm (ie, unencumbering it). I've also got my scsi inquiry util to get finalised and fasttracked, and then there's a fix for Silicon Image Sil311x cards which have raid firmware installed.

Finally... I promised Simon that I'd provide links to the OpenSolaris implementations of Sun's opened Hardware wiki.

All in all, a fairly interesting collection of things to be working on, and there's not a spare cycle to be had.

Saturday Sep 01, 2007

It feels good to be back

As some might have noticed, I posted a status update to my home blog this morning. I'm very happy to note that various databases within Sun have now propagated changes through, so now I can blog from work, too. I'm planning to keep for personal stuff including my photos, and for work stuff. Hopefully neither will be boring!

One thing I've noticed over the past 10 months while on contract to Sun is that I've felt kinda like a zombie - back, but not really. I don't feel that way any more :-)

Monday Sep 04, 2006

Moving, moving, moving

I've finally gotten my act together, acquired a static ip, installed apache, tomcat, roller ... futzed sufficiently with css etc and gotten my home blog listed in the aggregator.

If you'd like to subscribe directly, my feed is XML

Tuesday Aug 08, 2006

For everything there is a season

This morning I had a phone call from my senior manager and HR. Unfortunately, I haven't managed to avoid the RIF axe.

I'm annoyed, angry, disappointed, saddened, disheartened and yet surprisingly upbeat.

Why? Well since the rumours started flying a few weeks ago that August 3rd was D-Day I've had a lot of sleepless nights worrying about what might happen, and what I could do if I got retrenched. I managed to do a lot (a heckuvalot) of introspection, and realised that for all the hard work I've put into this company over the past ~ 7 years, I have actually gained more than I realised.

I've worked my way up from frontline phone support, through backline kernel/storage, to CPRE and PTS, where I dealt directly with engineering contacts within Sun, within Oracle and within Veritas.... I also got my appetite for coding whetted by working on bugfixes for some of the scsi hba drivers, discovered the joy of kernel crash dump analysis, tape diagnostics and how to read and interpret the FC and SCSI standards.

From there, last year, I made the leap to NWS...DMG... no, Storage! software development, where until today I was working on the protocol layer for what we fondly called "leadville" which is available under CDDL on

I've had a ball. I've experienced Sun's software design and engineering processes and principles first hand. I've worked with people across the entire planet (yes, I do mean that), built up professional relationships with some of the smartest people in the company (and also the planet), and revelled in having (for me) the absolutely coolest email address ever.

So while I'm cooling my heels and working out what I do next, I'll still be active in the OpenSolaris community. I'll be reading up on all those things I was putting off for my Christmas/New Year holiday, cutting a bit of code in ON and working on porting a few drivers.

Once that cool-off time is up I expect I'll be doing contract sysadmin work in Sydney - there is, after all, a lot of it about right now.... but I'd much rather be developing kernel code for Solaris!

So to all the people who've been online with me in irc channels internal and external (especially in the last week when I've been stressing out), to all the people who I've worked with over the last ~ 7 years, I'd just like to say thankyou.

If I play my cards right and the opportunities arise then one day I hope to be back @Sun.COM. Until then....

[In a few days I'll be re-hosting this blog on my own server. When I do, I'll update this site]

Monday Jul 03, 2006

Vale Stephen Peak, 1972-2006

I went to a memorial service today for one of the blokes who I went to school with. He died last week in his sleep while visiting a mate in Dublin.

We'd been in the same tute group and house throughout highschool, and while we weren't close, I did (and do) regard him as somebody who I could trust my life to.

The cause of death --- after an autopsy --- is apparently unknown. That's more than a little disturbing for me since we were very close in age (< 6 months iirc).

I saw his Dublin mate today at the service, he'd flown over here to say a eulogy..... quite an effort indeed, especially since Des had found Stephen in the morning. (Des --- thankyou).

I caught up with a few people who had also been at school with him, and it was good to see them. It was also good to see Stephen's older brother and his parents, to let them know (show them?) that they weren't alone.

Not that I could possibly have any concept of what it's like to bury a child or brother.

I'm still rattled.

Monday Jun 19, 2006

Making more use of my iPod on Solaris

Ok, so here's something I only just realised.

For some time now it's been annoying me that whenever I plug my iPod into any of my Solaris hosts, I get the "DO NOT DISCONNECT" logo, with the data-transfer spinwheel icon active in the top left of the window.

$ cfgadm -la
Ap_Id                          Type         Receptacle   Occupant     Condition
usb1/3                         usb-storage  connected    configured   ok

$ cfgadm -lav usb1/3
Ap_Id                          Receptacle   Occupant     Condition  Information
When         Type         Busy     Phys_Id
usb1/3                         connected    configured   ok         Mfg: Apple  Product: iPod  NConfigs: 1  Config: 0  <no cfg str descr>
unavailable  usb-storage  n        /devices/pci@0,0/pci108e,5347@2,1:3

And of course vold is running. But if you then eject the removable disk that vold has mounted, you can then use the USB port to merely charge your iPod while you listen to its contents using your headphones.

How fiendishly simple.

Friday Jun 09, 2006

On FMA - it really does work..... even on my Ultra 20

A few weeks ago I took delivery of a brand, spanking new Sun Ultra 20 workstation, purchased for me by my department.

I installed build 38 of Solaris Express and promptly BFU'd to whatever the relevant nightly build was at the time. I do like to live on the edge :-)

Anyway, I noticed after a while (we're talking hours here, not weeks) that there was only ever one process showing up as on cpu when I ran prstat. Curious, I ran psrinfo ---v, which told me that core 0 of my dual-core Opteron was faulted and offline.

Damn! This is a new machine, less than a day old. What on earth could have gone wrong with the cpu?

I thought I'd have a look at the FMA telemetry that was being generated. This was a bit of a shock:

# fmdump
Apr 27 18:50:23.6205 4cd32003-36a3-c3f8-ea93-b7edc762dd9f AMD-8000-JF
Apr 27 18:50:53.5720 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JF
Apr 28 07:13:56.8810 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JF
Apr 28 12:46:32.0074 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JF
Apr 28 13:37:59.2926 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JF
Apr 28 13:50:35.5724 412579b7-ed8d-607a-905c-e3fb998f290e ZFS-8000-D3
Apr 28 13:54:46.0114 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JF
Apr 28 13:54:46.3803 378726c1-1d68-c0c3-d0fd-9fb2b1431834 ZFS-8000-CS
Apr 28 14:23:09.6371 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JF
Apr 29 05:40:24.2258 a4d4edf8-520d-e625-8223-84c7ce652524 AMD-8000-2F
May 01 15:47:52.8092 abea0bc6-80b1-e022-edd1-d4a385117e0d AMD-8000-2F

Now except for the ZFS\* messages (which occurred when I was playing around with my scsi multipack), we've got two SUNW-MSG-ID strings which you can look up at

If you want to see what FMA has logged as faulted you can run

# fmdump -v -u 4cd32003-36a3-c3f8-ea93-b7edc762dd9f
Apr 27 18:50:23.6205 4cd32003-36a3-c3f8-ea93-b7edc762dd9f AMD-8000-JF
100% fault.cpu.amd.datapath

Problem in: hc:///motherboard=0/chip=0/cpu=0
Affects: cpu:///cpuid=0
FRU: hc:///motherboard=0/chip=0

Ok, so there's a serious looking problem with core 0 in my cpu. Good thing I've got two cores. A quick psradm -n 0 got the core to say that it was back online, but I wasn't really sure I'd done anything to fix it.

What about the other AMD\* messages, what do they mean?

# fmdump -v -u abea0bc6-80b1-e022-edd1-d4a385117e0d
May 01 15:47:52.8092 abea0bc6-80b1-e022-edd1-d4a385117e0d AMD-8000-2F
100% fault.memory.dimm_sb

Problem in: hc:///motherboard=0/chip=0/memory-controller=0/dimm=1
Affects: mem:///motherboard=0/chip=0/memory-controller=0/dimm=1
FRU: hc:///motherboard=0/chip=0/memory-controller=0/dimm=1

Ok, that's looking a tad worse. Especially when I try fmadm repair abea0bc6-80b1-e022-edd1-d4a385117e0d --- look at what I got in /var/adm/messages:

May 9 18:04:00 pieces fmd: [ID 441519 daemon.error] SUNW-MSG-ID: FMD-8000-0W, TYPE: Defect, VER: 1, SEVERITY: Minor
May 9 18:04:00 pieces EVENT-TIME: Tue May 9 18:03:59 EST 2006
May 9 18:04:00 pieces PLATFORM: Sun Ultra 20 Workstation, CSN: 0614FK40E2, HOSTNAME: pieces
May 9 18:04:00 pieces SOURCE: fmd-self-diagnosis, REV: 1.0
May 9 18:04:00 pieces EVENT-ID: ee29d053-e3aa-cbe8-a458-ec8528f5bf99
May 9 18:04:00 pieces DESC: The Solaris Fault Manager received an event from a component to which no automated diagnosis software is currently subscribed. Refer to for more information.
May 9 18:04:00 pieces AUTO-RESPONSE: Error reports from the component will be logged for examination by Sun.
May 9 18:04:00 pieces IMPACT: Automated diagnosis and response for these events will not occur.
May 9 18:04:00 pieces REC-ACTION: Run pkgchk -n SUNWfmd to ensure that fault management software is installed properly. Contact Sun for support.
May 9 18:04:00 pieces fmd: [ID 441519 daemon.error] SUNW-MSG-ID: FMD-8000-0W, TYPE: Defect, VER: 1, SEVERITY: Minor
May 9 18:04:00 pieces EVENT-TIME: Tue May 9 18:04:00 EST 2006
May 9 18:04:00 pieces PLATFORM: Sun Ultra 20 Workstation, CSN: 0614FK40E2, HOSTNAME: pieces
May 9 18:04:00 pieces SOURCE: fmd-self-diagnosis, REV: 1.0
May 9 18:04:00 pieces EVENT-ID: 27b29562-976e-ea5f-f554-d9010393029f
May 9 18:04:00 pieces DESC: The Solaris Fault Manager received an event from a component to which no automated diagnosis software is currently subscribed. Refer to for more information.
May 9 18:04:00 pieces AUTO-RESPONSE: Error reports from the component will be logged for examination by Sun.
May 9 18:04:00 pieces IMPACT: Automated diagnosis and response for these events will not occur.
May 9 18:04:00 pieces REC-ACTION: Run pkgchk -n SUNWfmd to ensure that fault management software is installed properly. Contact Sun for support.


So I logged a call with Sun Support requesting a new dual-core cpu and two new dimms. (Yes, I know that the telemetry only mentioned one dimm, but they're shipped in pairs). The parts duly came, and I installed them.

A day or so before the parts came I noticed that root got email saying that the fault log was too busy to rotate. This got me worried as well..... fmdump was showing 6 error events against dimm #1 every minute, which I thought was quite excessive.

So it was time for a quick search of the archives of FMA-discuss but nothing seemed to match. Time for an email to the team to find out whether they'd seen anything like this. Unfortunately not.

So I replaced the cpu (that worked just fine afterwards) and both dimms.... and the fma telemetry for the dimms continued.

Now I was getting worried. Really, really worried.

I'd taken the necessary precautions when replacing the cpu and the dimms, I'd tried running fmadm repair against the dimm uuids, and I'd even tried unloading just about all of the fma modules. (All that produced was messages to the effect of "hey! I've got an error and I dunno what to do with it.")

So I got in contact with the FMA core team and one of their number ssh'd into my workstation and dug around for a bit. I also got an email from Gavin Maltby letting me know that I actually had a single bit error on that dimm. From that he surmised that there was a single pin gone bad in the slot.... and could I spare the downtime to have a look please?

So at lunchtime that day I shutdown the box, took the necessary static precautions and removed the dimms.

Lo! and behold, I saw that there was indeed a bent pin in the second slot away from the cpu:

So I moved the pair of dimms to the other two slots, powered the system on.... ran fmadm repair on the dimm uuid, and life was good again.

I'd love to provide a "moral" to this anecdote, but there isn't one. All I can say is that this FMA stuff really does work and if you're not running Solaris 10 (or Express) by now then you are missing out.

Manpages that you will find useful for FMA include
fmadm(1M), fmd(1M), fmdump(1M), and fmstat(1M)

And don't forget the OpenSolaris Fault Management community pages too.

Thursday May 25, 2006

This just in: DTrace ported to FreeBSD-current

A friend of mine who uses FreeBSD passed on this announcement:
List: freebsd-current Subject: DTrace for FreeBSD - Status Update From: John Birrell Date: 2006-05-25 6:55:10 Message-ID: 20060525065510.GA20475 () what-creek ! com [Download message RAW] It's nearly 8 weeks since I started porting DTrace to FreeBSD and I thought I would post a status update including today's significant emotional event. 8-) For those who don't know what DTrace is or which company designed it, here are a few links: The BigAdmin: A Blurb: The Guide: My FreeBSD Project Page: Much of the basic DTrace infrastructure is in place now. Of the 1039 DTrace tests that Sun runs on Solaris, 793 now pass on FreeBSD. We've got the following providers: - dtrace - profile - syscall - sdt - fbt As of today, loading those providers on a GENERIC kernel gives 32,519 probes. Today's significant emotional event added over 30,000 of those, thanks to the Function Boundary Tracing (fbt) provider. It provides the instrumentation of the entry and return of every (non-leaf) function in the kernel and (non-DTrace provider) modules. [snip script and output] There is still a lot of work to do and while that goes on, the code has to remain in the FreeBSD perforce server. It isn't ready to get merged into CVS-current yet. I have asked the perforce-admins to mirror the project out to CVS (via, but I'm not sure what the hold-up there is. I had hoped that one or two of the Google SoC students would contribute to this, but I only received one proposal and that wasn't for anything that would help get DTrace/FreeBSD completed. There are things people can do to help. Some of them are build related; some are build tool related; some are user-land DTrace specific; and the rest are kernel related. Speak up if you are interested in working on this! --- --- John Birrell
I thought you might like to know. What a great way to start my day!

Monday May 22, 2006

There's a new blog in town: Solaris CAT

We're ramping things up somewhat with Solaris CAT... getting v4.2 ready to put on SDLC and adding support for more and more of the new (not so new now!) features that are to be found in Solaris 10 and Solaris Express (aka OpenSolaris).

For my part, I've been working out the bugs in our x86 and x64 disassembler implementation so that we can provide support for x86 and x64/amd64 for Solaris 10 onwards with version 5.0 of Solaris CAT ..... In fact I fixed a really nasty one just a few minutes ago which only happened on x86. It's been giving me the irrits for a few months (in between working on my day job of course!).

Now that I've got that fixed I can go back to working on the other stuff that we need in order to make the x86 and x64 versions as complete as possible. Of course there will be things that aren't applicable to those platforms and we'll give you a polite message to remind you.

At any rate, the team has a group blog here: Solaris CAT where we'll provide news, how to guides and other neat things.

Ooh ooh ooh how could I forget! v4.2 (almost ready) has support for CTF. So if you want to get with the program on CTF (because let's face it, stabs is an evil and disgusting flying monkey) then if you follow my instructions you'll be able to see your structures inside Solaris CAT. Now that is cool.

Wednesday Apr 26, 2006

Scott isn't going, he's just stepping back from the limelight

I got up earlyish yesterday to catch our 3rd quarter earnings announcement. We did ok --- in line with Wall St estimates. That was nothing compared to the "whoa, wtf?!?!?" that I experienced when I saw Jonathan's message talking about Scott's legacy.

A quick reload on the browser for Yahoo! finance and I saw that Scott had stepped down as CEO (and that there was a USD0.40 spike in the share price).

I am so very, very glad that Scott hasn't resigned and left the company. All he's doing is taking a step backwards out of the limelight to become chairman of the board. He's still going to be part of Sun but (I hope) getting more sleep at night.

Today I read Ashlee Vance's interview with Scott. It was fine vintage Scott, from the Scott that I've known and admired for so many years.

I particularly liked this comment (page 2):

I don't have the same sense of urgency to go blow away customer value that someone who is trying to make a buck this quarter on the street who has no vested interest in the long-term shareholder value of this company has. I don't listen to that garbage.

So take that! Laura Conigliaro. And incidentally, I saw this report mentioning a portfolio manager whose company sold SUNW after the bubble burst and ".... hasn't touched Sun since." Hmmm. I've still got my SUNW stock that I bought just before the height of the bubble. I'm keeping it because I believe in this company --- what it stands for, what it produces, what it values. I'm not in the stock market to make a quick dollar.


