By eschrock on Aug 14, 2004
In the footnote a few days ago, I commented on the fact that the history of Solaris debugging could rougly be divded into three 'eras'. As someone interested in UNIX history, I decided to dig through the Solaris archives and put together a chronology of Solaris debuggability and observability tools. For fun, I divided it into eras to parallel Earth's history. And I swear I'm not out to make anyone feel like a dinosaur (or a prokaryote, for that matter).
I've only been around for one of these "dawn of a new era" arrivals, DTrace1. When one of these revolutionary tools arrive, it's amazing to see how quickly engineers avoid their own past. Try asking Bryan to debug a performance problem on Solaris 9, and you'll probably get some choice phrases politely explaining that while he appreciates the importance of your problem, he would rather throw himself down a slide of broken glass and into a vat of rubbing alcohol. Being the neophyte that I am, I've only ventured into the 'Paleozoic era' on one occasion. After an MDB session on a Solaris 8 crashdump (paraphrased slightly):
$ mdb 0 > ::print mdb: invalid command '::print': unknown dcmd name > ::help print mdb: unknown command: print > ::please print mdb: invalid command'::please': unknown dcmd name > ::ihateyou $
I quickly ran away screaming, never to return. I think I ended up hiding in a corner of my office for two hours, cradling my DTrace answerbook and whispering "there's no place like home" over and over. I'm still a spoiled brat, but at least I have respect and admiration for those Solaris veterans who crawled through debugging hell so that I could live a comfortable life2. It's also made me feel sorry for the Linux (and Windows) developers out there. Not in the Nelson Muntz "Ha ha! You don't have DTrace!" sense. More like "Poor little guy. It's not his fault his species never evolved opposable thumbs." There are a lot of brilliant Linux developers out there, stuck in a movement that doesn't embrace debugging or observability as fundamental goals. But this post is supposed to be about history, not Linux. So without further ado, my brief history of Solaris (soon to be available in refrigerator magnet form):
|adb, ptrace, crash|
|1994||Kernel slab allocator|
|PROTEROZOIC||Next generation /proc|
|pkill and pgrep|
|1998||savecore on by default|
|1999||libproc for corefiles|
|lockstat kernel profiling|
|EOL of crash(1M)|
|2001||live process control for MDB|
|EOL of adb(1)|
|pargs and preap|
|MESOZOIC||kernel CTF data|
|libumem(3LIB) and umem_debug(3MALLOC)|
|::typegraph for mdb(1)|
|coreadm(1M) content control|
|2004||DTrace pid provider for x86|
|pfiles with pathnames|
|DTrace sched, proc providers|
|CTF for core libraries|
|DTrace I/O provider|
|DTrace MIB, fpuinfo providers|
These are my choices based on SCCS histories and putback logs. Obviously, I've failed to include some things. Leave a comment or email if you think something's not getting the recognition it deserves (keeping in mind this is a blog post, not a book).
1 I actually started exactly one day before DTrace integrated. But I had some experience (albeit limited) as an intern the previous year.
2 In all seriousness, it's not that I don't have to ever debug anything, or that the problems we have today are somehow orders of magnitude simpler than those in the past. What these tools provide is a dramatic reduction in time to root-cause. You still need the same inquisitive and logical mind to debug hard problems, its just that good tools let you form questions and get answers faster than you could before. Really good tools (like DTrace) let you ask the previously unanswerable questions. You may have been able to debug the problem before, but you would have ended up running around in circles trying to get data that's now immediately available thanks to DTrace.