X

Recent Posts

Solaris

Oracle Instant Client: now available in IPS

Over the last few years I've spent a fair amount of time working deep inside the ON (OS/Networking) consolidation, improving the build system (https://blogs.oracle.com/jmcp/entry/my_own_private_crash_n) and enhancing some general gate plumbing. One significant aspect of that plumbing was migrating our automated bug update system from Sun's Bugster to Oracle's bug database. When Oracle's acquisition of Sun took effect we were still developing Solaris 11, so we got an exemption from the "thou shalt migrate to the one true bug tracking tool" edict. Once we had shipped Solaris 11, however, we had to get cracking on that migration. My small part in that process was writing a python script to provide a gate tools interface. We needed to provide a way for engineers to check that their bugids, synopses and pre-integration state were correct, as well as automatically updating bug states on integration ("fix available"), backout ("fix failed") and build close (when the gate staff mark a bug as "fix delivered"). This was a surprisingly large amount of work, even though it resulted in only about 1500 lines of code (95% python, 5% shell). The majority of the effort came from learning the database schema and its APIs - and for that I needed to use sqlplus. Those of you who need to interact directly with Oracle databases will be familiar with this tool, and a great many people use the version that comes with their full Oracle database installation. There is another way of obtaining this tool - the Oracle Instant Client. Until now you needed to download the Instant Client from https://www.oracle.com/technetwork/database/features/instant-client/index-100365.html in zipped format, and unpack the bits you needed into a convenient location. Some mucking around with LD_LIBRARY_PATH was necessary, too. As follow-on from this bug service migration project I developed a hankering to see the Oracle Instant Client made available in IPS form, and I am delighted to announce that this is now possible with the Oracle Solaris 11.3 beta release for the 12.1.0.2.0 release of the Oracle Instant Client. If you've downloaded the Oracle Instant Client from OTN, you will be aware that the zipfiles are split up into 32 and 64bit versions of the basic libraries, sqlplus, the ODBC and JDBC supplements and software development kit (sdk). What we are providing with our delivery is slightly different from what you'll find on OTN, because we've combined a few logically aligned packages into one: pkg:/database/oracle/instantclient pkg:/database/oracle/instantclient/jdbc-supplement pkg:/database/oracle/instantclient/odbc-supplement pkg:/developer/oracle/instantclient/sdk There is also a pkg:/consolidation/instantclient/instantclient-incorporation which ties them all together. The contents of pkg:/database/oracle/instantclient almost completely match the OTN 'basic', 'sqlplus' and 'wrc' zipfiles - in both 32- and 64-bit versions. The sdk, odbc-supplement and jdbc-supplement packages match what is provided in the OTN zipfiles. To install these packages, once you have set your solaris publisher to the Beta release repo, just utter # pkg install pkg:/consolidation/instantclient/instantclient-incorporation As newer versions of the Instant Client are released, we will update the version in https://pkg.oracle.com to match, and you will notice that the package FMRI tracks the Database release version (12.1) rather than the Solaris release. We have also updated the runpaths in the libraries and binaries so there is no need for you to set LD_LIBRARY_PATH in a wrapper script - though you might find it useful to add /usr/oracle/instantclient/12.1/bin to your $PATH. As a side note, you might find it useful to set ORACLE_HOME if you are going to build bindings such as cx_Oracle for Python or DBD::Oracle for Perl. Finally, I could not have done this without the assistance of Chris Jones, the Instant Client program manager - thankyou Chris!

Over the last few years I've spent a fair amount of time working deep inside the ON (OS/Networking) consolidation, improving the build system (https://blogs.oracle.com/jmcp/entry/my_own_private_crash_n...

Solaris

My own private crash-n-burn farm: using kernel zones for speedy testing

I've spent most of the last two years working on a complete rewrite of the ON consolidation build system for Solaris 12. (We called it 'Project Lullaby' because we were putting nightly to sleep). This was a massive effort for our team of 5, and when I pushed the changes at the end of February we wound up with about 121k lines of change over close to 6000 files. Most of those were Makefiles (so you can understand why I'm now scarred!). We had to do an incredible amount of testing for this project. Introducing new patterns and paradigms for the complete Makefile hierarchy meant that we had to be very careful to ensure that we didn't break the OS. To accomplish this we used (and overloaded) the resources of the DIY test group and also made use of a feature which is now available in 11.2 - kernel zones. Kernel zones are a type-2 hypervisor, so you can run a separate kernel in them. If you've used non-global zones (ngz) on Solaris in the past, you'll recall the niggle of having to have those ngz in sync with the global when it comes to SRUs and releases. Using kernel zones offered several advantages to us: I could run tests whenever I wanted on my desktop system (a newish quad-core Intel Core i5 system with 32gb ram), I could quickly test updates of the newly built bits, I could keep the zone at the same revision while booting the global zone with a new build, and (this is my favourite) I could suspend the zone while rebooting the global zone. Our testing of Lullaby in kernel zones had two components: #1 does it actually boot? and #2 assuming I can boot the kz with Lullaby-built bits, can I then build the workspace in the kz and then boot those new bits in that same kernel zone? Creating a kernel zone is very, very easy: limoncello: # zonecfg -z crashs12 create -t SYSsolaris-kz limoncello: # zoneadm -z crashs12 install -x install-size=40glimoncello: # zoneadm -z crashs12 boot I could have used one of the example templates (eg /usr/share/auto_install/sc_profiles/sc_sample.xml) but for this use-case I just logged in and created the necessary users, groups, automount entries and installed compilers by hand. (Meaning pkg install rather than tar xf). To start with, I ensured that crashs12 was running the same development build as my global zone, but I removed the various hardware drivers I had no need for. The very first test I ran in crashs12 was a test of libc and the linker subsystem. Building libc is rather tricky from a make(1s) point of view, due to having several generated (rather than source-controlled) files as part of the base. The linker is even more complex - there's a reason that we refer to Rod and Ali as the 'linker aliens'! Once I had my fresh kz configured appropriately, I created a new BE, mounted it, then blatted the linker and libc bits onto it and rebooted. I was really, really happy to see the kz come up and give me a login prompt. Several weeks after that we got to the point of successful full builds, so I installed the Lullaby-built bits and rebooted: root@crashs12:~# pkg publisherPUBLISHER TYPE STATUS P LOCATIONnightly origin online F file:///net/limoncello/space/builds/jmcp/lul-jmcp/packages/i386/nightly-nd//repo.osnet/extra (non-sticky, disabled) origin online F file:///space/builds/jmcp/test-lul-lul/packages/i386/nightly/repo.osnet-internal/solaris (non-sticky) origin online T http://internal/reporoot@crashs12:~# pkg update --be-name lul-test-1root@crashs12:~# reboot This booted, too, but I couldn't get any network-related tests to work. Couldn't ssh in or out. Couldn't for the life of me work out what I'd done wrong in the build, so I asked the linker aliens and Roger for help - they were quick to realise that in my changes to the libsocket Makefiles, I'd missed the filter option. Once I fixed that, things were back on track. Now that Lullaby is in the gate and I'm working on my next project, I'm still using crashs12 for spinning up a quick test "system" and I'm migrating my 11.1 Virtualbox environment to an 11.2 kernel zone. The 11.2 zone, incidentally, was configured and installed in about 4 minutes using an example AI profile (see above) and a unified archive. Kernel zones: you know you want them.

I've spent most of the last two years working on a complete rewrite of the ON consolidation build system for Solaris 12. (We called it 'Project Lullaby' because we were putting nightly to sleep). This...

OpenSolaris

And before I forget it \*again\* ...

One of the things I'm currently responsible for (as onnv gatekeeper), is maintenance of our gatehooks. From time to time I need to make changes and test them in a sandbox, and I keep my sandbox pretty small. hg update takes a while when you have to run it over all of ON :-)Anyway, with the most recent round of changes (to support running the hooks with Mercurial built against python 2.6), I just spent the best part of a day beating my head against two things. Firstly, I'd forgotten that my sandbox was configured to exec mercurial with the--traceoption. This has the effect with python2.6 and Mercurial 1.3.1 of making any process exit, even a successful one (exit(0)) die with a traceback:$ $HG push ssh://onhg@localhost//scratch/gate/trees/minirepo-131_26pushing to ssh://onhg@localhost//scratch/gate/trees/minirepo-131_26searching for changesAre you sure you wish to push? [y/N]: yremote: adding changesetsremote: adding manifestsremote: adding file changesremote: added 1 changesets with 1 changes to 1 filesremote: caching changes for gatehooks...remote: ...changes cachedremote: Sanity checking your push...remote: ...Sanity checks passedremote: pushing to /scratch/gate/trees/minirepo-131_26-cloneremote: searching for changesremote: adding changesetsremote: adding manifestsremote: adding file changesremote: added 1 changesets with 1 changes to 1 filesremote: Traceback (most recent call last):remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 43, in _runcatchremote: return _dispatch(ui, args)remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 449, in _dispatchremote: return runcommand(lui, repo, cmd, fullargs, ui, options, d)remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 317, in runcommandremote: ret = _runcommand(ui, options, cmd, d)remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 501, in _runcommandremote: return checkargs()remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 454, in checkargsremote: return cmdfunc()remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 448, in remote: d = lambda: util.checksignature(func)(ui, \*args, \*\*cmdoptions)remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/util.py", line 402, in checkremote: return func(\*args, \*\*kwargs)remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/commands.py", line 2752, in serveremote: s.serve_forever()remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/sshserver.py", line 46, in serve_foreverremote: sys.exit(0)remote: SystemExit: 0The second one was equally frustrating: our hooks have a Test parameter, which you set to True or False in your gate's hgrc. Setting it to True means that the hook does not do any actual work. Guess which value I'd left it set to in my minirepo?

One of the things I'm currently responsible for (as onnv gatekeeper), is maintenance of our gatehooks. From time to time I need to make changes and test them in a sandbox, and I keep my sandbox...

OpenSolaris

Migrating from Solaris Express to OpenSolaris

There's currently no way to do an in-place upgrade0 from Solaris Express's SysV packaging to OpenSolaris' IPS packaging, so you have to think outside the box just a little. My Ultra 40 M2 has been happily chugging away with SXCE builds since I took delivery of it, but with build 131 fast approaching (when we start delivering a nightly IPS repo rather than SysV packages), I figured I should put some effort into migrating to the new world. Fortunately for me, I've been running with ZFS root since it was first available (build 80 or so), and when I reinstalled my laptop last year I put OpenSolaris on it. It's been upgraded to snv_126 in the last week, too. Here's what I did. After making sure I had enough space in my u40m2 ("blinder") root pool, I created a zfs snapshot of the current BE on the laptop ("gedanken"). Then destroyed it, pruned sundry filesystems and a bunch of packages which I don't need (eg, I have no use for almost all the localisations), and re-created the snapshot. gedanken# zfs snapshot rpool/ROOT/opensolaris-7@yayblinder# zfs create rpool/ROOT/opensolaris-7 Then I sent it from gedanken to blinder: gedanken# zfs send rpool/ROOT/opensolaris-7@yay | \\ ssh blinder zfs recv -v rpool/ROOT/opensolaris-7[trundle]received 20.8GB stream in 2707 seconds (7.85MB/sec) Now the action switches to blinder: (add entry to grub menu.lst, remember to set the default) # zpool set boots=rpool/ROOT/opensolaris-7 rpool# zfs set canmount=noauto rpool/ROOT/opensolaris-7# zfs set mountpoint=/ rpool/ROOT/opensolaris-7 Time for some customisations on the new pool: # zfs set mountpoint=/mnt rpool/ROOT/opensolaris-7# cp /etc/ssh/sshd\*key\* /mnt/etc/ssh# cp /etc/hostid /mnt/etc/hostid# cp /etc/inet/hosts /mnt/etc/inet/hosts# cp /etc/X11/xorg.conf /mnt/etc/X11# cp /etc/hostname.nge0 /mnt/etc# cp /var/spool/cron/cronjobs/onhg /mnt/var/spool/cron/cronjobs I think that's it, I really should have run this from within script(1)! # cd /# zfs umount rpool/ROOT/opensolaris-7# zfs set mountpoint=/ rpool/ROOT/opensolaris-7 Time for the acid test! # init 6 [dammit, this "fast reboot" stuff is TOO FAST!] root@blinder:~# cat /etc/release OpenSolaris Development snv_127 X86 Copyright 2009 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 06 November 2009root@blinder:~# root@blinder:~# uname -aSunOS blinder 5.11 snv_127 i86pc i386 i86pcroot@blinder:~# and X came up just fine. After making sure that all the bits that I expected were there, I was very pleased to call this exercise a success. The only bit that remains to be done is configuring my non-global zone but I'll leave that for another post.

There's currently no way to do an in-place upgrade0 from Solaris Express's SysV packaging to OpenSolaris' IPS packaging, so you have to think outside the box just a little.My Ultra 40 M2 has been...

OpenSolaris

Vodafone Mobile Broadband with a Huawei K3520/E169

Since I'll be even more remote from my office for a few days, due to acceptance of my talk about gatekeeping for SAGE-AU, I thought I should acquire one of those funky mobile broadband solutions so I could keep in contact with the gate.It's been a bit of a pain to get working, but now it is I figure I should provide some details what I've done.Firstly, a big thankyou to my mate Arjen who brought his Huawei E220 dongle over and let me muck around with it for a few hours.The solution I chose was Vodafone's Mobile Prepaid Broadband, which comes with a Huawei E169, aka K3520 usb dongle. This has the increasingly popular "ZeroCD(tm)" feature - when you plug it in, it defaults to showing a storage device (usb device class 8) rather than as a communications device (usb device class 2). This storage device has the windows drivers and application on it, which then kicks the device into being a communications device. All good, or something, as long as you're running XP, Vista or Mac OS X. Not so good, clearly, for yours truly.A bunch of google hits came up with http://darkstar-solaris.blogspot.com/2008/10/huawei-e169-usb-umts-gprs-modem.html, http://www.opensolaris.org/jive/thread.jspa?messageID=272246, http://in.opensolaris.org/jive/thread.jspa?messageID=227212 and http://my2ndhead.blogspot.com/2008/11/opensolaris-huawei-e220-swisscom-and.html, which got me started - I need to use the usbsacm(7D) driver, so it was time to update_drv. Being a bit lazy, and suspecting that there'd be a few options that needed covering, I just hand-edited /etc/driver_aliases to add in all the possible compatible entries for the device (starting with usb12d1,1001.0). Still no joy - darn thing was still showing up as a storage device even after hotplugging.On the off-chance that the attached-at-power-on state might be different I rebooted with it attached, ... lo and behold, it was different! No usb,class8, just compatible properties which allowed me to attach it to the usbsacm driver, and get 3 /dev/term entries. After a hotplug op, however, it was back to showing up as a storage device, which was most undesirable.So I tried looking for what solutions the linux world has come up with to the problem, came across usb_modeswitch (which gave the clue that sending a "USB_REQ_SET_FEATURE, 00000001" would kick it properly), and also some Huawei forum posts.One thread in particular caught my eye: Thread: K3520 (E169) microSD lost which featured a comment from a Huawei employee instructing the user to utter "at\^u2diag=255" in a hyperterm session in order to get back their microSD card slot.So being inquisitive, and making a guess, when I had the dongle connected from boot, I ran# tip /dev/term/0connectedATZOKat\^u2diag=0OK~.and hotplugged the device. On re-insertion (after waiting 1 minute), I saw that there was no storage device found, just usbsacm instances. W0000t!So now all I had to do was trawl my memories to recall how to do ppp (eeeek!) and would be connectable.Therein lies a lot of pain, so I'll cut straight to the "this works for me" part:The aliases I've got in /etc/driver_aliases areusbsacm "usb12d1,1001.0"usbsacm "usb12d1,1001"The peers file that I'm using is/dev/huawei720000debuglogfile /var/tmp/pppd.logcrtsctsasyncmap a0000idle 300passivedefaultrouteusepeerdns:0.0.0.0noccpnovjlcp-echo-interval 0lockmodemconnect '/usr/bin/chat -s -t60 -f /etc/ppp/voda-chat'noauthpersist Note that I symlinked /dev/term/0 to /dev/huawei - purely because I wanted to.The chat script isABORT 'BUSY'ABORT 'NO CARRIER'"" AT&FOK AT\\136u2diag=0OK ATE0V1E1X1\\136SYSCFG=2,2,3FFFFFFF,1,2OK ATS7=60OK AT+CGDCONT=1,"IP","vfprepaymbb"OK AT+CGQMIN=1OK AT+CGQREQ=1OK "ATD \\052 99\\052\\052 2 \\043"REPORT CONNECT ''CONNECTNote the use of octal characters - Solaris' /usr/bin/chat doesn't like the caret (\^), asterisk (\*) or hash (#) in a chat script, so you have to work around that by using \\136 for caret, \\052 for asterisk, and \\043 for hash. Also, note that the prepaid solution uses an Access Point Name (APN) of "vfprepaymbb" rather than the contract/postpaid "vfinternet.au".The other thing of interest is the ":0.0.0.0" in the peers file. This is what I needed to add in order to get around the problem[ID 702911 daemon.debug] Peer refused to provide his address; assuming 192.168.1.1Once I'd done that, things seemed to work just fine. It was rather weird to see a ping time from my non-global zone to the laptop via 3G taking about 170msec even though both machines are within my arm's reach!I just need to get some ip-up and ip-down scripts figured out, and then I'll be all raring to go.One final thing, there are apparently a reasonably annoying bug with usbsacm:6840063 usbsacm stops sending data out when pushed hard (fixed in snv_120), and an RFE 6588968 Provide support for 3G USB broadband modem device from Huawei Technologies.I don't know for sure whether this works if you're not running snv_120, but rather than disabling a core, you could try# pbind -b 0 `pgrep pppd`as part of your ip-up script. I'm going to try it and see.

Since I'll be even more remote from my office for a few days, due to acceptance of my talk about gatekeeping for SAGE-AU, I thought I should acquire one of those funky mobile broadband solutions so I...

Conferences

KCA2009 - winding down #1

Now that I've had a weekend to start recovering from Kernel Conference Australia, it's time to start catch up on a few blog entries. There'll be several over the next few weeks as I work through everything I want to mention.ThankyousA massive thankyou to our volunteers on the ground (James Lever, Greg Black, Nikolai Lusan, Daniel Dawson) who all did a stellar job and made it possible for me to be the shiny happy face of the conference :-)Thankyou to the review committee (Jake Carroll, Andre van Eyssen, David Gwynne and John Sonnenschein) who helped put the program together.Thankyou to our speakers, without whom there would not have been any conference to attend.Thankyou to Deirdre Straughan who got us live streamed sessions on ustream and recorded many hours of video.Thankyou to Claire Operie, Gabriele Decelis and Diana Pearce who handled the logistics of getting the event off the ground (registrations, the event website, catering, swag etc).Thankyou to Mitch Roberts for setting up the demo room and who answered hundreds of questions about pretty much every piece of technology Sun sells. He also showed off some really amazing VDI and demonstrated the Fishworks kit. Mmmmmmmmmmmm!Thankyou to Jake Carroll for setting up the infrastructure we needed at the Queensland Brain Institute. Thankyou especially to QBI's director Professor Perry Bartlett, FAA and Ian Duncan for generously allowing us to use QBI's facilities for the conference.Thankyou to the Brisbane branch of Sun Microsystems who got behind the idea and provided connections, spread the word and encouraged people to come along. Thankyou to everybody who attended KCA, who thought it would be something worthwhile. I hope you went away excited, energised and enthused about all the really cool technologies that you can find in Open Source kernels, and even be inspired to contribute to your favourite kernel in the future. Finally, thankyou to our sponsors - Frontline, OpenSolaris, and Sun Microsystems.Presentations, Videos, Papers, ProceedingsA number of people have asked me over the last few days whether the presentations will be made available online. The answer is a resounding YES, but not just yet. What the review committee (well, just Andre, and myself) are doing is creating a Proceedings of the conference. Within that we will have the papers or slideware and speakernotes that each presenter wrote. When we have that finished I will announce it on this blog, and I will email all the delegates to provide details on how to obtain a copy. (Hopefully it'll be in the National Library of Australia with proper CiP data too).The videos that we recorded are being cleaned up and will be uploaded to somewhere with a lot of space. Soon. I don't know when, but when I do know, I'll announce it here.Will we run this again in 2010?The feedback I've received from people attending KCA has generally been very positive about the event, and encourages me to organise another KCA for next year. We'll have to wait and see how things pan out following the buyout vote, but I'm very hopeful that we'll be able to make it happen.

Now that I've had a weekend to start recovering from Kernel Conference Australia, it's time to start catch up on a few blog entries. There'll be several over the next few weeks as I work...

Conferences

KCA2009 - earlybird registrations close in 1 week!

It's only one week before the earlybird registration period for Kernel Conference Australia closes. As a quick reminder, in addition to our excellent keynote speakers Jeff Bonwick, Bill Moore and Max Alt, here are the people who you'll be able to meet, listen and learn from at KCA:Fernando GontResults of a Security Assessment of Common Implementation Strategies of the TCP and IP ProtocolsHenning Brauer (OpenBSD)Faster Packets: Performance Tuning in the OpenBSD Network Stack and PFGavin Maltby (Sun Microsystems)Hardware & Software Fault Management ArchitecturePawel Dawidek (FreeBSD)GEOM - The FreeBSD way of handling storageJohn Sonnenschein (Sun Microsystems)Driver and Filesystem Development with the Solaris and OpenSolaris DDI/DKIDavid Gwynne (University of Queensland)MCLGETI: Effective Network Livelock Mitigation and MoreCristina Cifuentes (Sun Microsystems)Finding Bugs in Open Source Kernels Using ParfaitSherry Moore (Sun Microsystems)Fast reboot support (and more) for OpenSolarisMax Bruning (Bruning Systems)Porting USB HID Device Drivers Between Linux and OpenSolarisJames Morris (Red Hat)Linux Kernel Security OverviewPercy Pari-Salas (Bond University)Automated Testing of OpenSolarisVivek Joshi (Sun Microsystems)Porting OpenSolaris across architecturesJayakara Kini (Sun Microsystems)Crossbow for OpenSolaris DevelopersGarrett D'Amore (Sun Microsystems)Boomer: the new OpenSolaris audio systemPramod Batni (Sun Microsystems)Debugging and Diagnosing Interesting Kernel ProblemsStewart Smith (Sun Microsystems)(Ab)use the Kernel: what a database server can do to your kernelSo what are you waiting for? Hurry up and register!

It's only one week before the earlybird registration period for Kernel Conference Australia closes.As a quick reminder, in addition to our excellent keynote speakers Jeff Bonwick, Bill Moore and Max...

Conferences

Kernel Conference Australia - it's coming!

Over the years it's been a source of frustration for me that the conferences which I wanted to attend were either too expensive (time, travel, registration etc) or not covering topics I was interested in.Late last year I realised I should stop grumbling about it and fix it myself. So it is with great pleasure that I can announce that this July 15th to 17th in Brisbane, there will be an Open Source kernel-focused conference: Kernel Conference Australia. We don't have absolutely all of the organisational bits together yet, but here's what we do have:World class speakersJeff Bonwick (Sun, ZFS)Bill Moore (Sun, ZFS)Sherry Moore (Sun, x64 guru)Gavin Maltby (Sun, FMA guru)Dave Stewart (Intel, OpenSolaris group)Fantastic locationOur venue is the Queensland Brain Institute, within the University of Queensland.Excellent climateThe University is situated in Brisbane, Queensland, Australia. Yes, it will be winter... but winter in Brisbane is a beautiful time to visit.Call for PapersThe Call for Papers is not quite ready (still a few details to be ironed out), but if you'd like to be considered for a presentation spot you should be thinking about a topic in any of these areas:Cross-architecture kernel developmentPorting an OS to a new architectureFilesystemsSystem performance visualisation (DTrace, SystemTap?)Image visualisation (GPU kernels)Fault Management(globally) Distributed kernel development - how to make it workVirtualisationClustering (HPC and High Availability)Distributed systemsKernel Testing - methodologies, interesting problems foundTraps and pitfalls found when porting drivers between OSesRealtime performance and schedulingEmbedded OSes and drivers (including control systems)Patents and Open SourceThe state of OS kernel research / what's new / work in progressApart from the above, we expect that pretty much any kernel-focused topic for an Open Source licensed OS will be considered by the organising committee.Target OSesOpenSolaris (of course!)The BSD family FreeBSD / OpenBSD / NetBSDLinuxminixresearch OSes (L4 variants, GNU HURD etc etc)If you would like to help with sponsoring the conference, please let me know via email (jmcp at my employer . com).

Over the years it's been a source of frustration for me that the conferences which I wanted to attend were either too expensive (time, travel, registration etc) or not covering topics I was interested...

OpenSolaris

On stmsboot(1m)

When I started working with SAS (November 2006), our group's project (MPxIO support in mpt(7d)) was already off and running. The part that I was responsible for was booting - which meant fixing stmsboot(1m). Initially I was disappointed that I'd been given what I thought was such a small part of the problem to work on, but I quickly realised that there was a lot more to it than my first impression revealed.Since we were under some pretty tight time pressure, I didn't really have time to do a redesign of stmsboot to make it more sustainable. The expected arrival of ZFS root meant that there was also some uncertainty about how that would tie in - nothing was nailed down, so I had to make some guesses and keep my eyes peeled for when ZFS root eventuated and then see what further changes needed to be made. We putback those changes into snv_63, and had a few followups in subsequent builds, and all seemed ok.Then in February 2008 there was a thread on storage-discuss about how to obtain a particular device's lun number after running devfsadm -C (or boot -r, for that matter). I did a little digging and figured out that it would indeed be possible to provide that information - if you were willing to do a little digging and make use of a scsi_vhci ioctl() or two. Using hba-private data, unfortunately, so quite unsupportable. But it got me thinking, and I logged 6673281 stmsboot needs more clues as a placeholder.Then a short while later I noticed that the -L or -lX options to stmsboot(1m) were now broken, as of snv_83 (nobody had worked on stmsboot(1m) since I made my changes in build 63). Since this is an essential part of the actual interface, I figured it was important enough to log (6673278 stmsboot -L/l is broken on snv_83 and later) but was unable to do much about it until I got Pluggable fwflash(1m) out of the way first. I was also annoyed to find that there were problems with updating /etc/vfstab, too (6691090 stmsboot -d failed to update /etc/vfstab with non-mpxio device paths... things were not looking good, and I was watching code rot for real. Staggering!The kicker was (6707555 stmsboot is lost in a ZFS root world, and so I knew what I had to do - redesign and rewrite stmsboot from scratch.The RedesignI started with 4 guiding principles:require only one rebootlisting of MPxIO-enabled devices should be \*fast\*minimise filesystem-dependent lookups, anduse libdevinfo and devlinks as much as possible.I then looked at the overall effects that we need to achieve with the stmsboot(1m) command:enable MPxIO for all MPxIO-capable devicesenable MPxIO for specific MPxIO-capable driversenable MPxIO for specific MPxIO-capable HBA portsdisable MPxIO for all MPxIO-capable devicesdisable MPxIO for specific MPxIO-capable driversdisable MPxIO for specific MPxIO-capable HBA portsupdate MPxIO settings for all MPxIO-capable driversupdate MPxIO settings for specific MPxIO-capable driverslist the mapping between non-MPxIO and MPxIO-enabled deviceslist device guids, if availableWhat does the old code do?The code makes use of a shell script (/usr/bin/stmsboot), a private binary (/lib/mpxio/stmsboot_util) and an SMF service (/lib/svc/method/mpxio-upgrade) which runs on reboot. The private binary does the heavy lifting, providing a way for the shell script and SMF service to determine what a device's new MPxIO or non-MPxIO mapping is. The old private binary also walked through the device link entries in /dev/rdsk when called with the -L or -l $controller options, printing any device mappings. Finally, the private binary handles the task of re-writing /etc/vfstab.The shell script (stmsboot) is the user interface part of the facility. Its chief task is to do editing of the driver.conf(4) files for the supported drivers (fp(7d) and mpt(7d)), and to set the eeprom bootpath variable on the x86/x64 platform if disabling or updating MPxIO configurations. (Failing to do this would prevent an x86/x64 host from booting). The shell script also makes backup copies of modified files, and creates a file with instructions on how to recover a system which has failed to boot properly after running the stmsboot script.The SMF service is armed by the stmsboot script, and runs on reboot. It mounts /usr and root as read-write, invokes the private /lib/mpxio/stmsboot_util binary to rewrite the /etc/vfstab, updates the dump configuration and any SVM metadevice (/dev/md) device mappings, and then (in the old form) reboots the system. What has changedThe new design makes use of a private cache of device data (stored using an nvlist) gathered from libdevinfo(3LIB) functions, and obviates the requirement for a second reboot since the vfstab rewriting function is reliable - we use the kernel's concept of what devices it has attached so we're always consistent. In addition, the new design provides a significant improvement in execution time when listing device mappings: we don't need to trawl through device links on disk but instead use libdevinfo functions and our private cache to provide the required information.The data that we store in the cache for each device attached to an MPxIO-capable controller isits devid (eg, id1,sd@n5000cca00510a7cc/aS_________________________________________3QF0EAFP/aits physical path (eg, /pci@0,0/pci10de,5c@9/pci108e,4131@1/sd@0,0)its devlink path (eg, /dev/dsk/c2t0d0, which becomes c2t0d0)its MPxIO-enabled devlink path (eg, /dev/rdsk/c3t500000E011637CF0d0, which becomes c3t500000E011637CF0d0)whether MPxIO is enabled for the device in the running system (as a boolean_t B_TRUE or B_FALSE)These are stored as nvlist properties:#defineNVL_DEVID "nvl-devid"#defineNVL_PATH "nvl-path"#defineNVL_PHYSPATH"nvl-physpath"#defineNVL_MPXPATH"nvl-mpxiopath"#defineNVL_MPXEN"nvl-mpxioenabled"When we've found an MPxIO-capable device, we check whether it exists in our cached version, and if not, we create an nvlist containing the above properties and keyed off the device's devid. This nvlist is added to the global nvlist. In order to speed operations later, we also add some inverse mappings to the global nvlist: devfspath -> devid current devlink path -> devid current MPxIO-enabled path -> devid device physical path -> devidThis allows us to search for any of those paths and get the appropriate devid back, the nvlist of which we can then query for the desired properties.When the mpxio-upgrade service is invoked, we need to determine the mapping for the root device in the currently running system and mount that device as read-write in order to continue with the boot process. We do this by reading the entry for root in /etc/vfstab and finding the physical path of that device inthe running system. We mount /devices/$physicalpath as read-write, then re-invoke stmsboot_util to find the devlink (/dev/dsk...) path for root, which we then remount. This two-remount option is required because the devlink facility is not available to us at this early stage of the boot process devfsadm is not running yet) - until we can determine what the root device is and mount it as read-write.Once root and /usr have been remounted, we can then invoke stmsboot_util to re-write the vfstab. This is a fairly simple process of scanning through each line of the file and finding those which start with /dev/dsk, determining their mapping in the current system, and re-writing that line. As a safeguard, the new version of the vfstab is written to /etc/mpxio, and we let the mpxio-upgrade script take care of copying that file to /etc/vfstab. Once the vfstab has been updated, we run dumpadm, and if necessary, metadevadm. Finally, we re-generate the system's boot archive - which in fact is the longest single operation of all!After this, we can disable the svc:/system/device/mpxio-upgrade:default service and exit. When the mpxio-upgrade script exits, the svc:/system/filesystem/usr:default service takes over and the boot process completes normally - with the new device mappings already active and working. No second reboot required!I'm not going to claim that the new form of stmsboot(1m) is a beautiful thing, but I do believe that the architecture and implementation that it has now are much more solid and should be easier to extend in the future if required. Update (17 October 2008, 07:08 Australia/Brisbane): Jason's comment reminded me that I should have mentioned - I pushed these changes into build snv_99 and you can see them in the changelog.See also links:Solaris Express SAN Configuration and Multipathing GuideSolaris ZFS Administration GuideLinker and Libraries GuideSMFdevfsadm(1m) stmsboot(1m) zfs(1m) zpool(1m)libdevinfo(3LIB) libnvpair(3LIB) libdevid(3LIB) fp(7d) mpt(7d) scsi_vhci(7d)driver.conf(4)

When I started working with SAS (November 2006), our group's project (MPxIO support in mpt(7d)) was already off and running. The part that I was responsible for was booting - which meant fixing stmsboo...

OpenSolaris

Patches for Sun Studio 12 required in order to build ONNV

We moved the ONNV gate to use Sun Studio 12 during this last week.As it happens, I've been asked to maintain a build server for a related group, and to help that group bootstrap their development. I realised that I really should find out what patches are required for Sun Studio 12 so that we've got the same toolchain in use, and a quick mail to Nick resulted in the following lists:X86/X64Patch appliedCurrent revPatch description124873-0406Sun Studio 12_x86: Patch for dbx 7.6 Debugger126996-0304Sun Studio 12_x86: Patch for Performance Analyzer Tools124864-0707Sun Studio 12_x86: Patch for Sun C++ Compiler124868-0606Sun Studio 12_x86: Patch for C 5.9 compiler124869-0202Sun Studio 12_x86: Patch for Sun Performance Library124876-0202Sun Studio 12_x86: Patch for Debugger GUI 3.0126496-0202Patch for Sun Studio 12_x86 debuginfo handling126498-0909Sun Studio 12_x86: Sun Compiler Common patch for x86 backend126504-0101Sun Studio 12_x86: Patch for Sun Distributed Make 7.8127002-0404Sun Studio 12_x86: Patch for Fortran 95 8.3 Compiler127003-0101Sun Studio 12_x86: Patch for Fortran 95 8.3 Dynamic Libraries127144-0303Sun Studio 12_x86: Patch for Fortran 95 8.3 Support Library127148-0101Sun Studio 12_x86: Patch for update notification127153-0101Sun Studio 12_x86: Patch for IDE127157-0101Sun Studio 12_x86: Patch for install utilities.SPARCPatch appliedCurrent revPatch description124870-0203Sun Studio 12: Patch for Sun Performance Library124872-0406Sun Studio 12: Patch for dbx 7.6 Debugger126995-0304Sun Studio 12: Patch for Performance Analyzer Tools127000-0405Sun Studio 12: Patch for Fortran 95 8.3 Compiler124861-0808Sun Studio 12: Compiler Common patch for Sun C C++ F77 F95124863-0707Sun Studio 12: Patch for Sun C++ Compiler124867-0207Sun Studio 12: Patch for C 5.9 compiler124875-0202Sun Studio 12: Patch for Debugger GUI 3.0126495-0202Patch for Sun Studio 12 debuginfo handling126503-0101Sun Studio 12: Patch for Sun Distributed Make 7.8127001-0101Sun Studio 12: Patch for Fortran 95 8.3 Dynamic Libraries127143-0303Sun Studio 12: Patch for Fortran 95 8.3 Support Library127147-0101Sun Studio 12: Patch for update notification127152-0101Sun Studio 12: Patch for IDE127156-0101Sun Studio 12: Patch for install utilities.You can get these patches from SunSolve.Update: Thanks to Cyril (see comments) for pointing out my typo for the rev of 124864. Another case of "typing while tired" :(. Cyril - it's fixed now.

We moved the ONNV gate to use Sun Studio 12 during this last week.As it happens, I've been asked to maintain a build server for a related group, and to help that group bootstrap their development. I...

ZFS

Better late than never - a ZFS bringover-like util

So... it's a little late, but better late than never. Now that I've got my u40m2 re-configured and redone my local source code repositories (not hg repos... yet), I figured it was time to make the other part of what I've mentioned to customers a reality.The first part is this: bringing over the source from $GATE, wx init, cd $SRC, /usr/bin/make setup on both UltraSPARC and x64 buildboxen, and then zfs snapshot followed by two zfs clone ops so that I get to build on UltraSPARC and x64 buildboxen in the same workspace at the same time.Yes, this is a really ugly workaround for Should be able to build for sparc and x86 in a single workspace, and while I'm the RE for that bug, it's probably not going to be fixed for a while. So here's the afore-mentioned "other part": a kinda-sorta replacement for bringover, using ZFS snapshots and clones. Both Bill and DarrenM have mentioned something of this in the past, and you know what - the script I just hacked together is about 3 lines of content, 1 line of #! magic and 16 lines of arg checking.Herewith is the script. No warranties, guarantees or anything. Use at your own risk. It works for me, but your mileage may vary. Suggestions and improvements cheerfully accepted.#!/bin/ksh## CDDL HEADER START## The contents of this file are subject to the terms of the# Common Development and Distribution License (the "License").# You may not use this file except in compliance with the License.## You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE# or http://www.opensolaris.org/os/licensing.# See the License for the specific language governing permissions# and limitations under the License.## When distributing Covered Code, include this CDDL HEADER in each# file and include the License file at usr/src/OPENSOLARIS.LICENSE.# If applicable, add the following below this CDDL HEADER, with the# fields enclosed by brackets "[]" replaced with your own identifying# information: Portions Copyright [yyyy] [name of copyright owner]## CDDL HEADER END### Copyright 2008 Sun Microsystems, Inc. All rights reserved.# Use is subject to license terms.### Version number? if this needs a rev.... there's something # really, really wrong# we use the following process to snapshot, clone# and mount an up to date image of $GATE ::# zfs snapshot sink/$GATE@$wsname {$GATE is onnv-gate|on10-feature-patch|on10-patch}# zfs clone sink/$GATE@wsname sink/src/$wsname ## ignore "failed to mount"# zfs set mountpoint=/scratch/src/build/{rfes|bugs}/$wsname sink/src/$wsname# arg1 is "b" or "r" - bug or rfe# arg2 is $GATE - onnv-gate, on10-feature-patch, on10-patch# arg3 is wsname# first, some sanity checking of the args#if [ "$1" != "b" -a "$1" != "r" ]; then echo "Is this a bug (b) or an rfe (r) ?" exit 1;fiif [ "$2" != "onnv-gate" -a "$2" != "on10-feature-patch" -a "$2" != "on10-patch" ]; then echo "unknown / invalid gate specified ($2). Please choose one of " echo "onnv-gate, on10-feature-patch or on10-patch." exit 2;fiGATE=$2BR=WSNAME=$3if [ "$1" = "b" ]; then BR="bugs"else BR="rfes"fi## ASSUMPTION1: our $GATE is a dataset under pool "sink"# ASSUMPTION2: we have another dataset called "sink/src"# ASSUMPTION3: our user has delegated admin privileges, and can mount# a cloned snapshot under /scratch/src/.....zfs snapshot sink/$GATE@$WSNAMEzfs clone sink/$GATE@$WSNAME sink/src/$WSNAME >> /dev/null 2>&1zfs set mountpoint=/scratch/src/build/$BR/$WSNAME sink/src/$WSNAMEexit 0Note the ASSUMPTIONx lines - they're specific to my workstation, you will almost definitely want to change them to suit your system.

So... it's a little late, but better late than never. Now that I've got my u40m2 re-configured and redone my local source code repositories (not hg repos... yet), I figured it was time to make the...

ZFS

Oh, if only I'd had

Back when I got my first real break as a sysadmin, one of my first tasks was to upgrade the Uni's finance office server, a SparcServer 1000. Running Solaris 2.5 with a gaggle of external unipacks and multipacks for Oracle 7.$mumble, I organised an outage with the DBAs and the Finance stakeholders, practiced installing Solaris 2.6 on a new system (we'd just got an E450), and at the appointed time on the Saturday morning I rocked up and got to work on my precisely specified upgrade plan.That all went swimmingly (though looooooooowly) until the time came to reboot after the final SDS 4.1 mirror had been created. The primary system board decided that it really didn't like me, and promptly died along with the boot prom.PANIC!!At that point I didn't know all that much about the innards of the SS1000 otherwise I probably would have just engaged in some swaptronics with the other three boards. However, I was green, nervous, and - by that point - very tired of sitting in a cold, loud machine room for 12 hours. Turned the box off, rang the local Sun support office and left a message (we didn't have weekend coverage on any of our systems then), rang my boss and the primary stakeholder in the Finance unit and went home.Come Monday morning, all hell broke loose - the Accounts groups were unable to do any work, and the DBAs had to do a very quick enable of the DR system so I could get time to work on the problem with Sun. The "quick enable" took around 4 hours, if I'm remembering it correctly. Fortunately for me, not only were the DBAs quite sympathetic and very quick to help, but Miriam on the support phone number (who later hired me) was able to diagnose the problem and organise a service call to replace the faulty board. She also calmed me down, which I really, really appreciated. (Thankyou Miriam!)So ... why am I dredging this up? Because I've just done a LiveUpgrade (LU) from Solaris Nevada build 91 to build 93, with ZFS root, and it took me a shade under 90 minutes. Total. Including the post-installation reboot. Not only would I have gone all gooey at the idea of being able to do something like LU back in that job, but if I could have done it with ZFS and not had to reconfigure all the uni- and multi-pack devices I probably could have had the whole upgrade done in around 4 or 5 hours rather than 12. (Remember, of course, that while the SS1000 could take quite a few cpus, they were still very very very very sloooooooooow).Here's a trancript of this evening's upgrade:# uname -aSunOS gedanken 5.11 snv_91 i86pc i386 i86xpv(remove the snv_91 LU packages)pkgrm SUNWlu... packages from snv_91(add the snv_93 LU packages)pkgadd SUNWlu... packages from snv_93(Create my LU config)# lucreate -n snv_93 -p rpoolChecking GRUB menu...Analyzing system configuration.No name for current boot environment.INFORMATION: The current boot environment is not named - assigning name .Current boot environment is named .Creating initial configuration for primary boot environment .The device is not a root device for any boot environment; cannot get BE ID.PBE configuration successful: PBE name PBE Boot Device .Comparing source boot environment file systems with the file system(s) you specified for the new boot environment. Determining which file systems should be in the new boot environment.Updating boot environment description database on all BEs.Updating system configuration files.Creating configuration for boot environment .Source boot environment is .Creating boot environment .Cloning file systems from boot environment to create boot environment .Creating snapshot for on .Creating clone for on .Setting canmount=noauto for in zone on .Saving existing file in top level dataset for BE as //boot/grub/menu.lst.prev.File propagation successfulCopied GRUB menu from PBE to ABENo entry for BE in GRUB menuPopulation of boot environment successful.Creation of boot environment successful.-bash-3.2# zfs listNAME USED AVAIL REFER MOUNTPOINTrpool 50.0G 151G 35K /rpoolrpool/ROOT 7.06G 151G 18K legacyrpool/ROOT/snv_91 7.06G 151G 7.06G /rpool/ROOT/snv_91@snv_93 71.5K - 7.06G -rpool/ROOT/snv_93 128K 151G 7.06G /tmp/.alt.luupdall.2695rpool/WinXP-Host0-Vol0 3.57G 151G 3.57G -rpool/WinXP-Host0-Vol0@install 4.74M - 3.57G -rpool/dump 4.00G 151G 4.00G -rpool/export 7.47G 151G 19K /exportrpool/export/home 7.47G 151G 7.47G /export/homerpool/gate 5.86G 151G 5.86G /opt/gaterpool/hometools 2.10G 151G 2.10G /opt/hometoolsrpool/optcsw 225M 151G 225M /opt/cswrpool/optlocal 1.20G 151G 1.20G /opt/localrpool/scratch 14.4G 151G 14.4G /scratchrpool/swap 4G 155G 64.6M -# lustatusBoot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ----------snv_91 yes yes yes no - snv_93 yes no no yes - Golly, that was so easy! Here I was rtfming for the LU with UFS syntax.... not needed at all.# time luupgrade -u -s /media/SOL_11_X86 -n snv_93No entry for BE in GRUB menuCopying failsafe kernel from media.Uncompressing minirootUncompressing miniroot archive (Part2)13367 blocksCreating miniroot deviceminiroot filesystem is Mounting miniroot at Mounting miniroot Part 2 at Validating the contents of the media .The media is a standard Solaris media.The media contains an operating system upgrade image.The media contains version .Constructing upgrade profile to use.Locating the operating system upgrade program.Checking for existence of previously scheduled Live Upgrade requests.Creating upgrade profile for BE .Checking for GRUB menu on ABE .Saving GRUB menu on ABE .Checking for x86 boot partition on ABE.Determining packages to install or upgrade for BE .Performing the operating system upgrade of the BE .CAUTION: Interrupting this process may leave the boot environment unstable or unbootable.Upgrading Solaris: 100% completedInstallation of the packages from this media is complete.Restoring GRUB menu on ABE .Adding operating system patches to the BE .The operating system patch installation is complete.ABE boot partition backing deleted.PBE GRUB has no capability information.PBE GRUB has no versioning information.ABE GRUB is newer than PBE GRUB. Updating GRUB.GRUB update was successful.Configuring failsafe for system.Failsafe configuration is complete.INFORMATION: The file on boot environment contains a log of the upgrade operation.INFORMATION: The file on boot environment contains a log of cleanup operations required.WARNING: packages failed to install properly on boot environment .INFORMATION: The file on boot environment contains a list of packages that failed to upgrade or install properly.INFORMATION: Review the files listed above. Remember that all of the files are located on boot environment . Before you activate boot environment , determine if any additional system maintenance is required or if additional media of the software distribution must be installed.The Solaris upgrade of the boot environment is partially complete.Installing failsafeFailsafe install is complete.real 83m24.299suser 13m33.199ssys 24m8.313s# zfs listNAME USED AVAIL REFER MOUNTPOINTrpool 52.5G 148G 36.5K /rpoolrpool/ROOT 9.56G 148G 18K legacyrpool/ROOT/snv_91 7.07G 148G 7.06G /rpool/ROOT/snv_91@snv_93 18.9M - 7.06G -rpool/ROOT/snv_93 2.49G 148G 5.53G /tmp/.luupgrade.inf.2862rpool/WinXP-Host0-Vol0 3.57G 148G 3.57G -rpool/WinXP-Host0-Vol0@install 4.74M - 3.57G -rpool/dump 4.00G 148G 4.00G -rpool/export 7.47G 148G 19K /exportrpool/export/home 7.47G 148G 7.47G /export/homerpool/gate 5.86G 148G 5.86G /opt/gaterpool/hometools 2.10G 148G 2.10G /opt/hometoolsrpool/optcsw 225M 148G 225M /opt/cswrpool/optlocal 1.20G 148G 1.20G /opt/localrpool/scratch 14.4G 148G 14.4G /scratchrpool/swap 4G 152G 64.9M --bash-3.2# lustatusBoot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ----------snv_91 yes yes yes no - snv_93 yes no no yes - # luactivate snv_93System has findroot enabled GRUBGenerating boot-sign, partition and slice information for PBE Saving existing file in top level dataset for BE as //etc/bootsign.prev.WARNING: packages failed to install properly on boot environment .INFORMATION: on boot environment contains a list of packages that failed to upgrade or install properly. Review the file before you reboot the system to determine if any additional system maintenance is required.Generating boot-sign for ABE Saving existing file in top level dataset for BE as //etc/bootsign.prev.Generating partition and slice information for ABE Copied boot menu from top level dataset.Generating direct boot menu entries for PBE.Generating xVM menu entries for PBE.Generating direct boot menu entries for ABE.Generating xVM menu entries for ABE.Disabling splashimageRe-enabling splashimageNo more bootadm entries. Deletion of bootadm entries is complete.Changing GRUB menu default setting to Done eliding bootadm entries.\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*The target boot environment has been activated. It will be used when you reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You MUST USE either the init or the shutdown command when you reboot. If you do not use either init or shutdown, the system will not boot using the target BE.\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*In case of a failure while booting to the target BE, the following process needs to be followed to fallback to the currently working boot environment:1. Boot from Solaris failsafe or boot in single user mode from the Solaris Install CD or Network.2. Mount the Parent boot environment root slice to some directory (like /mnt). You can use the following command to mount: mount -Fzfs /dev/dsk/c1t0d0s0 /mnt3. Run utility with out any arguments from the Parent boot environment root slice, as shown below: /mnt/sbin/luactivate4. luactivate, activates the previous working boot environment and indicates the result.5. Exit Single User mode and reboot the machine.\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*Modifying boot archive servicePropagating findroot GRUB for menu conversion.File propagation successfulFile propagation successfulFile propagation successfulFile propagation successfulDeleting stale GRUB loader from all BEs.File deletion successfulFile deletion successfulFile deletion successfulActivation of boot environment successful.# dateFriday, 4 July 2008 9:45:41 PM EST# init 6propagating updated GRUB menuSaving existing file in top level dataset for BE as //boot/grub/menu.lst.prev.File propagation successfulFile propagation successfulFile propagation successfulFile propagation successfulHere I reboot and then login.# lustatusBoot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ----------snv_91 yes no no yes - snv_93 yes yes yes no - # lufslist -n snv_91 boot environment name: snv_91Filesystem fstype device size Mounted on Mount Options----------------------- -------- ------------ ------------------- --------------/dev/zvol/dsk/rpool/swap swap 4294967296 - -rpool/ROOT/snv_91 zfs 20630528 / -# lufslist -n snv_93 boot environment name: snv_93 This boot environment is currently active. This boot environment will be active on next system boot.Filesystem fstype device size Mounted on Mount Options----------------------- -------- ------------ ------------------- --------------/dev/zvol/dsk/rpool/swap swap 4294967296 - -rpool/ROOT/snv_93 zfs 10342821376 / -Cor! That was so easy I think I need to fall off my chair.Thinking about this for a moment, I needed just 6 commands and around 90 minutes to upgrade my laptop. If only I'd had this technology available to me back then. Finally, let me send a massive, massive thankyou to the install team and the ZFS team for all their hard work to get these technologies integrated and working pretty darned smoothly together.

Back when I got my first real break as a sysadmin, one of my first tasks was to upgrade the Uni's finance office server, a SparcServer 1000. Running Solaris 2.5 with a gaggle of external unipacks...

amd64

Trap for the unwary

I did a bios upgrade on my laptop the other day - from A05 to A08. Thought nothing of it until I re-installed the beast with build 91 to get some ZFS root goodness. (Note that currently you have to use the text-mode installer to do this).xVM told me, none too politely, that it couldn't find any virtualization capabilities in my cpus, so it wasn't going to be my friend any more.I logged 6714698 snv_91 xVM spurious failure on VT-enabled hardware and provided what I thought was enough info (prtpicl -v and prtconf -v output). Turns out I should have also provided the output from xm info and xm dmesg. When I did, I noticed these lines:...xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p ...and(xVM) Processor #0 6:15 APIC version 20(xVM) Processor #1 6:15 APIC version 20(xVM) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23(xVM) Enabling APIC mode: Flat. Using 1 I/O APICs(xVM) Using scheduler: SMP Credit Scheduler (credit)(xVM) Detected 2194.558 MHz processor.(xVM) VMX disabled by Feature Control MSR.(xVM) CPU0: Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz stepping 0b(xVM) Booting processor 1/1 eip 90000(xVM) VMX disabled by Feature Control MSR.(xVM) CPU1: Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz stepping 0b(xVM) Total of 2 processors activated.What the...?Quick jump into the bios revealed that there was a new option - Virtualization support. It was, of course, turned off by default. Turning it on and booting the xVM kernel showed me some much nicer output from those commands:xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 and(xVM) Processor #0 6:15 APIC version 20(xVM) Processor #1 6:15 APIC version 20(xVM) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23(xVM) Enabling APIC mode: Flat. Using 1 I/O APICs(xVM) Using scheduler: SMP Credit Scheduler (credit)(xVM) Detected 2194.555 MHz processor.(xVM) HVM: VMX enabled(xVM) VMX: MSR intercept bitmap enabled(xVM) CPU0: Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz stepping 0b(xVM) Booting processor 1/1 eip 90000(xVM) CPU1: Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz stepping 0b(xVM) Total of 2 processors activated.Now as soon as I get a spare cycle or three, I can go and see about building an S10 domU for backport builds. That'll be fun!

I did a bios upgrade on my laptop the other day - from A05 to A08. Thought nothing of it until I re-installed the beast with build 91 to get some ZFS root goodness. (Note that currently you have to...

OpenSolaris

Got me a new laptop

Got me a new laptop two weeks ago - spiffy new Dell XPSM1530, dual core Intel T7500 cpu, 4gb ram, 320Gb sata disk, the ultrabright 1680x1050 screen, Intel 4965abg wireless, builtin webcam. Very nice.Except that the builtin wired nic is a Marvell Yukon FE+. Not supported by skge, or yukonx from Marvell and while there's a patch for FreeBSD, it hasn't been ported or integrated into the myk driver that Masa Murayama wrote.I logged 6660771 need GLDv3 driver support for Marvell Yukon FE+ in Solaris but it's not resolved yet.Note for the unwary: when I tried the skge and yukonx drivers, I got system panics:update_drv -v -a -i ' "pci11ab,22e" ' [skge|yukonx]which results in a message like this:ERROR: yukonx0: SkGeHwInnit: Currently not supported!So being the Bright, Resourceful, Usually Correct and Exact person that I am, I emailed Masa directly asking for help.A number of myk test iterations later and I've now got a working myk driver. Not totally sure when he's going to post the updated version to his website, but the version I've found success with is 2.6.0t9 - it's still missing a few things but it seems to be able to give me 11.mumble Mbyte/sec over my 100Mbit/sec switch to blinder (u40m2) - pretty good indeed.I also needed to install the Opensound drivers but once PSARC/2008/043 is integrated I don't think that'll be necessary.Now I can go off to the Sun TechDays conference next week with all the bits working together.Thankyou Masa - you're a champ!

Got me a new laptop two weeks ago - spiffy new Dell XPSM1530, dual core Intel T7500 cpu, 4gb ram, 320Gb sata disk, the ultrabright 1680x1050 screen, Intel 4965abg wireless, builtin webcam. Very nice.Ex...

Coding

I'm annoyed at SES2 section 6.1.13.3

One of the things I'm working on at the moment is a firmware flashing utility. We've got an existing one in Solaris, called fwflash(1m) and one thing that PSARC made very clear is that They don't want a proliferation of firmware flashing utilities inside Solaris. So I'm working on making fwflash(1m) pluggable. There's a good deal of work required to make this succeed, mostly in the implementation of a plugin interface, and a specific plugin for the area that has a requirement I need to solve.That requirement pretty much mandates the use of SCSI Enclosure Services-2 (SES2), which is all good and well except when we get to section 6.1.13.3 which deals with the Additional Element Status descriptor protocol-specific information for SAS. I'm particularly annoyed at sections 6.1.13.3.3 (SAS Expanders) and 6.1.13.3.4 (SCSI Initiator Port, SCSI Target Port, Enclosure Services Controller Electronics).The problem is that - as far as I can see, after about a week's worth of serious and detailed investigation - these sections overlap in how they deliver a data payload to you. So figuring whether you've got a SAS expander, or one of a SCSI Initiator Port, SCSI Target Port or Enclosure Services Controller Electronics is actually incredibly difficult.I could punt and look at the size of the data payload, except that there'll be cases where Expanders vs (the rest) will coincide in terms of payload size. Or I could assume that everything I see there is an Expander - which would be wrong. Or I could do a massive amount of extra engineering in order to approximate what is probably the answer. Or I could use a lookup table to match against the devices which I really want and need to get access to. Right now, the lookup table is winning - a fact about which I am \*not\* happy.So, what used to be elegant code in my first prototype is now quite ugly. I'm not happy about it, this vagueness in SES2 has kicked my schedule around and has caused sleepless nights while trying to figure out a way forward.The SCSI family of standards are normally very well defined, very clear, and precise. I'm not impressed with SES2, that's for sure.Technorati tags: topic:{Technorati}[SCSI] topic:{Technorati}[SCSI Enclosure Services] topic:{Technorati}[SES2] topic:{Technorati}[Solaris] topic:{Technorati}[OpenSolaris] topic:{Technorati}[fwflash] topic:{Technorati}[firmware] topic:{Technorati}[flashing] topic:{Technorati}[PSARC]

One of the things I'm working on at the moment is a firmware flashing utility. We've got an existing one in Solaris, called fwflash(1m) and one thing that PSARC made very clear is that They don't want...

amd64

Bios bugs annoy the heck out of me

In the last few days I've been kinda-sorted prevented from successfully LiveUpgrading due to a freakin' annoying bug in my Ultra20-M2 system bios:6636511 u20m2 bios version 1.45.1 still can't distinguish disks on the same sata channel(It's in a closed prod/cat/subcat, sorry).The gist of the bug is that I've got two identical Seagate 320Gb disks (ST3320620AS, 320072933376 bytes) in my system, providing /, /zroot (for my zones, it's ufs), and sink - my zpool. No matter which two SATA ports I plug those two disks into, Shidokht's /sbin/biosdev util cannot do anything but report either no disks found, or (if run with -d) that the matchcount for the devices is greater than 1.This means that /usr/lib/lu/lumkboot, which is called as part of lucreate and friends, cannot do the needful. Hence LU fails.Yesterday I finally cracked and went off to purchase two new 320Gb disks (one Western Digital, the other a Samsung) in order to see how deep the bug goes. This became particularly important after JanD attempted to reproduce6628268 u20 and u20m2 + snv_75a with non-global zones refuses to allow LU (lucreate)with an u20m2 and two identical Hitachi 250Gb disks. He wasn't able to, despite having the same model disk, with the same firmware version in each slot. At the moment my box is having a grand old time, 1hr10 into a zpool replace:farnarkle:jmcp $ zpool status sink pool: sink state: DEGRADEDstatus: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state.action: Wait for the resilver to complete. scrub: resilver in progress, 67.52% done, 0h28m to goconfig: NAME STATE READ WRITE CKSUM sink DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c2t0d0s7 ONLINE 0 0 0 replacing DEGRADED 0 0 0 c3t0d0s7/old FAULTED 0 0 0 corrupted data c3t0d0s7 ONLINE 0 0 0errors: No known data errorsTo get to the point where zpool could replace the device, I made sure the slices on the new disk were in order, then ran zpool replace sink c3t0d0s7. That's it - it's really nifty. I've got one more thing to try (swapping the cables around for c3t0 and c3t1), which I think I'll have a go at in about 40 minutes. Whatever the results of that test, it's not looking good for the bios when it's got Seagate-branded disks attached.

In the last few days I've been kinda-sorted prevented from successfully LiveUpgrading due to a freakin' annoying bug in my Ultra20-M2 system bios:6636511 u20m2 bios version 1.45.1 still can't...

Conferences

Beijing Sun Tech Day 2007 wrap up

Wow ... what a day! I hung out with my friends (including Robs) at the HCTS - Hardware Certification Test Suite booth for most of the day. We were showing off the latest prototype demo of the Slim Installer LiveCD, which provided a very nice segue to the OpenSolaris Device Detection Tool. I lost track of how many people I talked to and handed out August 2007 OpenSolaris starter kits to, but it was definitely a steady stream of people.Just after lunch we managed to trip a surge protector, so the se3320 and t2k sitting in front of me were silent. Then a bunch of students from Beijing IT Institute rocked up and wanted to know all about the t2k. So I showed them the fan trays, swung the unit around and started talking about LOM and the various interfaces.... then figured what the heck, might as well yank the cover off entirely and show them the insides.So a good 45 minutes later after many many questions and much discussion about the benefits of multi-core computing and OpenSolaris, they invited me to visit their campus! Unfortunately I can't take them up on the offer since I'm flying home today (gee it's late as I type this!) but my Beijing colleagues will definitely visit them.The last session of the day for us was a demo of a device driver writing utility which our team is working on as a NetBeans / Sun Studio plugin. I think it went quite well, though I did get the impression that a lot of the attendees didn't really understand what a device driver was!Other highlights of the day were meeting Josh Berkus of PostgreSQL fame, and (finally!) meeting Jim Grisanzio in the flesh, albeit briefly.A few photos from today:Crowds at registrationMore crowds at registrationQueueing near the PostgreSQL boothSteve talking about xVMJosh BerkusFiona talking about HCTS and HCTLiveJosh BerkusRyan about to start the device driver demo sessionHands on experience in the demoExample template code from the demoAda explaining during the demoKevin explaining during the demoJaven explaining during the demoTechnorati tags: topic:{Technorati}[Sun Microsystems], topic:{Technorati}[SCERI], topic:{Technorati}[Sun Tech Days], topic:{Technorati}[HCTS], topic:{Technorati}[Device Detection Tool], topic:{Technorati}[Device Drivers], topic:{Technorati}[Solaris], topic:{Technorati}[OpenSolaris], topic:{Technorati}[UltraSPARC], topic:{Technorati}[CoolThreads], topic:{Technorati}[OpenSPARC], topic:{Technorati}[Caiman], topic:{Technorati}[Project Indiana]

Wow ... what a day! I hung out with my friends (including Robs) at the HCTS - Hardware Certification Test Suite booth for most of the day. We were showing off the latest prototype demo of the Slim...

Solaris

Today is a very good day

Today I'm ecstatic to be able to announce that the S10 patches for our backport are finally available on sunsolve.sun.com. We've delivered PSARC 2006/703 MPxIO extension for Serial Attached SCSI, and (my personal favourite) PSARC 2007/046 stmsboot(1M) extension for mpt(7D). The patches that you need to install aresparc:: 125081-10(We recommend that on sparc you also install 127747-01 as well, due to 6466248)andx86/x64:: 125082-10The full list of rfes and bugs is as follows:6443044 add mpxio support to SAS mpt driver6502231 stmsboot needs to support SAS devices6544226 mpt needs mdb module6242789 primary path comes up as standby instead online even if auto-failback is enabled6442215 mpt.conf maybe overwritten because filetype within SUNWckr package is 'f'6449836 stmsboot -d failed to boot if several LUNs or targets map to same partition6510425 properties "flow_control" and "queue" in mpt.conf are useless6525558 untagged command unlikely to be sent to HBA during heavy I/O6541750 CAM5.1.1b2: 2530, MPT2: Vdbench bailed out after I pull ctlr-A out6545198 build should allow architecture-dependent class action scripts6546164 stmsboot does not remove sun4u SMF service, erroneously lists parallel SCSI HBAs6548867 mpxio-upgrade script has fatally mis-defined variable6550585 mpt driver has a memory leak in mpt_send_tur6550591 mpt should not print unnecessary messages6550849 WARNING: mpt TEST_UNIT_READY failure6554029 mpt should get maxdevice from portfacts, not IOCfacts6554556 stmsboot's privilege message is not quite correct6556832 after ctlr brought online, some paths failed to come back6560371 mpt hangs during ST2530 firmware upgrade6566097 mpt: sd targets under mpt are not power-manageable6566815 changes for 6502231 broke g11n in stmsboot6531069 SCSI2 (tc_mhioctkown test cases) testing are showing UNRESOLVED results for ST25306546465 mpt: kernel panic due to NULL pointer reference in an error code path6556852 mpt needs to support Sun Fire x4540 platform6588204 mpt_check_scsi_io_error() incorrectly tests IOCStatus register6588278 mpt driver doesn't check GUID of LUN when the path online6591973 panic in mdi_pi_free() when remapping devices6613189 T125082-09 and T125081-09 don't work - missing misc/scsi module from deliverablesAs an interesting side note, during the development process we stumbled across6566270 Seagate Savvio 10k1 disks do not enumerate under scsi_vhciYou'll probably see this if you have a Galaxy or T2000/T1000 system. (Unfortunately you need a service contract to view the bug report due to its category).And on a personal note, I'd like to thank the other members of our team for working so well together - with Greg in Melbourne, Javen and Dolpher up in Beijing, test teams in Beijing, Menlo Park, Broomfield and San Diego and yours truly in Sydney (and now Brisbane) - we have truly been a virtual team. I reckon we've demonstrated that physical distance does not get in the way of designing, developing, testing and (most importantly) delivering good software that provides solutions for our customers.Technorati tags: topic:{Technorati}[MPxIO], topic:{Technorati}[Solaris], topic:{Technorati}[stmsboot], topic:{Technorati}[PSARC], topic:{Technorati}[mpt(7d)], topic:{Technorati}[Serial Attached SCSI], topic:{Technorati}[backport], topic:{Technorati}[Solaris 10], topic:{Technorati}[S10], topic:{Technorati}[patch], topic:{Technorati}[virtual team], topic:{Technorati}[software engineering]

Today I'm ecstatic to be able to announce that the S10 patches for our backport are finally available on sunsolve.sun.com. We've delivered PSARC 2006/703 MPxIO extension for Serial Attached SCSI, and...

General

For everything there is a season

This morning I had a phone call from my senior manager and HR. Unfortunately, I haven't managed to avoid the RIF axe.I'm annoyed, angry, disappointed, saddened, disheartened and yet surprisingly upbeat.Why? Well since the rumours started flying a few weeks ago that August 3rd was D-Day I've had a lot of sleepless nights worrying about what might happen, and what I could do if I got retrenched. I managed to do a lot (a heckuvalot) of introspection, and realised that for all the hard work I've put into this company over the past ~ 7 years, I have actually gained more than I realised.I've worked my way up from frontline phone support, through backline kernel/storage, to CPRE and PTS, where I dealt directly with engineering contacts within Sun, within Oracle and within Veritas.... I also got my appetite for coding whetted by working on bugfixes for some of the scsi hba drivers, discovered the joy of kernel crash dump analysis, tape diagnostics and how to read and interpret the FC and SCSI standards. From there, last year, I made the leap to NWS...DMG... no, Storage! software development, where until today I was working on the protocol layer for what we fondly called "leadville" which is available under CDDL on cvs.opensolaris.org.I've had a ball. I've experienced Sun's software design and engineering processes and principles first hand. I've worked with people across the entire planet (yes, I do mean that), built up professional relationships with some of the smartest people in the company (and also the planet), and revelled in having (for me) the absolutely coolest email address ever. So while I'm cooling my heels and working out what I do next, I'll still be active in the OpenSolaris community. I'll be reading up on all those things I was putting off for my Christmas/New Year holiday, cutting a bit of code in ON and working on porting a few drivers.Once that cool-off time is up I expect I'll be doing contract sysadmin work in Sydney - there is, after all, a lot of it about right now.... but I'd much rather be developing kernel code for Solaris! So to all the people who've been online with me in irc channels internal and external (especially in the last week when I've been stressing out), to all the people who I've worked with over the last ~ 7 years, I'd just like to say thankyou. If I play my cards right and the opportunities arise then one day I hope to be back @Sun.COM. Until then.... [In a few days I'll be re-hosting this blog on my own server. When I do, I'll update this site]

This morning I had a phone call from my senior manager and HR. Unfortunately, I haven't managed to avoid the RIF axe.I'm annoyed, angry, disappointed, saddened, disheartened and yet surprisingly...

ZFS

A few tidbits re ZFS Root

I was lurking in #opensolaris @ irc.freenode.net when drdoug007 mentioned he'd been blogging about an intense Solaris minimisation project. I had a look... he's managed to get the entire image to around 42Mb uncompressed.I then noticed he had a few entries re ZFS Root, and I remembered that there were three things I've discovered over the past week courtesy of my adventures with the ZFS delete queue that I really should mention.(1) When you boot single-user off media, you need to ensure that your boot archive contains a version of the zfs module which has the same on-disk filesystem format. In the process of repeated bfus and update_nonONs I'd also upgraded my ondisk format from ZFS Version 1 to ZFS Version 3. The older module doesn't like the newer format. I yelled a bit when I realised this (it was a Friday evening.....).Lesson: always make sure that your media is up to date with your on-disk format!(2) The next thing I realised was that if you zfs export your root pool before you reboot (so that you can boot from the allegedly-fixed ondisk boot archive) then you'll see a panic on the next boot because your pool isn't imported and therefore the boot archive won't know where to look for the rest of the OS. That's a bit of an annoyance, to say the least!Lesson: DON'T EXPORT YOUR ROOT POOL!!!(3) I had to boot off media a few times because I stuffed up my boot archive. I found that in that scenario there was a gap in the bootadm logic which meant that effectively zero-length archives were created since lofiadm and zfs wouldn't play nice with each other.Lesson: when booting to single-user, mount -o remount,rw / ;  cp {root_pool/root_filesystem}/usr/bin/mkisofs /usr/binTechnorati Tags: Solaris, OpenSolaris, ZFS

I was lurking in #opensolaris @ irc.freenode.net when drdoug007 mentioned he'd been blogging about an intense Solaris minimisation project. I had a look... he's managed to get the entire image to...

Solaris

nge, ultra20, nevada .... no packets!

One problem that I've been having with my bleeding-edge committment has been with the nge driver. I noticed that after my bfu gave me nge version 1.4, I couldn't get any packet responses when I pinged.With v1.3 I could, so I logged a bug against the nge driver.That was all well and good, except that I have absolutely \*no\* idea what to look for when debugging network issues. If <tt>snoop</tt> can't give me a clue, I'm stuffed.Yesterday I bfu'd to the 2nd July nightly bits (which contained nge v1.6), rebooted, saw the same lack of packet response and did my nge shuffle. Update the boot archive, reboot, kaboom!Turns out the there were some putbacks for GLD v3 which nge v1.3 isn't compatible with. Slight problem for me then, because I couldn't get my network..... it was more of a notwork. Not good.With the aid of Murayama's nfo driver (yay for usb storage!) I was able to determine that the problem wasn't actually with the nic driver, but either above that in the stack, or below it, in the hardware.I have Brendan Gregg's DTrace Toolkit installed, so I ran <tt>dtruss</tt> on a ping to my gateway. That showed me that everything seemed to be working ok from the above-the-nic part of the issue. So that left the hardware itself.Since I knew that nge v1.3 worked just fine, I was left to poke around in the hardware........ and the only thing I could find was in the bios for this box. It turns out that there's a setting in one of the Advanced Settings pages, called <b>MAC Media Interface</b>. Somewhere in my futzing around (I can't help myself, you know how it is), I'd set that particular item to "MII".That's the wrong thing.I actually needed that set to "RGMII", which stands for "Reduced Gigabit Media Independent Interface".Once I'd done that, all my network stuff came good.That's one setting I won't be playing with again!Technorati Tags: Solaris, OpenSolaris, ethernet

One problem that I've been having with my bleeding-edge committment has been with the nge driver. I noticed that after my bfu gave me nge version 1.4, I couldn't get any packet responses when...

ZFS

I've got my space back

When I got my Ultra 20 I decided to continue living on the bleeding edge and make a contribution to quality in Solaris Express.That, of course, means ZFS Root.So I followed the instructions on Tabriz' and Tim Foster's blogs, and quickly wound up with a zfs-rooted system.All well and good until I noticed that a lot of my disk space had disappeared.Itturns out that I was suffering from two issues:6420204 root filesystem's delete queue is not running and6436526 delete_queue thread reporting drained when it may not be trueSo the first workaround was the re-set the "readonly" flag to "off" and see how well it worked. Turns out that that wasn't quite enough, I only got about 2gb back. Then a little while later Mark Shellenbaum putback his fix for 6436526, so I bfu'd to that.One of the things that --- at this point --- you have to remember to do with ZFS Root is to copy your new boot archive from the ZFS Root environment to your ufs boot partition. It took me two reboots to remember that :(So once I'd got the correct module installed and booted, I had about 7 hours of downtime as a lot of my delete queue got flushed. I had to reboot several times because my disk decided it couldn't cope with the load, and also because the delete queue managed to trip over some assertions in the vm code. (Some pathetic reason like ASSERT(proc_pageout != NULL);.....)And that was all cool.BUT there was more to come. While chatting with Mark in the ZFS irc room he mentioned that Tabriz had a fix for 6420204 which involved adding a line to /lib/svc/method/fs-usr: [ "$fstype" = zfs ] && mntopts="${mntopts},rw"So I did that, and rebooted..... lo and behold, there was more stuff in the delete queue which needed to be taken care of. After another 10 hours and 8 panics my ZFS Root partition is finally back to using the expected 7% rather than 70+%.Life is good and all seems good with the world.Technorati Tags: ZFS, Solaris, OpenSolaris

When I got my Ultra 20 I decided to continue living on the bleeding edge and make a contribution to quality in Solaris Express. That, of course, means ZFS Root. So I followed the instructions on Tabriz...

Coding

On ::findleaks -- wot is it?

A few days ago I was asked by a partner engineer:When we were looking at a core file we ran ::findleaks and it came back with some buffers from our driver. However, the buffers are from our buffer pool. They are allocated at driver attach and will be freed at driver detach. The core file was from a Leadville stack panic during device discovery at boot, so I expected the driver to be fully present. The ::findleaks is identifying 5 buffers.We actually have allocated and pooled 512 of these buffers. I am wondering why ::findleaks considers these buffers as "leaked". Could you help me understand what this means?I'm glad you asked. To start with, the ::findleaks dcmd (manual page entry) has a ::help entry within mdb: > ::help findleaksNAME findleaks - search for potential kernel memory leaksSYNOPSIS [ addr ] ::findleaks [-dfv]DESCRIPTION   Does a conservative garbage collection of the heap in order to find potentially leaked buffers. Similar leaks are coalesced by stack trace, with the oldest leak picked as representative. The leak table is cached between invocations.   addr, if provided, should be a function or PC location. Reported leaks will then be limited to those with that function or PC in their stack trace.   The 'leak' and 'leakbuf' walkers can be used to retrieve coalesced leaks.OPTIONS    -d    detail each representative leak (long)    -f    throw away cached state, and do a full run    -v    report verbose information about the findleaks runATTRIBUTES   Target: kvm   Module: genunix   Interface Stability: UnstableYou can follow along with the findleaks code via http://cvs.opensolaris.org/source/xref/on/usr/src/cmd/mdb/common/modules/genunix/leaky.c but be warned --- the code is more than just a tad tricky to follow.Ok, as to how you really make use of ::findleaks, have a look through this transcript: $ mdb -k 12mdb: warning: dump is from SunOS 5.11 onnv-gate:2006-01-12; dcmds and macros may not match kernel implementationLoading modules: [ unix krtld genunix specfs dtrace uppc pcplusmp ufs ip sctp usba s1394 fctl nca lofs random nfs fcip cpc ipc ptm sppp ] > ::statusdebugging crash dump vmcore.12 (64-bit) from doppiooperating system: 5.11 onnv-gate:2006-01-12 (i86pc)panic message:BAD TRAP: type=e (#pf Page fault) rp=fffffffffbc58a90 addr=a0 occurred in module "" due to a NULL pointerdereferencedump content: kernel pages only > ::msgbuf### snippage ###kernel memory allocator:invalid free: buffer not in cachebuffer=ffffffff87a8cda0 bufctl=ffffffff879d70d8 cache: streams_mblkprevious transaction on buffer ffffffff87a8cda0:thread=ffffffff94795c00 time=T-15569.454683674 slab=ffffffff879b2780 cache: streams_mblkkmem_cache_free_debug+148kmem_cache_free+51dblk_destructor+73kmem_cache_free+1c0dblk_lastfree+7efreeb+b0struiocopyout+95kstrgetmsg+973sotpi_recvmsg+20dsocktpi_read+9cfop_read+29read+2a4panic[cpu0]/thread=fffffe800000bc80:kernel heap corruption detectedfffffe800000baa0 genunix:kmem_error+4ab ()fffffe800000baf0 genunix:kmem_slab_free+dd ()fffffe800000bb50 genunix:kmem_magazine_destroy+127 ()fffffe800000bb90 genunix:kmem_depot_ws_reap+9d ()fffffe800000bbc0 genunix:kmem_cache_reap+35 ()fffffe800000bc60 genunix:taskq_thread+200 ()fffffe800000bc70 unix:thread_start+8 ()#### snippage #### > ::findleaksBYTES LEAKED VMEM_SEG CALLER16729 1 ffffffff848bf528 AcpiOsTableOverride+0x15f------------------------------------------------------------------------ Total 1 kmem_oversize leak, 16729 bytesCACHE LEAKED BUFCTL CALLERffffffff82a1d008 55 fffffed4cc4c89e0 allocb+0x65ffffffff82a1d008 128 ffffffff879ecab0 allocb+0x65ffffffff80040008 128 ffffffff879e6578 dblk_constructor+0x57ffffffff80040008 55 ffffffff935f9040 dblk_constructor+0x57ffffffff8003a748 1 ffffffff93a5de80 devi_attach+0x94ffffffff8002f748 1 ffffffff89c92678 kobj_alloc+0x88ffffffff8002e008 2 ffffffff8a188ce8 kobj_alloc+0x88ffffffff80039008 1 ffffffff8970c4c0 kobj_alloc+0x88ffffffff84896748 116 ffffffff983da998 rootnex_dma_allochdl+0x5affffffff84896748 128 ffffffff879ec9d8 rootnex_dma_allochdl+0x5a------------------------------------------------------------------------ Total 615 buffers, 997408 bytes > ffffffff879e6578::bufctl -v ADDR BUFADDR TIMESTAMP THREAD CACHE LASTLOG CONTENTSffffffff879e6578 ffffffff87a8a4e0 b7d45a53f ffffffff82aefe80 ffffffff80040008 ffffffff808dc340 0 kmem_cache_alloc_debug+0x2a6 kmem_cache_alloc+0x1fc dblk_constructor+0x57 kmem_cache_alloc+0x237 allocb+0x65 FillTxDescriptor+0x2b FillTxRing+0x2e 0xfffffffff5136697 SkGeAttachReq+0x116 SkGeProto+0x8d SkGeWsrv+0x26 runservice+0x62 queue_service+0x5b stream_runservice+0x10a strput+0x1f3So what we do is take the bufctl address for each of the leaked bufs, and run it through the ::bufctl dcmd with the "-v"  option, which allows us to see the stack trace of the thread which allocated that particular buf. From that we can see which function (and where in it) leaked the memory.In the msgbuf for a panic like the one above you see the address of the suspect bufctl, so you can do this: > ffffffff879d70d8::bufctl -v ADDR BUFADDR TIMESTAMP THREAD CACHE LASTLOG CONTENTSffffffff879d70d8 ffffffff87a8cda0 6c5b3b526b33 ffffffff94795c00 ffffffff80040008 ffffffff80b770c0 ffffffff820514c8 kmem_cache_free_debug+0x148 kmem_cache_free+0x51 dblk_destructor+0x73 kmem_cache_free+0x1c0 dblk_lastfree+0x7e freeb+0xb0 struiocopyout+0x95 kstrgetmsg+0x973 sotpi_recvmsg+0x20d socktpi_read+0x9c fop_read+0x29 read+0x2a4Note that we have two thread addresses (right hand column)> ffffffff820514c8::thread ADDR STATE FLG PFLG SFLG PRI EPRI PIL INTRffffffff820514c8 inval/badd cafe badd cafe -13570 -17699 0 ffffffff91af94e0> ffffffff94795c00::thread ADDR STATE FLG PFLG SFLG PRI EPRI PIL INTRffffffff94795c00 run 1002 104 3 59 0 0 n/aOk, we know that we're going through streams code, and that the panic stack went through dblk_destructor -- therefore it's reasonable to assume that we should look at the two leaks from dblk_constructor as a first point:> ffffffff879e6578::bufctl -v ADDR BUFADDR TIMESTAMP THREAD CACHE LASTLOG CONTENTSffffffff879e6578 ffffffff87a8a4e0 b7d45a53f ffffffff82aefe80 ffffffff80040008 ffffffff808dc340 0 kmem_cache_alloc_debug+0x2a6 kmem_cache_alloc+0x1fc dblk_constructor+0x57 kmem_cache_alloc+0x237 allocb+0x65 FillTxDescriptor+0x2b FillTxRing+0x2e 0xfffffffff5136697 SkGeAttachReq+0x116 SkGeProto+0x8d SkGeWsrv+0x26 runservice+0x62 queue_service+0x5b stream_runservice+0x10a strput+0x1f3> ffffffff935f9040::bufctl -v ADDR BUFADDR TIMESTAMP THREAD CACHE LASTLOG CONTENTSffffffff935f9040 ffffffff934fe7c0 7a840278b8ac fffffe80000b3c80 ffffffff80040008 ffffffff805b7e00 ffffffff8268c6e8 kmem_cache_alloc_debug+0x2a6 kmem_cache_alloc+0x2ac dblk_constructor+0x57 kmem_cache_alloc+0x237 allocb+0x65 FillRxRing+0x46 ReceiveIrq+0xad3 SkGeIsrOnePort+0xa5 av_dispatch_autovect+0x97Which for me means logging a call with SysKonnect.de since SkGe is their code.You can find out more about ::findleaks by reading Jonathan Adams' blog entry on the implementation, and reading manpages for umem_debug --- you can use ::findleaks on userland stuff too, which is a real boon.I hope the above - although rambling - gives you enough to get started with on the magic of ::findleaks.Technorati Tags: mdb, Solaris, OpenSolaris

A few days ago I was asked by a partner engineer:When we were looking at a core file we ran ::findleaks and it came back with some buffers from our driver. However, the buffers are from our buffer...

OpenSolaris

OpenSolaris... 1 year on

It's been a pretty wild year since Sun opened the lid on OpenSolaris. Well, I reckon it's been wild, at any rate :-)As of a few minutes ago we've had the 100th community-member putback to the codebase completed. We've seen the opening of the vast majority of the ON gate (both kernel and userspace), the NWS gate (my favourite), the JDS consolidation .... lots!There's a thriving irc channel (#opensolaris @ irc.freenode.net) which has been a fantastic support forum as well as being a meeting place, and we've got projects and sub-communities from A to Z.I've noticed a real difference in the way that discussions (whether architectural or otherwise) now happen inside Sun, and it's a fantastic thing.... kinda like a caterpillar turning into a butterfly, if you will. There's been a visible crossover from discussions held in public, with the odd flamefest or two included.Ok, now for the self-promotion part! I've used my blog to document a few procedures, provide some info on currently-supported Emulex FC hbas and QLogic FC hbas and to provide a link for my presentation Getting to know the SAN Stack.And ok, sure, I blogged about other stuff too since I setup my blog early last year. But my point is that the opening of Solaris to the world has allowed Sun and non-Sun people to expand everybody's knowledge about what goes on under the hood, and it's coincided with a push from within to engage people outside Sun who might otherwise never have noticed. Even if you're not going to run Solaris or OpenSolaris, by being able to engage with Sun's staff on --- effectively --- a one-to-one basis, you have the opportunity to get new ideas. And so do we.You win, I win, we all win.I'm really glad to be a part of this community, both now and in the future.Technorati tags: Solaris OpenSolaris Sun

It's been a pretty wild year since Sun opened the lid on OpenSolaris. Well, I reckon it's been wild, at any rate :-)As of a few minutes ago we've had the 100th community-member putback to the codebase...

General

On FMA - it really does work..... even on my Ultra 20

A few weeks ago I took delivery of a brand, spanking new Sun Ultra 20 workstation, purchased for me by my department.I installed build 38 of Solaris Express and promptly BFU'd to whatever the relevant nightly build was at the time. I do like to live on the edge :-)Anyway, I noticed after a while (we're talking hours here, not weeks) that there was only ever one process showing up as on cpu when I ran prstat. Curious, I ran psrinfo ---v, which told me that core 0 of my dual-core Opteron was faulted and offline.Damn! This is a new machine, less than a day old. What on earth could have gone wrong with the cpu?I thought I'd have a look at the FMA telemetry that was being generated. This was a bit of a shock:# fmdumpTIME UUID SUNW-MSG-IDApr 27 18:50:23.6205 4cd32003-36a3-c3f8-ea93-b7edc762dd9f AMD-8000-JFApr 27 18:50:53.5720 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JFApr 28 07:13:56.8810 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JFApr 28 12:46:32.0074 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JFApr 28 13:37:59.2926 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JFApr 28 13:50:35.5724 412579b7-ed8d-607a-905c-e3fb998f290e ZFS-8000-D3Apr 28 13:54:46.0114 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JFApr 28 13:54:46.3803 378726c1-1d68-c0c3-d0fd-9fb2b1431834 ZFS-8000-CSApr 28 14:23:09.6371 5baaf5a5-2bbf-43c7-e3fe-ab24b007c3f7 AMD-8000-JFApr 29 05:40:24.2258 a4d4edf8-520d-e625-8223-84c7ce652524 AMD-8000-2FMay 01 15:47:52.8092 abea0bc6-80b1-e022-edd1-d4a385117e0d AMD-8000-2FNow except for the ZFS\* messages (which occurred when I was playing around with my scsi multipack), we've got two SUNW-MSG-ID strings which you can look up at http://www.sun.com/msg.If you want to see what FMA has logged as faulted you can run# fmdump -v -u 4cd32003-36a3-c3f8-ea93-b7edc762dd9fTIME UUID SUNW-MSG-IDApr 27 18:50:23.6205 4cd32003-36a3-c3f8-ea93-b7edc762dd9f AMD-8000-JF 100% fault.cpu.amd.datapath Problem in: hc:///motherboard=0/chip=0/cpu=0 Affects: cpu:///cpuid=0 FRU: hc:///motherboard=0/chip=0Ok, so there's a serious looking problem with core 0 in my cpu. Good thing I've got two cores. A quick psradm -n 0 got the core to say that it was back online, but I wasn't really sure I'd done anything to fix it.What about the other AMD\* messages, what do they mean?# fmdump -v -u abea0bc6-80b1-e022-edd1-d4a385117e0dTIME UUID SUNW-MSG-IDMay 01 15:47:52.8092 abea0bc6-80b1-e022-edd1-d4a385117e0d AMD-8000-2F 100% fault.memory.dimm_sb Problem in: hc:///motherboard=0/chip=0/memory-controller=0/dimm=1 Affects: mem:///motherboard=0/chip=0/memory-controller=0/dimm=1 FRU: hc:///motherboard=0/chip=0/memory-controller=0/dimm=1Ok, that's looking a tad worse. Especially when I try fmadm repair abea0bc6-80b1-e022-edd1-d4a385117e0d --- look at what I got in /var/adm/messages:May 9 18:04:00 pieces fmd: [ID 441519 daemon.error] SUNW-MSG-ID: FMD-8000-0W, TYPE: Defect, VER: 1, SEVERITY: MinorMay 9 18:04:00 pieces EVENT-TIME: Tue May 9 18:03:59 EST 2006May 9 18:04:00 pieces PLATFORM: Sun Ultra 20 Workstation, CSN: 0614FK40E2, HOSTNAME: piecesMay 9 18:04:00 pieces SOURCE: fmd-self-diagnosis, REV: 1.0May 9 18:04:00 pieces EVENT-ID: ee29d053-e3aa-cbe8-a458-ec8528f5bf99May 9 18:04:00 pieces DESC: The Solaris Fault Manager received an event from a component to which no automated diagnosis software is currently subscribed. Refer to http://sun.com/msg/FMD-8000-0W for more information.May 9 18:04:00 pieces AUTO-RESPONSE: Error reports from the component will be logged for examination by Sun.May 9 18:04:00 pieces IMPACT: Automated diagnosis and response for these events will not occur.May 9 18:04:00 pieces REC-ACTION: Run pkgchk -n SUNWfmd to ensure that fault management software is installed properly. Contact Sun for support.May 9 18:04:00 pieces fmd: [ID 441519 daemon.error] SUNW-MSG-ID: FMD-8000-0W, TYPE: Defect, VER: 1, SEVERITY: MinorMay 9 18:04:00 pieces EVENT-TIME: Tue May 9 18:04:00 EST 2006May 9 18:04:00 pieces PLATFORM: Sun Ultra 20 Workstation, CSN: 0614FK40E2, HOSTNAME: piecesMay 9 18:04:00 pieces SOURCE: fmd-self-diagnosis, REV: 1.0May 9 18:04:00 pieces EVENT-ID: 27b29562-976e-ea5f-f554-d9010393029fMay 9 18:04:00 pieces DESC: The Solaris Fault Manager received an event from a component to which no automated diagnosis software is currently subscribed. Refer to http://sun.com/msg/FMD-8000-0W for more information.May 9 18:04:00 pieces AUTO-RESPONSE: Error reports from the component will be logged for examination by Sun.May 9 18:04:00 pieces IMPACT: Automated diagnosis and response for these events will not occur.May 9 18:04:00 pieces REC-ACTION: Run pkgchk -n SUNWfmd to ensure that fault management software is installed properly. Contact Sun for support.WTF!?!?!?So I logged a call with Sun Support requesting a new dual-core cpu and two new dimms. (Yes, I know that the telemetry only mentioned one dimm, but they're shipped in pairs). The parts duly came, and I installed them.A day or so before the parts came I noticed that root got email saying that the fault log was too busy to rotate. This got me worried as well..... fmdump was showing 6 error events against dimm #1 every minute, which I thought was quite excessive. So it was time for a quick search of the archives of FMA-discuss but nothing seemed to match. Time for an email to the team to find out whether they'd seen anything like this. Unfortunately not.So I replaced the cpu (that worked just fine afterwards) and both dimms.... and the fma telemetry for the dimms continued.Now I was getting worried. Really, really worried.I'd taken the necessary precautions when replacing the cpu and the dimms, I'd tried running fmadm repair against the dimm uuids, and I'd even tried unloading just about all of the fma modules. (All that produced was messages to the effect of "hey! I've got an error and I dunno what to do with it.")So I got in contact with the FMA core team and one of their number ssh'd into my workstation and dug around for a bit. I also got an email from Gavin Maltby letting me know that I actually had a single bit error on that dimm. From that he surmised that there was a single pin gone bad in the slot.... and could I spare the downtime to have a look please?So at lunchtime that day I shutdown the box, took the necessary static precautions and removed the dimms.Lo! and behold, I saw that there was indeed a bent pin in the second slot away from the cpu:So I moved the pair of dimms to the other two slots, powered the system on.... ran fmadm repair on the dimm uuid, and life was good again.I'd love to provide a "moral" to this anecdote, but there isn't one. All I can say is that this FMA stuff really does work and if you're not running Solaris 10 (or Express) by now then you are missing out.Manpages that you will find useful for FMA includefmadm(1M), fmd(1M), fmdump(1M), and fmstat(1M)And don't forget the OpenSolaris Fault Management community pages too.

A few weeks ago I took delivery of a brand, spanking new Sun Ultra 20 workstation, purchased for me by my department. I installed build 38 of Solaris Express and promptly BFU'd to whatever the...

General

This just in: DTrace ported to FreeBSD-current

A friend of mine who uses FreeBSD passed on this announcement:http://marc.theaimsgroup.com/?l=freebsd-current&m=114854018213275&w=2List: freebsd-currentSubject: DTrace for FreeBSD - Status UpdateFrom: John Birrell Date: 2006-05-25 6:55:10Message-ID: 20060525065510.GA20475 () what-creek ! com[Download message RAW]It's nearly 8 weeks since I started porting DTrace to FreeBSD and Ithought I would post a status update including today's significantemotional event. 8-)For those who don't know what DTrace is or which company designed it,here are a few links:The BigAdmin: http://www.sun.com/bigadmin/content/dtrace/A Blurb: http://www.sun.com/2004-0518/feature/index.htmlThe Guide: http://docs.sun.com/app/docs/doc/817-6223My FreeBSD Project Page: http://people.freebsd.org/~jb/dtrace/index.htmlMuch of the basic DTrace infrastructure is in place now. Of the 1039DTrace tests that Sun runs on Solaris, 793 now pass on FreeBSD.We've got the following providers:- dtrace- profile- syscall- sdt- fbtAs of today, loading those providers on a GENERIC kernel gives 32,519 probes.Today's significant emotional event added over 30,000 of those, thanksto the Function Boundary Tracing (fbt) provider. It provides theinstrumentation of the entry and return of every (non-leaf) functionin the kernel and (non-DTrace provider) modules.[snip script and output]There is still a lot of work to do and while that goes on, the code hasto remain in the FreeBSD perforce server. It isn't ready to get mergedinto CVS-current yet.I have asked the perforce-admins to mirror the project out to CVS (viacvsup10.freebsd.org), but I'm not sure what the hold-up there is.I had hoped that one or two of the Google SoC students would contributeto this, but I only received one proposal and that wasn't for anythingthat would help get DTrace/FreeBSD completed.There are things people can do to help. Some of them are build related;some are build tool related; some are user-land DTrace specific; and therest are kernel related. Speak up if you are interested in working onthis!--- --- John BirrellI thought you might like to know.What a great way to start my day!

A friend of mine who uses FreeBSD passed on this announcement: http://marc.theaimsgroup.com/?l=freebsd-current&m=114854018213275&w=2 List: freebsd-current Subject: DTrace for FreeBSD - Status UpdateFrom...

General

There's a new blog in town: Solaris CAT

We're ramping things up somewhat with Solaris CAT... getting v4.2 ready to put on SDLC and adding support for more and more of the new (not so new now!) features that are to be found in Solaris 10 and Solaris Express (aka OpenSolaris). For my part, I've been working out the bugs in our x86 and x64 disassembler implementation so that we can provide support for x86 and x64/amd64 for Solaris 10 onwards with version 5.0 of Solaris CAT ..... In fact I fixed a really nasty one just a few minutes ago which only happened on x86. It's been giving me the irrits for a few months (in between working on my day job of course!).Now that I've got that fixed I can go back to working on the other stuff that we need in order to make the x86 and x64 versions as complete as possible. Of course there will be things that aren't applicable to those platforms and we'll give you a polite message to remind you. At any rate, the team has a group blog here: Solaris CAT where we'll provide news, how to guides and other neat things.Ooh ooh ooh how could I forget! v4.2 (almost ready) has support for CTF. So if you want to get with the program on CTF (because let's face it, stabs is an evil and disgusting flying monkey) then if you follow my instructions you'll be able to see your structures inside Solaris CAT. Now that is cool.

We're ramping things up somewhat with Solaris CAT... getting v4.2 ready to put on SDLC and adding support for more and more of the new (not so new now!) features that are to be found in Solaris 10 and...

Coding

linux kernel bug count increasing, apparently

I just read an article pointed to from OSNews regarding Andrew Morton's gut feel about the linux kernel bug count.I'll happily admit to not being too keen on linux. That's because my employer pays me to work on Solaris and I see vast differences between the philosophies of the two OSes, and Solaris is more my personal style. shrugBut look, I don't care whether you run or develop for Solaris or Linux or any of the BSDs or MacOSX or even MS-Windows. What I care about is quality. If you're writing code for an OS, for cryin' out loud make that code as bugfree as possible! Do your users, your customers and your bosses a favour by building in quality from the get-go. That means taking care when you design your product, documenting as you go, making sure that you don't just arbitrarily change the way that widget X works but that if you do make changes you do so for a justifiable reason.What I want to see in linux is the sort of stability that I see in Solaris. And since I write drivers, that means I want to see stable in-kernel interfaces. I find it incredibly frustrating to read comments in the linux communities that boil down to "no, \*nobody\* needs a stable in-kernel interface in linux, all you need to worry about is kernel->userspace." My business is inside the kernel. If you want me to support linux, then do yourself a favour and make it easier for me to justify doing so. The Solaris Writing Device Drivers guide (which is the basis for the SunEd course as well) has been used by thousands of developers all over the world for many, many years. It's one of the reasons --- apart from the Binary Compatibility guarantee --- (in my completely unhumble opinion) for the success of Solaris. ISVs and IHVs know that they can depend upon the DDI/DKI interfaces being stable. That means that their development and support costs are lower because they don't need to keep re-certifying their widget for a new release of the kernel, whether that release is from a new version of Solaris (eg 2.6, 7, 8, 9, 10, ....) or from a kernel patch.What I see in linux is that the kernel is growing organically. Organic growth is generally a good thing to have. However, what I would like to see is the linux kernel having some semblance of design in its interfaces, so that the need to go and change those interfaces is minimised. That makes it easier for ISVs and IHVs to justify supporting linux, and easier to justify supporting OSes other than MS-Windows and MacOSX. It's win-win for everybody.Finally, I find Greg Kroah's comments...remember we are talking about GPL released drivers here, if your code doesn't fall under this category, good luck, you are on your own here, you leechto be quite inflammatory and unnecessary. Not every company on the planet likes GPLv2, and when one is fighting a battle to provide support for non-MS/non-Mac OSen in the first place such comments do not make it easy to win the battle with management.

I just read an article pointed to from OSNews regarding Andrew Morton's gut feel about the linux kernel bug count. I'll happily admit to not being too keen on linux. That's because my employer pays me...

General

Scott isn't going, he's just stepping back from the limelight

I got up earlyish yesterday to catch our 3rd quarter earnings announcement. We did ok --- in line with Wall St estimates. That was nothing compared to the "whoa, wtf?!?!?" that I experienced when I saw Jonathan's message talking about Scott's legacy.A quick reload on the browser for Yahoo! finance and I saw that Scott had stepped down as CEO (and that there was a USD0.40 spike in the share price).I am so very, very glad that Scott hasn't resigned and left the company. All he's doing is taking a step backwards out of the limelight to become chairman of the board. He's still going to be part of Sun but (I hope) getting more sleep at night.Today I read Ashlee Vance's interview with Scott. It was fine vintage Scott, from the Scott that I've known and admired for so many years.I particularly liked this comment (page 2):I don't have the same sense of urgency to go blow away customer value that someone who is trying to make a buck this quarter on the street who has no vested interest in the long-term shareholder value of this company has. I don't listen to that garbage.So take that! Laura Conigliaro. And incidentally, I saw this report mentioning a portfolio manager whose company sold SUNW after the bubble burst and ".... hasn't touched Sun since." Hmmm. I've still got my SUNW stock that I bought just before the height of the bubble. I'm keeping it because I believe in this company --- what it stands for, what it produces, what it values. I'm not in the stock market to make a quick dollar.

I got up earlyish yesterday to catch our 3rd quarter earnings announcement. We did ok --- in line with Wall St estimates. That was nothing compared to the "whoa, wtf?!?!?" that I experienced when I...

Australia

It's great to be home again

Late on the afternoon of 30 March my beloved rang me and asked "do you mind if I go up to the Gold Coast for two weeks? Work needs me to mind their Southport office and train a new staff member. Oh, and I want you to come with me too." Now when she said "for two weeks" that really did mean two weeks..... starting with the red-eye flight up to Brisbane on Monday the 3rd to pick up an office car and then head off to Southport.I think I've hinted before in this blog that I really, really dislike Sydney, and I take any and every opportunity to get out.Her phone call came at just the right time, since the day we were to travel up (2nd) was when my sister-in-law was scheduled to have her baby. As if either of us would pass up the opportunity to see our new relative!Before we could fly up to Brisbane, we had to take coder cat off to the boarding cattery, I had to get the tyres replaced on the car so that when we got back from Queensland we could drive on good tyres up to Coffs Harbour, and then ....... then we had the weekend to deal with!My brother-in-law has two speeds: dead and a bazillion km/h. He used to sail the Laser one-design Olympic class dinghy, but has recently (last 2 years) taken up mountain biking. The event he competed in (solo, not part of a team) on 1/2 April was the 3rd annual N-ZO 24 hour race ... yes, you read that correctly, a 24 hour mountain-bike race. We had agreed that we'd bring him dinner on the Saturday evening, hang around for a bit, wander back to the Sebel Hawkesbury Resort (yes, we've been pampering ourselves a tad) and have a good night's sleep before heading back to Dargle Farm in the morning to see the end of the race and then drive him home.By the time we got back from Dargle Farm on the Sunday arvo we were pretty tired.So we flew up on the Monday feeling quite exhausted. J got a very quick briefing from the manager in the office there, we picked up the car, grabbed an espresso and headed south.When we got to Surfers the roads had changed a heckuvalot from what we were expecting based on our vintage-1995 refidex, so it was a little bit stressful, but we managed to find our way to Candlelight Holiday Apartments without too much fuss. We stayed there for three reasons: (1) it had broadband, (2) it had a reasonable kitchen, and (3) it was cheap. So while J did a 10 minute commute up to her Southport office I made use of the in-room dsl modem and Dan McDonald+team's punchin software to connect and work quite remotelyWe went up to Brisbane on the Tuesday to see our new nephew O (2.386kg =~ 5lb4oz, 46cm but very healthy) and spend a little bit of time with my sister-in-law and my brother. It was sooooo good to see them again, and to see the new addition to the family.On the Friday evening we wandered down to Currumbin to have dinner with L+R+C and kinda restart the conversations we'd been having when we visited them in March. On the Saturday we went to SeaWorld, spent the day, got a bit sunburnt and had one of those all-too-rare days where you just have a great time with no strings or regrets or anything. It was all good. On Sunday we had brunch with L+R+C before heading up to my brother+sister-in-law's place to see O at home. We had about 4 hours with them and took much better photos with natural light. (The hospital's lighting was actually quite dim).One this I did enjoy about being at Surfers was that Q1 (20th tallest tower in the world) was only a few hundred metres away, and we took the opportunity to go up it. It was just after sunset when we got up there (300m up to the obs deck iirc) and I got to take this photo:J had to start training the new employee the next day and cram as much as possible into 3.5 days because we had to get back to Brisbane to drop off the car and debrief on Thursday afternoon before catching a rush-hour flight back to Sydney. This is exactly the sort of thing that she excels at, and that made her possibly the best person to train the new employee. Anyway... the debrief went very well - the Brisbane manager was very grateful that J had been able to come up at short notice and get them out of a hole, and made encouraging comments regarding the possibility of working for the company up there. (Music to my ears :))We got back to Sydney late-ish on the Thursday, had enough time to do some washing on the Friday morning before packing to go to Coffs for 10 days. It took us 90 minutes to get from our place just north of the Bridge to the start of the F3 freeway, a journey which normally - outside of school holidays and long weekends - takes us about 30 minutes. After that it just seemed like ages and ages before we got there - nearly 9 hours later for about 530km.J's best friend from Sydney Uni's OT course had her hen's night on Easter eve and her wedding was last Friday, 21st April. J actually took some time off and had a relax, but I worked remotely again, this time via wireless in Ocean Spray Holiday Apartments. I did have a few problems getting connected - I think due to a combination of the lowish-powered antenna in my laptop and the complex's firewall, but with the dedication of the owners and a MIMO cardbus wifi adapter I managed to get connected to Sun via the Cisco vpn-3000 system. Yes, that meant that I had to work in the MS-WindowsXP environment. Let me say now, I just don't like that at all. Still, I managed to get enough work done that I didn't feel guilty at taking last Friday off to go to the wedding. Saturday was our relaxing recovery day before we headed back to Sydney on Sunday.I figured that a snapshot of the sunrise from our balcony in Coffs would be a nice way to remember the place:The trip back was considerably better - it only took us 6 hours to get back, and we were back in town early enough to go and pick up coder cat a day early.We're glad to be back in town (even though it's Sydney)... and I'm glad to have access to all of what makes my kitchen hum - my good sharp knives, herbs, spices, non-stick and heavy-base frypans... all those things that you miss when you have to spend three weeks away.

Late on the afternoon of 30 March my beloved rang me and asked "do you mind if I go up to the Gold Coast for two weeks? Work needs me to mind their Southport office and train a new staff member. Oh,...

OpenSolaris

Proof that Novell's COO doesn't understand Open Source

I saw a pointer to a story in The Australian's IT section today. The article is an interview with Novell's Robert Hovsepian in which he makes some statements which indicate to me that he just doesn't understand what OpenSource software is.Statements like this one:"The whole spirit of open source is to have one base of code," he said. "Open-sourcing Solaris --- while it's appreciated that you can see the low-level pieces of code --- doesn't move the overall effort of the open-source community further down the track. It creates a fork, which none of us likes."Huh? How is OpenSolaris a fork? Of what? Solaris? I don't think so. OpenSolaris is certainly not a fork of linux either. If you want to get historical and really pedantic, then it's possible (but disingenuous) to describe Solaris as a fork of AT&T's Unix known as SVR4. Wikipedia even has this you-beaut picture which shows the lineage.And the concept of having "one base of code" --- that's soooooo closed! So take that! OpenBSD, FreeBSD and NetBSD!Seriously, Mr Hovsepian should go back to WikiPedia and check out the description of Open Source and then perhaps he should have a wander over to The OpenSource Initiative and check a few facts about licensing as well as looking at the variety of OSI-certified OpenSource licenses and what's using them.I do appreciate his recognition that OpenSolaris' sourcecode is available for one-n-all to view. However, more than anything, I'm disappointed that the President and COO of a heavyweight in the industry that's built on Open Source software should display such ignorance. Mr Hovsepian, I know you know better than that.

I saw a pointer to a story in The Australian's IT section today. The article is an interview with Novell's Robert Hovsepian in which he makes some statements which indicate to me that he just doesn't...

Australia

Our tropical island holiday

Well it was 8 months we had to wait, but it was well worth it. Since J and I returned from our trip to Europe last year, it seemed as if every single weekend was spent doing something and there was precious little relaxing going on. So we were really, really pleased when February rolled around and we were able to take some time away. We flew up to Townsville on the Friday afternoon, and spent a night at the Aquarius On The Beach overlooking The Strand, an area which has changed a lot since the first time I saw it 5 years ago. We had a reasonably nice if late dinner at the Seaview Hotel's Steakhouse (see below for mini-review). On the Saturday we caught the ferry across to Magnetic Island. The ferry now pulls into the new (well, since we visited last!) harbour at Nelly Bay rather than at Picnic Bay. We caught the bus around to Horseshoe Bay, where we checked in to a unit at Sails on Horseshoe. Unit 1 has a great aspect - at the front, with only a small strip of road before the bay foreshore:As you can see, the view from the balcony is pretty good:It was actually quite difficult for us to remember how to relax (8 months since we had anything more than a long weekend off), so we put some effort in: wake up, breakfast on the balcony, go for a swim(Note the stinger net --- this is tropical North Queensland). Then after our swim we'd have lunch, have a snooze for a few hours, then swim again, have dinner, read for a bit (we got through all of the 6 novels we brought with us) and then sleep.One morning I got up just before daybreak to take some photos. Since Nathan showed me to use my camera better, I figured I'd play around a bit and see what happened. I also went around to Alma Bay (which was hosting the island's annual surf carnival the day we arrived) and climbed about 50m up the cliffs. As you can see, the water was perfect!Later in the week we went around to Gregory Bay (where the car ferry used to come into) because J said there were Rock Wallabies to see. Just like these two:We did try to get ourselves on a trip out to the Great Barrier Reef, but the company we rang up said "Sorry, we're trucking our vessel up to Cairns for repairs in the off-season." !!!!!So instead on the thursday we went on an Eco-tourism trip: Sea Kayaking around from Horseshoe Bay to Balding and Radical Bays. The bloke who runs ithas won several awards over the last 10 years he's been operating, and does lots of work with schools from all over.We finished our stay on the island with dinner atBarefoot Art Food Wine restaurant and gallery. They have a really good wine list, and the eye fillet was one of the best I've had anywhere in the world. I'll go into more detail on the food a little later. In the meantime, suffice to say that this place is totally, totally recommended. Then we went back to Townsville, staying at the Aquarius again, swam, went to The Point restaurant for dinner, and came back to Sydney on the Saturday.So we had a great time, we did just about nothing, and by the end of the weekk we actually felt relaxed and refreshed. It was a great change.

Well it was 8 months we had to wait, but it was well worth it. Since J and I returned from our trip to Europe last year, it seemed as if every single weekend was spent doing something and there was...

OpenNWS

Current Sun/Emulex fc hba support table

It's a bit long, but here 'tis: the current support table for Sun-branded Emulex and Emulex-branded Emulex fibre-channel HBAs. (This is the same information that I presented in my talk to SOSUG on Wednesday night and which various people expressed interest in).VendorHBA modelVendor IDDevice IDSubsys Vendor IDSubsys Device IDModelSparc minimum rev (S8/S9)Sparc minimum rev (S10)x86/x64 minimum revSunSG-XPCI1FC-EM210dffc0010dffc00LP10000-SSAN (SFK) 4.4.6S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04SunSG-XPCI2FC-EM210dffc0010dffc00LP10000DC-SSAN (SFK) 4.4.6S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04SunSG-XPCIE1FC-EM410dffc2010dffc21LPe11000-SNot supportedS10 update 2 or S10 + 120222-06S10 update 2 or S10 + 120223-06SunSG-XPCIE2FC-EM410dffc2010dffc22LPe11002-SNot supportedS10 update 2 or S10 + 120222-06S10 update 2 or S10 + 120223-06SunSG-XPCI1FC-EM410dffc1010dffc11LP11000-StbdtbdtbdSunSG-XPCI2FC-EM410dffc1010dffc12LP11002-StbdtbdtbdEmulexLP1000010dffa0010dffa00LP10000SAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP10000DC10dffa0010dffa00LP10000DCSAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP10000ExDC10dffa0010dffa00LP10000ExDCSAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLPe1100010dffe0010dffe00LPe11000SAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLPe1100210dffe0010dffe00LPe11002SAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP1100010dffd0010dffd00LP11000SAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP1100210dffd0010dffd00LP11002SAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP980210dff98010dff980LP9802SAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP9002DC10dff90010dff900LP9002DCSAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP9002L10dff90010dff900LP9002LSAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04EmulexLP9002S10dff09510dff095LP9002SSAN (SFK) 4.4.7S10 update 1 or S10 + 120222-04S10 update 1 or S10 + 120223-04

It's a bit long, but here 'tis: the current support table for Sun-branded Emulex and Emulex-branded Emulex fibre-channel HBAs. (This is the same information that I presented in my talk to SOSUGon...

OpenNWS

Current Sun/QLogic fc hba support table

It's a bit long, but here 'tis: the current support table for Sun-branded Qlogic and QLogic-branded Qlogic fibre-channel HBAs. (This is the same information that I presented in my talk to SOSUG on Wednesday night and which various people expressed interest in).VendorHBA modelVendor IDDevice IDSubsys Vendor IDSubsys Device IDSparc minimum rev (S8/S9)Sparc minimum rev (S10)x86/x64 minimum revSunSG-XPCI1FC-QLC107763221077132Not SupportedNot SupportedS10 update 1 or S10 + 119131-13Sun6799A10772200A10774082SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13SunSG-XPCI1FC-QF2/x6767A107723101077106SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13SunSG-XPCI2FC-QF2/x6768A10772312107710ASAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13SunX6727A10772200A10774083SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13SunSG-XPCI1FC-QF4107724221077140SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13SunSG-XPCI2FC-QF4107724221077141SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13SunSG-XPCIE1FC-QF4107724321077142SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13SunSG-XPCIE2FC-QF4107724321077143SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQCP2340107723121077109SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQCP234210772312107710BSAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLA200107763121077119Not SupportedNot SupportedS10 update 1 or S10 + 119131-13QLogicQLA21010776322107712FNot SupportedNot SupportedS10 update 1 or S10 + 119131-13QLogicQLA2310107723101077106SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLA2310F/QLA2310FL1077231010779SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLA2340/QLA2340L107723121077100SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLA2342/QLA2342L107723121077101SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLA2344/QLA2344-P1077(2)23121077102SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLA2440107724221077145Not SupportedNot SupportedS10 update 1 or S10 + 119131-13QLogicQLA2460107724221077133SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLA2462107724221077134SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLE2360107724321077117SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLE2362107724321077118SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLE2440107724321077147Not SupportedNot SupportedS10 update 1 or S10 + 119131-13QLogicQLE2460107724321077137SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQLE2462107724321077138SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQSB2340107723121077104SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13QLogicQSB2342107723121077105SAN (SFK) 4.4.8S10 update 1 or S10 + 119130-13S10 update 1 or S10 + 119131-13

It's a bit long, but here 'tis: the current support table for Sun-branded Qlogic and QLogic-branded Qlogic fibre-channel HBAs. (This is the same information that I presented in my talk to SOSUGon...

General

Day 2, 3rd test, Australia vs South Africa

I had a lot of fun yesterday --- I went to day 2 of the 3rd Australia vs South Africatest at the SCG. I went with a bunch of friends who work for Honeywell (a link via a school-friend), and my mate R who flew in for the New Years Eve fireworks was able to join us too.Since he works in the same building as BMCI figure when R gets home they'll be able to watch some of BMC's cricket dvds. Building on the knowledge that Nathan enlightened me with on NYE, I played a bit with my camera :-) Here are some of the pictures I took:Before play started for the day:At lunch, the Milo kiddies cricket teams came and played:The second ball after lunch:South Africa declared at 9/451:For some reason, when Andre Nel came on, the crowd booed him. I noticed that when he'd finished his over and came to start fielding near us, he waited until it was clear the ball wasn't coming near him and then hurried to sign about 20 autographs --- seems like a good sport to me!We were all really happy to see Ricky Ponting score his 8000th test run. He's only the 3rd Australian to get to that total:The scoreboard lit up:All-in-all, a really nice day. Pity South Africa scored so slowly all day, and Australia lost 3 quick wickets in the last 45 minutes of play ...... Ah well, that's the way things go! There's more info on current matches at CricInfo.

I had a lot of fun yesterday --- I went to day 2 of the 3rd Australia vs South Africatest at the SCG. I went with a bunch of friends who work for Honeywell (a link via a school-friend), and my mate R...

General

New Years Eve 2005, Sydney

We parked ourselves on Henry Lawson Drive in McMahon's Point for Sydney's NY celebrations. A friend of mine from our Menlo Park office had come over with his wife <it>just for the fireworks</it> so we made sure we got a <a href="http://www.multimap.com.au/map/browse.cgi?client=public&X=16832000.3046603&Y=-3984500.43916125&width=500&height=300&gride=16831962.3046603&gridn=-3984303.43916125&srec=0&coordsys=mercator&db=AU&addr1=&addr2=&addr3=&pc=&advanced=&local=&localinfosel=&kw=&inmap=&table=&ovtype=&keepicon=true&zm=0&scale=5000&down.x=279&down.y=327">good spot <img src="http://mc.multimap.com/cs/psma042//M67/Y-1/M67328Y-15939S5W500H300.gif"></a>. The downside of course was that since the <a href="http://www.northsydney.nsw.gov.au">local council</a> had road closures we had to be there before midday.It was a loooong day!Still, we had a good time, were there with friends (both local and international), and I also learnt something about my <a href="http://www.dpreview.com/reviews/specs/Nikon/nikon_cp5400.asp">Nikon Coolpix 5400</a> so after the 9pm fireworks my pictures improved dramatically. (Thankyou Nathan!)Here are some of the photos I took:<img src="http://blogs.sun.com/roller/resources/jmcp/nye_sydney_01.jpg"><img src="http://blogs.sun.com/roller/resources/jmcp/nye_sydney_02.jpg"><img src="http://blogs.sun.com/roller/resources/jmcp/nye_sydney_03.jpg"><img src="http://blogs.sun.com/roller/resources/jmcp/nye_sydney_04.jpg"><img src="http://blogs.sun.com/roller/resources/jmcp/nye_sydney_05.jpg">That girl in the last photo ..... we politely asked her to sit down when the midnight fireworks started (there was also a 9pm set but she was canoodling for those)...all her friends asked her to sit down as well but she insisted that it was her right to stand up. I think she assumed that everybody was going to stand up as well --- a completely nuts idea actually, because everybody around us had been there for at least 9 hours (if not longer) by that time and we couldn't be bothered! There's also the wobbly camera shot issue to take into account.Oh well, image cropping with <a href="http://www.gimp.org">The Gimp</a> will save my photos.We had a great night. Our friends from the USA thought it was most definitely worth the 13.5 hour flight from San Francisco. NYE 2005/06 rocked. Hippy Moo Ear to everybody!

We parked ourselves on Henry Lawson Drive in McMahon's Point for Sydney's NY celebrations. A friend of mine from our Menlo Park office had come over with his wife <it>just for the fireworks</it> so we...

OpenSolaris

Getting started with your own CTF data

There was a <a href="http://www.opensolaris.org/jive/thread.jspa?threadID=4614&tstart=0">question</a> posed on the <a href="http://www.opensolaris.org/jive/forum.jspa?forumID=4">mdb-discuss</a> forum today, wondering how to add CTF data to kernel modules which you develop.I gave a kinda-useless answer (basically <a href="http://cvs.opensolaris.org/source/search?q=ctfconvert">UTSL</a>), but recalled that this question had been asked on #opensolaris a few weeks ago and I'd promised to write up a procedure.So I found my copy of <a href="http://homepage2.nifty.com/mrym3/taiyodo/eng/index.htm">Murayama-san's</a> <tt>rh</tt> driver sourcecode, did a quick build of the <tt>fp</tt> module from the NWS consolidation, and then figured out J. Random Developer can add the CTF data in.Firstly, grab yourself a copy of the <a href="http://www.genunix.org/mirror/SUNWonbld-20051116.i386.tar.bz2">ON build (onbld)</a> tools from <a href="http://www.genunix.org">genunix.org</a>. This package contains the utilities <tt>ctfconvert</tt> and <tt>ctfmerge</tt>. (Unfortunately they don't appear to have manpages yet).Then with your driver, for each .o file that gets linked to make your driver, run $ ctfconvert ---g ---l [label] objectfile.oThen when each has been <tt>ctfconvert</tt>ed, run$ ctfmerge ---l [label] ---o [output driver name] [list of .o files]It's depressingly simple!Now here's what I did for the <a href="http://homepage2.nifty.com/mrym3/taiyodo/eng/index.htm">rh</a> driver:$ /usr/ccs/bin/make$ cd i386$ /ws/onnv-tools/onbld/bin/i386/ctfconvert ---g ---l RH gem.o$ /ws/onnv-tools/onbld/bin/i386/ctfconvert ---g ---l RH rh_gem.o$ /ws/onnv-tools/onbld/bin/i386/ctfmerge ---l RH ---o rh rh_gem.o gem.o$ cd amd64$ /ws/onnv-tools/onbld/bin/i386/ctfconvert ---g ---l RH gem.o$ /ws/onnv-tools/onbld/bin/i386/ctfconvert ---g ---l RH rh_gem.o$ /ws/onnv-tools/onbld/bin/i386/ctfmerge ---l RH ---o rh rh_gem.o gem.oThen I did the usual <tt>make install</tt> and <tt>adddrv.sh</tt> and I could now do this in a session of <tt>mdb ---k</tt>:> ::statusdebugging live kernel (64-bit) on doppiooperating system: 5.11 onnv-gate:2005-12-11 (i86pc)> ::modinfo ! grep rh232 fffffffff4feed58     adf8   1 rh (via rhine nic driver v1.0.24)> ::print ---t struct gem_stats{    uint32_t intr    uint32_t crc    uint32_t errrcv    uint32_t overflow    uint32_t frame    uint32_t missed    uint32_t runt    uint32_t frame_too_long    uint32_t norcvbuf    uint32_t collisions    uint32_t first_coll    uint32_t multi_coll    uint32_t excoll    uint32_t nocarrier    uint32_t defer    uint32_t errxmt    uint32_t underflow    uint32_t xmtlatecoll}And even better, with a development version of Solaris CAT I can use the <tt>stype</tt> command:SolarisCAT(live/11X)> stype gem_statsstruct gem_stats { (size: 0x48 bytes)   typedef uint32_t = unsigned intr; (offset 0x0 bytes, size 0x4 bytes)   typedef uint32_t = unsigned crc; (offset 0x4 bytes, size 0x4 bytes)   typedef uint32_t = unsigned errrcv; (offset 0x8 bytes, size 0x4 bytes)   typedef uint32_t = unsigned overflow; (offset 0xc bytes, size 0x4 bytes)   typedef uint32_t = unsigned frame; (offset 0x10 bytes, size 0x4 bytes)   typedef uint32_t = unsigned missed; (offset 0x14 bytes, size 0x4 bytes)   typedef uint32_t = unsigned runt; (offset 0x18 bytes, size 0x4 bytes)   typedef uint32_t = unsigned frame_too_long; (offset 0x1c bytes, size 0x4 bytes)   typedef uint32_t = unsigned norcvbuf; (offset 0x20 bytes, size 0x4 bytes)   typedef uint32_t = unsigned collisions; (offset 0x24 bytes, size 0x4 bytes)   typedef uint32_t = unsigned first_coll; (offset 0x28 bytes, size 0x4 bytes)   typedef uint32_t = unsigned multi_coll; (offset 0x2c bytes, size 0x4 bytes)   typedef uint32_t = unsigned excoll; (offset 0x30 bytes, size 0x4 bytes)   typedef uint32_t = unsigned nocarrier; (offset 0x34 bytes, size 0x4 bytes)   typedef uint32_t = unsigned defer; (offset 0x38 bytes, size 0x4 bytes)   typedef uint32_t = unsigned errxmt; (offset 0x3c bytes, size 0x4 bytes)   typedef uint32_t = unsigned underflow; (offset 0x40 bytes, size 0x4 bytes)   typedef uint32_t = unsigned xmtlatecoll; (offset 0x44 bytes, size 0x4 bytes)} ;

There was a <a href="http://www.opensolaris.org/jive/thread.jspa?threadID=4614&tstart=0">question</a> posed on the <a href="http://www.opensolaris.org/jive/forum.jspa?forumID=4">mdb-discuss</a> forum...

Solaris

How do I find out what my network device is?

I've been hanging out on #opensolaris@irc.freenode.net a lot recently doing my bit to help people get over the initial hump of installing Solaris and OpenSolaris. This evening we've been talking about devices, specifically NICs, and figuring out what driver they need.So how does one go about this if one has no idea what the driver should be? Well, start by running prtpicl -v and either pipe the output through /usr/bin/less or dump it to a file. Then you need to know what you're looking for: search for "Ethernet" or "Network" and you can't get too far off the track. That will appear in a stanza like this: pci1458,e000 (obp-device, 187220000034b) :DeviceID 0xb :UnitAddress 2 :device-id 17184 :vendor-id 4523 :revision-id 19 :class-code 131072 :unit-address b :subsystem-id 57344 :subsystem-vendor-id 5208 :min-grant 23 :max-latency 31 :interrupts 1 :devsel-speed 1 :fast-back-to-back :66mhz-capable :power-consumption 01 00 00 00 01 00 00 00 :model Ethernet controller :compatible (1872200000357TBL) | pci11ab,4320.1458.e000.13 | | pci11ab,4320.1458.e000 | | pci1458,e000 | | pci11ab,4320.13 | | pci11ab,4320 | | pciclass,020000 | | pciclass,0200 | :reg 00 58 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 58 02 02 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 14 58 02 01 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 30 58 02 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 :assigned-addresses 10 58 02 82 00 00 00 00 00 00 00 f5 00 00 00 00 00 40 00 00 14 58 02 81 00 00 00 00 00 94 00 00 00 00 00 00 00 01 00 00 30 58 02 82 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 :pm-hardware-state needs-suspend-resume :devfs-path /pci@0,0/pci10de,ed@e/pci1458,e000@b :driver-name skge :binding-name pci1458,e000 :bus-addr b :instance 0 :_class obp-device :name pci1458,e000see, up there at :model Ethernet controller. Now the next stanza or property is the very important :compatible part. It's so important to this blog that I'll excerpt it: :compatible (1872200000357TBL) | pci11ab,4320.1458.e000.13 | | pci11ab,4320.1458.e000 | | pci1458,e000 | | pci11ab,4320.13 | | pci11ab,4320 | | pciclass,020000 | | pciclass,0200 |These strings are PCI Consortium identifiers. Let's walk through them one by one.identifier which is what? pci11ab,4320.1458.e000.13 vendor,device.subvendor.subdevice.revision pci11ab,4320.1458.e000 vendor,device.subvendor.subdevice pci1458,e000 vendor,device pci11ab,4320.13 vendor,device.revision pciclass,020000 PCI Consortium device class, specific pciclass,0200 PCI Consortium device class, general Ok, that's all well and good, but how do I use that information?Well let's assume for a second that you want to find a network device in general. So searching through your prtpicl -v output you'll look for pciclass,0200. That will give you a pointer to the pci vendor,deviceid information, which you can then check /etc/driver_aliases for:$ grep pci1458,e000 /etc/driver_aliasesskge "pci1458,e000"This tells me that this particular pci identifier (pci1458,e000) is a device alias for the skge driver from SysKonnect.The example we came across on #opensolaris this evening was :compatible (1e4000001e7TBL) | pci14e4,1677.1028.179.1 | | pci14e4,1677.1028.179 | | pci1028,179 | | pci14e4,1677.1 | | pci14e4,1677 | | pciclass,020000 | | pciclass,0200 |which a quick search of /etc/driver_aliases reveals is actually a Broadcom nic which we supply a driver for. As it happens, for this particular system we had to specify more than just the vendor,deviceid:# update_drv —a —i ' "pci14e4,1677.1028.179" ' bgeThis reportedwarning: driver (bge) successfully added to the system but failed to attachwhich was a pain, but progress. So we asked this new user to run # svcs clear svc:/network/physical:defaultbecause it was showing as 'maintenance'.... and suddenly all that we had left was a piddly little routing problem. Joy!Now if we didn't supply a driver, the thing to do would be to Goooooooogle for the pci vendor,deviceid string and "solaris driver" -- for NICs you'll frequently come up with a hit for Masayuki Murayama's collection of drivers. There is a really nifty utility available for linux called lspci which will let you see what you've got installed in your system and on your motherboard. It makes use of a file of

I've been hanging out on #opensolaris@irc.freenode.net a lot recently doing my bit to help people get over the initial hump of installing Solaris and OpenSolaris. This evening we've been talking about...

ZFS

Back in the old days

In 1998 and 1999 I was a Solaris system administrator at one of Sydney's 5 universities. I was quite green --- this was my second sysadmin job --- and I'd been given the task of administering the University's Finance arm's server and the Uni's DR server. It was quite a challenge for me: each box was a SparcServer-1000 with about 30 attached disks. The DR host had those disks all nicely physically organised into an SSA-100 array whereas the Finance host had unipacks and multipacks crowded around it in a semi-neat fashion. I had to learn VxVM (for the DR box) and SDS (for the Finance host) very quickly, and I realised that doing so from the command line perspective was very clearly the way to go.... in a disaster I wouldn't have a graphical head to let me look at the semi-pretty gui that VxVM 2.5 and SDS 4.1 required. Working on the Finance box required close interaction with the Oracle DBAs that we had --- they'd frequently want to move Oracle datafiles around in order to maximise access speed .... whether that meant having the partition on the "fast" end of the disk or on a fast spindle or a faster scsi controller or on a lower scsi id. That was a pain. The DR box was another challenge because I had to somehow make the 30x4gb disks appear to be a single storage pool for whichever host had to be hosted there in a DR situation. Since we had three hosts which might get that experience it was a bit difficult. While all of them ran Oracle, they each ran different versions of Oracle, had different application filesystem requirements.... you get the idea I'm sure.After a year working in that part of the Uni I moved to work for a smaller (30 people) group in the research division. It was great being top dog in the sysadmin group.... since I was their only sysadmin ;-) I migrated that group off a Novell NetWare server which quite seriously crashed every day. I got them onto an E250 running Solaris 2.6, Samba and PC-Netlink (for the Macs!) Once again I had to carefully carve up the internal disks and the luns from the attached A1000 (never, ever remove lun0 from an A1000 if you want it to work). I had to worry about quotas and how much space to allocate for the application I wrote for them and how much to allocate for sendmail spool files too. I recall that I ended up creating my filesystems (a) to not exceed the size I could ufsdump to a single DLT7000 tape (35gb), and (b) by grabbing the few megabytes at the end of what I figured were correctly sized filesystems and using an SDS concat to make somethng out of.Hideous! Time-wasting! Ugly! Oh, if only we'd had ZFS back then.....One of ZFS' main aims is to end the suffering of the humble (and not so humble!) system admin. With those hosts it would not have been difficult to add more storage or a new filesystem:zfs add financepool mirror c9t0d0 c10t0d0With the research group I wouldn't have had to worry about setting quotas for each filesystem by editing the quotatab and remembering to mount the filesystem with quotas turned on: zfs set quota=1g research/home/louisenot only could I have done with some compression on those Oracle datafileszfs set compression=on financezfs set compression=on oracledatabut I could also have used zfs to send incremental backups of the relevant bits from the finance host to the DR box: zfs backup -i finance/application@12:00 finance/application@12:01 |ssh DRbox zfs restore -d /finance/applicationDo you get it? Do you understand why we've been desperately keen to get ZFS into your hands? Do you want to start making use of all this?It's quite fine with me (us?) if you want to keep using SVM and VxVM. Really, it is. When you're ready, please go and have a look at ZFS. Drop it onto a test machine and play with it. Look at the source code and the documentation and reach out to the possibilities of spending your time productively rather than in slicing up disks.

In 1998 and 1999 I was a Solaris system administrator at one of Sydney's 5 universities. I was quite green --- this was my second sysadmin job --- and I'd been given the task of administering the...

Oracle

Integrated Cloud Applications & Platform Services