By jmcp on Dec 21, 2014
I put this post up on my personal blog:
I put this post up on my personal blog:
I've spent most of the last two years working on a complete rewrite of the ON consolidation build system for Solaris 12. (We called it 'Project Lullaby' because we were putting nightly to sleep). This was a massive effort for our team of 5, and when I pushed the changes at the end of February we wound up with about 121k lines of change over close to 6000 files. Most of those were Makefiles (so you can understand why I'm now scarred!).
We had to do an incredible amount of testing for this project. Introducing new patterns and paradigms for the complete Makefile hierarchy meant that we had to be very careful to ensure that we didn't break the OS. To accomplish this we used (and overloaded) the resources of the DIY test group and also made use of a feature which is now available in 11.2 - kernel zones.
Kernel zones are a type-2 hypervisor, so you can run a separate kernel in them. If you've used non-global zones (ngz) on Solaris in the past, you'll recall the niggle of having to have those ngz in sync with the global when it comes to SRUs and releases.
Using kernel zones offered several advantages to us: I could run tests whenever I wanted on my desktop system (a newish quad-core Intel Core i5 system with 32gb ram), I could quickly test updates of the newly built bits, I could keep the zone at the same revision while booting the global zone with a new build, and (this is my favourite) I could suspend the zone while rebooting the global zone.
Our testing of Lullaby in kernel zones had two components: #1 does it actually boot? and #2 assuming I can boot the kz with Lullaby-built bits, can I then build the workspace in the kz and then boot those new bits in that same kernel zone?
Creating a kernel zone is very, very easy:
limoncello: # zonecfg -z crashs12 create -t SYSsolaris-kz
limoncello: # zoneadm -z crashs12 install -x install-size=40g
limoncello: # zoneadm -z crashs12 boot
I could have used one of the example templates (eg /usr/share/auto_install/sc_profiles/sc_sample.xml) but for this use-case I just logged in and created the necessary users, groups, automount entries and installed compilers by hand. (Meaning pkg install rather than tar xf).
To start with, I ensured that crashs12 was running the same development build as my global zone, but I removed the various hardware drivers I had no need for.
The very first test I ran in crashs12 was a test of libc and the linker subsystem. Building libc is rather tricky from a make(1s) point of view, due to having several generated (rather than source-controlled) files as part of the base. The linker is even more complex - there's a reason that we refer to Rod and Ali as the 'linker aliens'! Once I had my fresh kz configured appropriately, I created a new BE, mounted it, then blatted the linker and libc bits onto it and rebooted. I was really, really happy to see the kz come up and give me a login prompt.
Several weeks after that we got to the point of successful full builds, so I installed the Lullaby-built bits and rebooted:
root@crashs12:~# pkg publisher
PUBLISHER TYPE STATUS P LOCATION
nightly origin online F file:///net/limoncello/space/builds/jmcp/lul-jmcp/packages/i386/nightly-nd//repo.osnet/
extra (non-sticky, disabled) origin online F file:///space/builds/jmcp/test-lul-lul/packages/i386/nightly/repo.osnet-internal/
solaris (non-sticky) origin online T http://internal/repo
root@crashs12:~# pkg update --be-name lul-test-1
This booted, too, but I couldn't get any network-related tests to work. Couldn't ssh in or out. Couldn't for the life of me work out what I'd done wrong in the build, so I asked the linker aliens and Roger for help - they were quick to realise that in my changes to the libsocket Makefiles, I'd missed the filter option. Once I fixed that, things were back on track.
Now that Lullaby is in the gate and I'm working on my next project, I'm still using crashs12 for spinning up a quick test "system" and I'm migrating my 11.1 Virtualbox environment to an 11.2 kernel zone. The 11.2 zone, incidentally, was configured and installed in about 4 minutes using an example AI profile (see above) and a unified archive.
Kernel zones: you know you want them.
As of 25 March 2012, my @sun.com email address and alias will be deactivated. If you are still contacting me via firstname.lastname at sun.com, or myalias at sun.com, then please update your address book so that you use firstname.lastname at oracle.com instead.
I'm delighted to announce the inaugural meeting of the Brisbane Oracle Solaris User Group (BrOSUG).
Join us for pizza as we discuss Installation and Packaging in Solaris 11.
Many of you are familiar with Solaris Jumpstart, SysV packages and patches. Solaris 11 changes all of this. Come along and find out how it has changed, and why.
What: Brisbane Oracle Solaris User Group (BrOSUG).
When: 17 October 2011, 12:30pm - 2:30pm
Where: Oracle House, 300 Ann St, Brisbane.
(Come to reception on level 14).
Please confirm your intention to attend with:
James McPherson James dot McPherson-AT-oracle-DOT-com
+61 7 3031 7173
Mark Farmer Mark dot Farmer-AT-oracle-DOT-com
+61 7 3031 7106
One of the things I'm currently responsible for (as onnv gatekeeper), is maintenance of our gatehooks. From time to time I need to make changes and test them in a sandbox, and I keep my sandbox pretty small. hg update takes a while when you have to run it over all of ON
Anyway, with the most recent round of changes (to support running the hooks with Mercurial built against python 2.6), I just spent the best part of a day beating my head against two things. Firstly, I'd forgotten that my sandbox was configured to exec mercurial with the
option. This has the effect with python2.6 and Mercurial 1.3.1 of making any process exit, even a successful one (exit(0)) die with a traceback:
$ $HG push ssh://onhg@localhost//scratch/gate/trees/minirepo-131_26
pushing to ssh://onhg@localhost//scratch/gate/trees/minirepo-131_26
searching for changes
Are you sure you wish to push? [y/N]: y
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: caching changes for gatehooks...
remote: ...changes cached
remote: Sanity checking your push...
remote: ...Sanity checks passed
remote: pushing to /scratch/gate/trees/minirepo-131_26-clone
remote: searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: Traceback (most recent call last):
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 43, in _runcatch
remote: return _dispatch(ui, args)
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 449, in _dispatch
remote: return runcommand(lui, repo, cmd, fullargs, ui, options, d)
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 317, in runcommand
remote: ret = _runcommand(ui, options, cmd, d)
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 501, in _runcommand
remote: return checkargs()
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 454, in checkargs
remote: return cmdfunc()
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/dispatch.py", line 448, in
remote: d = lambda: util.checksignature(func)(ui, \*args, \*\*cmdoptions)
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/util.py", line 402, in check
remote: return func(\*args, \*\*kwargs)
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/commands.py", line 2752, in serve
remote: File "/opt/local/mercurial/1.3.1/lib/python2.6/site-packages/mercurial/sshserver.py", line 46, in serve_forever
remote: SystemExit: 0
The second one was equally frustrating: our hooks have a Test parameter, which you set to True or False in your gate's hgrc. Setting it to True means that the hook does not do any actual work. Guess which value I'd left it set to in my minirepo?
The more observant amongst you will have noticed a number of people who have blogs on blogs.sun.com have decided that they should move them elsewhere. The most recent I noticed was Simon Phipps, and realised that I need to follow suit.
For the curious, no, I didn't get a "don't come Monday"
I've been running my own blog at http://www.jmcp.homeunix.com/blog for quite a while now, and I figure it's a reasonable enough location to blog from in general.
So without further ado, Please Update Your Feeds(tm)!
gedanken# zfs snapshot rpool/ROOT/opensolaris-7@yay blinder# zfs create rpool/ROOT/opensolaris-7
gedanken# zfs send rpool/ROOT/opensolaris-7@yay | \\ ssh blinder zfs recv -v rpool/ROOT/opensolaris-7 [trundle] received 20.8GB stream in 2707 seconds (7.85MB/sec)
# zpool set boots=rpool/ROOT/opensolaris-7 rpool # zfs set canmount=noauto rpool/ROOT/opensolaris-7 # zfs set mountpoint=/ rpool/ROOT/opensolaris-7
# zfs set mountpoint=/mnt rpool/ROOT/opensolaris-7 # cp /etc/ssh/sshd\*key\* /mnt/etc/ssh # cp /etc/hostid /mnt/etc/hostid # cp /etc/inet/hosts /mnt/etc/inet/hosts # cp /etc/X11/xorg.conf /mnt/etc/X11 # cp /etc/hostname.nge0 /mnt/etc # cp /var/spool/cron/cronjobs/onhg /mnt/var/spool/cron/cronjobs
# cd / # zfs umount rpool/ROOT/opensolaris-7 # zfs set mountpoint=/ rpool/ROOT/opensolaris-7
# init 6[dammit, this "fast reboot" stuff is TOO FAST!]
root@blinder:~# cat /etc/release OpenSolaris Development snv_127 X86 Copyright 2009 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 06 November 2009 root@blinder:~# root@blinder:~# uname -a SunOS blinder 5.11 snv_127 i86pc i386 i86pc root@blinder:~#
It's been a bit of a pain to get working, but now it is I figure I should provide some details what I've done.
The solution I chose was Vodafone's Mobile Prepaid Broadband, which comes with a Huawei E169, aka K3520 usb dongle. This has the increasingly popular "ZeroCD(tm)" feature - when you plug it in, it defaults to showing a storage device (usb device class 8) rather than as a communications device (usb device class 2). This storage device has the windows drivers and application on it, which then kicks the device into being a communications device. All good, or something, as long as you're running XP, Vista or Mac OS X.
Not so good, clearly, for yours truly.
A bunch of google hits came up with http://darkstar-solaris.blogspot.com/2008/10/huawei-e169-usb-umts-gprs-modem.html, http://www.opensolaris.org/jive/thread.jspa?messageID=272246, http://in.opensolaris.org/jive/thread.jspa?messageID=227212 and http://my2ndhead.blogspot.com/2008/11/opensolaris-huawei-e220-swisscom-and.html, which got me started - I need to use the usbsacm(7D) driver, so it was time to update_drv. Being a bit lazy, and suspecting that there'd be a few options that needed covering, I just hand-edited /etc/driver_aliases to add in all the possible compatible entries for the device (starting with usb12d1,1001.0). Still no joy - darn thing was still showing up as a storage device even after hotplugging.
On the off-chance that the attached-at-power-on state might be different I rebooted with it attached, ... lo and behold, it was different! No usb,class8, just compatible properties which allowed me to attach it to the usbsacm driver, and get 3 /dev/term entries. After a hotplug op, however, it was back to showing up as a storage device, which was most undesirable.
So I tried looking for what solutions the linux world has come up with to the problem, came across usb_modeswitch (which gave the clue that sending a "USB_REQ_SET_FEATURE, 00000001" would kick it properly), and also some Huawei forum posts.
One thread in particular caught my eye: Thread: K3520 (E169) microSD lost which featured a comment from a Huawei employee instructing the user to utter "at\^u2diag=255" in a hyperterm session in order to get back their microSD card slot.
So being inquisitive, and making a guess, when I had the dongle connected from boot, I ran
# tip /dev/term/0 connected ATZ OK at\^u2diag=0 OK ~.and hotplugged the device. On re-insertion (after waiting 1 minute), I saw that there was no storage device found, just usbsacm instances. W0000t!
So now all I had to do was trawl my memories to recall how to do ppp (eeeek!) and would be connectable.
Therein lies a lot of pain, so I'll cut straight to the "this works for me" part:
The aliases I've got in /etc/driver_aliases are
usbsacm "usb12d1,1001.0" usbsacm "usb12d1,1001"
The peers file that I'm using is
/dev/huawei 720000 debug logfile /var/tmp/pppd.log crtscts asyncmap a0000 idle 300 passive defaultroute usepeerdns :0.0.0.0 noccp novj lcp-echo-interval 0 lock modem connect '/usr/bin/chat -s -t60 -f /etc/ppp/voda-chat' noauth persist
Note that I symlinked /dev/term/0 to /dev/huawei - purely because I wanted to.
The chat script is
ABORT 'BUSY' ABORT 'NO CARRIER' "" AT&F OK AT\\136u2diag=0 OK ATE0V1E1X1\\136SYSCFG=2,2,3FFFFFFF,1,2 OK ATS7=60 OK AT+CGDCONT=1,"IP","vfprepaymbb" OK AT+CGQMIN=1 OK AT+CGQREQ=1 OK "ATD \\052 99\\052\\052 2 \\043" REPORT CONNECT '' CONNECT
Note the use of octal characters - Solaris' /usr/bin/chat doesn't like the caret (\^), asterisk (\*) or hash (#) in a chat script, so you have to work around that by using \\136 for caret, \\052 for asterisk, and \\043 for hash. Also, note that the prepaid solution uses an Access Point Name (APN) of "vfprepaymbb" rather than the contract/postpaid "vfinternet.au".
The other thing of interest is the ":0.0.0.0" in the peers file. This is what I needed to add in order to get around the problem
[ID 702911 daemon.debug] Peer refused to provide his address; assuming 192.168.1.1
Once I'd done that, things seemed to work just fine. It was rather weird to see a ping time from my non-global zone to the laptop via 3G taking about 170msec even though both machines are within my arm's reach!
I just need to get some ip-up and ip-down scripts figured out, and then I'll be all raring to go.
One final thing, there are apparently a reasonably annoying bug with usbsacm:
6840063 usbsacm stops sending data out when pushed hard (fixed in snv_120), and an RFE
I don't know for sure whether this works if you're not running snv_120, but rather than disabling a core, you could try
# pbind -b 0 `pgrep pppd`
as part of your ip-up script. I'm going to try it and see.
Now that I've had a weekend to start recovering from Kernel Conference Australia, it's time to start catch up on a few blog entries. There'll be several over the next few weeks as I work through everything I want to mention.
A massive thankyou to our volunteers on the ground (James Lever, Greg Black, Nikolai Lusan, Daniel Dawson) who all did a stellar job and made it possible for me to be the shiny happy face of the conference
Thankyou to the review committee (Jake Carroll, Andre van Eyssen, David Gwynne and John Sonnenschein) who helped put the program together.
Thankyou to our speakers, without whom there would not have been any conference to attend.
Thankyou to Claire Operie, Gabriele Decelis and Diana Pearce who handled the logistics of getting the event off the ground (registrations, the event website, catering, swag etc).
Thankyou to Mitch Roberts for setting up the demo room and who answered hundreds of questions about pretty much every piece of technology Sun sells. He also showed off some really amazing VDI and demonstrated the Fishworks kit. Mmmmmmmmmmmm!
Thankyou to Jake Carroll for setting up the infrastructure we needed at the Queensland Brain Institute. Thankyou especially to QBI's director Professor Perry Bartlett, FAA and Ian Duncan for generously allowing us to use QBI's facilities for the conference.
Thankyou to the Brisbane branch of Sun Microsystems who got behind the idea and provided connections, spread the word and encouraged people to come along.
Thankyou to everybody who attended KCA, who thought it would be something worthwhile. I hope you went away excited, energised and enthused about all the really cool technologies that you can find in Open Source kernels, and even be inspired to contribute to your favourite kernel in the future.
A number of people have asked me over the last few days whether the presentations will be made available online. The answer is a resounding YES, but not just yet. What the review committee (well, just Andre, and myself) are doing is creating a Proceedings of the conference. Within that we will have the papers or slideware and speakernotes that each presenter wrote. When we have that finished I will announce it on this blog, and I will email all the delegates to provide details on how to obtain a copy. (Hopefully it'll be in the National Library of Australia with proper CiP data too).
The videos that we recorded are being cleaned up and will be uploaded to somewhere with a lot of space. Soon. I don't know when, but when I do know, I'll announce it here.
The feedback I've received from people attending KCA has generally been very positive about the event, and encourages me to organise another KCA for next year. We'll have to wait and see how things pan out following the buyout vote, but I'm very hopeful that we'll be able to make it happen.
The speaker presentations (slideware and copious notes, as well as plain text so we can incorporate them into the forthcoming Proceedings volume) are coming in and are looking really good. It'll take a bit of time after the conference ends before we can get the Proceedings finalised, but the presentations themselves will be available fairly soon afterwards; I'll mention the url here when I get it finalised.
There's still time for you to register and come along to hear, learn from and hang out with some of the finest minds in Open Source today.
|Dates||15 - 17 July, 2009|
|Venue||Queensland Brain Institute, University of Queensland|
Just a short reminder that if you want to come to Kernel Conference Australia at the earlybird price of $195, then you've got until the end of this Friday, 12 June, to get your registration happening.
If you're interested in any of these areas:
The student price is still $95, too.
For full abstracts please see http://wikis.sun.com/display/KCA2009/KCA2009+Conference+Agenda
For the conference homepage, see http://au.sun.com/sunnews/events/2009/kernel/
And for registration, Go Without Delay To
I look forward to seeing you there.
I work at Oracle in the Solaris group. The opinions expressed here are entirely my own, and neither Oracle nor any other party necessarily agrees with them.