My own private crash-n-burn farm: using kernel zones for speedy testing

I've spent most of the last two years working on a complete rewrite of the ON consolidation build system for Solaris 12. (We called it 'Project Lullaby' because we were putting nightly to sleep). This was a massive effort for our team of 5, and when I pushed the changes at the end of February we wound up with about 121k lines of change over close to 6000 files. Most of those were Makefiles (so you can understand why I'm now scarred!).

We had to do an incredible amount of testing for this project. Introducing new patterns and paradigms for the complete Makefile hierarchy meant that we had to be very careful to ensure that we didn't break the OS. To accomplish this we used (and overloaded) the resources of the DIY test group and also made use of a feature which is now available in 11.2 - kernel zones.

Kernel zones are a type-2 hypervisor, so you can run a separate kernel in them. If you've used non-global zones (ngz) on Solaris in the past, you'll recall the niggle of having to have those ngz in sync with the global when it comes to SRUs and releases.

Using kernel zones offered several advantages to us: I could run tests whenever I wanted on my desktop system (a newish quad-core Intel Core i5 system with 32gb ram), I could quickly test updates of the newly built bits, I could keep the zone at the same revision while booting the global zone with a new build, and (this is my favourite) I could suspend the zone while rebooting the global zone.

Our testing of Lullaby in kernel zones had two components: #1 does it actually boot? and #2 assuming I can boot the kz with Lullaby-built bits, can I then build the workspace in the kz and then boot those new bits in that same kernel zone?

Creating a kernel zone is very, very easy:


limoncello: # zonecfg -z crashs12 create -t SYSsolaris-kz
limoncello: # zoneadm -z crashs12 install -x install-size=40g
limoncello: # zoneadm -z crashs12 boot

I could have used one of the example templates (eg /usr/share/auto_install/sc_profiles/sc_sample.xml) but for this use-case I just logged in and created the necessary users, groups, automount entries and installed compilers by hand. (Meaning pkg install rather than tar xf).

To start with, I ensured that crashs12 was running the same development build as my global zone, but I removed the various hardware drivers I had no need for.

The very first test I ran in crashs12 was a test of libc and the linker subsystem. Building libc is rather tricky from a make(1s) point of view, due to having several generated (rather than source-controlled) files as part of the base. The linker is even more complex - there's a reason that we refer to Rod and Ali as the 'linker aliens'! Once I had my fresh kz configured appropriately, I created a new BE, mounted it, then blatted the linker and libc bits onto it and rebooted. I was really, really happy to see the kz come up and give me a login prompt.

Several weeks after that we got to the point of successful full builds, so I installed the Lullaby-built bits and rebooted:


root@crashs12:~# pkg publisher
PUBLISHER TYPE STATUS P LOCATION
nightly origin online F file:///net/limoncello/space/builds/jmcp/lul-jmcp/packages/i386/nightly-nd//repo.osnet/
extra (non-sticky, disabled) origin online F file:///space/builds/jmcp/test-lul-lul/packages/i386/nightly/repo.osnet-internal/
solaris (non-sticky) origin online T http://internal/repo
root@crashs12:~# pkg update --be-name lul-test-1
root@crashs12:~# reboot

This booted, too, but I couldn't get any network-related tests to work. Couldn't ssh in or out. Couldn't for the life of me work out what I'd done wrong in the build, so I asked the linker aliens and Roger for help - they were quick to realise that in my changes to the libsocket Makefiles, I'd missed the filter option. Once I fixed that, things were back on track.

Now that Lullaby is in the gate and I'm working on my next project, I'm still using crashs12 for spinning up a quick test "system" and I'm migrating my 11.1 Virtualbox environment to an 11.2 kernel zone. The 11.2 zone, incidentally, was configured and installed in about 4 minutes using an example AI profile (see above) and a unified archive.

Kernel zones: you know you want them.

Comments:

Post a Comment:
Comments are closed for this entry.
About

I work at Oracle in the Solaris group. The opinions expressed here are entirely my own, and neither Oracle nor any other party necessarily agrees with them.

Search

Archives
« July 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today