Saturday Jun 28, 2008

VirtualBox meets JET...

So first, the why. I was working on a project (described in detail in earlier blahg entries) that used JET/Jumpstart to maintain "golden images" of systems for provisioning in the application and web tiers. First reason for doing this was just to have a JET configuration that I could play with in my hotel room while waiting for room service. Sitting around the office behind the firewalls just to get access to the boxes can be a bummer after about 8pm.

Second reason for this, I'm a geek. Once I had the idea in my head, I just had to get it to work. Third reason, it is a really handy way to set up machine variances within VMs on your desktop or laptop for demos and testing software. Yeah, you can "clone" VMs, and copy the disk images and configs around. But you never know when you are going to "fat finger" a VM, or forget some important step when munging things by hand. Using a JET server allows you to save a config, re-use it, and always start with a fresh "machine" to work on, in a known state. You can even save off your favorite VMs as "flar" images to restore on demand, using compressed images can save \*tons\* of space, especially important when you are on a disk-limited road warrior laptop system.

Step one, download Sun xVM VirtualBox and read all of the assorted docs and technical info. Join the community, read the Wikis, read some blogs.

Step two, install the downloaded package on your system of choice. In my case, I am running it on my laptop and desktop machines, both under Windows XP. There are lots of folks running other host systems, read the info, read the blogs.

Step three, download your Solaris or OpenSolaris DVD image of choice. No need to burn the image on to DVD media, we'll just mount the ISO DVD image as a virtual DVD later. While you are cruising, join some discussions, cruise through the forums, read some documentation, join the movement. (Can you tell I really like OpenSolaris?)

Step four, (well, maybe after you play around with VirtualBox a bit and get comfortable with it) create your Jet Server virtual machine (VM). Double click on the desktop icon, and then click the "New" button. For my configuration, I went with an 32G root disk, added in 32G for /export (only allocated from your hard drive as you use the space), 512Meg of RAM (I have 8Gig on my desktop and 4Gig in my laptop), and defined the CD drive to be a virtual drive, mounting the ISO image that we just downloaded.

The networks here are the tricky part. I am using a "Back End Network" (BEN) for management. I will keep my JET/Jumpstart traffic on the BEN, and use a second interface as a "public network". My BEN network is defined as "internal network" or "intnet" (inter-VM traffic only). My external interface could be NAT or host bridge, either way, the Windows host can get to the VM, and the VM can see the outside world (Firefox can get to Google). I configure the VM through the GUI, but then ran into a little "issue".

There was a bug in the earlier (1.3.X) version of VirtualBox that broke "intnet" network connections. A couple packets would fly back and forth between VMs, and then the connections would just disappear. You could snoop and see packets going out, and sometimes being received, but answering packets never made it back to the originating host. I don't know if that is fixed in the current version (1.6), but there is a workaround. In the Windows host, click [Start] -> [Run], and then type in "cmd" to get a shell window. Use "cd" to go to the VirtualBox installation directory (C:\\Program Files\\Sun\\xVM VirtualBox in my case), and use the command line utility VBoxManage.exe. The magic mojo here is (for my VM named JETserver):

  "C:\\...xVM VirtualBox> VboxManage modifyvm JETserver -nic1 intnet

Note that the network interfaces are "nic1" (my BEN) and "nic2" (my public interface).

Now just click the "Start" button, followed quickly by pressing "F12" on your keyboard to choose the boot device. This option only appears during the splash screen of the virtual BIOS, so you have a second or two to hit the key. Choose CDROM for your boot device and away you go installing Solaris as usual. You can also juggle the boot devices under [General] -> [Advanced] -> [boot devices] in the configuration pane. Make sure that if you created a second drive for the JET stuff (as I did) that you set it up during the install, or fdisk/format/newfs it when the machine is booted. Make sure that you set your boot devices again, or unmount the virtual CDROM device, or hit F12 really quickly again to boot from disk after the install.

There we go, a Solaris VM. Yay! Now we need to make it into a JET server.

Step five (OK, so four was rather long), download the Jumpstart Enterprise Toolkit (JET) software from the JET home on BigAdmin. BigAdmin is your friend. Cruise around, read the JET Wiki, read docs, join communities, etc. as usual. Also take a cruise through Mike Ramchand's blog, and check out the Yahoo! JETJumpStart group.

Step six, follow the instructions for installing JET, using that ISO file again as a virtual CDROM device. Make sure you run:


to get the DHCP and PXE boot stuff in place, since we will be using those goodies to boot our client VMs.

The documentation for JET is included in the package in PDF format. Read it (at least give it a cursory pass through) before continuing. The simple stuff is really simple. The more complex and intricate operations can get tricky.

Step seven, create a new VM to be a JET client. This is simple, here is a screenshot of my JSclient machine to show the settings (don't forget to jump out to the command line again for VBoxManage.exe after you configure it through the GUI):

Take note of the ethernet address of your virtual adapter, and use that in your JET template. Make sure you assign the IP address of your JET client in the template file to be on the same network as your BEN "intnet" connection on the JET server VM. If you want to get fancy, you can add a public interface to the JET client configuration and add that to the JET template as well (DHCP or fixed IP with default router and DNS server info in the template).

Step eight, boot the sucker, hit "F12", and choose network boot (PXE) (or again, juggle your boot devices under the [General] -> [Advanced] -> [boot devices] configuration pane). If things don't work as planned, you can snoop your nic1 from the JETserver VM to debug things. That's it. Simple. I'll post some more screen shots and details of my template files as time permits.


Friday Jun 20, 2008

Pass the buck (or a message) using motd & EEPROM...

Another nifty challenge that we had on this project was juggling resources. We had about 8 machines in our development lab, and about 30 engineers working on them at any given time. One machine is reserved as the Jumpstart/JET server, one as the N1SPS/N1SM server, and two for cluster development. One machine was mostly dedicated to the "golden flar image" development, leaving three machines for everyone else to scramble for as target server system for deployment and functional/unit testing.

This qualifies as the hack of the day for me. We could all send emails around, but the volume and conflicting needs would be overwhelming. We could (and did) maintain a spreadsheet and whiteboard of who had what assets reserved for what periods of time. We could (and did) keep the assets in the project plan docs, though "issues" and "need a machine for debugging this new error that just popped up" often makes those things out of date before they can be sent to the printer.

We ended up assigning a "group owner" to the machines as they were allocated. So the "Solaris" team might own a machine for a couple hours or a couple days. That person was responsible for knowing who was working on the machine, the state of the machine, and what tasks were to be completed before giving the machine back to the pool for reassignment.

Simple things like adding/deleting/cloning zones, messing with ndd settings, playing with secondary network adapters to configure IPMP aren't conflicts. Re-deploying or JET installing a machine that people are counting on could set teams back several hours or days. How can you pass a message other than having people run down the hall, and pop into the work rooms one at a time screaming "who just rebooted my machine, what's happening?"

You can place a message in the /etc/motd file (yeah, it is old school, but it works) letting everyone know what is happening on the machine. The "group owner" of the machine at any given time is responsible for placing notes there to keep incoming users informed as to what activitied and instabilities might exist on the machine, as well the the contact info for the machine owner.

somehost# cat /etc/motd

Machine:  somehost
Owner:    Bill Walker (703.555.1234)
Purpose:  JET/flar work

Current State:  
        Al:  Doing ndd settings testing
        Jim:  Working on Packaging tools
        Bill:  Working on Issues 118 and 181, SSH key mismatch issue
               Working on Issue 58, root homedir changes halfway
                   through provisioning process.

The other nifty item is the EEPROM's oem-banner variable. If the /etc/motd can inform folks coming in as to what is being done and by whom, the EEPROM oem-banner can alert folks who might be coming in through the ILOM and trying to re-provision or re-JET the machine out from under you. The last thing you want after working on a machine for 3-4 hours is to see "Connection closed", and find out that someone just typed "boot net - install" at the ok> prompt.

Historically, the OBP's oem-banner variables were created so that hardware manufacturers could "relabel" machines and resell them. Replacing the skin of the machine and putting a new logo on it was easy, but the console power-on message at POST time had a banner identifying the machine model and manufacturer, and on graphic (frame buffer) equipped systems, a graphic image depending on the machine and frame buffer type. The oem-banner and oem-logo variables allow machines to display different banners and graphic logos, essentially hiding the manufacturer of the machine for OEMs. We are going to appropriate these data elements for our own use in this case.

somehost# eeprom oem-banner="STOP!!!  System Group box for flar development
somehost> call before re-installing!  Bill Walker (703.555.1234)"
somehost# eeprom oem-banner?=true

Now when someone goes to the console to re-install the box, they will see:

{0} ok 
{0} ok boot net - install

STOP!!!  System Group box for flar development
call before re-installing!  Bill Walker (703.555.1234)

Boot device: /pci@0/pci@0/pci@1/pci@0/pci@2/network@0  File and 
args: - install
1000 Mbps full duplex  Link up
Requesting Internet Address for 0:14:4f:d3:9a:0
Requesting Internet Address for 0:14:4f:d3:9a:0

Since the "Requesting Internet Address" stuff takes several minutes, they should have plenty of time to go back to the ILOM, issue a "break -y", and make the call before the disks get scribbled over.

Your mileage may vary, but I thought this was a good idea. :)


Thursday Jun 19, 2008

JET checkpoint, our milestone...

At this point, we have the packages and patches that we want, and the disk configuration that we want. We have a running system that we can fiddle with and install configuration files into. From this point forward, the system flar becomes evolutionary. Files like complex resolv.conf, NTP configurations (ntp.conf), passwd, shadow, netmasks, networks, sshd_config and the like can be configured on the JET provisioned host and then wrapped up into our "Golden Image" flar for N1SPS/N1SM to use.

For this project, we tried to maintain a line of separation between the "system" side of the configurations, and the "deployed applications" side. Within the deployed applications pile, we have two groups: IT and management applications, and end-user applications. Things like SunMC, backup tools, and administrative utilities are deployed through the IT and management steps. End-user applications (web servers, application servers, message queue) will be provisioned through other steps owned by different groups under N1SPS and N1SM (with DI Dynamic Infrastructure). Rather than use both JET and N1SPS/N1SM to maintain our system configuration, we concentrated on the JET side, and only worked within the deployment frameworks where it was necessary.

One accidental side effect that we leveraged early on was provided by the JET framework itself. All of the JET generated scripts that twiddle the bits after the Jumpstart of the system are installed on the JET installed machine. In /var/opt/sun/jet, we find the post_install scripts that were used to create the system. This includes the scripts to set up the root disk mirroring, the metadbs, and the zone partitioning. The scripts for other configuration tasks, generating ssh keys, setting ndd configurations, enabling/disabling services, installing our JASS/SST driver, adding services that will run a JASS audit every time the system boots, and our /etc/system setup script that I noted earlier, are also present.

Since all of this work was done to automate the installation and make our repetitive configuration tasks easier and safer, why not leverage that work in the deployment frameworks? So our flar not only contains the software that we want to be installed, and the configurations of the services within that system, but it also contains the necessary scripts to deploy that image and those configurations on a piece of hardware, and regenerate alot of the settings and configuration dynamically. Very cool.

As an example of how this is a timesaver for us, we had to juggle the zone soft partition sizes in mid-project. If we were using the N1SPS and N1SM (including DI Dynamic Infrastructure goodies) plans to deploy the disk configurations, we would have had to juggle configuration information for all of our configured hosts in the test environment. In our case, we modified the JET template with the new sizes, ran the "make_client -f", did a "boot net - install" on our test machine, and then rolled up a new flar revision for the deployment folks to use with the new disk configuration embedded in the flar through the inclusion of the JET generated scripts. Grand total time to do this was a couple hours, and because this was a "system side" change, the IT/management and end-user applications folks under N1SPS and N1SM were never impacted by the change. In fact, they didn't even notice that the change had taken place. Very cool.

Late in the project, we did repeat all of our evolutionary steps on a single flar revision just to test our release notes and documentation. We went back and installed the packages we wanted from the DVD, and applied all of our documented changes to produce the "final flar" image from scratch. This was a great way to test our documentation and our procedures, and did uncover a few issues for us. Still, grand total time to produce the final system image flar with embedded scripts was one day, including testing. Nice.

One of my co-workers (Hi Jim!) became deeply involved with the packaging and cluster mechanics of Solaris distributions in this project. He wrote several tools (that I will let him write about at a future date and hopefully contribute to OpenSolaris) for juggling and debugging the package dependencies and installation order information. He wrote one tool in particular that I found incredibly useful. The tool "cooks" the package information from the installation media, takes a snapshot of the pkginfo from an installed machine, and creates the ordered list of packages to add necessary to satisfy all of the pkgadd dependencies. This will allow us to create our own "metaclusters". So we can now add "SUNWCgolden" to the standard "SUNWCall, SUNWCprog, SUNWCuser, and SUNWCmin" Jumpstart options. No more "packages to add" and "packages to remove" as post-install tasks to generate our flar from the installation media. Jim rocks.


In the zone with SVM and JET...

We now have a decent OS image, our root disk layout finished and mirrored, along with the associated metadbs live. You DID test at this point and make sure that the template file checks out and the target machine installs, right?

The partitions that we need to use for zone allocations are created for us, slice s6 on each of our 8 disks. The root disk and the root mirror disk have about 80G in s6, and the remaining 6 disks have about 143G in s6. We need to glue all of that free disk space together into a giant mirrored metadevice. We'll call it d100. In our sds_metadevices_to_init variable, we need to add:

d101 d102 d100

This created metadevices d101, d102 and d100. Metadevice d101 would be defined as a 4-way concatenation of partitions c1t0d0s6, c1t2d0s6, c1t4d0s6, and c1t6d0s6. Metadevice d102 would be defined as a 4-way concatenation of partitions c1t1d0s6, c1t3d0s6, c1t5d0s6, and c1t7d0s6. We didn't see a real need for striping in this configuration for performance, and we want to retain the option of stacking more disk partitions on top of the concatenation at some point, if that becomes an issue. Remember that these disks are on a single controller, with the throttles and bottlenecks associated with that hardware limitation.

The two metadeviced d101 and d102 were then mirrored into a new metadevice named d100. This could have been accomplished by the command lines:

# metainit d101 4 1 c1t0d0s6 1 c1t2d0s6 1 c1t4d0s6 1 c1t6d0s6
# metainit d102 4 1 c1t1d0s6:1 c1t3d0s6 1 c1t5d0s6 1 c1t7d0s6
# metainit d100 -m d101
# metattach d100 d102

Now that we have a ~500GB disk partition called d100, we need to slice out the soft partitions. Since we were unable to use ZFS, using SVM softpartitions gives us the maximum flexibility possible when we slice out the zone partitions. For now, we will be creating fairly standard sizes and mountpoints for the zones to be created on, but that might change later.

The first zone (100) is for the Message Queue, and will contain 4 filesystems (two, plus Live Upgrade space). Zones 101 through 135 will use the same 6 partitions and sizes. Naming for the zone space metadevices is d[zone number][partition number].

	d1000 d1001 d1002 d1003
	d1010 d1011 d1012 d1013 d1014 d1015
	d1350 d1351 d1352 d1353 d1354 d1355

This is the equivilant of running over 200 "metainit d#### -p d100 [size]" commands by hand. I created this section with a quick and dirty shell script that did a loop, counting up each zone, and containing a loop with echo's that created the lines for that zone. Ugly, but quick and effective.

# Creating filesystems
# If you wish to newfs any UFS filesystems, then you can do them from here.
# As a side effect, the devices need not be DiskSuite ones....
# sds_newfs_devices:	Space separated list of devices to newfs - full paths
# If you wish to specify extra newfs options, then use a custom script to
sds_newfs_devices=" /dev/md/dsk/d1000 

This tells JET to newfs all of those metadevices that we just softpartitioned out of the d100 space. This is the most time-consuming piece of the machine build, lasting about 10-15 minutes. Again, this section was initially created with a quick and dirty shell script full of "for" loops and "echo" statements.

# sds_mount_devices:	space separated list of tuples that will be used
#			to populate the /etc/vfstab file
#	tuples of the form
#	blockdevice:mountpoint:fsckpass:mountatboot:options
#		blockdevice:	/dev/dsk/.... or /dev/md/dsk/....
#		mountpoint:	/export/big_raid_device	
#		fsckpass:	5th column from /etc/vfstab
#		mountatboot:	"yes" or "no"
#		options:	7th column from /etc/vfstab

This section creates the /etc/vfstab entries for the partitions that we just created. Yes, we are mounting the Live Upgrade space for both the global zone, and for our full root local zones, but that keeps folks from stealing space from my d100 or mounting those allocated partitions for other nefarious uses. The space is there and mounted into the global zone for when I need to use it for LU. Philosophical arguments on this item > /dev/null and as always, your mileage may vary.

Now we should be ready to cook our JET template and give it a shot:

jetserver# pwd

jetserver# ../bin/make_client -f myhost
Gathering network information..
        Client: (
        Server: (, SunOS)
Solaris: client_prevalidate
         Clean up /etc/ethers
Solaris: client_build
Creating sysidcfg
Creating profile
Adding base_config specifics to client configuration
Adding flash specifics to client configuration
FLASH: Modifying client profile for flash install
FLASH: Removing package/cluster/usedisk entries from profile
Adding eiscd specifics to client configuration
EISCD: EISCD: No profile changes needed
Adding sds specifics to client configuration
SDS: Configuring preferred metadevice numbers
Solaris: Configuring JumpStart boot for myhost
         Starting SMF services for JumpStart
Solaris: Configure bootparams build
cleaning up preexisting install client "myhost"
removing myhost from bootparams
updating /etc/bootparams
Force bootparams terminal type
-Restart bootparamd
Running '/opt/SUNWjet/bin/check_client  myhost'
        Client: (
        Server: (, SunOS)
Checking product base_config/solaris
Checking product flash
FLASH: Checking nfs://
Checking product custom
Checking product eiscd
Checking product sds
Check of client myhost

-> Passed....

YAY!!! We now have a template that (according to JET) might actually work. Let's go over to the target machine and give it a shot:

sc> break -y
sc> console

Enter #. to return to ALOM.

{0} ok 
{0} ok 
{0} ok boot net - install

Boot device: /pci@0/pci@0/pci@1/pci@0/pci@2/network@0  File and 
args: - install
1000 Mbps full duplex  Link up
Requesting Internet Address for 0:14:4f:d3:9a:0
Requesting Internet Address for 0:14:4f:d3:9a:0
Requesting Internet Address for 0:14:4f:d3:9a:0

... blah blah blah ...

Using sysid configuration file
Search complete.
Discovering additional network configuration...

Completing system identification...

Starting remote procedure call (RPC) services: done.
System identification complete.
Starting Solaris installation program...
Searching for JumpStart directory...
Using rules.ok from
Checking rules.ok file...
Using begin script: Utils/begin
Using derived profile: Utils/begin
Using finish script: Utils/finish
Executing JumpStart preinstall phase...
Executing begin script "Utils/begin"...
Installation of myhost at 09:45 on 19-Jun-2008
Loading JumpStart Server variables
Loading JumpStart Server variables
Loading Client configuration file
Loading Client configuration file
FLASH: Running flash begin script....
FLASH: Running flash begin script....
CUSTOM: Running custom begin script....
CUSTOM: Running custom begin script....
Begin script Utils/begin execution completed.
Searching for SolStart directory...
Checking rules.ok file...
Using begin script: install_begin
Using finish script: patch_finish
Executing SolStart preinstall phase...
Executing begin script "install_begin"...
Begin script install_begin execution completed.

So far so good, it found the right stuff. Now it should set up the disks and extract my flar.

Processing profile
	- Opening Flash archive
	- Validating Flash archive
	- Selecting all disks
	- Configuring boot device
	- Using disk (c1t0d0) for "rootdisk"
	- Configuring / (c1t0d0s0)
	- Configuring swap (c1t0d0s1)
	- Configuring /GZ_VAR_LU (c1t0d0s3)
	- Configuring /GZ_ROOT_LU (c1t0d0s4)
	- Configuring /var (c1t0d0s5)
	- Configuring  (c1t0d0s7)
	- Configuring  (c1t2d0s7)
	- Configuring  (c1t3d0s7)
	- Configuring  (c1t4d0s7)
	- Configuring  (c1t5d0s7)
	- Configuring  (c1t6d0s7)
	- Configuring  (c1t7d0s7)
	- Configuring  (c1t0d0s6)
	- Configuring  (c1t2d0s6)
	- Configuring  (c1t3d0s6)
	- Configuring  (c1t4d0s6)
	- Configuring  (c1t5d0s6)
	- Configuring  (c1t6d0s6)
	- Configuring  (c1t7d0s6)
	- Deselecting unmodified disk (c1t1d0)

Verifying disk configuration
	- WARNING: Changing the system's default boot device in 

Verifying space allocation
	NOTE: 1 archives did not include size information

Preparing system for Flash install

Configuring disk (c1t0d0)
	- Creating Solaris disk label (VTOC)

Configuring disk (c1t2d0)
	- Creating Solaris disk label (VTOC)

Configuring disk (c1t3d0)
	- Creating Solaris disk label (VTOC)

Configuring disk (c1t4d0)
	- Creating Solaris disk label (VTOC)

Configuring disk (c1t5d0)
	- Creating Solaris disk label (VTOC)

Configuring disk (c1t6d0)
	- Creating Solaris disk label (VTOC)

Configuring disk (c1t7d0)
	- Creating Solaris disk label (VTOC)

Creating and checking UFS file systems
	- Creating / (c1t0d0s0)
	- Creating /GZ_VAR_LU (c1t0d0s3)
	- Creating /GZ_ROOT_LU (c1t0d0s4)
	- Creating /var (c1t0d0s5)

Beginning Flash archive processing

Predeployment processing
16 blocks
16 blocks
16 blocks

No local customization defined

Extracting archive: patchedflar
	Extracted    0.00 MB (  0% of 1302.60 MB archive)
	Extracted    1.00 MB (  0% of 1302.60 MB archive)
	Extracted    2.00 MB (  0% of 1302.60 MB archive)
	Extracted 1302.00 MB ( 99% of 1302.60 MB archive)
	Extracted 1302.60 MB (100% of 1302.60 MB archive)
	Extraction complete

Postdeployment processing

No local customization defined

Customizing system files
	- Mount points table (/etc/vfstab)
	- Unselected disk mount points (/var/sadm/system/data/vfstab.unselected)
	- Network host addresses (/etc/hosts)
	- Environment variables (/etc/default/init)

Cleaning devices

Customizing system devices
	- Physical devices (/devices)
	- Logical devices (/dev)

Installing boot information
	- Installing boot blocks (c1t0d0s0)
	- Installing boot blocks (/dev/rdsk/c1t0d0s0)
	- Updating system firmware for automatic rebooting

Installation log location
	- /a/var/sadm/system/logs/install_log (before reboot)
	- /var/sadm/system/logs/install_log (after reboot)

Flash installation complete
Executing JumpStart postinstall phase...
Executing finish script "Utils/finish"...

YAY!!! Still going well. Now we'll see if the JET stuff actually works. At this point, we see lots of information about moving the scripts around, copying stuff in, and setting up for the reboots and activities that happen between those reboots. After the first reboot, we see:

Boot device: /pci@0/pci@0/pci@2/scsi@0/disk@0,0:a  File and args:
SunOS Release 5.10 Version Generic_127127-11 64-bit
Copyright 1983-2008 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
/dev/rdsk/c1t0d0s4 is clean
/dev/rdsk/c1t0d0s3 is clean
JumpStart (/var/opt/sun/jet/post_install/S99jumpstart) started @ Thu 
Jun 19 09:52:42 MST 2008
Loading JumpStart Server variables
Loading Client configuration file
Running additional install files for reboot Platform/1
NFS Mounting Media Directories
Mounting nfs:// on 
Mounting nfs:// on 
CUSTOM: Running 001.custom.001.set_etc_system
SDS: Running 002.sds.001.create_fmthard
SDS: Running 002.sds.001.set_boot_device
SDS: Running 002.sds.002.create_metadb
SDS: Running 002.sds.003.create_root_mirror
SDS: Running 002.sds.007.create_user_devices
SDS: Running 002.sds.008.create_filesystems
SDS: Running 002.sds.009.create_vfstab_entries
NFS Unmounting Media Directories
Unmounting /var/opt/sun/jet/js_media/pkg
Unmounting /var/opt/sun/jet/js_media/patch
Save existing system entries
Copying file /etc/system to /etc/system.prejs2
Jun 19 10:11:03 myhost reboot: rebooted by root

Jun 19 10:11:04 myhost syslogd: going down on signal 15

Jun 19 10:11:04 rpc.metad: Terminated

syncing file systems... done

YAY!!! Our disk configurations all executed without errors or warnings. Now let's see if it comes back clean after the reboot.

Boot device: rootdisk  File and args: 

SunOS Release 5.10 Version Generic_127127-11 64-bit
Copyright 1983-2008 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
/dev/md/rdsk/d4 is clean
/dev/md/rdsk/d3 is clean
/dev/md/rdsk/d1000 is clean
/dev/md/rdsk/d1001 is clean
/dev/md/rdsk/d1002 is clean
/dev/md/rdsk/d1003 is clean
/dev/md/rdsk/d1352 is clean
/dev/md/rdsk/d1353 is clean
/dev/md/rdsk/d1354 is clean
/dev/md/rdsk/d1355 is clean

... [lots of other activity deleted]

SDS: Running 003.sds.001.attach_mirrors
SDS: Running 003.sds.002.attach_user_mirrors
Disable & delete SMF tag svc:/site/jetjump
JumpStart is complete @ Thu Jun 19 10:15:55 MST 2008

myhost console login: 



Wednesday Jun 18, 2008

JET and SVM, a match made in heaven...

The flar that we created earlier now has all the packages and patches (oh yeah, JET patches the system for you as well) in it that we want, and has the basic filesystems installed for us on the root disk. We implemented the root disk layout (reserving s7 for metadb in the SVM sections later) with this section of the JET template:

#  Define Root disk configuration
#    Make sure that /var has a Live Upgrade slice
#    if /var is a separate filesystem.
#    Mount the LU spaces to make sure that someone
#    doesn't come along later and use that "free space".
#    s6 is the "freehog" partition to contain all free
#    space after the static slices are allocated
#    s7 is defined later and used as a metadb space.

Hint number one... Put lots of comments in your template files to remind yourself (and others who come along later) why and how you did things. These template files can be rather large and complex.

This section defines the rootdisk (reserved word in JET) with a / partition of 8G on s0, swap space of 32G on s1, /var of 8G on s5, and some Live Upgrade partitions for / and /var on s3 and s4 with matching sizes (important). The metadb space for SVM will be on s7 (defined later in the template), and any leftover space will be allocated to a partition on s6, but not mounted. We will add this free space to our pile of space for use by zones later on.

# Any devices we need to skip (spare slices on root disks etc)
# Format: c?t?d?s?
sds_skip_devices="c1t0d0s6 c1t1d0s6"

This tells JET that we don't want to use the root mirroring phase to set up the partitions that we will use later as part of the zone disk space.

#  Additional disks under JET control.  Skip the
#  root mirror disk (c1t1) as it is defined in the
#  mirroring steps below.
base_config_profile_additional_disks="c1t2d0 c1t3d0 \\
     c1t4d0 c1t5d0 c1t6d0 c1t7d0 "

This is where comments come into play. Without this comment, you might think that the root mirror disk should be listed in "additional disks under JET control", it just makes sense. But no, that would make very ugly things happen.

Now that we have reserved the disks for JET to use, we need to layout a disk partition scheme for them. Again, we don't want to mount them, and we will use s7 later on to add metadbs for SVM to use:

#  Define layout of the additional disks to use
#  s6 as the freehog space.  s7 will have already
#  been reserved by the metadb definitions below.

That sets up all of our physical disks (except for the metadb stuff, but that will come along later). At this point, we could install the machine and make sure that all is well, and our rootdisk works as expected. The next step is to add in the SVM stuff. Let me repeat, now is a REALLY good time to stop, try things out, and make sure that your root disk and "additional disks" are configured the way you want them.

There are three basic pieces in the SVM configuration that we need to worry about. We need to set up the metadb copies, copy the configuration of the rootdisk and mirror it, and then we need to set up the leftover diskspace and put it into a big metadevice to use later for zone space soft partitions.

# Kernel options
# If you need to increase the number of metasets from the 
# default (4) or the number of metadevices per metaset from 
# the default (128), then enter the figures here.
# Please note that increasing these numbers can significantly 
# increase the time taken to do a reconfiguration boot, or a 
# drvconfig etc.

We needed to increase the default number of metadevices from 128 to something higher. We did some testing, and making this number in the thousands didn't hurt our performance, so we erred on the side of safety with 4000.

# This variable defines where SVM will create the metadbs.
# If any meta state databases are to be created, add 
# the c?t?d?s? number here. If you need multiple copies 
# (i.e. metadb -c 3), suffix the device with a : and the 
# number of copies. e.g. c0t0d0s7:3
# eg: sds_database_locations="c3t0d0s7:3 c1t0d0s7:3"
sds_database_locations="rootdisk.s7:3 c1t2d0s7:3 c1t3d0s7:3 \\
     c1t4d0s7:3 c1t5d0s7:3 c1t6d0s7:3 c1t7d0s7:3"

In theory, according to the template comments, you don't need to specify the rootdisk or the root mirror disk in this variable. We didn't notice the comment until we already had a working configuration, and left things alone. Your mileage may vary, but we don't get warnings or errors with this configuration and everything is working fine for us.

We have set up metadb partitions in this step, and placed three copies on each metadb partition. Again, this was a part of the build specification, and traditional for the customer. That is 24 copies of the metadb, and I am not sure if I would have configured things this way if the choice was mine. Definitely do your due diligence and make your own configuration decisions wisely. Your mileage may vary.

# This variable ensures that partitions are created to 
# hold the metadbs defined above. 
# Specify locations in one of the following forms
#	s:size	       - creates s7 on the rootdisk
#	c?t?d?s:size    - creates slice on specified device
sds_database_partition="s7:32 c1t2d0s7:32 c1t3d0s7:32 \\
      c1t4d0s7:32 c1t5d0s7:32 c1t6d0s7:32 c1t7d0s7:32

We are creating eight metadb partitions, one on each disk, 32MB each. This section also reserves the proper space on s7 of the rootdisk. We had issues with the configuration when we tried to configure the metadb partition on s7 in the section of the template where we defined the rest of the rootdisk partitions. JET is smart enough to slice s7 out for us before calculating the "free" space for s6 on the root disk.

# If the boot device alias needs setting then do it here
# ie sds_root_alias="rootdisk"
# This will update the boot-device filed to ${sds_root_alias} 
# net and add the name to devalias, removing any previous 
# one of the same name

# If we do have a root mirror, then set the devalias device 
# to this name

# If you are using a two disk system and are mirroring the 
# root device,
# you may want to enable md:mirrored_root_flag in the 
# kernel (/etc/system).

# You should read the info doc about this and fully understand 
# the implications of setting it... i.e. it's not just a case 
# of always setting it!

Here we assign a name for the root mirror disk. This will be the alias used in the boot prom to set up the root mirror as a second bootable device. We also set the md:mirrored_root_flag in /etc/system. Definitely read the Infodoc that the template mentions and make your own decision on this one.

# By default, the root disk will be mirrored slice by 
# slice; the metadevices will start with d10 for the 
# first slice (sub-mirrors d11 and d12), d20 for
# the second slice (sub-mirrors d21 and d22) upwards.
# If you wish to use your own numbering scheme for the 
# metadevices, please specify them here, in the following 
# format- 
#    :mirror md:sub mirror 1:sub mirror 2

In this section, we are defining the device names for the partitions of the root disk. We were following a local numbering scheme, and this worked well for us. One interesting note here that took a couple hours to debug, apparently metadevices may be named "d0", but they can't be named d0[anything]. So d01 is not allowed. d001 is not allowed. d02843 is not allowed. Oops. I'll definitely remember that one.

# If the root device is to be mirrored, define the mirror 
# here (c?t?d?).
# sds_use_fmthard can be set to "yes" | "no" | ""
# If sds_use_fmthard is set to "yes", then JET will create 
# metadb partitions and create the metadb as defined on 
# the root disk. You DO NOT need to specify the root mirror 
# in the sds_database_partition nor sds_database_locations 
# variable.
# If sds_use_fmthard is set to "no" or "", then JET will 
# create the data partitions for you, but you will have 
# to populate the sds_database_\*  variables if you want a 
# metadb to exist on the root mirror.
# You MUST set fmthard=yes for Solaris 9 and above. 

Wow. That section is easy. Those two lines are all it takes to mirror the root disk. Just tell JET what disk to mirror it to, and tell JET to use "fmthard" to copy the partition table. This causes the installation to run a "prtvtoc" command against the configured root disk, and then feed that output to "fmthard -s" against the disk defined in sds_root_mirror. Simple.

For our configuration though, we will add specific naming for the root disk and root mirror disk partitions and metadevice names:

#  sds_metadevices_to_init="d81:1:1:/dev/dsk/c1t0d0s0 d80:-m:d81"
#	Equivalent to
#		metainit d81 1 1 /dev/dsk/c1t0d0s0
#		metainit d80 -m d81
#       This will create a one-way mirror on d80 to d81.
#       Example of combined "" and "command line" syntax:
#	(this would all be on one line, but has been split for 
#       clarity)
#  sds_metadevices_to_init="d71 d72 d70 d81:1:1:/dev/dsk/c1t0d0s0
#				d82:1:1:/dev/dsk/c2t0d0s0
#				d80:-m:d81 d91 d92 d90"
sds_metadevices_to_init="d91 d92 d0 
			d11 d12 d1
			d31 d32 d3
			d41 d42 d4
			d51 d52 d5

There we have it. Those pieces of the template define the metadbs and the root disk mirroring, and setup the partitions that we will use later to create the space for our zones. Fairly painless and straightforward, and definitely easier and safer than doing all of the twiddly bits by hand. Absolutely easier than repeating those manual tasks for 100 servers too! Next entry, I'll delve deeper into the zone partition space allocation, and the soft partitioning that we used to accomplish that piece.

This, again, is a REALLY good time to stop and try the template out. At this point, we have sliced up the root disk, created metadbs, sliced up the extra disks, and mirrored the root disk. These are the really tricky parts that will break a machine in interesting ways that are more difficult to debug. If we know that these parts are working, then creating the spaces on our leftover partitions for the zones will happen on a running and (hopefully) stable system environment, making debugging much easier.


JET is your friend...

One of the challenges in this project is keeping an eye on the evolutionary qualities of the underlying components. In other words, we know that the basic elements of disk layout, packages added, configuration files, etc. will be in flux throughout the development and test cycles. In order to streamline these activities and make them a tad safer, we decided to implement as much as possible with the Jumpstart Enterprise Toolkit (JET).

JET allows you to create template files, describing not only basic Jumpstart configurations, but many of the most common "post install" tasks that administrators do manually. Things like JASS (Solaris Security Toolkit), Solaris Volume Manager (SVM), more complex network configurations like IPMP, and many other goodies are either implemented in JET already, or easily integrated as a repeatable and automated task. In our case, with IPMP, over 200 filesystems using SVM, and some post-install tasks (dropping in config files, setting up some standard /etc/system and ndd settings, etc.), JET is definitely a great first step on our road to our final goal of doing this through N1SPS/N1SM.

I won't cut and paste my whole template file into my blog, but I will note a few of the key sections. We started with an existing JET template from the current production environment. This had the benefit of bringing along several post-install scripts that installed default configuration files for syslog, NTP, and SSH, and gave us standard and proven locations and methods for installing some of the extra software that this project required (expect, some GNU tools, etc.). Since we are working with a flar and not using the standard pkgadd method, this line in the template file (from /opt/SUNWjet/Templates by default) gives us the starting system image that we want:

#       Identify flar to load

That's it. Just that line, and all the stuff from our hand-built machine is installed on the new box. Of course, we still need to specify a bunch of other configuration things, like disk layout, SVM configs, etc., but I'll get to those later.

In addition to our standard flar, we want to install some changes to /etc/system, and twiddle some extra bits. This is accomplished with a script that we wrote and saved in /opt/SUNWjet/Clients/common.files called set_etc_system. We just place that script into the JET common.files directory and configure the template to run it in reboot 1 (JET installs do several reboots after the initial flar load):

#  Override default custom scripts.  
#  Set default locale to POSIX/C to get around buggy use of "tr"

Our set_etc_system script isn't super complicated, or error-proof, but I am using it as a simple example. We wrote it to automate a way around a mistake that we were making, forgetting to clean out settings between flar revisions. So for the settings that we want to be there, if the setting exists, leave it alone. If it doesn't exist in /etc/system, put our setting into the file:

set -a
#   set_etc_system
#   1.1 - Bill Walker < >
#   Set some sane /etc/system variables if they don't exist
BN=`/usr/bin/basename ${0}`
cat >> ${ROOTDIR}/etc/system << EOF
\* Added by Jet set_etc_system `date +%d%b%y`

if grep " autoup" /etc/system >/dev/null
        echo "${BN} : autoup already in /etc/system..."
cat >> ${ROOTDIR}/etc/system << EOF
set autoup=480

if  grep " tune_t_fsflushr" /etc/system >/dev/null
        echo "${BN} : tune_t_fsflushr already in /etc/system..."
cat >> ${ROOTDIR}/etc/system << EOF
set tune_t_fsflushr=60

if  grep " rlim_fd_cur" /etc/system >/dev/null
        echo "${BN} : rlim_fd_cur already in /etc/system..."
cat >> ${ROOTDIR}/etc/system << EOF
set rlim_fd_cur=1024

Simple but repetitive tasks such as this are amazingly easy with JET. Every time you find yourself doing some task more than once on an install or deployment, just take a couple minutes and create a script to do it for you, and add it to the custom_scripts list in your JET template. Things like cleaning out the SSH known_hosts entries for users, creating a complex /etc/resolv.conf file, adding a new service, making sure that some services are disabled or sending an email or message to you to let you know that things succeeded or failed can be great time savers and make deployment safer and faster.

Not much new and exciting information here if you already use JET, but I'll be digging a bit deeper into JET in this blahg later.





« July 2016