Heads up on Kernel patch installation issues with jumpstart or ZFS Root
By Gerry Haskins on Jun 25, 2009
I'd like to give you a heads-up on a couple of Kernel patch installation issues:1. There was a bug (since fixed) in the Deferred Activation Patching functionality in a ZFS Root environment on x86 only. See Sun Alert 263928. An error message to the effect that a Class Action Script has failed to complete and failure to set up environment for Deferred Activation Patching may be seen. The relevant CR is 6850329: "KU 139556-08 fails to apply on x86 systems that have ZFS root filesystems and corrupts the OS". SPARC systems are similarly affected. The following error message is returned:
mv: cannot rename /var/run/.patchSafeMode/root/lib/libc.so.1.20102 to /lib/libc.so.1: Device busy ERROR: Move of /var/run/.patchSafeMode/root/lib/libc.so.1.20102 to dstActual failed usage: puttext [-r rmarg] [-l lmarg] string pkgadd: ERROR: class action script did not complete successfully Installation of <SUNWcslr> failed.
This issue is fixed in patch in the Patch Utilities patch 119255-70 or later revision.
2. There are reproducible issues using jumpstart finish scripts and other scenarios to install Kernel patch 137137-09 followed by Kernel patch 139555-08. Here's the gist of the issue which I've pulled from an engineering email thread on the subject:
Issue 1: I have a customer whose system is not booting after applying the patch cluster with Live Upgrade (LU).
Solution 1: If using 'luupgrade -t', then you must ensure that latest version of LU patch is installed first, currently 121430-36 is currently the latest revision on SPARC, 121431-37 on x86. Once these patches are installed, LU will automatically handle the build of the boot archive when 'luactivate' is called, thus avoiding the problem.
Issue 2: There are other ways to get oneself into situations where a boot archive is out of sync: e.g. If using jumpstart finish scripts to apply patches that include 137137-09. Basically any operation that involves patching to an ABE outside of 'luupgrade' will involve a manual build of boot-archive.
Solution 2: One must manually rebuild the boot-archive on the /a partition after applying the patches. Otherwise once the system boots, the boot-archive will be out of sync.
Here's some more detail on the jumpstart finish script version of this:
We've seen the same panic a few times when the latest patch cluster is applied via a finish script to a boot environment prior to s10u6 via a jumpstart installation. It appears that the boot archive is out of sync with the kernel on the system. The boot archive was created from the 137137-09 patch and not updated after the 139555-08 kernel was applied, therefore the mismatch between the kernel and the boot archive.
In these instances updating the boot archive allows the system to boot successfully. Boot failsafe (ok boot -F failsafe) will detect an out of sync boot archive. Execute the automated update then reboot. This will now boot from the later kernel (139555-08) which successfully installed from the finish script.
I reproduced the problem in a jumpstart installation environment applying the latest 10_Recommended patch cluster from a finish script. The initial installation was S10U5 which is deployed from a miniroot that has no knowledge of a boot archive (my theory anyway). This is similar to a live upgrade environment if the boot environment doing the patching is also boot archive unaware (meaning the kernel is pre 137137-09).
In the jumpstart scenario the immediate problem was solved by updating the boot archive by booting failsafe as previously described. The Solution was to update the boot archive from the finish script after the patch cluster installation completed. BTW, all patches in the patch cluster installed successfully per the /var/sadm/system/logs.finish.log.
In a standard jumpstart the boot device (install target) is mounted to /a, therefore adding the following entry to the finish script solved the problem:/a/boot/solaris/bin/create_ramdisk -R /a
Depending on the finish script configuration, and variables the following would also work:$ROOTDIR/boot/solaris/bin/create_ramdisk -R $ROOTDIRIssue 3: This above issues are sometimes mis-diagnosed as CR 6850202: "bootadm fails to build bootarchive in certain configurations leading to unbootable system".
But CR 6850202 will only be encountered in very specific circumstances, all of which must occur in order to hit this specific bug, namely:
1. Install u6 SUNWCreq - there's no mkisofs so we build ufs boot archive
2. Limit /tmp to 512M - thus forcing the ufs build to happen in /var/run
3. Have a separate /var - bootadm.c only lofs nosub mounts "/" when creating the alt root for DAP patching build of boot archive
4. Install 139555-08
You must have all 4 of above in order to hit this, i.e. step 4 must be installing a DAP patch such as a Kernel patch associated with a Solaris 10 Update such as 139555-08.
Solution 3: Removing the 512MB limit (or whatever limit has been imposed) to /tmp in /etc/vfstab and/or adding SUNWmkcd (and probably SUNWmkcdS) so that mkisofs is available on the system is sufficient to avoid the code path that fails this way.
Booting failsafe and recreating the boot archive will successfully recreate the boot archive.
Here's further input from one of my senior engineers, Enda O'Connor:
If using Live Upgrade (LU), and LU on the live partition is up to date in terms of latest revision of the LU patch, 121430 (SPARC) and 121431 (x86), the boot-archive will be built automatically once users runs shutdown ( after luactivate to activate the new BE ). This is done from a kill script in rcd.0.
If using a jumpstart finish script, or jumpstart profile to patch a pre-U6 image with latest kernel patches, then you need to run create_ramdisk from the finish script after all patching/packaging operations have been finished. Alternatively, you can patch your pre-U6 miniroot to the U6 SPARC NewBoot level (137137-09), at which point the modified miniroot will handle the build of the boot_archive after the finish script has run.
If patching U6 and upwards from jumpstart, the boot_archive will get built automatically after finish script has run, so there's no issue in this scenario.
If using any home grown technology to patch or install/modify software on an Alternate Boot Environment ( ABE ), such as ufsrestore/cpio/tar for example, you must always run create_ramdisk manually before booting to said ABE.