Common Live Upgrade Problems
By User12611829-Oracle on Jun 30, 2011
To help with your navigation, here is an index of the common problems.
- lucreate(1M) copies a ZFS root rather than making a clone
- luupgrade(1M) and the Solaris autoregistration file
- Watch out for an ever growing /var/tmp
# lucreate -n s10u9-baseline Checking GRUB menu... System has findroot enabled GRUB Analyzing system configuration. Comparing source boot environmentIf you weren't paying close attention, you might not even know this is an error. The symptoms are lucreate times that are way too long due to the extraneous copy, or the one that alerted me to the problem, the root file system is filling up - again thanks to a redundant copy.
file systems with the file system(s) you specified for the new boot environment. Determining which file systems should be in the new boot environment. Updating boot environment description database on all BEs. Updating system configuration files. Creating configuration for boot environment . Source boot environment is . Creating boot environment . Creating file systems on boot environment . Creating file system for > in zone on . The error indicator -----> /usr/lib/lu/lumkfs: test: unknown operator zfs Populating file systems on boot environment . Checking selection integrity. Integrity check OK. Populating contents of mount point >. This should not happen ------> Copying. Ctrl-C and cleanup
This problem has already been identified and corrected, and a patch (121431-58 or later for x86, 121430-57 for SPARC) is available. Unfortunately, this patch has not yet made it into the Solaris 10 Recommended Patch Cluster. Applying the prerequisite patches from the latest cluster is a recommendation from the Live Upgrade Survival Guide blog, so an additional step will be required until the patch is included. Let's see how this works.
# patchadd -p | grep 121431 Patch: 121429-13 Obsoletes: Requires: 120236-01 121431-16 Incompatibles: Packages: SUNWluzone Patch: 121431-54 Obsoletes: 121436-05 121438-02 Requires: Incompatibles: Packages: SUNWlucfg SUNWluu SUNWlur # unzip 121431-58 # patchadd 121431-58 Validating patches... Loading patches installed on the system... Done! Loading patches requested to install. Done! Checking patches that you specified for installation. Done! Approved patches will be installed in this order: 121431-58 Checking installed patches... Executing prepatch script... Installing patch packages... Patch 121431-58 has been successfully installed. See /var/sadm/patch/121431-58/log for details Executing postpatch script... Patch packages installed: SUNWlucfg SUNWlur SUNWluu # lucreate -n s10u9-baseline Checking GRUB menu... System has findroot enabled GRUB Analyzing system configuration. INFORMATION: Unable to determine size or capacity of sliceThis time it took just a few seconds. A cursory examination of the offending ICF file (/etc/lu/ICF.3 in this case) shows that the duplicate root file system entry is now gone.
. Comparing source boot environment file systems with the file system(s) you specified for the new boot environment. Determining which file systems should be in the new boot environment. INFORMATION: Unable to determine size or capacity of slice . Updating boot environment description database on all BEs. Updating system configuration files. Creating configuration for boot environment . Source boot environment is . Creating boot environment . Cloning file systems from boot environment to create boot environment . Creating snapshot for on . Creating clone for on . Setting canmount=noauto for > in zone on . Saving existing file in top level dataset for BE as //boot/grub/menu.lst.prev. Saving existing file in top level dataset for BE as //boot/grub/menu.lst.prev. Saving existing file in top level dataset for BE as //boot/grub/menu.lst.prev. File propagation successful Copied GRUB menu from PBE to ABE No entry for BE in GRUB menu Population of boot environment successful. Creation of boot environment successful.
# cat /etc/lu/ICF.3 s10u8-baseline:-:/dev/zvol/dsk/panroot/swap:swap:8388608 s10u8-baseline:/:panroot/ROOT/s10u8-baseline:zfs:0 s10u8-baseline:/vbox:pandora/vbox:zfs:0 s10u8-baseline:/setup:pandora/setup:zfs:0 s10u8-baseline:/export:pandora/export:zfs:0 s10u8-baseline:/pandora:pandora:zfs:0 s10u8-baseline:/panroot:panroot:zfs:0 s10u8-baseline:/workshop:pandora/workshop:zfs:0 s10u8-baseline:/export/iso:pandora/iso:zfs:0 s10u8-baseline:/export/home:pandora/home:zfs:0 s10u8-baseline:/vbox/HardDisks:pandora/vbox/HardDisks:zfs:0 s10u8-baseline:/vbox/HardDisks/WinXP:pandora/vbox/HardDisks/WinXP:zfs:0This error can show up in a slightly different form. When activating a new boot environment, propogation of the bootloader and configuration files may fail with an error indicating that an old boot enviromnent could not be mounted. That prevents the activation from taking place and you will find yourself booting back into the old BE.
Again, the root cause is the root file system entry in /etc/vfstab. Even though the mount at boot time flag is set to no, it confuses lumount(1M) as it cycles through duing the propogation phase. To correct this problem, boot back to the offending boot environment and remove the vfstab entry for /.
Here's what the "error" looks like.
# luupgrade -u -s /mnt -n s10u9-baseline System has findroot enabled GRUB No entry for BEAs with the previous problem, this is also easy to work around. Assuming that you don't want to use the auto-registration feature at upgrade time, create a file that contains just autoreg=disable and pass the filename on to luupgrade.
in GRUB menu Copying failsafe kernel from media. 61364 blocks miniroot filesystem is Mounting miniroot at ERROR: The auto registration file <> does not exist or incomplete. The auto registration file is mandatory for this upgrade. Use -k argument along with luupgrade command. autoreg_file is path to auto registration information file. See sysidcfg(4) for a list of valid keywords for use in this file. The format of the file is as follows. oracle_user=xxxx oracle_pw=xxxx http_proxy_host=xxxx http_proxy_port=xxxx http_proxy_user=xxxx http_proxy_pw=xxxx For more details refer "Oracle Solaris 10 9/10 Installation Guide: Planning for Installation and Upgrade".
Here is an example.
# echo "autoreg=disable" > /var/tmp/no-autoreg # luupgrade -u -s /mnt -k /var/tmp/no-autoreg -n s10u9-baseline System has findroot enabled GRUB No entry for BEThe Live Upgrade operation now proceeds as expected. Once the system upgrade is complete, we can manually register the system. If you want to do a hands off registration during the upgrade, see the Oracle Solaris Auto Registration section of the Oracle Solaris Release Notes for instructions on how to do that.
in GRUB menu Copying failsafe kernel from media. 61364 blocks miniroot filesystem is Mounting miniroot at ####################################################################### NOTE: To improve products and services, Oracle Solaris communicates configuration data to Oracle after rebooting. You can register your version of Oracle Solaris to capture this data for your use, or the data is sent anonymously. For information about what configuration data is communicated and how to control this facility, see the Release Notes or www.oracle.com/goto/solarisautoreg. INFORMATION: After activated and booted into new BE , Auto Registration happens automatically with the following Information autoreg=disable ####################################################################### Validating the contents of the media . The media is a standard Solaris media. The media contains an operating system upgrade image. The media contains version <10>. Constructing upgrade profile to use. Locating the operating system upgrade program. Checking for existence of previously scheduled Live Upgrade requests. Creating upgrade profile for BE . Checking for GRUB menu on ABE . Saving GRUB menu on ABE . Checking for x86 boot partition on ABE. Determining packages to install or upgrade for BE . Performing the operating system upgrade of the BE . CAUTION: Interrupting this process may leave the boot environment unstable or unbootable.
# df -k / Filesystem kbytes used avail capacity Mounted on rpool/ROOT/s10x_u8wos_08a 20514816 4277560 13089687 25% /So far, so good. Solaris is just a bit over 4GB. Another 3GB is used by the swap and dump devices. That should leave plenty of room for half a dozen or so patch cycles (assuming 1GB each) and an upgrade to the next release.
Now, let's put on the latest recommended patch cluster. Note that I am following the suggestions in my Live Upgrade Survival Guide, installing the prerequisite patches and the LU patch before actually installing the patch cluster.
# cd /var/tmp # wget patchserver:/export/patches/10_x86_Recommended-2012-01-05.zip . # unzip -qq 10_x86_Recommended-2012-01-05.zip # wget patchserver:/export/patches/121431-69.zip # unzip 121431-69 # cd 10x_Recommended # ./installcluster --apply-prereq --passcode (you can find this in README) # patchadd -M /var/tmp 121431-69 # lucreate -n s10u8-2012-01-05 # ./installcluster -d -B s10u8-2012-01-05 --passcode # luactivate s10u8-2012-01-05 # init 0After the new boot environment is activated, let's upgrade to the latest release of Solaris 10. In this case, it will be Solaris 10 8/11 (u10).
Yes, this does seem like an awful lot is happening in a short period of time. I'm trying to demonstrate a situation that really does happen when you forget something as simple as a patch cluster clogging up /var/tmp. Think of this as one of those time lapse video sequences you might see in a nature documentary.
# pkgrm SUNWluu SUNWlur SUNWlucfg # pkgadd -d /cdrom/sol_10_811_x86 SUNWluu SUNWlur SUNWlucfg # patchadd -M /var/tmp 121431-69 # lucreate -n s10u10-baseline' # echo "autoreg=disable" > /var/tmp/no-autoreg # luupgrade -u -s /cdrom/sol_10_811_x86 -k /var/tmp/no-autoreg -n s10u10-baseline # luactivate s10u10-baseline # init 0As before, everything went exactly as expected. Or I thought so, until I logged in the first time and checked the free space in the root pool.
# df -k / Filesystem kbytes used avail capacity Mounted on rpool/ROOT/s10u10-baseline 20514816 10795038 2432308 82% /Where did all of the space go ? Back of the napkin calculations of 4.5GB (s10u8) + 4.5GB (s10u10) + 1GB (patch set) + 3GB (swap and dump) = 13GB. 20GB pool - 13GB used = 7GB free. But there's only 2.4GB free ?
This is about the time that I smack myself on the forehead and realize that I put the patch cluster in the /var/tmp. Old habits die hard. This is not a problem, I can just delete it, right ?
Not so fast.
# du -sh /var/tmp 5.4G /var/tmp # du -sh /var/tmp/10* 3.8G /var/tmp/10_x86_Recommended 1.5G /var/tmp/10_x86_Recommended-2012-01-05.zip # rm -rf /var/tmp/10* # du -sh /var/tmp 3.4M /var/tmpImagine the look on my face when I check the pool free space, expecting to see 7GB free.
# df -k / Filesystem kbytes used avail capacity Mounted on rpool/ROOT/s10u10-baseline 20514816 5074262 2424603 68% /We are getting closer, I suppose. At least my root filesystem size is reasonable (5GB vs 11GB). But the free space hasn't changed at all.
Once again, I smack myself on the forehead. The patch cluster is also in the other two boot environments. All I have to do is get rid them too, and I'll get my free space back. Right ?
# lumount s10u8-2012-01-05 /mnt # rm -rf /mnt/var/tmp/10_x86_Recommended* # luumount s10u8-2012-01-05 # lumount s10x_u8wos_08a /mnt # rm -rf /mnt/var/tmp/10_x86_Recommended* # luumount s10x_u8wos_08aSurely, the free space will now be 7GB.
# df -k / Filesystem kbytes used avail capacity Mounted on rpool/ROOT/s10u10-baseline 20514816 5074265 2429261 68% /This is when I smack myself on the forehead for the third time in one afternoon. Just getting rid of them in the boot environments is not sufficient. It would be if I were using UFS as a root filesystem, but lucreate will use the ZFS snapshot and cloning features when used on a ZFS root. So the patch cluster is in the snapshot, and the oldest one at that.
Let's try this all over again, but this time I will put the patches somewhere else that is not part of a boot environment. If you are thinking of using root's home directory, think again - it is part of the boot environment. If you are running out of ideas, let me suggest that /export/patches might be a good place to put them.
Doing the exercise again, with the patches in /export/patches, I get similar results (to be expected), but this time the patches are in a shared ZFS dataset (/export).
# lustatus Boot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ---------- s10x_u8wos_08a yes no no yes - s10u8-2012-01-05 yes no no yes - s10u10-baseline yes yes yes no - # df -k / Filesystem kbytes used avail capacity Mounted on rpool/ROOT/s10u10-baseline 20514816 5184578 2445140 68% / # df -k /export Filesystem kbytes used avail capacity Mounted on rpool/export 20514816 5606384 2445142 70% /exportThis means that I can delete them, and reclaim the space.
# rm -rf /export/patches/10_x86_Recommended* # df -k / Filesystem kbytes used avail capacity Mounted on rpool/ROOT/s10u10-baseline 20514816 5184578 8048050 40% /Now, that's more like it. With this free space, I can continue to patch and maintain my system as I had originally planned - estimating a few hundred MB to 1.5GB per patch set.
Technocrati Tags: Oracle Solaris Patching Live Upgrade