Monday Jul 05, 2010

Example to update the boot archive via failsafe boot on sparc systems with configured SVM root mirror

Since Solaris 10 10/08 update6 the boot archive feature was implemented into Solaris 10 for sparc. When the system is configured with Solaris Volume Manager (SVM) root mirror then attention is necessary if using the failsafe boot (boot -F failsafe) on the 'ok prompt' environment.

In case of the following or similar message it could be necessary to use the Solaris failsafe boot:

Rebooting with command: boot
Boot device: disk:a File and args:
SunOS Release 5.10 Version Generic_141414-07 64-bit
Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hardware watchdog enabled
Hostname: scnode1

WARNING: The following files in / differ from the boot archive:

changed /kernel/drv/did.conf

The recommended action is to reboot to the failsafe archive to correct
the above inconsistency. To accomplish this, on a GRUB-based platform,
reboot and select the "Solaris failsafe" option from the boot menu.
On an OBP-based platform, reboot then type "boot -F failsafe". Then
follow the prompts to update the boot archive. Alternately, to continue
booting at your own risk, you may clear the service by running:
"svcadm clear system/boot-archive"

Jun 29 15:51:28 svc.startd[8]: svc:/system/boot-archive:default: Method "/lib/svc/method/boot-archive" failed with exit status 95.
Jun 29 15:51:28 svc.startd[8]: system/boot-archive:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)
Requesting System Maintenance Mode
(See /lib/svc/share/README for more information.)
Console login service(s) cannot run

Root password for system maintenance (control-d to bypass):

Normally after this message the issue can be solved by
a) login into system
b) run "svcadm clear system/boot-archive"
c) wait until system continuing the boot up
The system is already booted from the SVM root mirror and the svcadm clear command goes to all used devices within the root mirror.

But when this is not successful then it's necessary to do a Solaris failsafe boot. Now the issue is that the failsafe boot is not mounting the SVM root mirror and manual intervention is necessary to update the boot archive on all devices within the SVM root mirror. This procedure does NOT require to unmirror the SVM root mirror to update the boot archive.

The procedure in short:
1) Start failsafe boot:
ok boot -F failsafe
a) When system ask for update boot-archive then answer with 'n'.
Do you wish to automatically update this boot archive? [y,n,?] n

b) When system ask for boot drive selection answer with 'q'
Please select a device to be mounted (q for none) [?,??,q]: q
and a shell is starting

2) Mount the single root drive read-only! (use format if not know the root drive)
# mount -o ro /dev/dsk/c0t0d0s0 /a
3) Copy the md.conf file from the root drive to the active boot environment
# cp /a/kernel/drv/md.conf /kernel/drv/
4) Umount the single root drive
# umount /a
5) Update the md driver information
# update_drv -f md
devfsadm: mkdir failed for /dev 0x1ed: Read-only file system
6) Important: Wait a minute, it could take some time to configure the md devices!
7) Sync meta information (if not know the SVM root mirror then run metastat)
# metasync d0
7a) Now fsck can be safely run on d0 if necessary:
# fsck -o f -y /dev/md/rdsk/d0
Please repeat this command if any file system errors were fixed. You have to repeat it until the file system is clean. This can take 3 or 4 runs of fsck command. Force option '-o f' prevents fsck from skipping the file system check. It is recommended to use option -y as well to confirm all changes to the file system.
8) Mount the SVM mirror
# mount /dev/md/dsk/d0 /a
9) Update the boot archive on the SVM mirror
# bootadm update-archive -v -R /a
some output will be shown
9a) OPTIONAL: If seeing error message like
- cannot find: /mnt/etc/mach: No such file or directory
- cannot find: /a/etc/cluster/nodeid: No such file or directory
- cannot find: /a/etc/devices/mdi_ib_cache: No such file or directory
then try to force to update the archive with:
# bootadm update-archive -f -v -R /a
This -f should force the bootadm command to do the update despite of the errors. BUT this could also fail. That the boot archive will be updated in any case do:
# touch /a/kernel/drv/md.conf
# bootadm update-archive -v -R /a
Due to the update of the timestamp of file /a/kernel/drv/md.conf the update of the boot archive will be forced.
10) Umount SVM root mirror
# umount /a
11) reboot the system from repaired SVM mirror
# init 6

See also:
DocID 1340586.1 How to access (root) disk under Solaris Volume Manager Control (SVM) from failsafe or CDROM and update the boot_archive in Solaris 10

Friday Feb 08, 2008

Decrease boot time of Sun Cluster

The boot time of your system can increase dramatically if use Sun Cluster 3.2 in combination with Volume Management software. There is a bug 6626457 (boot time increases exponentially with S10U4 + VxVM 5.0 + Sun Cluster 3.2) which describe the issue in more details. Long boot times can be critical in the view of high availability.

Besure you have installed the following patch to decrease the boot time.

127718-04 SunOS 5.10: svc.startd and rpc.metad patch
127719-04 SunOS 5.10_x86: svc.startd and rpc.metad patch

It's worth to install this patch anyway because the boot time also decrease if you running Sun Cluster without volume management software.


I'm still mostly blogging around Solaris Cluster and support. Independently if for Sun Microsystems or Oracle. :-)


« March 2017