Monday Jul 05, 2010

Example to update the boot archive via failsafe boot on sparc systems with configured SVM root mirror

Since Solaris 10 10/08 update6 the boot archive feature was implemented into Solaris 10 for sparc. When the system is configured with Solaris Volume Manager (SVM) root mirror then attention is necessary if using the failsafe boot (boot -F failsafe) on the 'ok prompt' environment.

In case of the following or similar message it could be necessary to use the Solaris failsafe boot:

Rebooting with command: boot
Boot device: disk:a File and args:
SunOS Release 5.10 Version Generic_141414-07 64-bit
Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hardware watchdog enabled
Hostname: scnode1

WARNING: The following files in / differ from the boot archive:

changed /kernel/drv/did.conf

The recommended action is to reboot to the failsafe archive to correct
the above inconsistency. To accomplish this, on a GRUB-based platform,
reboot and select the "Solaris failsafe" option from the boot menu.
On an OBP-based platform, reboot then type "boot -F failsafe". Then
follow the prompts to update the boot archive. Alternately, to continue
booting at your own risk, you may clear the service by running:
"svcadm clear system/boot-archive"

Jun 29 15:51:28 svc.startd[8]: svc:/system/boot-archive:default: Method "/lib/svc/method/boot-archive" failed with exit status 95.
Jun 29 15:51:28 svc.startd[8]: system/boot-archive:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)
Requesting System Maintenance Mode
(See /lib/svc/share/README for more information.)
Console login service(s) cannot run

Root password for system maintenance (control-d to bypass):

Normally after this message the issue can be solved by
a) login into system
b) run "svcadm clear system/boot-archive"
c) wait until system continuing the boot up
The system is already booted from the SVM root mirror and the svcadm clear command goes to all used devices within the root mirror.


But when this is not successful then it's necessary to do a Solaris failsafe boot. Now the issue is that the failsafe boot is not mounting the SVM root mirror and manual intervention is necessary to update the boot archive on all devices within the SVM root mirror. This procedure does NOT require to unmirror the SVM root mirror to update the boot archive.

The procedure in short:
1) Start failsafe boot:
ok boot -F failsafe
when system ask for boot drive selection answer with 'q'
...
Please select a device to be mounted (q for none) [?,??,q]: q
and a shell is starting

2) Mount the single root drive read-only! (use format if not know the root drive)
# mount -o ro /dev/dsk/c0t0d0s0 /a
3) Copy the md.conf file from the root drive to the active boot environment
# cp /a/kernel/drv/md.conf /kernel/drv/
4) Umount the single root drive
# umount /a
5) Update the md driver information
# update_drv -f md
devfsadm: mkdir failed for /dev 0x1ed: Read-only file system
#
6) Important: Wait a minute, it could take some time to configure the md devices!
7) Sync meta information (if not know the SVM root mirror then run metastat)
# metasync d0
8) Mount the SVM mirror
# mount /dev/md/dsk/d0 /a
9) Update the boot archive on the SVM mirror
# bootadm update-archive -v -R /a
some output will be shown
10) Umount SVM root mirror
# umount /a
11) reboot the system from repaired SVM mirror
# init 6





Monday Oct 19, 2009

Solaris 10 kernel patches

This is a short overview of Solaris 10 kernel patches. The table show which kernel patch revision is included in the Solaris 10 Update releases and there patch dependencies. Install the kernel patch of a Solaris 10 update release is not the same as do an upgrade to the Solaris 10 update release. Sometimes is advisable to do an upgrade from an Solaris 10 update release to a higher Solaris 10 update release. There is not a general rule but when it is necessary to jump over some Solaris 10 update releases then a upgrade is advisable. The list is sorted from newest to oldest...

Last Update: 14.Apr.2014

150400-11
150401-11
newest patchid sparc (but check for actual revision)
newest patchid x86 (but check for actual revision)
 requires 148888-05 SunOS 5.10: kernel patch
 requires
148889-05 SunOS 5.10_x86: kernel patch
148888-05
148889-05
highest release sparc
highest release x86
 requires 147147-26 SunOS 5.10: kernel patch
 requires
147148-26 SunOS 5.10_x86: kernel patch
147147-26
147148-26
Solaris 10 01/13 Update11 sparc
Solaris 10 01/13 Update11 x86
 requires 144500-19
 requires 144501-19
147440-27
147441-27
highest release sparc
highest release x86
 Obsoleted by: 147147-26 SunOS 5.10: kernel patch
 Obsoleted by:
147148-26 SunOS 5.10_x86: kernel patch
144500-19

144501-19
Solaris 10 08/11 Update10 sparc
post U10 release have 147440-01
Solaris 10 08/11 Update10 x86
post U10 release have 147441-01
 requires 142909-17

 requires 142910-17
144488-17
144489-17
highest release sparc
highest release x86
Obsoleted by: 144500-19 SunOS 5.10: kernel patch
Obsoleted by:
144501-19 SunOS 5.10_x86: kernel patch
142909-17
142910-17
Solaris 10 09/10 Update9 sparc
Solaris 10 09/10 Update9 x86
 requires 141444-09
 requires 141445-09
142900-15
142901-15
highest release sparc
highest release x86
Obsoleted by: 142909-17 SunOS 5.10: kernel patch
Obsoleted by:
142910-17 SunOS 5.10_x86: kernel patch
141444-09
141445-09
Solaris 10 10/09 Update8 sparc
Solaris 10 10/09 Update8 x86
 requires 139555-08
 requires 139556-08
141414-10
141415-10
highest release sparc
highest release x86
Obsoleted by: 141444-09 SunOS 5.10: kernel patch
Obsoleted by:
141445-09 SunOS 5.10_x86: kernel patch
139555-08
139556-08
Solaris 10 05/09 Update7 sparc
Solaris 10 05/09 Update7 x86
 requires 137137-09
 requires 137138-09
138888-08
138889-08
highest release sparc
highest release x86
Obsoleted by: 139555-08 SunOS 5.10: Kernel Patch
Obsoleted by:
139556-08 SunOS 5.10_x86: Kernel Patch
137137-09
137138-09
Solaris 10 10/08 Update6 sparc
Solaris 10 10/08 Update6 x86
 requires 127127-11
 requires 127128-11
137111-08
137112-08
highest release sparc
highest release x86
Obsoleted by: 137137-09 SunOS 5.10: kernel patch
Obsoleted by:
137138-09 SunOS 5.10_x86: kernel patch
127127-11
127128-11
Solaris 10 05/08 Update5 sparc
Solaris 10 05/08 Update5 x86
 requires 120011-14
 requires 120012-14
127111-11
127112-11
highest release sparc
highest release x86
Obsoleted by: 127127-11 SunOS 5.10: kernel patch
Obsoleted by:
127128-11 SunOS 5.10_x86: kernel patch
120011-14
120012-14
Solaris 10 08/07 Update4 sparc
Solaris 10 08/07 Update4 x86
 requires 118833-36
 requires 118855-36
125100-10
125101-10
highest release sparc
highest release x86
Obsoleted by: 120011-14 SunOS 5.10: Kernel Update patch
Obsoleted by:
120012-14 SunOS 5.10_x86: Kernel Update patch
118833-36
118855-36
highest release sparc
highest release x86
 this is a must have for Solaris 10 11/06 Update3 sparc
 this is a must have for Solaris 10 11/06 Update3 x86
118833-33
118855-33
Solaris 10 11/06 Update3 sparc
Solaris 10 11/06 Update3 x86

118833-17
118855-14
Solaris 10 06/06 Update2 sparc
Solaris 10 06/06 Update2 x86

118822-30
118844-30
highest release sparc
highest release x86
Obsoleted by: 118833-36 SunOS 5.10: kernel Patch
Obsoleted by:
118855-36 SunOS 5.10_x86: kernel Patch
118822-25
118844-26
Solaris 10 01/06 Update1 sparc
Solaris 10 01/06 Update1 x86

118822-10
118844-11
Solaris 10 03/05 HW1 sparc
Solaris 10 03/05 HW1 x86


Solaris 10 sparc
Solaris 10 x86


Tuesday Nov 13, 2007

nxge driver and patch 120011-14

This information is about different nxge packages which used for network interface cards (NIC's). At the moment, after installation of patch 120011-14 the system get the following panic, within the next boot, when the unbundled nxge package of Solaris 10 11/06 Update3 is installed.

panic[cpu1]/thread=2a100aa9cc0: BAD TRAP: type=31 rp=2a100aa8f90 addr=0 mmu_fsr=0 occurred in module "genunix" due to a NULL pointer dereference
sched: trap type = 0x31
pid=0, pc=0x1200598, sp=0x2a100aa8831, tstate=0x80001601, context=0x0
g1-g7: 7ae2d074, 1, 0, 7ae2d000, 7ae2cfd4, 10, 2a100aa9cc0


For the patch 120011-14 the bundled package version of Solaris 10 8/07 is necessary. Get the package SUNWnxge.u (sun4u architecture) or SUNWnxge.v (sun4v architecture) from the Solaris 10 8/07 Update4 distribution and higher or download the packages here.


The NIC's which use nxge driver are:
Sun Dual 10Gbe Fibre PCIe x8 Low Profile Card (X1027A-z)
Dual 10GbE XFP PCIe x8 ExpressModule for Blade Servers (X1028A-z)
Sun Quad Gigabit Ethernet UTP PCIe x8 Card (X4447A-z and X7287A-z)


Unbundled package version for Solaris 10 11/06 Update3.

t2000# pkginfo -l SUNWnxge
   PKGINST: SUNWnxge
      NAME: Sun x8 10G/1G Ethernet Adapter Driver
  CATEGORY: system
      ARCH: sparc.sun4u
   VERSION: 1.0,REV=2007.01.12.10.0
   BASEDIR: /
    VENDOR: Sun Microsystems, Inc.
      DESC: Sun x8 10G/1G Ethernet Adapter Driver
    PSTAMP: miro20070112193338
...

Bundled package version in Solaris 10 8/07 Update4.

t2000# pkginfo -l SUNWnxge
   PKGINST: SUNWnxge
      NAME: Sun NIU leaf driver
  CATEGORY: system
      ARCH: sparc.sun4u
   VERSION: 11.10.0,REV=2007.07.08.17.44
   BASEDIR: /
    VENDOR: Sun Microsystems, Inc.
      DESC: Sun NIU 10Gb/1Gb driver
    PSTAMP: on10ptchfeat20070708174804
...


Keep in mind: (currently)
Solaris 10 11/06 Update3 is only working with the unbundled version 1.0,REV=2007.01.12.10.0
Solaris 10 11/06 Update3 with 120011-14 is only working with the bundled version 11.10.0,REV=2007.07.08.17.44
Solaris 10 08/07 Update4 is only working with the bundled version 11.10.0,REV=2007.07.08.17.44


Workarounds:

1) If already suffered the mentioned panic
  a) Boot the system from net or DVD.
  b) Mount the / filesystem to /a.
     Also mount /var to /a/var if var is separate partition
     NOTE: If you have a root mirror - take care of it. Addtional steps are necessary.
  c) Remove the unbundled SUNWnxge package with
     # pkgrm -R /a SUNWnxge
  d) Add the bundled package. Use SUNWnxge.v for sun4v architecture
     or SUNWnxge.u for sun4u architecture. e.g: for sun4v
     # pkgadd -R /a -d . SUNWnxge.v
  e) Install the nxge patch for bundled nxge package 127741-01
     # patchadd -R /a 127741-01
  f) umount the filesystems and reboot the system
2) Prevent the panic (tested)
  a) Remove the unbundled package SUNWnxge with
     # pkgrm SUNWnxge
  b) Add the bundled package. Use SUNWnxge.v for sun4v architecture
     or SUNWnxge.u for sun4u architecture. e.g: for sun4v
     # pkgadd -d . SUNWnxge.v
     Note: Do NOT reboot now. The system will panic if 120011-14 is not installed!
  c) Install nxge patch for bundled nxge package 127741-01
     # patchadd 127741-01
  d) Install patch 120011-14 on Solaris 10 11/06 Update3
     Note: The patch 120011-14 requires a lot of other patches. Maybe you require a patch management tool!
  e) Reboot the system
Attention: What's NOT working
It's NOT possible to install the patch 120011-14 befor the replacement of the package SUNWnxge. After the installation of 120011-14 the system is blocked for package operation. The message is:
# pkgrm SUNWnxge pkgrm: ERROR: unable to remove any package from the system until it is rebooted. One or more patches have updated the system but these changes are not yet enabled. Additional package operations are not permitted until the system is rebooted.



TAKE CARE ABOUT ALERT 1000628.1: Solaris 10 Systems May Fail to Come up if Patches Are Applied After Kernel Patches 120011-14 (SPARC) and 120012-14 (x86) and Before the Reboot. Workaround: To avoid this issue, reboot the system immediately after installing patch 120011-14 or 120012-14.


Additional information for i386 architecture.
The kernel update patch is 120012-14.
In Solaris 10 8/07 the bundled package name is SUNWnxge.i with version 11.10.0,REV=2007.07.08.17.21
The nxge patch for bundled Solaris 10 8/07 Update4 nxge package is 127742-01.

Wednesday Jan 10, 2007

pkcs11_softtoken failure after installation of Solaris 10 11/06

After the installation of Solaris 10 11/06 (Update3) you will see the following message on the console

Jan 8 15:14:32 host345 java[2021]: pkcs11_softtoken: Keystore version failure.

You can simple ignore the message. This is a known internal bug and will be fixed in the patches
118918-23 - Sparc version
118919-20 - X86 version

So, it's no need to open a case :-)

Wednesday Jan 03, 2007

IPMP in a test subnet

IP multipathing (IPMP) can be used with probe-based failure detection in a test subnet. In this case the functionality changed with Solaris 9 onwards. The deprecated flag has been fully implemented as per the ifconfig man page. "An address associated with a deprecated interface will not be used as source address for outbound packets unless either there are no other addresses available on the interface or the application has bound to this address explicitly".
This means that test subnet of Solaris 9 and 10 systems will not become probe targets for each other. So, in case of an upgrade from Solaris 8 to Solaris 9 or higher you maybe need to change your configuration.
The major advantage is that you can use different IP's in test subnet than the IP data addresses. This save IP data addresses in the production network. Private network addresses as specified by rfc1918 (e.g. 10/8, 172.16/12, or 192.168/16) can be used in the test subnet as well.


Typical setup of IPMP with test subnet:

ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 10.16.51.123 netmask fffffe00 broadcast 10.16.51.255
        groupname ipmp0
        ether 0:3:ba:19:90:d9
ce0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
        inet 192.168.100.123 netmask ffffff00 broadcast 192.168.100.255
ce1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
        inet 192.168.100.122 netmask ffffff00 broadcast 192.168.100.255
        groupname ipmp0
        ether 0:3:ba:19:90:da

If all test partner hosts are also using IPMP for the test subnet you should be aware that test partners have the deprecated flag ON. In case of Solaris9 and Solaris 10, an interface which have the deprecated flag ON does not answer to "all hosts" multicast which are used for the automatic IPMP test partner detection (only used if the test subnet is without a defaultrouter). To solve this issue you can add a logical network interface with deprecated/nofailover flag OFF to answer the "all hosts" multicast.
e.g. (look to new interface ce0:2):

ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 10.16.51.123 netmask fffffe00 broadcast 10.16.51.255
        groupname ipmp0
        ether 0:3:ba:19:90:d9
ce0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
        inet 192.168.100.123 netmask ffffff00 broadcast 192.168.100.255
ce0:2: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 192.168.100.200 netmask ffffff00 broadcast 192.168.100.255
ce1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
        inet 192.168.100.122 netmask ffffff00 broadcast 192.168.100.255
        groupname ipmp0
        ether 0:3:ba:19:90:da

The hostname files for this example:
/etc/hostname.ce0
10.16.51.123 netmask + broadcast + group ipmp0 up \\
addif 192.168.100.123 netmask + broadcast + deprecated -failover up
addif 192.168.100.200 netmask + broadcast + up \\
/etc/hostname.ce1
192.168.100.122 netmask + broadcast + group ipmp0 deprecated -failover up



For more about IPMP you should refer to:
Infodoc 1010640.1: Summary of typical IPMP Configurations
Infodoc 1008064.1: IPMP Link-based Failure Detection with Solaris [TM] 10 Operating System (OS) and higher

Monday Oct 23, 2006

Device in use checking and Solaris Volume Manager

On Solaris 10 with the kernel update patch 118833-17 the "Device in use checking feature" was introduced. The patch 118833-17 is also included with Solaris 10 06/06 aka Update2.
Actually the feature was introduced in 118833-04. But the patches 118833-04 until 118833-16 went never to SunSolve.
If you use format command to repartition a disk drive which is part of a metaset you can get a messages like:

Error occurred with device in use checking: No such device
or
Error occurred with device in use checking: Permission denied
or
/dev/dsk/cXXXXXXX is part of SVM volume diskset:XXX. Please see metaclear(1M).
/dev/dsk/cXXXXXXX contains an SVM mdb. Please see metadb(1M).

and maybe you are not able to label the disk.

To prevent these messages and to label the disk successfully you should set the shell variable NOINUSE_CHECK=1 .

For further details refer to the "Hands Up" "Device checking for fs utilities"
If you have an SunSolve account you can look at Infodoc 1005435.1

About

I'm still mostly blogging around Solaris Cluster and support. Independently if for Sun Microsystems or Oracle. :-)

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
23
24
25
26
27
28
29
30
   
       
Today