Monday Dec 19, 2011

More robust control of zfs in Solaris Cluster 3.x

In some situations there is a possibility that a zpool will not be exported correctly if controlled by SUNW.HAStoragePlus resource. Please refer to the details in Document 1364018.1: Potential Data Integrity Issues After Switching Over a Solaris Cluster High Availability Resource Group With Zpools

I like to mention this because zfs is used more and more in Solaris Cluster environments. Therefore I highly recommend to install following patches to get a more reliable Solaris Cluster environment in combination with zpools on SC3.3 and SC3.2. So, if you already running such a setup, start planning NOW to install the following patch revision (or higher) for your environment...

Solaris Cluster 3.3:
145333-10 Oracle Solaris Cluster 3.3: Core Patch for Oracle Solaris 10
145334-10 Oracle Solaris Cluster 3.3_x86: Core Patch for Oracle Solaris 10_x86

Solaris Cluster 3.2
144221-07 Solaris Cluster 3.2: CORE patch for Solaris 10
144222-07 Solaris Cluster 3.2: CORE patch for Solaris 10_x86

Tuesday Mar 15, 2011

Setup of local zpool on local or shared device with Solaris Cluster

Maybe there is a need (for whatever reason) to configure a local zpool in a Solaris Cluster environment. As local zpool I mean that this zpool should only be available on one Solaris Cluster node WITHOUT using SUNW.HAStoragePlus. Such a local zpool can be configured with local devices (only connected to one node) or shared devices (accessible from all nodes in the cluster via SAN). However in case of shared device it would be better to setup a zone in the SAN switch to make the device only available to one host.

The following procedure is necessary to use local devices in local zpool:

In this example I use the local device c1t3d0 to create a local zpool
a) Look for the did device of the device which should be used by the zpool
# scdidadm -l c1t3d0
49 node0:/dev/rdsk/c1t3d0 /dev/did/rdsk/d49
b) Check the settings of the used did device
# cldg show dsk/d49
Note: Only one node should be in the node list
c) Set localonly flag for the did device. Optional: set autogen flag
# cldg set -p localonly=true -p autogen=true dsk/d49
or disable fencing for the did device
# cldev set -p default_fencing=nofencing d49
d) Verify the settings
# cldg show dsk/d49
e) Create the zpool
# zpool create localpool c1t3d0
# zfs create localpool/data


The following procedure is necessary to use shared devices in local zpool:

In this example I use the shared device c6t600C0FF00000000007BA1F1023AE1711d0 to create a local zpool
a) Look for the did device of the device which should be used by the zpool
# scdidadm -L c6t600C0FF00000000007BA1F1023AE1711d0
11 node0:/dev/rdsk/c6t600C0FF00000000007BA1F1023AE1710d0 /dev/did/rdsk/d11
11 node1:/dev/rdsk/c6t600C0FF00000000007BA1F1023AE1710d0 /dev/did/rdsk/d11
b) Check the settings of the used did device
# cldg show dsk/d11
c) Remove the node which should not access the did device
# cldg remove-node -n node1 dsk/d11
d) Set localonly flag for the did device. Optional: set autogen flag
# cldg set -p localonly=true -p autogen=true dsk/d11
or disable fencing for the did device
# cldev set -p default_fencing=nofencing d11
e) Verify the settings
# cldg show dsk/d11
f) Create the zpool
# zpool create localpool c6t600C0FF00000000007BA1F1023AE1711d0
# zfs create localpool/data


If you forgot to do this for a local zpool then there is a possibility that the zpool will be FAULTED state after a boot.

Friday Oct 30, 2009

Kernel patch 141444-09 or 141445-09 with Sun Cluster 3.2

As stated in my last blog the following kernel patches are included in Solaris 10 10/09 Update8.
141444-09 SunOS 5.10: kernel patch or
141445-09 SunOS 5.10_x86: kernel patch

Update 10.Dec.2009:
Support of Solaris 10 10/09 Update8 with Sun Cluster 3.2 1/09 Update2 is now announced. The recommendation is to use the 126106-39 (sparc) / 126107-39 (x86) with Solaris 10 10/09 Update8. Note: The -39 Sun Cluster core patch is a feature patch because the -38 Sun Cluster core patch is part of Sun Cluster 3.2 11/09 Update3 which is already released.
For new installations/upgrades with Solaris 10 10/09 Update8 use:
\* Sun Cluster 3.2 11/09 Update3 with Sun Cluster core patch -39 (fix problem 1)
\* Use the patches 142900-02 / 142901-02 (fix problem 2)
\* Add "set nautopush=64" to /etc/system (workaround for problem 3)

For patch updates to 141444-09/141445-09 use:
\* Sun Cluster core patch -39 (fix problem 1)
\* Also use patches 142900-02 / 142901-02 (fix problem 2)
\* Add "set nautopush=64" to /etc/system (workaround for problem 3)


It's time to notify that there are some issues with these kernel patches in combination with Sun Cluster 3.2

1.) The patch breaks the zpool cachefile feature if using SUNW.HAStoragePlus

a.) If the kernel patch 141444-09 (sparc) / 141445-09 (x86) is installed on a Sun Cluster 3.2 system where the Sun Cluster core patch 126106-33 (sparc) / 126107-33 (x86) is already installed then hastorageplus_prenet_start will fail with the following error message:
...
Oct 26 17:51:45 nodeA SC[,SUNW.HAStoragePlus:6,rg1,rs1,hastorageplus_prenet_start]: Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 26 17:51:53 nodeA SC[,SUNW.HAStoragePlus:6,rg1,rs1,hastorageplus_prenet_start]: Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 26 17:51:54 nodeA zfs: [ID 427000 kern.warning] WARNING: pool 'zpool1' could not be loaded as it was last accessed by another system (host: nodeB hostid: 0x8516ced4). See: http://www.sun.com/msg/ZFS-8000-EY
...


b.) If the kernel patch 141444-09 (sparc) / 141445-09 (x86) is installed on a Sun Cluster 3.2 system where the Sun Cluster core patch 126106-35 (sparc) / 126107-35 (x86) is already installed then hastorageplus_prenet_start will work but the zpool cachefile feature of SUNW.HAStoragePlus is disabled. Without the zpool cachefile feature the time of zpool import increases because the import will scan all available disks. The messages look like:
...
Oct 30 15:37:45 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 148650 daemon.notice] Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:45 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 148650 daemon.notice] Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 547433 daemon.notice] Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 547433 daemon.notice] Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 792255 daemon.warning] Failed to update the cachefile contents in /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile to CCR table zpool1.cachefile for pool zpool1 : file /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile open failed: No such file or directory.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 792255 daemon.warning] Failed to update the cachefile contents in /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile to CCR table zpool1.cachefile for pool zpool1 : file /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile open failed: No such file or directory.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 205754 daemon.info] All specified device services validated successfully.
...


If the ZFS cachefile feature is not required AND the above kernel patches are installed, problem a.) is resolved by installing Sun Cluster core patch 126106-35 (sparc) / 126107-35 (x86).
Solution for a) and b):
126106-39 Sun Cluster 3.2: CORE patch for Solaris 10
126107-39 Sun Cluster 3.2: CORE patch for Solaris 10_x86

Alert 1021629.1: A Solaris Kernel Change Stops Sun Cluster Using "zpool.cachefiles" to Import zpools Resulting in ZFS pool Import Performance Degradation or Failure to Import the zpools

2.) The patch breaks probe-based IPMP if more than one interface is in the same IPMP group

After installing the already mentioned kernel patch:
141444-09 SunOS 5.10: kernel patch or
141445-09 SunOS 5.10_x86: kernel patch
then the probe-based IPMP group feature is broken if the system is using more than one interface in the same IPMP group. This means all Solaris 10 systems which are using more than one interface in the same probe-based IPMP group are affected!

After installing this kernel patch the following errors will be sent to the system console after a reboot:
...
nodeA console login: Oct 26 19:34:41 in.mpathd[210]: NIC failure detected on bge0 of group ipmp0
Oct 26 19:34:41 in.mpathd[210]: Successfully failed over from NIC bge0 to NIC e1000g0
...

Workarounds:
a) Use link-based IPMP instead of probe-based IPMP
b) Use only one interface in the same IPMP group if using probe-based IPMP
See the blog "Tips to configure IPMP with Sun Cluster 3.x" for more details if you like to change the configuration.
c) Do not install the listed kernel patch above. Note: Fix is already in progress and can be reached via a service request. I will update this blog when the general fix is available.

Solution:
142900-02 SunOS 5.10: kernel patch
142901-02 SunOS 5.10_x86: kernel patch

Alert 1021262.1 : Solaris 10 Kernel Patches 141444-09 and 141445-09 May Cause Interface Failure in IP Multipathing (IPMP)
This is reported in Bug 6888928

3.) When applying the patch Sun Cluster can hang on reboot

After installing the already mentioned kernel patch:
141444-09 SunOS 5.10: kernel patch or
141511-05 SunOS 5.10_x86: ehci, ohci, uhci patch
the Sun Cluster nodes can hang within boot because the Sun Cluster nodes has exhausted the default number of autopush structures. When clhbsndr module is loaded, it causes a lot more autopushes to occur than would otherwise happen on a non-clustered system. By default, we only allocate nautopush=32 of these structures.

Workarounds:
a) Do not use the mentioned kernel patch with Sun Cluster
b) Boot in non-cluster-mode and add the following to /etc/system
set nautopush=64

Solution:
126106-42 Sun Cluster 3.2: CORE patch for Solaris 10
126107-42 Sun Cluster 3.2: CORE patch for Solaris 10_x86
for Sun Cluster 3.1 the issue is fixed in:
120500-26 Sun Cluster 3.1: Core Patch for Solaris 10

Alert 1021684.1: Solaris autopush(1M) Changes (with patches 141444-09/141511-04) May Cause Sun Cluster 3.1 and 3.2 Nodes to Hang During Boot
This is reported in Bug 6879232

Friday May 08, 2009

Administration of zpool devices in Sun Cluster 3.2 environment


Carefully configure zpools in Sun Cluster 3.2. Because it's possible to use the same physical device in different zpools on different nodes at the same time. This means the zpool command does NOT care about if the physical device is already in use by another zpool on another node. e.g. If node1 have an active zpool with device c3t3d0 then it's possible to create a new zpool with c3t3d0 on another node. (assumption: c3t3d0 is the same shared device on all cluster nodes).

Output of testing...


If problems occurred due to administration mistakes then the following errors have been seen:

NODE1# zpool import tank
cannot import 'tank': I/O error

NODE2# zpool import tankothernode
cannot import 'tankothernode': one or more devices is currently unavailable

NODE2# zpool import tankothernode
cannot import 'tankothernode': no such pool available

NODE1# zpool import tank
cannot import 'tank': pool may be in use from other system, it was last accessed by NODE2 (hostid: 0x83083465) on Fri May 8 13:34:41 2009
use '-f' to import anyway
NODE1# zpool import -f tank
cannot import 'tank': one or more devices is currently unavailable


Furthermore the zpool command also use the disk without any warning if it used by Solaris Volume Manager diskset or Symantec (Veritas) Volume Manager diskgroup.

Summary for Sun Cluster environment:
ALWAYS MANUALLY CHECK THAT THE DEVICE WHICH USING FOR ZPOOL IS FREE!!!


This is addressed in bug 6783988.

About

I'm still mostly blogging around Solaris Cluster and support. Independently if for Sun Microsystems or Oracle. :-)

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today