Friday Oct 30, 2009

Kernel patch 141444-09 or 141445-09 with Sun Cluster 3.2

As stated in my last blog the following kernel patches are included in Solaris 10 10/09 Update8.
141444-09 SunOS 5.10: kernel patch or
141445-09 SunOS 5.10_x86: kernel patch

Update 10.Dec.2009:
Support of Solaris 10 10/09 Update8 with Sun Cluster 3.2 1/09 Update2 is now announced. The recommendation is to use the 126106-39 (sparc) / 126107-39 (x86) with Solaris 10 10/09 Update8. Note: The -39 Sun Cluster core patch is a feature patch because the -38 Sun Cluster core patch is part of Sun Cluster 3.2 11/09 Update3 which is already released.
For new installations/upgrades with Solaris 10 10/09 Update8 use:
\* Sun Cluster 3.2 11/09 Update3 with Sun Cluster core patch -39 (fix problem 1)
\* Use the patches 142900-02 / 142901-02 (fix problem 2)
\* Add "set nautopush=64" to /etc/system (workaround for problem 3)

For patch updates to 141444-09/141445-09 use:
\* Sun Cluster core patch -39 (fix problem 1)
\* Also use patches 142900-02 / 142901-02 (fix problem 2)
\* Add "set nautopush=64" to /etc/system (workaround for problem 3)


It's time to notify that there are some issues with these kernel patches in combination with Sun Cluster 3.2

1.) The patch breaks the zpool cachefile feature if using SUNW.HAStoragePlus

a.) If the kernel patch 141444-09 (sparc) / 141445-09 (x86) is installed on a Sun Cluster 3.2 system where the Sun Cluster core patch 126106-33 (sparc) / 126107-33 (x86) is already installed then hastorageplus_prenet_start will fail with the following error message:
...
Oct 26 17:51:45 nodeA SC[,SUNW.HAStoragePlus:6,rg1,rs1,hastorageplus_prenet_start]: Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 26 17:51:53 nodeA SC[,SUNW.HAStoragePlus:6,rg1,rs1,hastorageplus_prenet_start]: Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 26 17:51:54 nodeA zfs: [ID 427000 kern.warning] WARNING: pool 'zpool1' could not be loaded as it was last accessed by another system (host: nodeB hostid: 0x8516ced4). See: http://www.sun.com/msg/ZFS-8000-EY
...


b.) If the kernel patch 141444-09 (sparc) / 141445-09 (x86) is installed on a Sun Cluster 3.2 system where the Sun Cluster core patch 126106-35 (sparc) / 126107-35 (x86) is already installed then hastorageplus_prenet_start will work but the zpool cachefile feature of SUNW.HAStoragePlus is disabled. Without the zpool cachefile feature the time of zpool import increases because the import will scan all available disks. The messages look like:
...
Oct 30 15:37:45 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 148650 daemon.notice] Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:45 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 148650 daemon.notice] Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 547433 daemon.notice] Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 547433 daemon.notice] Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 792255 daemon.warning] Failed to update the cachefile contents in /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile to CCR table zpool1.cachefile for pool zpool1 : file /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile open failed: No such file or directory.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 792255 daemon.warning] Failed to update the cachefile contents in /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile to CCR table zpool1.cachefile for pool zpool1 : file /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile open failed: No such file or directory.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 205754 daemon.info] All specified device services validated successfully.
...


If the ZFS cachefile feature is not required AND the above kernel patches are installed, problem a.) is resolved by installing Sun Cluster core patch 126106-35 (sparc) / 126107-35 (x86).
Solution for a) and b):
126106-39 Sun Cluster 3.2: CORE patch for Solaris 10
126107-39 Sun Cluster 3.2: CORE patch for Solaris 10_x86

Alert 1021629.1: A Solaris Kernel Change Stops Sun Cluster Using "zpool.cachefiles" to Import zpools Resulting in ZFS pool Import Performance Degradation or Failure to Import the zpools

2.) The patch breaks probe-based IPMP if more than one interface is in the same IPMP group

After installing the already mentioned kernel patch:
141444-09 SunOS 5.10: kernel patch or
141445-09 SunOS 5.10_x86: kernel patch
then the probe-based IPMP group feature is broken if the system is using more than one interface in the same IPMP group. This means all Solaris 10 systems which are using more than one interface in the same probe-based IPMP group are affected!

After installing this kernel patch the following errors will be sent to the system console after a reboot:
...
nodeA console login: Oct 26 19:34:41 in.mpathd[210]: NIC failure detected on bge0 of group ipmp0
Oct 26 19:34:41 in.mpathd[210]: Successfully failed over from NIC bge0 to NIC e1000g0
...

Workarounds:
a) Use link-based IPMP instead of probe-based IPMP
b) Use only one interface in the same IPMP group if using probe-based IPMP
See the blog "Tips to configure IPMP with Sun Cluster 3.x" for more details if you like to change the configuration.
c) Do not install the listed kernel patch above. Note: Fix is already in progress and can be reached via a service request. I will update this blog when the general fix is available.

Solution:
142900-02 SunOS 5.10: kernel patch
142901-02 SunOS 5.10_x86: kernel patch

Alert 1021262.1 : Solaris 10 Kernel Patches 141444-09 and 141445-09 May Cause Interface Failure in IP Multipathing (IPMP)
This is reported in Bug 6888928

3.) When applying the patch Sun Cluster can hang on reboot

After installing the already mentioned kernel patch:
141444-09 SunOS 5.10: kernel patch or
141511-05 SunOS 5.10_x86: ehci, ohci, uhci patch
the Sun Cluster nodes can hang within boot because the Sun Cluster nodes has exhausted the default number of autopush structures. When clhbsndr module is loaded, it causes a lot more autopushes to occur than would otherwise happen on a non-clustered system. By default, we only allocate nautopush=32 of these structures.

Workarounds:
a) Do not use the mentioned kernel patch with Sun Cluster
b) Boot in non-cluster-mode and add the following to /etc/system
set nautopush=64

Solution:
126106-42 Sun Cluster 3.2: CORE patch for Solaris 10
126107-42 Sun Cluster 3.2: CORE patch for Solaris 10_x86
for Sun Cluster 3.1 the issue is fixed in:
120500-26 Sun Cluster 3.1: Core Patch for Solaris 10

Alert 1021684.1: Solaris autopush(1M) Changes (with patches 141444-09/141511-04) May Cause Sun Cluster 3.1 and 3.2 Nodes to Hang During Boot
This is reported in Bug 6879232

Thursday Dec 11, 2008

Tips to configure IPMP with Sun Cluster 3.x

Configure IPMP (probe based or link based):
Setup IPMP (IP network multipathing) groups on all nodes for all public network interfaces which are used for a HA dataservice. This article describe a summary of possibilities and known issues. An overview about IPMP can be found in System Administration Guide: IP Services.


Example probe-based IPMP group active-active with interfaces qfe0 and qfe4 with one production IP:

Entry of /etc/hostname.qfe0:
<production_IP_host> netmask + broadcast + group ipmp1 up \\
addif <test_IP_host> netmask + broadcast + deprecated -failover up

Entry of /etc/hostname.qfe4:
<test_IP_host> netmask + broadcast + group ipmp1 deprecated -failover up
The IPMP group name ipmp1 is freely chosen in this example!

If the defaultrouter is NOT 100% available please read
Technical Instruction 1010640.1: Summary of typical IPMP Configurations
and
Technical Instruction 1001790.1: The differences between Network Adapter Failover (Sun Cluster 3.0) and IP Multipathing (Sun Cluster 3.1)

Notes:
\* Do not use test IP for normal applications.
\* When using Solaris9 12/02 or later & Sun Cluster 3.1 Update1 or later there is no need for a IPMP test address if you have only 1 IP address in the IPMP group. (RFE 4511634, 4741473)
  e.g: of /etc/hostname.qfe0 entry:
    <production_IP_host> netmask + broadcast + group ipmp1 up
\* Test IP for all adapters in the same IPMP group must belong to a single IP subnet.


Example link-based IPMP group active-active with interfaces qfe0 and qfe4 with one production IP:

Entry of /etc/hostname.qfe0:
<production_IP_host> netmask + broadcast + group ipmp1 up
Entry of /etc/hostname.qfe4:
<dummy_IP_host> netmask + broadcast + deprecated group ipmp1 up

Notes:
\* Do NOT use the 0.0.0.0 IP address as dummy_IP_host for link based due to bug 6457375.
\* This time the recommendation is to use valid IP address but it can also be a dummy IP address.

The bug 6457375 is fixed in kernel update patch 138888-01 (sparc) or 138889-01 (x86). These kernel patches are based on Solaris 10 10/08 Update6. Now it's possible to use the 0.0.0.0 IP address as described in the following example:
Entry of /etc/hostname.qfe0:
<production_IP_host> netmask + broadcast + group ipmp1 up
Entry of /etc/hostname.qfe4:
group ipmp1 up

Further Details:
Technical Instruction 1008064.1: IPMP Link-based Only Failure Detection with Solaris 10 Operating System (OS)


Hints / Checkpoints for all configurations:
  • You need an additional IP for each logical host.

  • If there is a firewall being used between clients and a HA service running on this cluster, and if this HA service is using UDP and does not bind to a specific address, the IP stack choses the source address for all outgoing packages from the routing table. So, as there is no guarantee that the same source address is chosen for all packages - the routing table might change - it is necessary to configure all addresses available on a network interface as valid source addresses in the firewall. More details can be found in the blog Why a logical IP is marked as DEPRECATED?

  • IPMP groups as active-standby configuration is also possible.

  • In the /etc/default/mpathd file, the value of TRACK_INTERFACES_ONLY_WITH_GROUPS must be yes (default).

  • In case of Sun Cluster 3.1 The FAILBACK in /etc/default/mpathd file must be yes (default). Bug 6429808. Fixed in Sun Cluster 3.2.

  • Use only one IPMP group in the same subnet. It's not supported to use more IPMP groups in the same subnet.

  • The SC installer adds an IPMP group to all public network adapters. If desired remove the IPMP configuration for network adapters that will NOT be used for HA dataservices.

  • Remove IPMP groups from dman interfaces (SunFire 12/15/20/25K) if exists. (Bug 6309869)

Wednesday Jan 03, 2007

IPMP in a test subnet

IP multipathing (IPMP) can be used with probe-based failure detection in a test subnet. In this case the functionality changed with Solaris 9 onwards. The deprecated flag has been fully implemented as per the ifconfig man page. "An address associated with a deprecated interface will not be used as source address for outbound packets unless either there are no other addresses available on the interface or the application has bound to this address explicitly".
This means that test subnet of Solaris 9 and 10 systems will not become probe targets for each other. So, in case of an upgrade from Solaris 8 to Solaris 9 or higher you maybe need to change your configuration.
The major advantage is that you can use different IP's in test subnet than the IP data addresses. This save IP data addresses in the production network. Private network addresses as specified by rfc1918 (e.g. 10/8, 172.16/12, or 192.168/16) can be used in the test subnet as well.


Typical setup of IPMP with test subnet:

ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 10.16.51.123 netmask fffffe00 broadcast 10.16.51.255
        groupname ipmp0
        ether 0:3:ba:19:90:d9
ce0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
        inet 192.168.100.123 netmask ffffff00 broadcast 192.168.100.255
ce1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
        inet 192.168.100.122 netmask ffffff00 broadcast 192.168.100.255
        groupname ipmp0
        ether 0:3:ba:19:90:da

If all test partner hosts are also using IPMP for the test subnet you should be aware that test partners have the deprecated flag ON. In case of Solaris9 and Solaris 10, an interface which have the deprecated flag ON does not answer to "all hosts" multicast which are used for the automatic IPMP test partner detection (only used if the test subnet is without a defaultrouter). To solve this issue you can add a logical network interface with deprecated/nofailover flag OFF to answer the "all hosts" multicast.
e.g. (look to new interface ce0:2):

ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 10.16.51.123 netmask fffffe00 broadcast 10.16.51.255
        groupname ipmp0
        ether 0:3:ba:19:90:d9
ce0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
        inet 192.168.100.123 netmask ffffff00 broadcast 192.168.100.255
ce0:2: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 192.168.100.200 netmask ffffff00 broadcast 192.168.100.255
ce1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
        inet 192.168.100.122 netmask ffffff00 broadcast 192.168.100.255
        groupname ipmp0
        ether 0:3:ba:19:90:da

The hostname files for this example:
/etc/hostname.ce0
10.16.51.123 netmask + broadcast + group ipmp0 up \\
addif 192.168.100.123 netmask + broadcast + deprecated -failover up
addif 192.168.100.200 netmask + broadcast + up \\
/etc/hostname.ce1
192.168.100.122 netmask + broadcast + group ipmp0 deprecated -failover up



For more about IPMP you should refer to:
Infodoc 1010640.1: Summary of typical IPMP Configurations
Infodoc 1008064.1: IPMP Link-based Failure Detection with Solaris [TM] 10 Operating System (OS) and higher

About

I'm still mostly blogging around Solaris Cluster and support. Independently if for Sun Microsystems or Oracle. :-)

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today