private interconnect and patch 138888/138889


In specific Sun Cluster 3.x configurations the cluster node can not join. Most of the time this issue comes up after the installation of kernel update patch
138888-01 until 139555-08 or higher SunOS 5.10: Kernel Patch OR
138889-01 until 139556-08 or higher SunOS 5.10_x86: Kernel Patch
AND
Sun Cluster 3.x using an Ethernet switch (with VLAN) for the private interconnect
AND
Sun Cluster 3.x using e1000g, nxge, bge or ixgb (GLDv3) interfaces for the private interconnect.

The issue looks similar to the following messages during the boot up of the cluster node.
...
Jan 25 15:46:14 node1 genunix: [ID 279084 kern.notice] NOTICE: CMM: node reconfiguration #2 completed.
Jan 25 15:46:15 node1 genunix: [ID 884114 kern.notice] NOTICE: clcomm: Adapter e1000g1 constructed
Jan 25 15:46:15 node1 ip: [ID 856290 kern.notice] ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
Jan 25 15:46:16 node1 genunix: [ID 884114 kern.notice] NOTICE: clcomm: Adapter e1000g3 constructed
Jan 25 15:47:15 node1 genunix: [ID 604153 kern.notice] NOTICE: clcomm: Path node1:e1000g1 - node2:e1000g1 errors during initiation
Jan 25 15:47:15 node1 genunix: [ID 618107 kern.warning] WARNING: Path node1:e1000g1 - node2:e1000g1 initiation encountered errors, errno = 62. Remote node may be down or unreachable through this path.
Jan 25 15:47:16 node1 genunix: [ID 604153 kern.notice] NOTICE: clcomm: Path node1:e1000g3 - node2:e1000g3 errors during initiation
Jan 25 15:47:16 node1 genunix: [ID 618107 kern.warning] WARNING: Path node1:e1000g3 - node2:e1000g3 initiation encountered errors, errno = 62. Remote node may be down or unreachable through this path.
...
Jan 25 16:33:51 node1 genunix: [ID 224783 kern.notice] NOTICE: clcomm: Path node1:e1000g1 - node2:e1000g1 has been deleted
Jan 25 16:33:51 node1 genunix: [ID 638544 kern.notice] NOTICE: clcomm: Adapter e1000g1 has been disabled
Jan 25 16:33:51 node1 genunix: [ID 224783 kern.notice] NOTICE: clcomm: Path node1:e1000g3 - node2:e1000g3 has been deleted
Jan 25 16:33:51 node1 genunix: [ID 638544 kern.notice] NOTICE: clcomm: Adapter e1000g3 has been disabled
Jan 25 16:33:51 node1 ip: [ID 856290 kern.notice] ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast

Update 6.Mar.2009:
Available now:
Alert 1020193.1 Kernel Patches/Changes may Stop Sun Cluster Nodes From Joining the Cluster

Update 26.Jun.2009:
The issue is fixed in the patches
141414-01 or higher SunOS 5.10: kernel patch OR
137104-02 or higher SunOS 5.10_x86: dls patch

Both patches require the 13955[56]-08 kernel update patch which is included in Solaris 10 5/09 update7. If using Solaris 10 5/09 update7 then Sun Cluster 3.2 requires the Sun Cluster core patch in revision -33 or higher. So, to get this one fixed it's recommended to use Solaris 10 5/09 update7 (patch 13955[56]-8 or higher & 141414-01(sparc) or 137104-02(x86)) with the Sun Cluster 3.2 core patch -33 or higher.


Choose one of the corrective actions (if not install the patch with the fix):
  • Before install the mention patches configure VLAN tagging on the Sun interface and on the switch. This makes VLAN tagged packets expected and prevents drops. This means the interface name moves to e.g. e1000g810000. After configuration change to e.g. e1000g810000 it's recommend to reboot the Sun Cluster hosts. Configuration details.

  • If using the above mentioned kernel update patch enable QoS (Quality of Service) on the Ethernet switch. The switch should be able to handle priority tagging. Please refer to the switch documentation because each switch is different.

  • Do not install the above mentioned kernel update patch if using VLAN in Sun Cluster 3.x private interconnect.

The mentioned kernel update patch delivers some new features in the GLDv3 architecture. It makes packets 802.1q standard compliant by including priority tagging. Therefore the following Sun Cluster 3.x configuration should not be affected.
\* Sun Cluster 3.x which use ce, ge, hme, qfe, ipge or ixge network interfaces.
\* Sun Cluster 3.x which have back-to-back connections for the private interconnect.
\* Sun Cluster 3.x on Solaris 8 or Solaris 9.

Comments:

Hi,
I have a configure single-node cluster M4K which going to used nxge1 and bge1 for private-interconnect and at the same time to configure and join the SunFire V490 into Cluster using ce1 and ce3 as private-interconnect.Will it cause the cluster not able to join using back-to-back connections.Kernel Version 138888-03 for both System.

Posted by carina on October 06, 2009 at 09:25 AM CEST #

I have seen this bug only if used switches for the private interconnect. But for the private interconnect only same nic's supported. Which means ce to ce, nxge to nxge, bge to bge and so on...

Posted by jschleich on October 08, 2009 at 07:11 AM CEST #

Hi,
I thought as long as we are using the same gigabit ethernet interface which ce to nxge or bge1 which operate at the same speed.It should be a supported config for cluster interconnect.Pls refer to the Sun Cluster manual for Cluster Interconnect Speed Requirements for more informations.

Posted by guest on October 12, 2009 at 01:57 AM CEST #

Yes, you are right. Here is the link:
http://docs.sun.com/app/docs/doc/819-2993/interconnectinstall-chapter?a=view

Posted by jschleich on October 13, 2009 at 06:33 AM CEST #

Hi,
Can check whether Sun Cluster 3.2 Release support Oracle 11gR2 as shown below:
/etc/cluster/release
Sun Cluster 3.2 for Solaris 10 sparc

Sun Cluster 3.2 Core patch for Solaris 10 :126106-
27
Sun Cluster 3.2 HA Oracle patch for Solaris 10:126047-07

Posted by carina on January 28, 2010 at 04:25 AM CET #

Hi,
the Oracle 11gR2 is not yet qualified with Sun Cluster 3.2.

Posted by Juergen Schleich on January 29, 2010 at 10:11 AM CET #

Hi,
From the sun cluster 3.2 core patch,it means I can run only Oracle 11g instead Oracle 11gr2 as it is still not yet
qualified.

(from 126106-11)

6626743 placeholder for CLARC/2007/1614 (Qualification of Oracle 11g on Solaris Cluster)
6654698 support for HA-Oracle for Oracle 11g

Posted by guest on January 29, 2010 at 04:03 PM CET #

Hi,

yes the mentioned bugs are only for 11g release and not for 11gR1 or 11gR2. The 11g release can be used with Sun Cluster 3.2 RAC and HA-Oracle (SPARC and x86). The RAC 11gR1 release can be used with SPARC. Support for RAC 11gR1 is not available for x86 Sun Cluster…

Posted by jschleich on February 01, 2010 at 04:40 AM CET #

Hi,
Can I confirm The 144221-01 core patch of Sun Cluster 3.2 is now able to support HA Oracle 11GR2 based on the bug ID.

6917107 placeholder for FBCs 1822 and 1823, Oracle 11gR2 support

Any issue to upgrade the existing core patch < 126106-27> Sun Cluster 3.2: CORE patch for Solaris 10 to the 144221-01 inorder the Oracle 10GR2(10.2.0.4) can be upgraded to Oracle 11gr2?

Posted by guest on January 05, 2012 at 04:38 AM CET #

Yes, with 144221-01 you can run HA-Oracle with 11gR2, but you also need the patch 126047-15 or higher. If you have not running 126106-40 or higher you should refer to http://blogs.oracle.com/js/entry/summary_of_install_instructions_for
because 126106-42 is a requirement for 144221-01.
But the recommendation to run 11gR2 on Solaris Cluster 3.2 is that you run SC3.2 11/09 (update3) with min. 144221-02 and 126047-16. Even better use SC3.3. For 11.2.0.3 64bit the Solaris 10 9/10 (update9) is the min. requirement. Please use Oracle Communities for such questions at https://communities.oracle.com/portal/server.pt/community/oracle_solaris_cluster/393

Posted by Juergen on January 10, 2012 at 05:09 AM CET #

Post a Comment:
  • HTML Syntax: NOT allowed
About

I'm still mostly blogging around Solaris Cluster and support. Independently if for Sun Microsystems or Oracle. :-)

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today