What is bondib1 used for on SPARC SuperCluster with InfiniBand, Solaris 11 networking & Oracle RAC?

A co-worker asked the following question about a SPARC SuperCluster InfiniBand network:

> on the database nodes the RAC nodes communicate over the cluster_interconnect. This is the
> 192.168.10.0 network on bondib0. (according to ./crs/install/crsconfig_params NETWORKS
> setting) 
> What is bondib1 used for? Is it a HA counterpart in case bondib0 dies?

This is my response:

Summary: In a SPARC SuperCluster installation, bondib0 and bondib1 are the InfiniBand links that are used for the private interconnect (usage includes global cache data blocks and heartbeat) and for communication to the Exadata storage cells. Currently, the database is idle, so bondib1 is currently only being used for outbound cluster interconnect traffic.

Details:

bondib0 is the cluster_interconnect

$ oifcfg getif           
bondeth0  10.129.184.0  global  public
bondib0  192.168.10.0  global  cluster_interconnect
ipmpapp0  192.168.30.0  global  public


bondib0 and bondib1 are on 192.168.10.1 and 192.168.10.2 respectively.

# ipadm show-addr | grep bondi
bondib0/v4static  static   ok           192.168.10.1/24
bondib1/v4static  static   ok           192.168.10.2/24


This private network is also used to communicate with the Exadata Storage Cells. Notice that the network addresses of the Exadata Cell Disks are on the same subnet as the private interconnect:  

SQL> column path format a40
SQL> select path from v$asm_disk;

PATH                                     

---------------------------------------- 
o/192.168.10.9/DATA_SSC_CD_00_ssc9es01  
o/192.168.10.9/DATA_SSC_CD_01_ssc9es01
...

Hostnames tied to the IPs are node1-priv1 and node1-priv2 

# grep 192.168.10 /etc/hosts
192.168.10.1    node1-priv1.us.oracle.com   node1-priv1
192.168.10.2    node1-priv2.us.oracle.com   node1-priv2

For the four compute node RAC:

  • Each compute node has two IP address on the 192.168.10.0 private network.
  • Each IP address has an active InfiniBand link and a failover InfiniBand link.
  • Thus, the compute nodes are using a total of 8 IP addresses and 16 InfiniBand links for this private network.

bondib1 isn't being used for the Virtual IP (VIP):

$ srvctl config vip -n node1
VIP exists: /node1-ib-vip/192.168.30.25/192.168.30.0/255.255.255.0/ipmpapp0, hosting node node1
VIP exists: /node1-vip/10.55.184.15/10.55.184.0/255.255.255.0/bondeth0, hosting node node1


bondib1 is on bondib1_0 and fails over to bondib1_1:

# ipmpstat -g
GROUP       GROUPNAME   STATE     FDT       INTERFACES
ipmpapp0    ipmpapp0    ok        --        ipmpapp_0 (ipmpapp_1)
bondeth0    bondeth0    degraded  --        net2 [net5]
bondib1     bondib1     ok        --        bondib1_0 (bondib1_1)
bondib0     bondib0     ok        --        bondib0_0 (bondib0_1)


bondib1_0 goes over net24

# dladm show-link | grep bond
LINK                CLASS     MTU    STATE    OVER
bondib0_0           part      65520  up       net21
bondib0_1           part      65520  up       net22
bondib1_0           part      65520  up       net24
bondib1_1           part      65520  up       net23


net24 is IB Partition FFFF

# dladm show-ib
LINK         HCAGUID         PORTGUID        PORT STATE  PKEYS
net24        21280001A1868A  21280001A1868C  2    up     FFFF
net22        21280001CEBBDE  21280001CEBBE0  2    up     FFFF,8503
net23        21280001A1868A  21280001A1868B  1    up     FFFF,8503
net21        21280001CEBBDE  21280001CEBBDF  1    up     FFFF


On Express Module 9 port 2:

# dladm show-phys -L
LINK              DEVICE       LOC
net21             ibp4         PCI-EM1/PORT1
net22             ibp5         PCI-EM1/PORT2
net23             ibp6         PCI-EM9/PORT1
net24             ibp7         PCI-EM9/PORT2


Outbound traffic on the 192.168.10.0 network will be multiplexed between bondib0 & bondib1

# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
192.168.10.0         192.168.10.2         U        16    6551834 bondib1  
192.168.10.0         192.168.10.1         U         9    5708924 bondib0  


The database is currently idle, so there is no traffic to the Exadata Storage Cells at this moment, nor is there currently any traffic being induced by the global cache. Thus, only the heartbeat is currently active. There is more traffic on bondib0 than bondib1

# /bin/time snoop -I bondib0 -c 100 > /dev/null
Using device ipnet/bondib0 (promiscuous mode)
100 packets captured

real        4.3
user        0.0
sys         0.0


(100 packets in 4.3 seconds = 23.3 pkts/sec)

# /bin/time snoop -I bondib1 -c 100 > /dev/null
Using device ipnet/bondib1 (promiscuous mode)
100 packets captured

real       13.3
user        0.0
sys         0.0


(100 packets in 13.3 seconds = 7.5 pkts/sec)

Half of the packets on bondib0 are outbound (from self). The remaining packet are split evenly, from the other nodes in the cluster.

# snoop -I bondib0 -c 100 | awk '{print $1}' | sort | uniq -c
Using device ipnet/bondib0 (promiscuous mode)
100 packets captured
  49 node1
-priv1.us.oracle.com
  24 node2
-priv1.us.oracle.com
  14 node3
-priv1.us.oracle.com
  13 node4
-priv1.us.oracle.com

100% of the packets on bondib1 are outbound (from self), but the headers in the packets indicate that they are from the IP address associated with bondib0:

# snoop -I bondib1 -c 100 | awk '{print $1}' | sort | uniq -c
Using device ipnet/bondib1 (promiscuous mode)
100 packets captured
 100 node1-priv1.us.oracle.com

The destination of the bondib1 outbound packets are split evenly, to node3 and node 4.

# snoop -I bondib1 -c 100 | awk '{print $3}' | sort | uniq -c
Using device ipnet/bondib1 (promiscuous mode)
100 packets captured
  51 node3-priv1.us.oracle.com
  49 node4-priv1.us.oracle.com

Conclusion: In a SPARC SuperCluster installation, bondib0 and bondib1 are the InfiniBand links that are used for the private interconnect (usage includes global cache data blocks and heartbeat) and for communication to the Exadata storage cells. Currently, the database is idle, so bondib1 is currently only being used for outbound cluster interconnect traffic.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

user12620111

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today