Tuesday Feb 26, 2013

Solaris on Exalogic - Effect of VNIC over eoib0 & eoib1

There are lots of reason for customer to create VNIC over eoib0 & eoib1 on a compute node running Solaris, two typical examples are

1) compute node needs to connect to a VLAN over the EoIB network

2) there are containers running on the compute node that require 10GbE connectivity

We talked about why Transitive Probe-based Failure Detection is required in previous blog entry, the focus was on the link between IB gateway and customer's 10GbE infrastructure.

In fact, if there are VNIC created over eoib0 and eoib1, there is a chance that bond1 will not fail over even if the link between compute node and IB gateway goes down!

Here is a simple test to illustrate this scenario:

First of all, let's create a VNIC over eoib0 using the following command:

root@el01cn01:~#dladm create-vnic -l eoib0 vnic0

That's what the IPMP groups look like:

root@el01cn01:~# ipmpstat -i
INTERFACE   ACTIVE  GROUP       FLAGS     LINK      PROBE     STATE
eoib0       yes     bond1       --mb---   up        disabled  ok
eoib1       no      bond1       is-----   up        disabled  ok
bond0_0     yes     bond0       --mb---   up        disabled  ok
bond0_1     no      bond0       is-----   up        disabled  ok

Then we take the link down between compute node and the IB gateway where eoib0 is located, following is what we get:

root@el01cn01:~# ipmpstat -i
INTERFACE   ACTIVE  GROUP       FLAGS     LINK      PROBE     STATE
eoib0       yes     bond1       --mb---   up        disabled  ok
eoib1       no      bond1       is-----   up        disabled  ok
bond0_0     no      bond0       -------   down      disabled  failed
bond0_1     yes     bond0       -smb---   up        disabled  ok

Notice that bond0 has failover but not bond1. Even the LINK status is still up for eoib0, it has actually lost connectivity to the 10GbE network.

Obviously, the reason behind this behavior is related to the vnic0 that we created over eoib0, from the operating system point of view, the link between eoib0 and vnic0 is still up, therefore no failover of bond1 occurred.

This is another good reason why probe-based failure detection is required.




About

The primary contributors to this blog are comprised of the Exalogic and Cloud Application Foundation contingent of Oracle's Fusion Middleware Architecture Team, fondly known as the A-Team. As part of the Oracle development organization, The A-Team supports some of Oracle's largest and most strategic customers worldwide. Our mission is to provide deep technical expertise to support various Oracle field organizations and customers deploying Oracle Fusion Middleware related products. And to collect real world feedback to continuously improve the products we support. In this blog, our experts and guest experts will focus on Exalogic, WebLogic, Coherence, Tuxedo/mainframe migration, Enterprise Manager and JDK/JRockIT performance tuning. It is our way to share some of our experiences with Oracle community. We hope our followers took away something of value from our experiences. Thank you for visiting and please come back soon.

Search

Categories
Archives
« February 2013 »
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
23
24
25
27
28
  
       
Today