X

An Oracle blog about Ops Center

Using MPxIO and Veritas in Ops Center

Guest Author

Hey, folks. This is a guest post by Doug Schwabauer about using MPxIO and Veritas in Ops Center. It's well worth the read.

Enterprise Manager Ops Center has a rich feature set for Oracle VM Server for SPARC (LDOMs) technology. You can provision Control Domains, service domains, LDOM guests, control the size and properties of the guests, add networks and storage to guests, create Server Pools, etc. The foundation for many of these features is called a Storage Library. In Ops Center, a Storage Library can be Fiber Channel based, iSCSI based, or NAS based.

For FC and iSCSI-based LUNs, Ops Center only recognizes LUNS that are presented to the Solaris Host via MPxIO, the built in multipathing software for Solaris. LUNs that are presented via other MP software, such as Symantec DMP, EMC Power Path, etc., are not recognized.

Sometimes users do not have the option to use MPxIO, or may choose to not use MPxIO for other reasons, such as use cases where SCSI3 Persistent Reservations (SCSI3 PR) is required.

It is possible to have a mix of MPxIO-managed LUNs and other LUNs, and to mix and match the LDOM guests and storage libraries for different purposes.

Below are two such use cases. In both, the user wanted to
utilize Veritas Cluster Server (VCS) to cluster select
applications running within LDOM’s that are managed with Ops
Center. However, the cluster requires I/O fencing via SCSI3 PR.
In this scenario, storage devices CANNOT be presented via MPxIO
since SCSI3 PR requires direct access to storage from inside the
guest OS, and MPxIO does not facilitate that capability. Therefore, the user thought they were either going to have to
choose to not use Ops Center, or not use VCS.

We found and presented a "middle road" solution, where the user is
able to do both - use Ops Center for the majority of their LDOM/OVM
Server for SPARC environment, but still use Veritas Dynamic
Multi-Pathing (DMP) software to manage the disk devices used for the
data protected by the cluster.

In both use cases, the hardware is the same:

  • 2 T4-2, each with 4 FC cards and 2 ports/card
  • Sun Storedge 6180 FC LUNs presented to both hosts
  • One Primary Service domain, and one Alternate Service Domain, a Root Complex domain
  • Each Service domain sees two of the 4 FC Cards.

See the following blog posts for more details on setting up Alternate Service Domains:

In our environment, the primary domain owns the cards in Slots 4 and 6, and the alternate domain owns the cards in Slots 1 and 9.

(Refer to System Service Manual for System Schematics and bus/PCI layouts.)

A user can control what specific cards, and even ports on cards, use MPxIO, and which don't.

You can either enable MPxIO globally, and then just disable it on certain ports, or disable MPxIO globally, and then just enable it on certain ports. Either way will accomplish the same thing.

See the Enabling or Disabling Multipathing on a Per-Port Basis document for more information.

For example:

root@example:~# tail /etc/driver/drv/fp.conf
# "target-port-wwn,lun-list"
#
# To prevent LUNs 1 and 2 from being configured for target
# port 510000f010fd92a1 and target port 510000e012079df1, set:
#
# pwwn-lun-blacklist=
# "510000f010fd92a1,1,2",
# "510000e012079df1,1,2";
mpxio-disable="no";      <---------------------- Enable MPxIO globally
name="fp" parent="/pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0" port=0 mpxio-disable="yes";  <--- Disable on port


root@example:~# ls -l /dev/cfg
total 21
.
.
.
lrwxrwxrwx   1 root     root          60 Feb 13 12:51 c3 -> ../../devices/pci@400/pci@1/pci@0/pci@8/SUNW,qlc@0/fp@0,0:fc
lrwxrwxrwx   1 root     root          62 Feb 13 12:51 c4 -> ../../devices/pci@400/pci@1/pci@0/pci@8/SUNW,qlc@0,1/fp@0,0:fc
lrwxrwxrwx   1 root     root          60 Feb 13 12:51 c5 -> ../../devices/pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0:fc
lrwxrwxrwx   1 root     root          62 Feb 13 12:51 c6 -> ../../devices/pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0,1/fp@0,0:fc
.
.
.

Therefore "c5" on the example host will not be using MPxIO.

Similar changes were made for the other 3 service domains.

Now, for the guest vdisks that will not use MPxIO, the backend devices used were just raw /dev/dsk names - no multi pathing software is involved.   You will see a mix of these below - italics use MPxIO, and non-italics do not:

VDS
    NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
    primary-vds0     primary          aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
                                      quorum1                                        /dev/dsk/c5t20140080E5184632d12s2
                                      quorum2                                        /dev/dsk/c5t20140080E5184632d13s2
                                      quorum3                                        /dev/dsk/c5t20140080E5184632d14s2
                                      clusterdata1                                   /dev/dsk/c5t20140080E5184632d8s2
                                      clusterdata2                                   /dev/dsk/c5t20140080E5184632d7s2
                                      clusterdata3                                   /dev/dsk/c5t20140080E5184632d10s2
                                      aa-guest3-vol0                  aa-guest3      /dev/dsk/c0t60080E5000183F120000138B5522A1C4d0s2

VDS
    NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
    alternate-vds0   example-a    aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
                                      clusterdata3                                   /dev/dsk/c3t20140080E5184632d10s2
                                      clusterdata2                                   /dev/dsk/c3t20140080E5184632d7s2
                                      clusterdata1                                   /dev/dsk/c3t20140080E5184632d8s2
                                      quorum3                                        /dev/dsk/c3t20140080E5184632d14s2
                                      quorum2                                        /dev/dsk/c3t20140080E5184632d13s2
                                      quorum1                                        /dev/dsk/c3t20140080E5184632d12s2
                                      aa-guest3-vol0                  aa-guest3      /dev/rdsk/c0t60080E5000183F120000138B5522A1C4d0s2

Here you can see in Ops Center what the Alternate Domain's virtual disk services look like:

From guest LDOM perspective, we see 12 data disks (c1d0 is the boot disk), which is really 2 paths to 6 LUNs - one path from Primary and one from Alternate:

AVAILABLE DISK SELECTIONS:
       0. c1d0 <SUN-SUN_6180-0784-100.00GB>
          /virtual-devices@100/channel-devices@200/disk@0
       1. c1d1 <SUN-SUN_6180-0784-500.00MB>
          /virtual-devices@100/channel-devices@200/disk@1
       2. c1d2 <SUN-SUN_6180-0784-500.00MB>
          /virtual-devices@100/channel-devices@200/disk@2
       3. c1d3 <SUN-SUN_6180-0784-500.00MB>
          /virtual-devices@100/channel-devices@200/disk@3
       4. c1d4 <SUN-SUN_6180-0784-500.00MB>
          /virtual-devices@100/channel-devices@200/disk@4
       5. c1d5 <SUN-SUN_6180-0784-500.00MB>
          /virtual-devices@100/channel-devices@200/disk@5
       6. c1d6 <SUN-SUN_6180-0784 cyl 51198 alt 2 hd 64 sec 64>
          /virtual-devices@100/channel-devices@200/disk@6
       7. c1d7 <SUN-SUN_6180-0784 cyl 51198 alt 2 hd 64 sec 64>
          /virtual-devices@100/channel-devices@200/disk@7
       8. c1d8 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
          /virtual-devices@100/channel-devices@200/disk@8
       9. c1d9 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
          /virtual-devices@100/channel-devices@200/disk@9
      10. c1d10 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
          /virtual-devices@100/channel-devices@200/disk@a
      11. c1d11 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
          /virtual-devices@100/channel-devices@200/disk@b
      12. c1d12 <SUN-SUN_6180-0784-500.00MB>
          /virtual-devices@100/channel-devices@200/disk@c

Again from Ops Center, you can click on the Storage tab of the guest, and see that the MPxIO-enabled LUN is known to be "Shared" by the hosts in Ops Center, while the other LUNs are not:

At this point, since VCS was going to be installed on the LDOM OS and a cluster built, the Veritas stack, including VxVM and VxDMP,  was enabled on the guest LDOMs to correlate the two paths from primary and alternate domains into one path.

For example:

root@aa-guest1:~# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
sun6180-0_0  auto:ZFS        -            -            ZFS
sun6180-0_1  auto:cdsdisk    sun6180-0_1  data_dg      online shared
sun6180-0_2  auto:cdsdisk    -            -            online
sun6180-0_3  auto:cdsdisk    -            -            online
sun6180-0_4  auto:cdsdisk    -            -            online
sun6180-0_5  auto:cdsdisk    sun6180-0_5  data_dg      online shared
sun6180-0_6  auto:cdsdisk    sun6180-0_6  data_dg      online shared

root@aa-guest1:~# vxdisk list sun6180-0_6
Device:    sun6180-0_6
devicetag: sun6180-0_6
type:      auto
clusterid: aa-guest
disk:      name=sun6180-0_6 id=1427384525.22.aa-guest1
group:     name=data_dg id=1427489834.14.aa-guest1
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig shared autoimport imported
pubpaths:  block=/dev/vx/dmp/sun6180-0_6s2 char=/dev/vx/rdmp/sun6180-0_6s2
guid:      {a72e068c-d3ce-11e4-b9a0-00144ffe28bc}
udid:      SUN%5FSUN%5F6180%5F60080E5000184108000000004C2CF217%5F60080E5000184632000056B354E6025F
site:      -
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=65792 len=104783616 disk_offset=0
private:   slice=2 offset=256 len=65536 disk_offset=0
update:    time=1430930621 seqno=0.15
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=48144
logs:      count=1 len=7296
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
 config   priv 000256-048207[047952]: copy=01 offset=000192 enabled
 log      priv 048208-055503[007296]: copy=01 offset=000000 enabled
 lockrgn  priv 055504-055647[000144]: part=00 offset=000000
Multipathing information:
numpaths:   2
c1d11s2         state=enabled   type=secondary
c1d9s2          state=enabled   type=secondary

connectivity: aa-guest1 aa-guest2

root@aa-guest1:~# vxdmpadm getsubpaths dmpnodename=sun6180-0_6
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME          ENCLR-TYPE   ENCLR-NAME    ATTRS
========================================================================================
c1d11s2      ENABLED(A)  SECONDARY    c1                 SUN6180-     sun6180-0        -
c1d9s2       ENABLED(A)  SECONDARY    c1                 SUN6180-     sun6180-0        -

In this way, the 2 guests that were going to be clustered together are now ready for VCS installation and configuration.

The second use case changes a little bit in that both MPxIO and Veritas DMP are used in the Primary and Alternate domains, and DMP is still used in the guest as well.   The advantage of this is there is more redundancy and I/O throughput available at the service domain level, because a multi-pathed devices are used for the guest virtual disk services, instead of just the raw /dev/dsk/c#t#d#.

Now the disk services looks something like this, where the italics vdsdevs are DMP-based, and the non-italic ones are DMP based:

VDS
    NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
    primary-vds0     primary          aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
                                      aa-guest3-vol0                  aa-guest3      /dev/dsk/c0t60080E5000183F120000138B5522A1C4d0s2
                                      quorum1                                        /dev/vx/dmp/sun6180-0_6s2
                                      quorum2                                        /dev/vx/dmp/sun6180-0_7s2
                                      quorum3                                        /dev/vx/dmp/sun6180-0_8s2
                                      clusterdata1                                   /dev/vx/dmp/sun6180-0_12s2
                                      clusterdata2                                   /dev/vx/dmp/sun6180-0_5s2
                                      clusterdata3                                   /dev/vx/dmp/sun6180-0_14s2


VDS
    NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
    alternate-vds0   example-a    aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
                                      aa-guest3-vol0                  aa-guest3      /dev/rdsk/c0t60080E5000183F120000138B5522A1C4d0s2
                                      quorum1                                        /dev/vx/dmp/sun6180-0_6s2
                                      quorum2                                        /dev/vx/dmp/sun6180-0_7s2
                                      quorum3                                        /dev/vx/dmp/sun6180-0_8s2
                                      clusterdata1                                   /dev/vx/dmp/sun6180-0_12s2
                                      clusterdata2                                   /dev/vx/dmp/sun6180-0_5s2
                                      clusterdata3                                   /dev/vx/dmp/sun6180-0_14s2

Again, the advantage here is that 2 paths to the same LUN are being presented from each service domain, so there is additional redundancy and throughput available. You can see the two paths:

root@example:~# vxdisk list sun6180-0_5
Device:    sun6180-0_5
devicetag: sun6180-0_5
type:      auto
clusterid: aa-guest
disk:      name= id=1427384268.11.aa-guest1
group:     name=data_dg id=1427489834.14.aa-guest1
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig shared autoimport
pubpaths:  block=/dev/vx/dmp/sun6180-0_5s2 char=/dev/vx/rdmp/sun6180-0_5s2
guid:      {0e62396e-d3ce-11e4-b9a0-00144ffe28bc}
udid:      SUN%5FSUN%5F6180%5F60080E5000184108000000004C2CF217%5F60080E5000183F120000107F54E603CA
site:      -
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=65792 len=104783616 disk_offset=0
private:   slice=2 offset=256 len=65536 disk_offset=0
update:    time=1430930621 seqno=0.15
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=48144
logs:      count=1 len=7296
Defined regions:
 config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
 config   priv 000256-048207[047952]: copy=01 offset=000192 enabled
 log      priv 048208-055503[007296]: copy=01 offset=000000 enabled
 lockrgn  priv 055504-055647[000144]: part=00 offset=000000
Multipathing information:
numpaths:   2
c5t20140080E5184632d7s2 state=enabled   type=primary
c5t20250080E5184632d7s2 state=enabled   type=secondary

The Ops Center view of the virtual disk services is much the same:

Now the cluster can be set up just as it was before. To the guest, the virtual disks have not changed - just the back-end presentation of the LUNs has changed. This was transparent to the guest.

Join the discussion

Comments ( 2 )
  • Arkadiy Chapkis Monday, June 22, 2015

    Thanks for the article, it is interesting. However, the setup is too complex.

    > Sometimes users do not have the option to use MPxIO, or may choose to not use MPxIO for other reasons, such as use cases where SCSI3 Persistent Reservations (SCSI3 PR) is required.

    I am not sure why you think MPxIO would not support SCSI3 PGR. We have working configuration of DMP over MPxIO (DMP sees only one path from devices multipathed with MPxIO) with IO Fencing anf SCSI3 PGR. This works OK for both coordinator and data LUNs (as long as the host mode options on SAN are set properly):

    # vxdisk list hp_p95000_0|egrep "Device:|group:|flags:|path|state"

    Device: hp_p95000_0

    group: name=fendg id=1424281293.20.sndcsqss01-ldm2

    flags: online ready private autoconfig coordinator

    pubpaths: block=/dev/vx/dmp/hp_p95000_0s2 char=/dev/vx/rdmp/hp_p95000_0s2

    Multipathing information:

    numpaths: 1

    c0t60060E80167DE20000017DE200000508d0s2 state=enabled

    # vxfenadm -s all -f hp_p95000_0.txt

    Device Name: /dev/rdsk/c0t60060E80167DE20000017DE200000508d0s2

    Total Number Of Keys: 2

    key[0]:

    [Numeric Format]: 86,70,48,48,68,69,48,49

    [Character Format]: VF00DE01

    * [Node Format]: Cluster ID: 222 Node ID: 1 Node Name: xxx1

    key[1]:

    [Numeric Format]: 86,70,48,48,68,69,48,48

    [Character Format]: VF00DE00

    * [Node Format]: Cluster ID: 222 Node ID: 0 Node Name: xxx2

    # mpathadm show LU /dev/rdsk/c0t60060E80167DE20000017DE200000508d0s2

    Logical Unit: /dev/rdsk/c0t60060E80167DE20000017DE200000508d0s2

    mpath-support: libmpscsi_vhci.so

    Vendor: HP

    Product: OPEN-V -SUN

    Revision: 7006

    Name Type: unknown type

    Name: 60060e80167de20000017de200000508

    Asymmetric: no

    Current Load Balance: round-robin

    Logical Unit Group ID: NA

    Auto Failback: on

    Auto Probing: NA

    Paths:

    Initiator Port Name: 2100000e1e19b120

    Target Port Name: 50060e80167de219

    Override Path: NA

    Path State: OK

    Disabled: no

    Initiator Port Name: 2100000e1e19a9f0

    Target Port Name: 50060e80167de209

    Override Path: NA

    Path State: OK

    Disabled: no

    Target Ports:

    Name: 50060e80167de219

    Relative ID: 0

    Name: 50060e80167de209

    Relative ID: 0

    It easier to manage the devices with DMP (IMHO), but we had to build a number of local zones, and for zonepath only ZFS is supported. The choise was between building zpools on DMP with native support (flaky) or MPxIO, so we went with later. It is in our plans to introduce Ops center later on.


  • Len Monday, October 19, 2015

    >as long as the host mode options on SAN are set properly.

    can you please advise the host mode settings ? we got Hitachi storage.

    we have presented the fence 3 x LUNs to the guest ldom(Solaris 10) where SFHA 6.1 is installed. the disks are under mpxio on control domain(T5-2/Soalaris 11) so it shows only one path on the guest ldom.

    the problem is, vxconfigd daemon goes unresponsive when we try to deport the data disk ( 8 x1tb) group on the guest ldom. this issue doesnt happen if we remove the fence disks.

    i dont see any issue with scsi reservation on the fence disks.

    Total Number Of Keys: 2

    key[0]:

    [Numeric Format]: 86,70,48,55,68,67,48,49

    [Character Format]: VF07DC01

    * [Node Format]: Cluster ID: 2012 Node ID: 1 Node Name: XXXXXX2

    key[1]:

    [Numeric Format]: 86,70,48,55,68,67,48,48

    [Character Format]: VF07DC00

    * [Node Format]: Cluster ID: 2012 Node ID: 0 Node Name: XXXXXX1


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.