LDoms VIO Failover - Part Deux

The article on VIO failover provided an overview of how failover support can be added to the virtual IO infrastructure in a LDoms environment. This support allows a client domain to failover the virtual disk and network from a primary service domain to the alternate (backup) service domain. In this blog, I outline the steps required to configure the primary and alternate service domains, and the client domain for failover. See here for DETAILED INSTRUCTIONS.
Comments:

Very nice. BTW, the link to the instructions has an unnecessary "/entry" - the correct URL is http://blogs.sun.com/narayan/resource/docs/vio_failover_steps.html

Posted by Liam on July 19, 2007 at 08:33 AM EDT #

Thanks Liam. I have fixed the link in the blog.

Posted by Narayan Venkat on July 19, 2007 at 08:33 AM EDT #

I had managed to build a similar setup to what you've described. However I did find that if the alternate domain did not import any disk services from the primary domain then it would hang or just not respond to reconfiguration commands. I subsequently found this issue is described in the LDOM 1.0 release notes as bug 6544197. The solution I used was to export a disk from the primary domain (I had four at my disposal) to the alternate domain for it to use as it's own boot disk. I then farmed out disks from the 3310 storage array attached to the alternate domain to each of the guests (after messing around with the geometry) for use as their own boot device and used the remaining disks from the primary domain to build a SVM mirror with the exported disks from the 3310. And thanks for pointing out the timeout parameter ! So after proving that the concept works, what I wanted to do next was build an Oracle HA cluster using Solaris Cluster by attaching one disk (to be used as the Oracle datastore) from the 3310 to two guest domains. However the current version of LDOM does not allow this (it said resource already bound or such when I tried to add the same disk to the second guest). Any idea of how I can work around this ? I've been stuck on this for over a week and nothing comes to (other than to purchase some fibre disk and a dual controller HBA). I'm sure there's got to be some clever and inexpensive solution to this.

Posted by Jeroen on July 19, 2007 at 11:20 PM EDT #

In setting up the virtual disks, how have you achieved filesystem sharing across primary and alternate service domains? Is this an NFS mount from an external system?

Posted by Paul on July 30, 2007 at 10:48 PM EDT #

Paul,

Both service domains export a virtual disk each to the ldom. The ldom 'ldg1' uses solaris volume manager (SVM) to create a mirrored meta-disk, where each vdisk a sub-mirror. So when the 'primary' service domain reboots, 'ldg1' detects an disk error and fails over to the sub-mirror exported by the 'alternate' service domain.

Posted by Narayan Venkat on August 08, 2007 at 12:31 AM EDT #

Jeroen,

You need to configure both primary and alternate domains with direct access to IO devices. They cannot share IO devices between them. Exporting a disk/network device from the primary to alternate will not allow the alternate domain to continue running when the primary domain goes down. In the configuration described here, both service domains have access to physical IO devices via exclusive PCI-E buses they own. This allows them to start and stop independent of each other. You will need to boot your alternate service domain from a boot disk backed by a physical disk and not from a virtual disk exported from the primary domain.

Posted by Narayan Venkat on August 08, 2007 at 01:01 AM EDT #

Hi,
the idea of vio-failover is really interesting. I think when Boot on ZFS @Sparc is available, the software raid-1 will be easier to handle. It would reduce maintenace work after reboot of a service domain in your example. Think about rebooting both service domains one after another - that wouldn't be a good idea on SDS when forgetting to synchronize before the second reboot...

I also explained your article at my blog in German language for the German Solaris guys.

Greetings from Bavaria, Otmanix

Posted by Otmanix on August 16, 2007 at 10:12 AM EDT #

[Trackback] Die Virtualisierungstechnologie “Logical Domain” (LDOM), die auf den neuen USparc T1/T2 basierten Sun-Servern mit Hypervisor zur Verfügung steht, ist relativ neu (LDOM 1.0) und hat im Bereich Verfügbarkeit noch Optimierungspotenti...

Posted by Otmanix Blog on August 16, 2007 at 10:28 AM EDT #

Hi, I am also trying to build LDOM VIO failover but stuck by virtual-switch (vsw0) configuration on alternate domain.

alternate-vsw0 is configured and plumbing and etc works. The interface is in running state but somehow vsw0 does not function. As soon as I change it to e1000g0 it works fine.

Did anyone faced the same problem?

Posted by Pronto on August 31, 2007 at 09:22 AM EDT #

What I don't get is how the "alternate" service domain provide disk service to LDoms, if it has no physical disk of it's own. Even though you split PCI buses, and assigned the second bus to the "alternate" service domain, how would that service domain boot? Afaik, second PCI bus doesn't own any of the local hard drives, so the only way you could boot it is to have a iSCSI or fiber channel HBA in the one PCI-e slot that's on the second bus.

Unless there is a way to somehow assign a local drive to a PCI bus (doubt that this is possible).

Posted by Vasur on September 07, 2007 at 05:39 AM EDT #

Yes you are right. That's what I want to test next week.
For t-2000 you need a HBA on PCI-E slot0 which is on the same bus as the alternate domain. And then use the LUN as vdisk for the ldom.

Posted by Pronto on September 13, 2007 at 12:12 AM EDT #

Hi Pronto / Vasur

You both are absolutely correct. You cannot use the onboard disks for
both domains. You will need to use a HBA in the PCI-E slot to achieve
this. I have listed below the disk / pci-e bus information for both the
primary and alternate domains.

Primary domain
==============
# df -k /ldomspool/ldg1/bootdisk.img
Filesystem kbytes used avail capacity Mounted on
/dev/dsk/c0t3d0s4 48896488 25226644 23180880 53% /ldomspool

# ls -l /dev/dsk/c0t3d0s4
lrwxrwxrwx 1 root root 49 Jun 2 06:15 /dev/dsk/c0t3d0s4 -> ../../devices/pci@780/pci@0/pci@9/scsi@0/sd@3,0:e

Alternate domain
================
# df -k /ldomspool/ldg1/bootdisk.img
Filesystem kbytes used avail capacity Mounted on
/dev/dsk/c2t0d0s4 48896488 8441218 39966306 18% /ldomspool

# ls -l /dev/dsk/c2t0d0s4
lrwxrwxrwx 1 root root 65 Jun 2 06:15 /dev/dsk/c2t0d0s4 -> ../../devices/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@0,0:e

Both service domains export (as virtual disks) disk image files on the
physical disks (on their bus) to the LDom ldg1. Solaris volume manager
is then used in the domain ldg1 to create a mirrored metadisk using the
two vdisks.

Posted by Narayan on September 15, 2007 at 04:37 AM EDT #

Hi Narayan,

Thanks for your post. I supplied the HBAs for my test.
One question came up to my mind before I create my Failover guest domain.

My alternate service domain OS lies on a exported disk from primary domain. Should I also change this and transfer my boot disk of alternate domain on a SAN LUN which is on PCI-E slot 0?

BR, Pronto

Posted by Pronto on September 15, 2007 at 04:55 AM EDT #

Hi Pronto

Since a service domain is required to independently service other
domains, it should to have direct access to IO. Hence the alternate
service domain should use the boot disk on a SAN LUN which is
on PCI-E slot 0. Otherwise when the primary domain goes down,
the 'alternate' domain will hang and will not be able to act as the
backup service for domain ldg1.

Thanks
-Narayan

Posted by Narayan Venkat on September 17, 2007 at 05:34 AM EDT #

Hi Narayan,

Finally I managed to build my alternate domain.(boot disk on a SAN LUN which is on PCI-E slot 0) and my Failover Guest domain.

Both service domains export a virtual-disk and a virtual-net to my guest domain. (as you described in http://blogs.sun.com/narayan/resource/docs/vio_failover_steps.html) The only thing which didn't work for me is to give the "timeout=1" parameter to ldm add-vdisk command.

Anyway, If I boot a service domain my guest domain stays up and running but as soon as I take the alternate domain to ok prompt (init 0) my guest domain freezes and waits till I boot up the alternate domain again.

NIC failover is no problem but my vdisk failover seems to be lazy. There is no much logs to see what is happening but the only message what I see is

NOTICE: ds_cap_send: invalid handle 0x00000000000
NOTICE: md ds_cap_send resp failed (22)

It seems to be a bug (6528758) but I don't know if it has something to do with my Problem.

If you have any idea, please let me know.

Another matter is, is it possible to configure alternate domain also as terminal server (vntsd) and open virtual consoles for guest domain over alternate domain since the primary domain is down?

Greetings from Germany,

Pronto

Posted by Pronto on September 19, 2007 at 01:23 AM EDT #

Hi Narayan,
We have a T2000 server and configured the server like the "DETAILED INSTRUCTIONS".
I have a question about the disks part.
We have a HDS lun which we connected to the Primary and Alternate Domain via 2 separate hba´s (bus_a,bus_b). We gave Vdisk1 and Vdisk2 at Ldg1 that lun. We can boot the Ldg1 form the Primary Vdisk1. Is there a way to boot Ldg1 from Vdisk2. So that we have a failover with that lun.
With Kind Regards
Frits Witjas

Posted by F.Witjas on September 12, 2008 at 06:50 AM EDT #

I am new to this concept, but it seems like very interesting.

I have gone through the procedures, but one thing i cannot understand. Do we need to install ldom 1.1 software on both primary and alternate service domain to manage guest domain ? If not, how do you manage guest domain when the primary domain goes down ?

Thanks in Advance.
Regards,
G.Mohanraj

Posted by G.Mohanraj on June 20, 2009 at 02:15 AM EDT #

We have create a primary IO domain using the internal disks and then as advised above for the 2nd IO domain have allocated a SAN LUN to use as the boot device. Using LDOMS 1.2 we have created the 2nd IO domain and installed Solaris however whenever it tries to boot it panics with:

Fatal error has occured in: PCIe root complex

Any ideas?

Posted by Simon Borthwick on August 01, 2009 at 10:03 PM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Narayan Venkat

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today