Configure failover LDom (Oracle VM Server for SPARC) on Solaris Cluster 4.1 by using 'live' migration

This blog shows an example to configure 'Oracle Solaris Cluster Data Service for Oracle VM Server for SPARC' on Solaris Cluster 4.1. It also mentions some hints around such a configuration. For this setup Solaris Cluster 4.1 SRU3 or higher and Oracle VM Server 3.0 or higher is required.
At least this is a summary of
Oracle Solaris Cluster Data Service for Oracle VM Server for SPARC Guide
Oracle VM Server for SPARC 3.0 Administration Guide.
Please check these guides for further restrictions and requirements.

This procedure is especially for 'live' migration of guest LDom's which means no shutdown of the OS in the LDom within the failover. In earlier OVM releases this was called 'warm' migration. However, the word 'live' is used in this example. A 'cold' migration means that the OS in the guest LDom will be stopped before migration.

Let's start:
The necessary services must be identical on all the potential control domains (primary domains) which run as Solaris Cluster 4.1 nodes. It is expected that Oracle VM Sever is already installed.

1) Prepare all primary domains which should manage the failover LDom with the necessary services.
all_primaries# ldm add-vconscon port-range=5000-5100 primary-vcc0 primary
all_primaries# svcadm enable svc:/ldoms/vntsd:default
all_primaries# ldm add-vswitch net-dev=net0 public-vsw1 primary
all_primaries# ldm add-vdiskserver primary-vds0 primary

To verify:
all_primary# ldm list-bindings primary

2) Set failure-policy on all primary domains:
all_primaries# ldm set-domain failure-policy=reset primary
To verify:
all_primaries# ldm list -o domain primary

3) Create failover guest domain (fgd0) on one primary domain.
Simple example:
primaryA# ldm add-domain fgd0
primaryA# ldm set-vcpu 16 fgd0
primaryA# ldm set-mem 8G fgd0

4) Add public network to failover guest domain:
primaryA# ldm add-vnet public-net0 public-vsw1 fgd0
To verify:
primaryA# ldm list-bindings fgd0

For more details to setup guest LDoms refer to Oracle VM Server for SPARC 3.0 Administration Guide.

5) Set necessary values on failover guest domain fgd0:
primaryA# ldm set-domain master=primary fgd0
primaryA# ldm set-var auto-boot?=false fgd0

To verify run:
primaryA# ldm list -o domain fgd0
auto-boot?=false is a “must have” to prevent data corruption. More details available in DocID 1585422.1 Solaris Cluster and HA VM Server Agent SUNW.ldom Data Corruption may Occur in a failover Guest Domain when "auto-boot?=true" is set

6) Select boot device for failover guest domain fgd0.
Possible options for the root file system of a domain with 'live' migration are: Solaris Cluster global filesystem (UFS/SVM), NFS, iSCSI, and SAN LUNs because all accessible at the same time from both nodes. The recommendation is to use full raw disk because it's expected to do 'live' migration. The full raw disk can be provided via SAN or iSCSI to all primary domains.
Remember zfs as root filesystem can ONLY be used if doing 'cold' migration because for 'live' migration both nodes need to access the root file system at the same time which is not possible with zfs.
Using Solaris Cluster global filesystem is an alternative but the performance is not that good as root on raw disk.
Details available in DocID 1366967.1 Solaris Cluster Root Filesystem Configurations for a Guest LDom Controlled by a SUNW.ldom Resource

So, root on raw filesystem is selected.
Add boot device to fgd0:
all_primaries# ldm add-vdsdev /dev/did/rdsk/d7s2 boot_fgd0@primary-vds0
primaryA# ldm add-vdisk root_fgd0 boot_fgd0@primary-vds0 fgd0

6a) Optional: Configure MAC addresses of LDom. The LDom Manager assign MAC automatically but the following issues can occur:
* Duplicate MAC address if other guest LDom's are down when creating a new LDom.
* MAC address can change after failover of a LDom
Assign your own MAC address is possible. This example use the suggested range between 00:14:4F:FC:00:00 – 00:14:4F:FF:FF:FF as described in
Assigning MAC Addresses Automatically or Manually of Oracle VM Server for SPARC 3.0 Administration Guide.
Identify current automatically assigned MAC addresses
primaryA# ldm list -l fgd0
to see the HOSTID which is similar as MAC a 'ldm bind fldg0' is necessary. Unbind fldg0 afterwards with 'ldm unbind fldg0'
MAC: 00:14:4f:fb:50:dc → change to 00:14:4f:fc:50:dc
HOSTID: 0x84fb50dc → change to 0x84fb50dd
public-net: 00:14:4f:fa:01:49 → change to 00:14:4f:fc:01:49
primaryA# ldm set-domain mac-addr=00:14:4f:fc:50:dc fgd0
primaryA# ldm set-domain hostid=0x84fb50dd fgd0
primaryA# ldm set-vnet mac-addr=00:14:4f:fc:01:49 public-net0 fgd0
primaryA# ldm list-constraints fgd0 (this shows assigned MAC now)
For more details or if necessary to change the MAC addresses on already configured failover guest LDom then refer to DocID 1559415.1 Solaris Cluster HA-LDom Agent do not Preserve hostid and MAC Address Upon Failover

7) Bind and start the fgd0
primaryA# ldm bind fgd0
primaryA# ldm start fgd0

8) Login to LDom using console:
primaryA# telnet localhost 5000

9) Install Solaris10 or Solaris 11 on LDom by using install server
To identify MAC address of LDom do in the console of fgd0:
{0} ok devalias net
{0} ok cd /virtual-devices@100/channel-devices@200/network@0
{0} ok .properties
local-mac-address 00 14 4f fc 01 49

For different installation method please refer to Installing Oracle Solaris OS on a Guest Domain of Oracle VM Server for SPARC 3.0 Administration Guide

10) Install HA-LDom (HA for Oracle VM Server Package) on all primary domain nodes if not already done
all_primaries# pkg info ha-cluster/data-service/ha-ldom
all_primaries# pkg install ha-cluster/data-service/ha-ldom

11) Check that cluster is first entry in /etc/nsswitch.conf
all_primaries# svccfg -s name-service/switch listprop config/host
config/host astring "files dns"
all_primaries# svccfg -s name-service/switch listprop config/ipnodes
config/ipnodes astring "files dns"
all_primaries# svccfg -s name-service/switch listprop config/netmask
config/netmask astring files
If not add it:
all_primaries# svccfg -s name-service/switch setprop config/host = astring: '("cluster files dns")'
all_primaries# svccfg -s name-service/switch setprop config/ipnodes = astring: '("cluster files dns")'
all_primaries# svccfg -s name-service/switch setprop config/netmask = astring: '("cluster files")'

More Details in DocID 1554887.1 Solaris Cluster: HA LDom Migration Fails With "Failed to establish connection with ldmd(1m) on target"

12) Create resource group for failover LDom fgd0 for primiary domains
primaryA# clrg create -n primaryA,primaryB fldom-rg

13) Register SUNW.HAStoragePlus if not already done:
primaryA# clrt register SUNW.HAStoragePlus

14) Create HAStoragePlus resource for boot device:
primaryA# clrs create -g fldom-rg -t SUNW.HAStoragePlus -p GlobalDevicePaths=/dev/global/dsk/d7s2 fgd0-has-rs
To use d7s2 is a requirement!!!

15) Enable LDom resrouce group on current node:
primaryA# clrg online -M -n primaryA fldom-rg

16) Register SUNW.ldom
primaryA# clrt register SUNW.ldom

17) Setup password file for non-interactive 'live' migration on all primary nodes
all_primaries# vi /var/cluster/.pwfgd0
add root password to this file
all_primaries# chmod 400 /var/cluster/.pwfgd0
* The first line of the file must contain the password
* The password must be plain text
* The password must not exceed 256 characters in length
A newline character at the end of the password and all lines that follow the first line are ignored.
These details from Performing Non-Interactive Migrations of Oracle VM Server for SPARC 3.0 Administration Guide

Attention: If you are using SUNW.ldom:6 or higher this kind of password setup will fail. Also the alternative 17a) does not work with SUNW.ldom:6 or higher. For details please refer to my blog
New resource type version for Solaris Cluster SUNW.ldom agent

17a) Alternative: Setup encrypted password file for non-interactive 'live' migration on all primary nodes
all_primaries# echo "encrypted" > /var/cluster/.pwfgd0
all_primaries# dd if=/dev/urandom of=/var/cluster/ldom_key bs=16 count=1
all_primaries# chmod 400 /var/cluster/ldom_key
all_primaries# echo fu_bar | /usr/sfw/bin/openssl enc -aes128 -e -pass file:/var/cluster/ldom_key -out /opt/SUNWscxvm/.fgd0_passwd
all_primaries# chmod 400 /opt/SUNWscxvm/.fgd0_passwd

The root password for failover LDom is "fu_bar" which will be encrypted. All files must be secured using "chmod 400". Both /var/cluster/ldom_key and /opt/SUNWscxvm/.{DOMAIN}_passwd file can NOT be placed in a different location and can NOT have a different name.

Verify if encrypted password can be decrypted:
all_primaries# /usr/sfw/bin/openssl enc -aes128 -d -pass file:/var/cluster/ldom_key -in /opt/SUNWscxvm/.fgd0_passwd
More Details in DocID 1668567.1 Solaris Cluster HA-LDom Fails Doing 'live' Migration with "normal failover will be performed" or "Password cannot be longer than 256 characters" due to Wrong Value in 'Password_file' resource property

18) Create SUNW.ldom resource
primaryA# clrs create -g fldom-rg -t SUNW.ldom -p Domain_name=fgd0 -p Password_file=/var/cluster/.pwfgd0 -p resource_dependencies=fgd0-has-rs fgd0-rs

Notice: The domain configuration is retrieved by the “ldm list-constraints -x ldom” command from Solaris Cluster and stored in the CCR. This info is used to create or destroy the domain on the node where the resource group is brought online or offline.

19) Check Migration_type property. It should be MIGRATE for 'live' migration:
primaryA# clrs show -v fgd0-rs | grep Migration_type
If not MIGRATE then set it:
primaryA# clrs set -p Migration_type=MIGRATE fgd0-rs

20) To stop/start the SUNW.ldom resource
primaryA# clrs disable fgd0-rs
primaryA# clrs enable fgd0-rs

21) Verify the setup by switching failover LDom to other node and back.
primaryA# clrg switch -n primaryB fldom-rg
primaryA# clrg switch -n primaryA fldom-rg
To monitor the migration process run 'ldm list -o status fldg0' on the primary target domain.

22) Tune your timeout values depending on your system.
primaryA# clrs set -p STOP_TIMEOUT=1200 fgd0-rs
Details in DocID 1423937.1 Solaris Cluster: HA LDOM Migration Fails With "Migration of domain timed out, the domain state is now shut off"

23) Consider further tuning of timeout values as described in
SPARC: Tuning the HA for Oracle VM Server Fault Monitor of Oracle Solaris Cluster Data Service for Oracle VM Server for SPARC Guide
For less frequent probing maybe the following setting can be used.
primaryA # clrs set -p Thorough_probe_interval=180 -p Probe_timeout=90 fgd0-rs

Last but not least, it's not supported to run a 2-node or more-node Solaris Cluster within a failover LDom!
BUT with SC4.1 SRU4 or higher you can run a single-node Solaris Cluster within failover LDom.
For details please refer to Application monitoring in Oracle VM for SPARC failover guest domain within Doc ID 1597319.1 Oracle Solaris Cluster Product Update Bulletin October 2013


Fer, va a haber que implementar solaris cluster en YPFB en Ldoms

Posted by fcalla on April 28, 2014 at 03:00 PM CEST #

Please can you write this in english without acronyms?

Posted by Juergen on April 28, 2014 at 03:40 PM CEST #

I tried doing this on two SPARC T5s but the Guest Domain won't start when added to the cluster. I get to create it, start and install the Guest OS. I even get to migrate it from one physical node to another, but when the time comes to put it under cluster control it won't start. That is on step 18 of this page.

My servers are at Oracle Solaris using Solaris Cluster 4.1. Is there anything that might be on this newer Solaris/Solaris Cluster versions that may not be working alright?

Posted by Jose Tamayo on July 28, 2014 at 03:49 PM CEST #

There is nothing specific for this setup. The latest qualified LDom (Oracle VM Server for SPARC) software is version 3.1. This info can be found in the Oracle Solaris Cluster 4 Compatibility Guide - Page 45 (

Posted by Juergen on July 30, 2014 at 04:52 PM CEST #


Great article. However, between steps 10 to 12 should there be a step for setting up the cluster?
- scinstall
- Cluster Transport
- Quorum device
- Where is the ldom_config file created?

Can you help with this?

I need to get a better understanding.

Posted by guest on June 30, 2016 at 12:40 AM CEST #

This article expect that Solaris Cluster is already installed in the primary domains. Please refer to
Oracle Solaris Cluster 4.3 Software Installation Guide ( or How to Install and Configure a Two-Node Cluster ( for details to install Solaris Cluster.

Posted by Juergen on June 30, 2016 at 12:36 PM CEST #

Post a Comment:
  • HTML Syntax: NOT allowed

I'm still mostly blogging around Solaris Cluster and support. Independently if for Sun Microsystems or Oracle. :-)


« July 2016