Wednesday Jun 08, 2011

Live Migration in Oracle VM Server for SPARC 2.1

You may have seen the press release that Oracle VM Server for SPARC 2.1 (a.k.a. LDoms) has just been released (you can find the download links here). There is a considerable list of new features (including Dynamic Resource Management, Virtual Device Service Validation and many more) but the key feature for me is Live Migration which allows for the migration of an active domain without any impact on applications and users shouldn't even notice that the guest domain is running on a new machine (OK so I would say that since I'm one of the migration developers...)

It has been possible to migrate an active domain since LDoms 1.1 (released in 2008) however up until now, the domain was suspended while the runtime state was copied from the source machine to the target which could result in an outage in the order of minutes if the domain had a large amount of memory (the suspend time was pretty much linearly proportional to the guest domain memory size). With Live Migration we transfer the memory contents as the domain keeps running while at the same time keeping track of the memory that is being modified. We iterate through the memory, transferring modified pages to the target system, until the amount of memory being modified is minimal. Then at the end we momentarily suspend the domain and copy the remaining memory and resume the domain on the target. This suspension can take less than a second but depending on the workload can take longer than this if the domain is rapidly modifying a lot of the memory pages.

One other migration performance enhancement in this release is that multiple network connections between the source and target machines are utilised (based on the number of virtual cpus in the control domains) which improves the throughput of the memory transfer and reduces the overall migration time.

I've found from running experiments that having 16 vcpus in the control domains makes a significant improvement over 8 vcpus (and up to 32 vpus will help more, beyond which there's no noticeable difference). The other best practice is to ensure that cryptographic units (a.k.a MAUs) are assigned to the control domains also as the memory contents are protected by SSL when being transferred over the network and offloading these operations to the T-series hardware makes a big difference.

The migration chapter in the Admin Guide has been revamped to discuss the enhancements.

 [Update: Jeff Savit wrote a great post about his experiences using Live Migration]

Wednesday Jan 20, 2010

LDoms 1.3 has been released

It's now possible to download the LDoms 1.3 software. The LDoms 1.3 Release Notes list all the new features but some of the ones I particularly like include:

  • Greatly improved domain migration speeds
    • Following on from the speed-up in 1.2 due to using the hardware crypto devices, the migration code in 1.3 has been enhanced to compress the memory and use multiple threads to push the data over the network. The speed-up depends on the memory usage of the domain being migrated but Haik has mentioned seeing improvements of over 80%...

  • Support for link-based IPMP
    • The virtual network and virtual switch devices now support link status updates to the network stack. The LDoms 1.3 Admin Guide includes an example of how you'd configure it.

  • Crypto Dynamic Reconfiguration (DR)
    • You can now add and remove the hardware crypto units from domains without rebooting (just like CPUs and VIO devices). Which is great since ssh/scp can now use the crypto units as well as domain migration. In addition, because of this, you can now migrate domains with crypto units.

  • Support for non-interactive migrations
    • Adding a '-p {password file}' option to 'ldm migrate' removes the need to type in a password when doing migrations.

[ Update: Some other posts by Alex, Duncan, Eric and Jeff ]

Friday Oct 02, 2009

Switching between ALOM and ILOM shells

Here are the commands needed to switch between the ALOM and ILOM shells on the T5xx0 SPARC CMT servers. I'm saving this here because I can never easily find it when I go searching Google or for it.

sc> userclimode admin default
sc> logout

-> set /SP/users/admin cli_mode=alom
-> exit

And usually when I want to do this, it's because I'm switching on Power Management

To set the policy property to elastic mode (enable PM):

-> set /SP/powermgmt policy=elastic

To set the policy property to performance mode (disable PM):
-> set /SP/powermgmt policy=performance

Wednesday Sep 09, 2009

LDoms 1.2 in OpenSolaris IPS repos

For anyone running OpenSolaris (either 2009.06 or the latest 2010.02 development builds), you'll be glad to know that I've pushed the LDoms 1.2 software to the /release and /dev repositories. You can get it by running:

pfexec pkg install ldomsmanager

If you were previously running LDoms 1.1 from the IPS repo, the new version will start running once you do a 'svcadm restart ldmd'.

If you were previously running LDoms 1.2 or earlier which was installed via pkgadd, then do a 'pkgrm SUNWldm' before the pkg install (your existing LDoms configs will be kept).


I have been working in Sun/Oracle since 1998 and I've spent much of that time working on adding Solaris support for various SPARC processors and servers. For the last 6+ years I've been working on what is now known as Oracle VM Server for SPARC (previously called LDoms) – virtualisation support for servers based on UltraSPARC CMT processors.


« April 2014