Cross-CPU Live Migration in Oracle VM Server for SPARC 2.2

Oracle VM Server for SPARC 2.2 has just been released with important new functionality. One of the enhancements is an improvement to live domain migration. With the new release, it is now possible to migrate running domains between T-series servers with different chip types and clock frequencies. This provides operational flexibility for customers who have a heterogeneous mixture of Oracle SPARC T-series systems, by letting them move running domains between servers regardless of whether they are T2, T2+, T3 or T4-based systems.

Background

Oracle VM Server for SPARC (originally, and still often called Logical Domains) provides secure live migration, which non-disruptively moves a running guest domain from one T-series server to another. As with most things, this requires prior planning and has several technical requirements:

  1. T-series servers running compatible versions of Oracle VM Server for SPARC and firmware.
  2. Common network accessibility.
  3. Networked storage (that is, no use of internal direct-attach disks within the server).
  4. Guests only use virtual devices - that is, virtual network and disk devices provided by a service domain, as opposed to physical assignment of devices.
  5. Identical CPU chip and clock frequencies.

The last requirement meant that a guest domain could only be live migrated between servers of the same chip type. For example, you can migrate domains between T2-based systems (between T5120 and T5220 servers) or between T4 systems, but you couldn't live migrate a domain between a T2 and a T4 system. Additionally, the clock frequency had to be identical, shown by the value displayed by the prtconf command for stick-frequency. Even within a chip type, not all products have the same clock speed.

This restricted customer ability to freely move domains between systems - and quite a few customers have a mixture of T2, T2+, T3 and T4 systems.

Enabling Cross-CPU Migration

Version 2.2 removes this restriction by adding new, cooperative functionality in firmware, the logical domains manager, and Solaris. For example, Solaris has to understand when it being moved from one processor chip set and speed and adjust accordingly.

In order to use cross-CPU migration, the servers have to be at current firmware levels as described in the Oracle VM Server for SPARC 2.2 Release Notes, and the guest domains have to (at this writing) run Oracle Solaris 11. Finally, the guest domain must be created with the constraint cpu-arch=generic using the command ldm set-domain cpu-arch=generic mydomain, rather than the default of cpu-arch=native.

An Example

In this example, I will migrate a domain between a T5240 and a T5120. First, I'll show the different procesor types. The T5240 has the following chip type and clock frequency:

t5240:~# prtconf -pv|grep stick-frequency
    stick-frequency:  457656f0
t5240:~# psrinfo -vp
The physical processor has 8 virtual processors (0-7)
  UltraSPARC-T2+ (chipid 0, clock 1165 MHz)

and the T5120 has these:

t5120:~#  prtconf -pv|grep stick-frequency
    stick-frequency:  457646c0
t5120:~# psrinfo -vp
The physical processor has 8 virtual processors (0-7)
  UltraSPARC-T2 (chipid 0, clock 1165 MHz)

On the T5240, I've created a domain thing1 and migrated it by entering: ldm migrate thing1 t5120.

Logged into the guest, I was able to continue executing commands and running processes during the migration. The virtinfo command is useful when you want to obtain information about the domain and the system hosting it. I ran that command during the migration and saw the control domain name change when the migration completed.

thing1:~# virtinfo -ct
Domain role: LDoms guest
Control domain: t5240

...a little while later, from the same session...

thing1:~# virtinfo -ct
Domain role: LDoms guest
Control domain: t5120
# psrinfo -vp
thing1:~The physical processor has 8 virtual processors (0-7)
  sun4v-cpu (chipid 0, clock 1165 MHz)

In summary, the domain moved from a T2 processor to a T2+ while it continued running. A few moments later, I moved it back again. Note the processor type is set to the generic type sun4v-cpu.

Video demonstration on YouTube

Brand new: a video demonstration showing live migration between a T3 and a T4, and also showing the use of the Oracle VM Manager GUI instead of the command line.

Summary

Oracle VM Server for SPARC 2.2 can now live migrate running guest domains between T-series servers even if they don't have the same chip type or frequency. This enhances operational flexibility, and lets domain migration be used in a wider set of use-cases.

Comments:

Jeff, Thank you for the info -- we've been waiting for this. =-)

My first question: Will Solaris 10 also support cross-CPU migration?
Perhaps a feature for Solaris 10 Update 11? -cheers, CSB

Posted by guest on May 24, 2012 at 09:35 AM MST #

Hi Craig (I'm figuring this must be Craig...) Some Solaris 11 features have been back-ported to Solaris 10, and I wouldn't be surprised if that happened here. No guarantee or date - I am not pre-annnouncing - but I think it makes sense.

Posted by Jeff on May 24, 2012 at 02:17 PM MST #

Hi Jeff,

First than all. thank you so much for the information shared.
I also have some few questions:

1) Is Live Migration a free feature for OVM SPARC as per OVM x86?
2) On x86 OVM the HA functionality with OCFS2 is free, is the SPARC version for HA also free?

Regards,

Francisco

Posted by Francisco on May 24, 2012 at 03:55 PM MST #

Hi Francisco,

You're quite welcome, and I'm glad you find this helpful. To your questions:

1. Live migration is a free feature of OVM SPARC, just as with OVM x86. It's worth emphasis that you don't pay extra for OVM SPARC in the first place: it's a built-in capability of the T-series server. Also, live migration is always secure live migration: we encrypt memory contents (using the crypto acceleration of the hardware) before transmitting it over the wire. Not all products on the market do that.

2. Oracle VM Server for SPARC does not currently have that kind of HA, which is an auto-restart capability, though it might in future. Today, people provide redundancy within a server by using OVM SPARC features to avoid single points of failure, especially using multiple service domains for multipath redundant I/O. For true "no outage" continuous availability, customers use clustering at the Solaris OS, database, or app server level.

I hope that's helpful.

regards, Jeff

Posted by Jeff on May 24, 2012 at 04:34 PM MST #

Thank you for this post. As a huge user of LDOM virtualization (we run, Weblogic and Oracle Dbs now on LDOMs), this feature will be most useful. However, the big question is - when would you port this into Solaris 10?

Murali

Posted by Murali on May 25, 2012 at 07:14 AM MST #

Murali - I can't pre-announce in public (I try to stick to the rules!) but please work with your account team to get this information under NDA. Craig - you can do the same! :-)

Posted by Jeff on May 25, 2012 at 07:22 AM MST #

Thanks for this post, really interesting.

Is there a performance impact on the guest domain by changing the cpu-arch parameter to generic ?

Thx,
Rgds,

Posted by guest on May 31, 2012 at 01:39 AM MST #

I think for application binaries it should be the same, but you won't want to compile (using Studio) using instructions that are specific to T4 (eg: fused arithmetic) or use 2GB page sizes since they won't be available on non-T4 systems. In the case of binaries, it would be the same in a non-domain, non-migration context: you never want to specify at compile time use of features not present on all the chips you plan to run the binary on. I have to learn more about how that works in practice (are the extra instructions available on a T4 when in generic mode? I think they aren't)

Posted by guest on May 31, 2012 at 02:52 PM MST #

Hi

Thanks for wonderful post. I have been deploying LDOM since version 1.02 and i am quite happy with the improvement on the LDOM all this years.
My curiosity is whether Oracle VM Manager GUI comes free with oracle VM for Sparc 2.2. All this while what i understand is OPS center has to be purchased in order to get GUI management thingy for LDOM environment.

Best Regards

Posted by Thiva on June 17, 2012 at 09:00 PM MST #

Hi Thiva,

I'm glad you like the posts and LDoms! To your question: the Oracle VM Manager GUI is not yet available for SPARC, but you can see it being used pre-release in the video demo in the body of this blog post. My unofficial understanding is that it will not be a extra-cost feature when it is released. For Ops Center: that is available as a GUI that can manage LDoms today (as well as doing many other things), and that is available at no cost with Oracle servers. You might want to look at that, as it does a lot of system management functions, beyond what the OVM Manager does.

Posted by guest on June 18, 2012 at 09:06 AM MST #

Post a Comment:
Comments are closed for this entry.
About

jsavit

Search

Categories
Archives
« July 2014
SunMonTueWedThuFriSat
  
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today