LDOMs or Containers, that is the question....

An often asked question, do I put my application in a container (zone) or an LDOM ? My question in reply is why the or ? The two technologies are not mutually exclusive, and in practice their combination can yield some very interesting results. So if it is not an or, under what circumstances would I apply each of the technologies ? And does it matter if I substitute LDOMs with VMware, Xen, VirtualBox or Dynamic System Domains ? In this context all virtual machine technologies are similar enough to treat them as a class, so we will generalize to zones vs virtual machines for the rest of this discussion.

First to the question of zones. All applications in Solaris 10 and later should be deployed in zones with the following exceptions
  • The restricted set of privileges in a zone will not allow the application to operate correctly
  • The application interacts with the kernel in an intimate fashion (reads or writes kernel data)
  • The application loads or unloads kernel modules
  • There is a higher level virtualization or abstraction technology in use that would obviate any benefits from deploying the application in a zone
Presented a different way, if the security model allows the application to run and you aren't diminishing the benefits of a zone, deploy in a zone.

Some examples of applications that have difficulty with the restrictive privileges would be security monitoring and auditing, hardware monitoring, storage (volume) management software, specialized file systems, some forms of application monitoring, intrusive debugging and inspection tools that use the kernel facilities such as the DTrace FBT provider. With the introduction of configurable zone privileges in Solaris 10 11/06, the number of applications that fit into this category should be few in number, highly specialized and not the type of application that you would want to deploy in a zone.

For the higher level abstraction exclusion, think of something at the application layer that tries to hide the underlying platform. The best example would be Oracle RAC. RAC abstracts the details of the platform so that it can provide continuously operating database services. It also has the characteristic that it is itself a consolidation platform with some notion of resource controls. Given the complexity associated with RAC, it would not be a good idea to consolidate non-RAC workloads on a RAC cluster. And since zones are all about consolidation, RAC would trump zones in this case.

There are other examples such as load balancers and transaction monitors. These are typically deployed on smaller horizontally scalable servers to provide greater bandwidth or increases service availability. Although they do not provide consolidation services, their sophisticated availability features might not interact well with the nonglobal zone restrictive security model. High availability frameworks such as SunCluster do work well with zones. Zones abstract applications in such a way that service failover configurations can be significantly simplified.



Unless your application falls under one of these exemptions, the application should be deployed in a zone.

What about virtual machines ? This type of abstraction is happening at a much lower level, in this case hardware resources (processors, memory, I/O). In contrast, zones abstract user space objects (processes, network stacks, resource controls). Virtual machines allow greater flexibility in running many types and versions of operating systems and applications but also eliminates many opportunities to share resources efficiently.

Where would I use virtual machines ? Where you need the diversity of multiple operating systems. This can be different types of operating system (Windows, Linux, Solaris) or different versions or patch levels of the same operating system. The challenge here is that large sites can have servers at many different patch and update versions, not by design but as a result of inadequate patching and maintenance tools. Enterprise patch management tools (xVM OpsCenter), patch managers (PCA), or automated provisioning tools (OpsWare) can help reduce the number of software combinations and online maintenance using Live Upgrade can reduce the time and effort required to maintain systems.

It is important to understand that zones are not virtual machines. Their differences and the implications of this are
  • Zones provide application isolation on a shared kernel
  • Zones share resources very efficiently (shared libraries, system caches, storage)
  • Zones have a configurable and restricted set of privileges
  • Zones allow for easy application of resource controls even in a complex dynamic application environment
  • Virtual machines provide relatively complete isolation between operating systems
  • Virtual machines allow consolidation of many types and versions of operating systems
  • Although virtual machines may allow oversubscription of resources, they provide very few opportunities to share critical resources
  • An operating system running in a virtual machine can still isolate applications using zones.
And it is that last point that carries this conversation a bit farther. If the decision between zones and virtual machines isn't an or, under what conditions would it be an and, and what sort of benefit can be expected ?

Consider the case of application consolidation. Suppose you have three applications: A, B and C. If they are consolidated without isolation then system maintenance becomes cumbersome as you can only patch or upgrade when all three application owners agree. Even more challenging is the time pressure to certify the newly patched or upgraded environment due to the fact that you have to test three things instead of one. Clearly isolation is a benefit in this case, and it is a persistent property (once isolated, forever isolated).

Isolation using zones alone will be very efficient but there will be times when the common shared kernel will be inconvenient - approaching the problems of the non-isolated case. Isolation using virtual machines is simple and very flexible but comes with a cost that might be unnecessary.

So why not do both ? Use zones to isolate the applications and use virtual machines for those times when you cannot support all of the applications with a common version of the operating system. In other words the isolation is a persistent property and the need for heterogeneous operating systems is temporary and specific. With some small improvements in the patching and upgrade tools, the time frame when you need heterogeneous operating systems can be reduced.

Using our three applications as an example, A B and C are deployed in separate zones on a single system image, bare metal or in a virtual machine. Everything is operating spectacularly until a new OS upgrade is available which provides some important new functionality for application A. So application owner A wants to upgrade immediately, application B doesn't care one way or the other, and (naturally) application C has just gone into seasonal lock-down and cannot be altered for the rest of the year.

Using zones and virtual machines provides a unique solution. Provision a new virtual machine with the new operating system software, either on the same platform by reassigning resources (CPU, memory) or on a separate platform. Next clone the zone running application A. Detach the newly cloned zone and migrate it to the new virtual machine. A new feature in Solaris 10 10/08 will automatically upgrade the new zone upon attachment to a server running newer software. Leave the original zone alone for some period of time in the event that an adverse regression appears that would force you to revert to the original version. Eventually the original zone can be reclaimed, but at a time when convenient.

Migrate the other two applications at a convenient time using the same procedure. When all of the applications have been migrated and you are comfortable that they have been adequately tested, the old system image can be shut down and any remaining resources can be reclaimed for other purposes. Zones as the sole isolation agent cannot do this and virtual machines by themselves will require more administrative effort and higher resource consumption during the long periods when you don't need different versions of the operating system. Combined you get the best of both features.

A less obvious example is ISV licensing. Consider the case of Oracle. Our friends at Oracle consider the combination of zones and capped resource controls as a hard partition method which allows you to license their software to the size of the resource cap, not the server. If you put Oracle in a zone on a 16 core system with a resource cap of 2 cores, you only pay for 2 cores. They have also made similar considerations for their Xen based Oracle VM product yet have been slow to respond to other virtual machine technologies. Zones to the rescue. If you deploy Oracle in a VM on a 16 core server you pay for all 16 cores. If you put that same application in a zone, in the same VM but cap the zone at 4 cores then you only pay for 4 cores.

Zones are all about isolation and application of resouce controls. Virtual machines are all about heterogeneous operating systems. Use zones to persistently isolate applications. Use virtual machines during the times when a single operating system version is not feasible.

This is only the beginning of the conversation. A new Blueprint based on measured results from some more interesting use cases is clearly needed. Jeff Savit, Jeff Victor and I will be working on this over the next few weeks and I'm sure that we will be blogging with partial results as they become available. As always, questions and suggestions are welcome. <script type="text/javascript"> var sc_project=1193495; var sc_invisible=1; var sc_security="a46f6831"; </script> <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>
Comments:

I agree with what you've said but I wanted to add in another compelling reason for the right combination of zones, domains, or no virtualization: cost reduction.

Logical domains (and dynamic system domains) go to the heart of reducing physical servers, power, cooling, and floor space. These reductions (in resource consumption and monetary costs) can be significant but there is another level of savings to be had using Containters to reduce the annual application and software costs.

Oracle is a great example but it works with the other applications as well. Systems can be monitored from the global so a 5:1 consolidation to containers can also be a 5:1 consolidation of enterprise monitoring, backup, and security monitoring to name a few categories.

Without containers, the average dollar savings I saw (given a pool of servers which included older, EOSL assets) was $3K to $5K per server per year in operational expenses. With Containers, it was between $6K and $8K per server per year in OpEx.

It ain't all about money, I know, but a good idea isn't worth much if it doesn't convince the person holding the checkbook.

Posted by Lee Diamante on September 23, 2008 at 11:45 AM CDT #

Hi Bob, can you clarify something?

I understand that Oracle don't officially support their product in Logical Domains. Can you state the official position with Oracle and LDoms? Also, I had situation at a customer recently where because we were using LDoms Oracle insisted that all cores in the server be licensed as they don't recognise LDoms in their hard partitioning definition, even though the Oracle LDom was only using 2 cores. If Oracle was deployed into a capped resource local zone within an LDom, how would this impact the licensing?

Posted by Jon Whiteoak on September 25, 2008 at 10:32 PM CDT #

Regarding support of Oracle in an LDom guest, both Sun and Oracle are working on certifying Oracle standalone and Oracle RAQ on LDoms. The verification process is well underway and going smoothly. Expect to see formal support announced soon.

Posted by John Falkenthal on October 09, 2008 at 04:55 AM CDT #

Thanks for the Oracle on LDOMs information John. Much appreciated. It confirms information I have received through other channels but not with enough confidence for me to post authoritatively on the subject.

And to follow up on the other John's second question, we have to be a bit precise here. Oracle doesn't recognize capped CPUs as a resource control, but does recognize pools (dynamic or static). So if you set up Oracle in a zone and create your own pools or use the dedicated CPU resource controls then you only have to license to the size of the pool (the max in the case of dynamic pools). This is true regardless of whatever other virtualization technologies are in play (VMware, Xen, LDOMs).

Posted by Bob Netherton on October 09, 2008 at 07:26 AM CDT #

Hi Bob, Great info, thanks. Do you have any recommendations on when to use whole root zones as opposed to sparse root? For example does a whole root zone run it's own kernel? I'm assuming not, so it would still stand that if you needed kernel interaction you'd be better going down the LDom route? What are the advantages/disadvantages of WRZ vs SRZ?

Posted by Tony Dalton on November 24, 2008 at 12:26 AM CST #

Thanks Tony. Good question. I'll give a proper treatment of the subject over the next few days, but the short answers are......

All zones share a common kernel. That's true with sparse and whole root zones. Even non-native zones share the same kernel,
although they may have an additional shim that sits in between libc (or equivalent) and the top of the kernel.

So the difference between a sparse and whole root zone comes down to being able to write in a few special directories. That
would be /usr, ./lib, /sbin and /platform. In the case of a sparse root zone these are inherited from the global zone as read-only
loopback filesystem mounts. As such they cannot be changed from within the zone. In addition, the packaging and patching
system is aware of the read-only nature of these directories so that the utilities do not fail when they encounter the write error. Whole
root on the other hand have their own copies of these directories. Their contents can be modified from within the zone if needed.

So let's start with the basic premise that sparse root zones are the preferred type. The rationale is that they take up much less disk
space (perhaps 75MB compared to 4GB of a full Solaris install), produce a smaller memory footprint since shared libraries are in
fact shared across sparse root zones, take less time to create, take a little less time to patch (you still have to mess with all the
stuff in /var/sadm), have a lot less mass to throw around when migrating, and my personal favorite - prevent Trojan Horse attacks
against the common system directories (/usr, /lib, /platform, and /sbin). It will be this last point that you will be coming back to
frequently as you think about it.

So when to use a whole root zone ? When you have to write in /usr, /lib, /platform or /sbin. And I want to be a bit more specific
about that - it is when you have to write all over them.

For example, take a package that puts a few binaries in /usr/bin, maybe an administrative command in /sbin and some header files
in /usr/include. That doesn't work too well in a sparse root zone, unless you want everyone to have the package - and more
precisely the exact same version of the package. You might also install or remove a set of packages that put or take away things
from /usr.

Now that is not the same thing as an application that installs in /usr/throatwarbler. That doesn't make me give up on a sparse root
zone. I can add a writable loopback mount for /usr/throatwarbler in one zone and not another. It is the writing of little bits all over /usr
that I can't cover with a writable loopback that makes me think whole root zone. This same reasoning can be applied to java
runtime environments, although there are much better ways to do this with environment variables and path settings. Since updating
and patching tools might use a java runtime, I do get a bit nervous changing the one in /usr/java, but sometimes those trade-offs are
worth exploring.

A whole root zone is also handy for some degree of patch testing. I have to say that - it's in the contract :-) By the time you eliminate
all of the PKG_ALL_ZONES packages and their dependencies, there isn't an awful lot to individually patch, but there are some
cases where this is interesting.

Although we have zone roots on ZFS in Solaris 10 10/08 (aka u6), the features aren't tightly integrated yet. Zone cloning doesn't take
advantage of ZFS cloning, so you still have the large storage footprint when using whole root zones. And migration may make
you untangle the ZFS clones anyway, so this doesn't make me change my mind on the sparse root preference.

One last consideration. If you are a package minimalist (a Terry Riley of software package management) then you might like the
idea of a minimal global zone and then individual whole root zones tailored to the specific software needs of the zone. This is
certainly more complicated and goes against the trend of installing everything except for the obvious exceptions, but there is merit in
the approach. Since I come at this as a system administrator I would tend not to do this, but my good friends that are security
specialists favor this approach.

So sparse root zones and whole root zones only differ by the ability to write in the inherited package directories (by default /usr, /lib,
/sbin, and /platform) and the implications of this limitation. My preference is to start the discussion assuming sparse root and
look for one of the exceptions that would prevent its use.

Did I say short answers..... Did get a bit carried away. Hope this helps answer your question. I will post a more proper reply for
all the folks that don't make it into to comments.

Thanks again,
Bob

Posted by Bob Netherton on November 24, 2008 at 12:57 PM CST #

now that its official that oracle's product suite will not be supported on any vmware hypervisor type 1 or 2 - the only options are oracle's respin of rhel with xen based vmm. Will Oracle support other UNIX implementation s? solaris x86 with xen kernel additions? xvm server...
what about xen server from citrix? what about suse?
I was hate to think oracle will only supporting their "unbreakable linux" and solaris zones as the supported virtualization technology.. everyone keeps saying KVM is a better solution than Xen..

Posted by sid wilroy on May 24, 2009 at 12:25 PM CDT #

Sid, those are good questions for our friends at Oracle. I certainly can't comment on their product strategies, so if one of those fine folks wants to offer up some insight, I'd be happy to pass it along. I would expect that after the acquisition and once things start getting sorted out that there would be good answers for questions like yours.

To your point at the end about everyone is saying KVM is a better solution than Xen, that's not quite correct. Certainly, Red Hat will say that :-) And to a large degree, those distributions that desire to stay close to the core projects may come along with varying degrees of enthusiasm. Clearly we disagree and like how we have been able to leverage some pretty spectacular Solaris features in our Xen implementation (ZFS, some DTrace probes, the fault management parts are especially interesting). Stay tuned though, this conversation is far from over.

Posted by Bob Netherton on May 25, 2009 at 03:19 AM CDT #

Hi Bob, I haven't found much information on the scheduling of threads and possible perfomance considerations when using LDOMs or zones.

I am trying to determine if it would be beneficial to split a server (T series with 2 - 6 core CPUs for example) into two or more LDOMs to insure more linear scalability for java applications.

Has anyone found performance benefits when running multiple LDOMs vs. one LDOM and multiple zones? I am mostly concerned about scheduling and the ability to run more than 600 application threads within a single global zone (around 156 application threads per zone).

Posted by Soren Morton on June 15, 2009 at 05:21 AM CDT #

Bob, how current is your comment about the Oracle licensing in a zone vs LDOM? Is that still the case?

Posted by Jim Covington on July 29, 2009 at 09:16 AM CDT #

Jim-
Oracle has certified stand alone 10gR2 and above in containers and 10gR2 and above in LDOMS with Ldom 1.0.3 and above. Be careful with RAC and ZFS though. I can't comment about licensing though. We're playing that game at this time.
Eric

Posted by Eric Crosby on August 20, 2009 at 12:50 AM CDT #

How do I migrate a zone to an LDOM?

Posted by Jim Covington on October 16, 2009 at 08:11 AM CDT #

Hey Jim. I've received the same question from some Sun folks - maybe related to your question. I'm assuming that you are looking for a tool to turn a zone into an LDOM. There is no such tool that I'm aware of, unfortunately. Moving a zone to an LDOM is easy, especially when running Solaris 10 10/08 or later where zone upgrade on attach will catch package differences between architectures.

But for turning a zone into an LDOM, I'm not coming up with anything. You could think of this from a different direction though - application reprovisioning. If your application is easy to reinstall, it may be a simple matter of reinstallation and just moving the data. That would not be too hard. But it does presume that there is no tricky customization that has to be done.

I've read about some clever uses of SMF properties to try to record application specific data for this purpose. It is a fascinating concept - perhaps worth a few blogs.

If I'm missing what you are trying to do, send me some email and we can talk about it. Maybe there is an easy way to accomplish your task.

Posted by Bob Netherton on November 16, 2009 at 12:49 AM CST #

Bob -
Thanks for this blog and for the specificity of the statement:
"So if you set up Oracle in a zone and create your own pools or use the dedicated CPU resource controls then ..."

This is the clearest statement on licensing I've found so far, but it is in a blog and not an official Oracle document and you intersperse "capped" and "dedicated" as well.

I have been variously told by people who ought to know, that I can put a zone in an ldom and use capped-cpu alone to allow Oracle licensing; told that I must use dedicated-cpu vs capped-cpu; and told that I must create a pool and cap the cpus; and told that no use of an LDOM (with or without zones) will allow restriction of CPUs for licensing of Oracle database products.

What I need is a good and official document from Oracle detailing what they really do support. Can you direct me to such?

Regards,
Ken

Posted by guest on June 17, 2011 at 07:25 AM CDT #

Hey Ken,

Thanks for your comment.

It appears the original document has been renamed, so I've updated the blog to point at the new location. For your reference, it is in the Specialty Topics off the main pricing page. The direct link is to the partitioning document is http://www.oracle.com/us/corporate/pricing/partitioning-070609.pdf

Some of the terms are used interchangeably, which can be confusing. For the purposes of this document, "capped container" means a Solaris container that is mapped to a processor pool, either by "add dedicated-cpus" in the zone configuration or by the pool attribute (and a corresponding pooladm to set up the pool).

The partitioning document, along with the companion http://www.oracle.com/technetwork/topics/virtualization/ovm-hardpart-167739.pdf (referenced in the partitioning doc) are the authoritative documents, as far as I understand.

Please let me know if this doesn't answer your question or concerns.

Bob

Posted by Bob Netherton on June 28, 2011 at 05:16 AM CDT #

When a creating a LDOM is it possible to over subscribe a cpu core?

Thank you,
Phil

Posted by Phil Schlegel on June 04, 2012 at 09:27 AM CDT #

Hi Phil,

Thanks for the question, it is a good one.

The practical answer is no, you can't. At least not in the way you do with IBM, VMware or any of the Xen based products (like Oracle VM for x86). Since the T group of processors tend to be very thread and core per socket rich, the hypervisor does direct assignment of memory and CPU to the guests. The implication is that the hypervisor remains compact and fast and the guests deliver bare-metal performance for all but I/O operations - and in some configurations (see the SPARC SuperCluster) they can do that as well.

Since I don't like blunt "no" responses to questions :-) let's look a bit more at what you are trying to accomplish and see if there is something more creative that we can do.

LDOMs do allow for dynamic reconfiguration of CPU, memory and I/O. This does give an opportunity for an agent based resource manager to do a little bit of load balancing. You probably don't want to be slinging processors between guests too often, but for gentle load balancing, it would be a solution. The primary use case would be multi-time-zone consolidation where the load follows the sun (add CPUs as the sun rises, remove them as it sets).

Perhaps the better answer is some combination of LDOMs and zones (containers). Carve up a T server into a few LDOMs, on core boundaries and then consolidate applications within those few LDOMs using containers. In that case, we can use the containers resource controls (capped-memory, capped-cpu, importance) to provide the oversubscription functionality and take advantage of LDOM's high performance design.

The objections to this approach tend to fall into "this is too complicated" or "that would be a nightmare to patch" categories. For the former, Ops Center has come a long way in ease of use for virtualized environments, especially in the area of containers. For the latter, Live Upgrade is the answer for Solaris 10 (and IPS has Live Upgrade functionality built in for Solaris 11).

Hope that helps answer your question. If not, please drop me some email so we can chat more.

Posted by Bob Netherton on June 04, 2012 at 10:17 AM CDT #

"Logical domains (and dynamic system domains) go to the heart of reducing physical servers, power, cooling, and floor space"

LDOMs do not allow oversubscription of CPU resources, so that's simply not true.

Posted by guest on October 24, 2012 at 07:56 PM CDT #

Oversubscription of CPUs is just one of many possible features in virtualization technologies. As with most things, it is not a "one size fits all" as there are implications of doing that. It most certainly is not required to reduce physical servers, power, cooling and floor space, especially in a thread (vcpu) rich platform like the T series, as our customers have demonstrated time and time again.

That said, there are times when oversubscription is a useful feature and that is where zones come into play. Within an LDOM, stack applications of a like or compatible lifecycle and use Solaris resource management tools (caps, FSS scheduling class) to allow *and* manage oversubscribed resources efficiently.

If you are worried about the complexity of using two different virtualization technologies, that's where Oracle Enterprise Manager Ops Center comes in. It has a nice graphical management interface which removes most of the complexity, not that it is all that complex in the first place.

Posted by Bob Netherton on October 25, 2012 at 10:46 AM CDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Bob Netherton is a Principal Sales Consultant for the North American Commercial Hardware group, specializing in Solaris, Virtualization and Engineered Systems. Bob is also a contributing author of Solaris 10 Virtualization Essentials.

This blog will contain information about all three, but primarily focused on topics for Solaris system administrators.

Please follow me on Twitter Facebook or send me email

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today