An often asked question, do I put my application in a container (zone) or an LDOM ? My
question in reply is why the or ? The two technologies are not mutually
exclusive, and in practice their combination can yield some very interesting
results. So if it is not an or, under what circumstances would I apply
each of the technologies ? And does it matter if I substitute LDOMs with
VMware, Xen, VirtualBox or Dynamic System Domains ? In this context all virtual machine
technologies are similar enough to treat them as a class, so we will generalize to
zones vs virtual machines for the rest of this discussion.
First to the question of zones. All applications in Solaris 10 and later
should be deployed in zones with the following exceptions
The restricted set of privileges in a zone will not allow the application to operate correctly
The application interacts with the kernel in an intimate fashion (reads or writes kernel data)
The application loads or unloads kernel modules
There is a higher level virtualization or abstraction technology in use that would obviate any
benefits from deploying the application in a zone
Presented a different way, if the security model allows the application to run and you aren't
diminishing the benefits of a zone, deploy in a zone.
Some examples of applications that have difficulty with the restrictive privileges would be security
monitoring and auditing, hardware monitoring, storage (volume) management software, specialized
file systems, some forms of application monitoring, intrusive debugging and inspection
tools that use the kernel facilities such as the DTrace FBT provider. With the introduction
of configurable zone privileges
in Solaris 10 11/06, the number of applications that fit into this category should be few in
number, highly specialized and not the type of application that you would want to deploy in a zone.
For the higher level abstraction exclusion, think of something at the application layer that tries to hide the
underlying platform. The best example would be Oracle RAC. RAC abstracts the details of the platform
so that it can provide continuously operating database services. It also has the characteristic that it
is itself a consolidation platform with some notion of resource controls. Given the complexity associated
with RAC, it would not be a good idea to consolidate non-RAC workloads on a RAC cluster. And since zones
are all about consolidation, RAC would trump zones in this case.
There are other examples such as load balancers and transaction monitors. These are typically
deployed on smaller horizontally scalable servers to provide greater bandwidth or increases service
availability. Although they do not provide consolidation
services, their sophisticated availability features might not interact well with the
nonglobal zone restrictive security model. High availability frameworks such as SunCluster
do work well with zones. Zones abstract applications in such a way that
service failover configurations can be significantly simplified.
Unless your application falls under one of these exemptions, the application should be deployed
in a zone.
What about virtual machines ? This type of abstraction is happening at a much lower level, in this
case hardware resources (processors, memory, I/O). In contrast, zones abstract user space
objects (processes, network stacks, resource controls). Virtual machines allow greater flexibility in running
many types and versions of operating systems and applications but also eliminates many opportunities
to share resources efficiently.
Where would I use virtual machines ? Where you need the diversity of multiple operating systems.
This can be different types of operating system (Windows, Linux, Solaris) or different
versions or patch levels of the same operating system. The challenge here is that large sites can
have servers at many different patch and update versions, not by design but as a result
of inadequate patching and maintenance tools. Enterprise patch management tools
patch managers (PCA), or
automated provisioning tools (OpsWare) can help reduce the number
of software combinations and online maintenance using Live Upgrade can reduce the time and effort
required to maintain systems.
It is important to understand that zones are not virtual machines. Their differences and the
implications of this
Zones provide application isolation on a shared kernel
Zones share resources very efficiently (shared libraries, system caches, storage)
Zones have a configurable and restricted set of privileges
Zones allow for easy application of resource controls even in a complex dynamic application environment
Virtual machines provide relatively complete isolation between operating systems
Virtual machines allow consolidation of many types and versions of operating systems
Although virtual machines may allow oversubscription of resources, they provide very few opportunities
to share critical resources
An operating system running in a virtual machine can still isolate applications using zones.
And it is that last point that carries this conversation a bit farther. If the decision between zones and
virtual machines isn't an or, under what conditions would it be an and, and what sort of benefit can
be expected ?
Consider the case of application consolidation. Suppose you have three applications: A, B and C.
If they are consolidated without isolation then system maintenance becomes cumbersome as you can
only patch or upgrade when all three application owners agree. Even more challenging is the
time pressure to certify the newly patched or upgraded environment due to the fact that you have
to test three things instead of one. Clearly isolation is a benefit in this case, and it is a
persistent property (once isolated, forever isolated).
Isolation using zones alone will be very
efficient but there will be times when the common shared kernel will be inconvenient - approaching
the problems of the non-isolated case. Isolation using virtual machines is simple and very flexible
but comes with a cost that might be unnecessary.
So why not do both ?
Use zones to isolate the applications and use virtual machines for those times when you cannot
support all of the applications with a common version of the operating system. In other words
the isolation is a persistent property and the need for heterogeneous operating systems is
temporary and specific. With some small improvements in the patching and upgrade tools,
the time frame when you need heterogeneous operating systems can be reduced.
Using our three applications as an example, A B and C are deployed in separate zones on
a single system image, bare metal or in a virtual machine. Everything is operating
spectacularly until a new OS upgrade is available which provides some important new
functionality for application A. So application owner A wants to upgrade immediately,
application B doesn't care one way or the other, and (naturally) application C has
just gone into seasonal lock-down and cannot be altered for the rest of the year.
Using zones and virtual machines provides a unique solution. Provision a
new virtual machine with the new operating system software, either on the same platform
by reassigning resources (CPU, memory) or on a separate platform. Next clone the zone
running application A. Detach the newly cloned zone and migrate it to the new virtual machine.
A new feature in Solaris 10 10/08 will automatically upgrade the new zone upon attachment
to a server running newer software. Leave the original zone alone for some period of time
in the event that an adverse regression appears that would force you to revert to the
original version. Eventually the original zone can be reclaimed, but at a time when convenient.
Migrate the other two applications at a convenient time using the same procedure. When
all of the applications have been migrated and you are comfortable that they have been
adequately tested, the old system image can be shut down and any remaining resources
can be reclaimed for other purposes. Zones as the sole isolation agent cannot do this
and virtual machines by themselves will require more administrative effort and higher
resource consumption during the long periods when you don't need different versions of
the operating system. Combined you get the best of both features.
A less obvious example is ISV licensing. Consider the case of Oracle. Our friends at
Oracle consider the combination
of zones and capped resource controls as a hard partition method which allows you to
license their software to the size of the resource cap, not the server. If you put Oracle
in a zone on a 16 core system with a resource cap of 2 cores, you only pay for 2 cores.
They have also made similar considerations for their Xen based Oracle VM product yet have
been slow to respond to other virtual machine technologies. Zones to the rescue. If you
deploy Oracle in a VM on a 16 core server you pay for all 16 cores. If you put that same
application in a zone, in the same VM but cap the zone at 4 cores then you only pay for
Zones are all about isolation and application of resouce controls. Virtual machines
are all about heterogeneous operating systems. Use zones to persistently isolate applications.
Use virtual machines during the times when a single operating system version is not
This is only the beginning of the conversation. A new Blueprint
based on measured results from some more interesting use cases is clearly needed. Jeff Savit,
Jeff Victor and I will be working on this over the next few weeks and I'm sure
that we will be blogging with partial results as they become available. As always, questions and
suggestions are welcome.
While this is inspired by a recent conversation with a customer, I have seen the term "true virtualization" used
quite a bit lately - mostly by people who have just attended a VMware seminar, and to a lesser extend folks from
IBM trying to compare LPARS with Solaris zones. While one must give due credit to the
fine folks at VMware for raising Information Technology (IT) awareness and putting virtualization in the common vocabulary,
they hardly have cornered the market on virtualization and using the term "true virtualization" may reveal how narrow an
understanding they have of the concept or an unfortunate arrogance that their approach is the only one that matters.
virtualization as a technique for hiding the physical characteristics of computing resources from the way in which other systems, applications, or end users interact with those resources. While Wikipedia isn't the final authority, this definition is quite good and we will use it to start our exploration.
So what is true virtualization ? Anything that (potentially) hides architectural details from running objects (programs, services, operating
systems, data). No more, no less - end of discussion.
Clearly VMware's virtualization products (ESX, Workstation) do that. They provide virtual machines that emulate the Intel x86
Instruction Set Architecture (ISA) so that operating systems think they are running on real hardware when in fact they are not. This type of virtualization would be classified as an abstraction type of virtual machines. But so is Xen, albeit with an interesting twist.
In the case of Xen, a synthetic ISA based on the x86 is emulated removing some of the instructions that are difficult to virtualize.
This makes porting a rather simple task - none of the user space code needs to be modified and the privileged code is generally limited to parts of the kernel that actually touch the hardware (virtual memory management, device drivers). In some respects, Xen is less of an abstraction as it does allow the virtual machines to see the architectural details thus permitting specific optimizations to occur that would be prohibited in the VMware case. And our good friends at Intel and AMD are adding new features to their processors to make virtualization less complicated and higher performance so the differences in approach between the VMware and Xen hypervisors may well blur over time.
But is this true virtualization ? No, it is just one of many types of virtualization.
How about the Java Virtual Machine (JVM) ? It is a run time executive that provides a virtualized environment for a completely synthetic ISA (although real pcode implementations have been done, they are largely for embedded systems). This is the magic behind write once and run anywhere and in general the approach works very well. So this is another example of virtualization - and also an abstraction type. And given the number of JVMs running around out there - if anyone is going to claim true virtualization, it would be the Java folks. Fortunately their understanding of the computer industry is broad and they are not arrogant - thus they would never suggest such folly.
Sun4v Logical Domains (LDOMs) are a thin hypervisor based partitioning of a radically multithreaded SPARC processor. The guest domains (virtual
machines) run on real hardware but generally have no I/O devices. These guest domains get their I/O over a private channel from a service domain (a special type of domain that owns devices and contains the real device drivers). So I/O is virtualized but all other operations are executed on real hardware. The hypervisor provides resource (CPU and memory) allocation and management and the private channels for I/O (including networking). This too is virtualization, but not like Xen or VMware. This is an example of partitioning. Another example is IBM (Power) LPARS albeit with a slightly different approach.
Are there other types of virtualization ? Of course there are.
Solaris zones are an interesting type of virtualization called OS Virtualization. In this case we interpose the virtualization layer between
the privileged kernel layer the non-privileged user space. The benefit here is that all user space objects (name space, processes, address spaces) are
completely abstracted and isolated. Unlike the methods previously discussed, the kernel and underlying hardware resources are not artificially
limited, so the full heavy lifting capability of the kernel is available to all zones (subject to other resource management policies). The
trade-off for this capability is that all zones share a common kernel. This has some availability and flexibility limitations that should
be considered in a system design using zones. Non-native (Branded) zones offers some interesting flexibilities that we are just now beginning to
exploit, so the future of this approach is very bright indeed. And if I read my competitors announcements correctly, even our good friends at IBM are embracing this approach with future releases of AIX. So clearly there is something to this thing called OS Virtualization.
And there are other approaches as well - hybrids of the types we have been discussing. Special purpose libraries that either replace or interpose between common system libraries can provide some very nice virtualization capabilities - some of these transparent to applications, some not. The open source project Wine is a good example of this. User mode Linux and it's descendants offer some abilities to run an operating system as user mode program, albeit not particularly efficiently.
QEMU is an interesting general purpose ISA simulator/translator that can be used to host non-native operating systems (such as Windows while running Solaris or Linux). The interesting thing about QEMU is that you can strip out the translation features with a special kernel module (kqemu) and the result is very efficient and nicely performing OS hosting (essentially simulating x86 running on x86). Kernel-based Virtual Machines (KVM) extends the QEMU capability to add yet another style of virtualization to Linux. It is not entirely clear at present whether KVM is really a better idea or just another not invented here (NIH) Linux project. Time will tell, but it would have been nice for the Linux kernel maintainers to take a page from OpenSolaris and embrace an already existing project that had some non-Linux vendor participation (\*BSD, Solaris, Plan 9, plus some mainstream Linux distributions). At the very least it is confusing as most experienced IT professionals will associate KVM with Keyboard Video and Mouse switching products. There are other commercial products such as QuickTransit that use a similar approach (ISA translation).
And there are many many more.
So clearly the phrase "true virtualization" has no common or useful meaning. Questioning the application or definition of the phrase will likely uncover a predisposition or bias that might be a good starting point to carry on an interesting dialog. And that's always a good idea.
I leave you with one last thought. It is probably human nature to seek out the one uniform solution to all of our problems, the Grand Unification Theory being a great example. But in general, be skeptical of one size fits all approaches - while they may in fact fit all situations, they are generally neither efficient nor flattering. What does this have to do with virtualization ? Combining various techniques quite often will yield spectacular results. In other words, don't think VMware vs Zones - think VMware and Zones. In fact if you think Solaris, don't even think about zones, just do zones. If you need the additional abstraction to provide flexibility (heterogeneous or multiple version OS support) then use VMware or LDOMs. And zones.
Next time we'll take a look at abstraction style virtualization techniques and see if we can develop a method of predicting the overhead that each technique might impose on a system. Since a good apples to apples benchmark is not likely to ever see the light of day, perhaps some good old fashioned reasoning can help us make sense of what information we can find.
Bob Netherton is a Principal Sales Consultant for the North American Commercial Hardware group, specializing in Solaris, Virtualization and Engineered Systems. Bob is also a contributing author of Solaris 10 Virtualization Essentials.
This blog will contain information about all three, but primarily focused on topics for Solaris system administrators.