X

News, tips, partners, and perspectives for Oracle’s virtualization offerings

Oracle VM Performance and Tuning - Part 1

Jeff Savit
Product Management Senior Manager

Oracle VM Performance

I've been interested in virtual machine performance a very long time, and while I've written Best Practices items that included performance, this post starts a series of articles exclusively focused on Oracle VM performance. Topics will include:

  • The performance landscape.
  • An update of a classic problem.
  • How to think about and evaluate performance.
  • Tips / best practices that apply to all Oracle VM products.
  • Tips / best practices specific to Oracle VM Server for x86 and the Private Cloud Appliance (PCA).
  • Tips / best practices specific to Oracle VM Server for SPARC.
  • Examples, lessons learned, summary.

I intend this series to be relatively high-level in essays that relate concepts to think about, illustrated by some examples and with links to other resources for details.

The changed VM performance landscape

...then...

There was a time when working on virtual machine performance, or system performance in general, required fine tuning of many system parameters. CPU resources were scarce, with many virtual machines on servers with one or a few CPUs, so we carefully managed CPU scheduling parameters: priorities or weights for VMs, distinction of interactive vs. non-interactive VMs to prioritize access, duration of time slices. Memory was expensive and capacity was limited, so we aggressively over-subscribed memory (some VM systems still do but it's less a factor than it once was) and administered memory management: page locking and reservation, working set controls, life time of idle pages before displacement.

Since we had to page (sometimes loosely referred to as "swapping", though there is a difference), we created hierarchies of page and swap devices and spread the load over multiple disks. Disks were slow and uncached, so we sometimes individually positioned files to reduce rotational and seek latency. It took a lot of skilled effort to have systems perform well.

...and now...

These items have less impact in modern virtualization systems. Many of these issues have been eliminated or rendered less important by architecture, product design, or the abundance of system resources seen with today's systems. In general, administering plentiful resources for optimal performance is very different from apportioning scarce resources. In particular, Oracle VM products eliminate as much of the effort as possible, and design in best practices and performance choices to fit today's applications and hardware to perform well "out of the box".

Today's servers have lots of CPU capacity, and we don't need to run at 95% CPU busy (though we can, of course) to make them cost effective, so we don't tweak CPU scheduling parameters to prevent starvation as we once did. Oracle VM Server for x86 lets you set CPU caps and weights as needed, and Oracle VM Server for SPARC dedicates CPU threads or cores directly to guests, so the topic simply evaporates on that platform. Having enough CPU cycles to get the job done is rarely the problem now. Instead, we now tune for scale and to handle Non Uniform Memory Access (NUMA) properties of large systems.

Neither Oracle VM platform over-subscribes memory, so we don't have to worry about managing working sets, virtual to real ratios, or paging performance. That's true in today's non-VM environments too, where it's safe to say that if you're swapping, you've already suffered performance and should just add memory. This eliminates an entire category of problematic performance management that often (in the bad old days) resulted in pathological performance issues. Friends don't let friends swap. Instead, what memory tuning remains is around NUMA latency and alignment with CPUs.

There are still performance problems and the need to manage performance remains - or why would I be writing this? Effort has moved to other parts of the technology stack - network performance is much more important than it once was, for instance. Workloads are more demanding, less predictable, and are subject to scale up/down requirements far beyond those of the earlier systems. There still are, and probably always will be, constraints on performance and scale that have to be understood and managed. That said, the goal of Oracle VM is to increasingly have systems "do the right thing" for performance with as little need for administrative effort as possible. I'll illustrate examples of that in this series of posts.

But first, a classic problem

Here's a problem that existed in the old days and persists today in different form, which makes it interesting to think about.

Let's say it was 8am on a Monday morning back when people logged into timeshared systems for their daily work. The first people to get to their desks, coffee in hand, would get really good response time reading their e-mail until the laggards showed up.
Response time could degrade as users logged in if utilization of a key system component reached a performance sensitive level. However, suitably sized and tuned systems could handle this or other peaks related to business cycles. In my case it was tied to the open and close of the New York stock exchanges or timing of large derivative transactions.

Now suppose there was a system outage: when the system came back up, all the interrupted users would login at the same time to resume their work, and performance could be terrible. All the users would place demand on the system at the same time, instead of the normal mix of busy and idle (think time) users, placing pressure on CPU, memory, and I/O, and single-threading any serialized portion of the system. Viewed from the perspective of "getting work done", it could often be faster to have fewer active people logged on, so they could complete their effort and drop to idle state, than have everyone trying at once. You could even see this with plain old batch jobs: if there was excessive multiprogramming, reducing the number of tasks competing for resources could make the entire collection of work run faster. Classic congestion effect, like too many cars on the same highway.


Systems of the day had clever controls to prevent excessive multiprogramming, usually driven by memory consumption to prevent thrashing on the small RAM capacities available: a subset of users whose working sets fit in RAM would be allowed to run, while other users had to wait. The fortunate users either finished their work and became idle so their memory could be re-used by others, or were evicted after a while to give somebody else a turn (that's where address space swapping comes in).

That applied to the old timesharing systems, and applies even more to today's virtual machine systems, especially since the CPU, memory, or I/O footprint of starting up a VM is substantial. This is such a well known problem that it is referred to as a "boot storm" - a Google search of "boot storm" yields over 40 million hits - this is a well known problem! Let's consider a few reasons this has such a powerful effect:

  • Booting a virtual machine and starting apps may have a big CPU footprint. If VMs share physical CPUs then they compete (well, obviously) for cycles. Not usually a problem for Oracle VM Server for x86, and a non-issue for Oracle VM Server for SPARC where CPUs are dedicated.
  • A booting-up OS touches all its memory. If memory is oversubscribed as in some virtual machine systems, then the working set of active pages is going to exceed real memory so there is a risk of thrashing. Not an issue with Orcle VM since we don't oversubscribe memory.
  • The virtual machine needs to fetch all of its binaries and other OS contents from disk. This is a random read intensive, cache-hostile workload. By cache, I mean memory cache like the ZFS ARC (Adaptive Read Cache), or Solid State Disk (SSD) cache such as the ZFS L2ARC.
     
    • OS kernel images and other files needed at boot time are read once and never read again, so no cache re-use and reads must go to rotating media. In fact, that pollutes the cache with content not likely to be needed again for a long time.
    • Unless thin cloning is used, disk locations use by OS images are disjoint so there's no data sharing from media that could provide cache re-use. If thin clones are used on disk the first VM to read a disk location pays for the I/O wait but promotes the contents to cache so other VMs sharing those locations benefit.
    • If the storage array was also rebooted, then any cached contents it had from before the outage are also discarded.
    • Architects sometimes put virtual machine boot disks on high capacity, low performance storage devices because they don't generate much I/O load. The exception is during a boot storm, when the thundering herd of booting VMs is an intensely I/O bound operation.

How can a boot storm be handled? A number of methods can be used:

  • Throw hardware at it, and treat it as a upper bound on resource consumption of memory, CPU, disk and network I/O.
  • Revisit decisions about what needs to be optimized. For example, during normal operation, boot disk I/O might be infrequent so not worth improving, but could be critical path during boot storms. The infrequent event (booting up) is the hot activity path, and maybe should be on higher performance media, possibly including SSD, but that can be expensive.
  • Control the order in which VMs start up: instead of letting them all start at once, stage VM boot up in order of business importance. That way no system resources are excessively stressed and business functions are restored to operational state sooner - without spending money.
These are well-known approaches, for a problem that has been around for a very long time in one form or another.

Summary

While the landscape has changed, and we no longer tune to the same factors we once managed, the need to manage performance has not disappeared. Further articles in this series will discuss different aspects of Oracle VM performance management. Some starting concepts:

  • Oracle VM on both x86 and SPARC is intended out 'engineer out' many of the administrative tasks formerly needed for virtual machine performance, through product design and features and by incorporating and publicising best practices.
  • Neither version oversubscribes memory, which eliminates many potential performance problems and is the right design for today's workloads. Virtual machines have poor locality of reference and trying to oversubscribe memory can lead to serious problems - including displacing the wrong pages when under load.
  • Oracle VM uses a simple and effective CPU allocation model, especially for SPARC, the is understandable, avoids starvation, and cuts out overhead.
  • There is still need to tune and manage performance, and some classic problems can still happen.

Resources

For additional resources about Oracle VM Server

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services