Tuesday Jun 21, 2016

Oracle VM Server for SPARC - virtual HBA (vHBA) enhancements

Oracle VM Server for SPARC 3.4 was released in May. This blog entry describes enhancements to the virtual HBA (vHBA) feature, and (bonus!) describes how NPIV can be used to control LUN assignments.[Read More]

Wednesday Jun 01, 2016

Oracle VM Server for SPARC 3.4 - scalability improvements

Oracle VM Server for SPARC 3.4 has been released, with improved capabilities for scale, security, availability, and advanced networking support. This blog entry describes some of the enhancements that increase scalability.
[Read More]

Friday May 06, 2016

Upgrading Oracle VM Manager to version 3.4.1

Last month I blogged on Oracle VM upgrades: one blog entry upgrading Oracle VM Manager to version 3.3.4. and the other blog upgrading to the corresponding SPARC OVS agent. Today, I upgraded the same systems to Oracle VM Manager 3.4.1 and the ovs-agent 3.4.1

[A note on release numbers: Oracle VM Manager's release is 3.4.1, and so is the ovs-agent that runs on the SPARC server. The actual hypervisor, Oracle VM Server for SPARC, formerly called logical domains (and frequently still called by the original name), is currently at release 3.3. Close, but not exactly aligned.]

Today's process was exactly the same as before, and proceeded without any excitement. I upgraded the Manager, and then the server component. If anyone asks via comment I'll post the commands I used, as I did last time, but it will look exactly the same except for file names and release numbers.

This was uneventful. I won't say "boring", because people might find it interesting ;) but it proceeded as expected without any excitement. System upgrades should be uneventful! Now my lab systems are at the current of Oracle VM Manager, ovs-agent, and logical domains.

Friday Apr 22, 2016

New whitepaper: Optimizing Oracle VM Server for x86 Performance

I am very pleased to announce publication of a new whitepaper "Optimizing Oracle VM Server for x86  Performance". This article contains material previously posted on this blog, plus additional technical information and features newly introduced with Oracle VM 3.4.

Wednesday Apr 20, 2016

Upgrading Oracle VM Server to version 3.3.4

This blog entry shows the step-by-step procedure I used to upgrade Oracle VM Server from version 3.3.3 to 3.3.4, corresponding to the upgrade performed last week for Oracle VM Manager.[Read More]

Tuesday Apr 12, 2016

Upgrading Oracle VM Manager to version 3.3.4

This blog entry shows the step-by-step procedure I used to upgrade Oracle VM Manager from version 3.3.3 to 3.3.4. No muss, no fuss.[Read More]

Thursday Mar 24, 2016

Oracle VM 3.4.1 and new performance features

Oracle VM 3.4.1 has just been released, with important new features and improved performance and scalability. This blog describes a new feature than can be used to further improve disk device performance on Oracle VM 3.4.1, and also on the recent maintenance release Oracle VM 3.3.4.[Read More]

Friday Mar 18, 2016

Root domains and I/O on SPARC M7

Please see the excellent blog entry on root domains and how they've changed (for the better) on SPARC M7 servers at the blog article Complex Root Domains. The article refers to SR-IOV but doesn't discuss it, in order to focus on root domains, but SR-IOV also remains available on M7 systems for physical I/O with high resource granularity.

Monday Nov 30, 2015

Oracle VM Performance and Tuning - Part 5

The fifth article in this series of Oracle VM performance focusses on Oracle VM Server for x86 domain types, huge pages, and CPU scheduling controls.[Read More]

Tuesday Nov 24, 2015

Oracle VM Performance and Tuning - Part 4

The fourth article in this series of Oracle VM performance focusses on Oracle VM Server for x86 CPU and memory performance, with guidance on how to reduce memory latency and control CPU allocation.
[Read More]

Wednesday Nov 11, 2015

Virtual HBA in Oracle VM Server for SPARC

Oracle VM Server for SPARC 3.3 added an important new feature, virtual HBA (vHBA), which adds flexibility and relieves prior limitations of virtual I/O without sacrificing performance. This blog entry describes this new feature and shows how to use it.
[Read More]

Friday Nov 06, 2015

Oracle VM Performance and Tuning - Part 3

The third article in this series of Oracle VM performance focusses on performance goals for the virtual machine and cloud environments, and performance principles for CPU, memory, and I/O that behave differently in VM environments.[Read More]

Thursday Oct 08, 2015

Oracle VM Server for SPARC Best practices: naming virtual network devices

This blog shows a simple usability best practice to make it easier to identify network resources using 'ldm list-netdev'.[Read More]

Friday Oct 02, 2015

Oracle VM Performance and Tuning - Part 2

This article on Oracle VM performance reviews general performance principles, and follows with a review of Oracle VM architectural features that affect performance. This will be high-level as a basis for more technical detail in subsequent articles.

How to evaluate and measure performance (short version)

First, let's consider ways to not evaluate performance. Performance is often stated in unquantified generalities ("Give good response time!") or complaints ("Response time is terrible today. Fix it!"). That doesn't help understand the performance situation.

Another habit is to look at resource utilization without relevance to delivered performance. For example:

  • High CPU utilization. Is 95% CPU busy bad or good?

    High CPU busy may just mean you're getting your money's worth from your servers. It is only a problem if service level objectives aren't being met due to resource starvation. However, it could be a symptom of a problem (program looping, excessive error handling, etc) rather than being a problem.

  • Low CPU utilization. Is 12% CPU busy bad or good?

    Low CPU may mean the workload is idle, or it's a single threaded app on a server with many CPUs. (Is this an 8 CPU machine where one CPU is 100% busy while the others are idle?)

    Applications are often unable to drive CPU because they are waiting on I/O, or because memory is over committed and they are thrashing. Low utilization might be innocent, or can indicate a bottleneck elsewhere.

So, raw utilization numbers are not good or bad of themselves. They can be clues when used in the right context - "it depends". Another trap is to use average numbers, which can hide peak loads and spikes.

There are other popular measurements that don't help: such as microbenchmarks that don't look at all like the actual workload, or measuring how long it takes to run the dd command to write a gigabyte of zeroes when the expected workload is random I/O. Unless the purpose of the system is to use dd to write zeroes, it's of limited utility. Measuring the wrong thing because it's easy is so common that it has its own name, the streetlight effect.

Instead, performance analysts and systems administrators should measure against requirements of their business users, using service level objectives stated in external terms: meeting a deadline to run a particular task (such as: get payroll out, post and clear stock trades, or close the books at end of quarter), or meeting response times for different times of transaction (load a web page, do a stock quote, transact a purchase or trade). Performance objectives are commonly expressed in the form of response times ("95% of a specified transaction must complete in X seconds at a rate of N transactions per second") or in throughput rates ("handle N payroll records in H hours").

In the preceding paragraphs, I've touched on essential performance concepts: thoughput, response time, latency, utilization, service levels. We'll use those terms to relate to virtual machine performance.

Oracle VM - architectural review

Some may find the preceding material too abstract and not specific to virtualization, so lets change gears and discuss architecture of the Oracle VM hypervisors.

The Oracle VM family includes Oracle VM Server for x86 and Oracle VM Server for SPARC. two hypervisors that optionally share Oracle VM Manager as a common management infrastructure. There is also a desktop virtualization product, Oracle VM VirtualBox. VirtualBox is popular for end-user virtual machines on a desktop for laptop, but is out of scope for this series of articles.

Oracle VM Server for x86 and Oracle VM Server for SPARC have architectural similarities and differences. Both use a small hypervisor in conjunction with a privileged virtual machine ("domain") for aministration, and virtual and physical device management. The hypervisor resides in firmware on SPARC, and in software on x86. That constrasts with traditional virtual machine systems that use a monolithic hypervisor kernel for system control and device management.

Oracle VM Server for x86 is based on Xen virtualization technology, and uses a "dom0" (domain 0) as an administrative control point and to provide virtual I/O device services to the guest VMs ("domU"). Oracle VM Server for SPARC uses a small firmware-based hypervisor coupled with a "control domain" that can be compared to dom0 on x86, with the option of having multiple "service domains" for resiliency.

The two products also have similarities and differences in how they handle systems resources:

Oracle VM Server for x86 Oracle VM Server for SPARC
CPU CPUs can be shared, oversubscribed, and timesliced using a share-based scheduler.
CPUs can be allocated cores (or CPU threads if hyperthreading is enabled.)
The number of virtual CPUs in a domain can be changed while the domain is running.
CPUs are dedicated to each domain with static assignment when the domain is "bound".
Domains are given exclusive use of some number of CPU cores or threads, which can be changed while the domain is running.
Memory Memory is dedicated to each domain, no over-subscription.
The hypervisor attempts to assign a VM's memory to a single NUMA node, and has CPU affinity rules to try to keep a VM's virtual CPUs near its memory for local latency.
Memory is dedicated to each domain, no over-subscription.
The hypervisor attempts to assign memory on a single NUMA node, and allocate CPUs on the same NUMA node for local latency.
I/O Guest VMs are provided virtual network, console, and disk devices provided by dom0 Guest VMs are provided virtual HBA, network, console, and disk devices provided by control domain and optional service domains.
VMs can also use physical I/O with direct connection to SR-IOV virtual functions or PCIe buses.
Domain types Guest VMs (domains) may be hardware virtualization (HVM), paravirtualized (PV) or hardware virtualized with PV device drivers. Guest VMs (domains) are paravirtualized.

That's a lot of similarity for two products with different origins. When I'm asked for a quick summary, I say that the two products have a common memory model (VM memory is fixed, not overcommitted or swapped - very important), but different CPU models (Oracle VM Server for SPARC uses dedicated CPUs on servers that have lots of them, while x86 uses a more traditional software based scheduler that time-slices virtual CPUs onto physical CPUs. Both products are aware of the NUMA affects and try in different ways to reduce remote memory latency from CPUs. Both have virtual network and virtual disk devices, but the SPARC side has additional options for device backends and non-virtualized I/O. Finally, the x86 side has more domain types, reflecting the wide range of x86 operating systems.

That's an introduction to the concepts. The next article (rubbing hands together in anticipation!) will delve more into the technology and their performance implications.


For general performance analysis, I recommend Brendan Gregg's book Systems Performance: Enterprise and the Cloud. It has excellent content for any performance analyst, as well as details for various versions of Linux and Solaris.

Thursday Aug 06, 2015

Oracle VM Performance and Tuning - Part 1

This blog entry starts a series of articles on virtual machine performance, focussing (obviously) on Oracle VM Server on both x86 and SPARC, though also including general concepts.[Read More]



« July 2016