Oracle's SPARC T7-1 server using Oracle VM Server for SPARC exhibits lower network latency under virtualization. The network latency and bandwidth were measured using the Netperf benchmark.
TCP network latency between two Oracle VM Server for SPARC guests running on separate SPARC T7-1 servers each using SR-IOV is similar to that of two SPARC T7-1 servers without virtualization (native/bare metal).
TCP and UDP network latencies between two Oracle VM Server for SPARC guests running on separate SPARC T7-1 servers each using assigned I/O were significantly less than the other two I/O configurations (SR-IOV and paravirtual I/O).
TCP and UDP network latencies between two Oracle VM Server for SPARC guests running on separate SPARC T7-1 servers each using SR-IOV were significantly less than when using paravirtual I/O.
The following tables show the results for TCP and UDP Netperf Latency and Bandwidth tests (single stream). Netperf latency, often called the round-trip time, is measured in microseconds (usec), smaller is better.
System Under Test:
Using the Netperf 2.6.0 benchmark to evaluate native and virtualized (LDoms) network performance. Netperf is a client/server benchmark measuring network performance providing a number of independent tests, including the omni Request/Response (aka ping-pong) test with TCP or UDP protocols used here to obtain the Netperf latency measurements, and TCP stream for bandwidth. Netperf was run between separate servers connected back-to-back (no network switch) by 10 GbE network interconnection.
To measure the cost of virtualization, for each test the servers were configured identically: native (without virtualization) or guest VM. When in a virtual environment, in similar identical fashion on each server, some representative methods were configured to connect the environment to the network hardware (e.g. assigned I/O, paravirtualization, SR-IOV).
Oracle VM Server for SPARC requires explicit partitioning of guests into Logical Domains of bound CPUs and memory, typically chosen to be local, and does not provide dynamic load balancing between guests on a host.
Oracle VM Server for SPARC guests (LDoms) were assigned 32 virtual CPUs (4 complete processor cores) and 64 GB of memory. The control domain served as the I/O domain (for paravirtualized I/O) and was assigned 4 cores and 64 GB of memory.
Each latency average reported was computed from the inverse of the reported throughput (similar to the transaction rate) of a Netperf Request/Response test run using 20 samples (aka iterations) of 30 second measurements of non-concurrent 1 byte messages.
To obtain a meaningful average latency from a Netperf Request/Response test, it is important that the transactions consist of single messages, which is Netperf's default. If, for instance, Netperf options for "burst" and "TCP_NODELAY" are turned on, multiple messages can overlap in the transactions and the reported transaction rate or throughput cannot be used to compute the latency.
All results were obtained with interrupt coalescence (aka interrupt throttling, interrupt blanking) turned on in the physical NIC, and if applicable, for the attachment driver in the guest. Also, interrupt coalescence turned on is the default for all the platforms used here.
All the results were obtained with large receive offload (LRO) turned off in the physical NIC, and, if applicable, for the attachment driver in the guest, in order to reduce the network latency between the two guests.
The netperf bandwidth test used send and receive 1MB (1048576 Bytes) messages.
The paravirtual variation of the measurements refers to the use of a paravirtualized network driver in the guest instance. IP traffic consequently is routed across the guest, the virtualization subsystem in the host, a virtual network switch or bridge (depending upon the platform), and the network interface card.
The assigned I/O variation of the measurements refers to the use of the card's driver in the guest instance itself. This use is possible by exclusively assigning the device to the guest. Device assignment results in less (software) routing for IP traffic and consequently less overhead than using paravirtualized drivers, but virtualization still can impose significant overhead. Note also NICs used in this way cannot be shared amongst guests, and may obviate the use of certain other VM features like migration. The T7-1 system has four on-board 10 GbE devices, but all of them are connected to the same PCIe branch, making it impossible to configure them as assigned I/O devices. Using a PCIe 10 GbE NIC allows configuring it as an assigned I/O device.
In the context of Oracle VM Server for SPARC and these tests, assigned I/O refers to PCI endpoint device assignment, while paravirtualized I/O refers to virtual I/O using a virtual network device (vnet) in the guest connected to a virtual switch (vsw) through the I/O domain to the physical network device (NIC).
Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 25 October 2015.