SPARC T4-1 Server Outperforms Intel (Westmere AES-NI) on IPsec Encryption Tests
By Brian on Sep 29, 2011
Oracle's SPARC T4 processor has significantly greater performance than the Intel Xeon X5690 processor when both are using Oracle Solaris 11 secure IP networking (IPsec). The SPARC T4 processor using IPsec AES-256-CCM mode achieves line speed over a 10 GbE network.
On IPsec, SPARC T4 processor is 23% faster than the 3.46 GHz Intel Xeon X5690 processor (Intel AES-NI).
The SPARC T4 processor is only at 23% utilization when running at its maximum throughput making it 3.6 times more efficient at secure networking than the 3.46 GHz Intel Xeon X5690 processor.
The 3.46 GHz Intel Xeon X5690 processor is nearly fully utilized at its maximum throughput leaving little CPU for application processing.
The SPARC T4 processor using IPsec AES-256-CCM mode achieves line speed over a 10 GbE network.
The SPARC T4 processor approaches line speed with fewer than one-quarter the number of IPsec streams required for the Intel Xeon X5690 processor to achieve its peak throughput. The SPARC T4 processor supports the additional streams with minimal extra CPU utilization.
IPsec provides general purpose networking security which is transparent to applications. This is ideal for supplying the capability to those networking applications that don't have cryptography built-in. IPsec provides for more than Virtual Private Networking (VPN) deployments where the technology is often first encountered.
Performance was measured using the AES-256-CCM cipher in megabits per second (Mb/sec) aggregate over sufficient numbers of TCP/IP streams to achieve line rate threshold (SPARC T4 processor) or drive a peak throughput (Intel Xeon X5690).
|Processor||GHz||AES Decrypt||AES Encrypt|
|B/W (Mb/sec)||CPU Util||Streams||B/W (Mb/sec)||CPU Util||Streams|
|– Peak performance|
|Intel Xeon X5690||3.46||8,000||83%||4,700||81%|
|– Load at which SPARC T4 processor performance crosses 9000 Mb/sec|
|Intel Xeon X5690||3.46||4,700||41%||3,200||47%|
128 GB memory
Oracle Solaris 11
Single 10-Gigabit Ethernet XAUI Adapter
48 GB memory
Oracle Solaris 11
Sun Dual Port 10GbE PCIe 2.0 Networking Card with Intel 82599 10GbE Controller
Driver Systems Configuration:
48 GB memory
Oracle Solaris 11
Dual 10-Gigabit Ethernet Fabric Expansion Module (FEM)
The results here are derived from runs of the Netperf 2.4.5 benchmark. Netperf is a client/server benchmark measuring network performance providing a number of independent tests, including the TCP streaming bandwidth tests used here.
Netperf is, however, a single network stream benchmark and to demonstrate peak network bandwidth over a 10 GbE line under encryption requires many streams.
The Netperf documentation provides an example of using the software to drive multiple streams. The example is not sufficient to develop the workload because it does not scale beyond a single driver node which limits the processing power that can be applied. This subsequently limits how many full bandwidth streams can be supported. We chose to have a single server process on the target system (containing either the SPARC T4 processor or the Intel Xeon processor) and to spawn one or more Netperf client processes each across a cluster of the driver systems. The client processes are managed by the mpirun program of the Oracle Message Passing Toolkit.
Tabular results include aggregate bandwidth and CPU utilization. The aggregate bandwidth is computed by dividing the total traffic of the client processes by the overall runtime. CPU utilization on the target system is the average of that reported by all of the Netperf client processes.
IPsec is configured in the operating system of each participating server transparently to Netperf and applied to the dedicated network connecting the target system to the driver systems.
Key Points and Best Practices
- Line speed is defined as data bandwidth within 10% of theoretical maximum bit rate of network line. For 10 GbE greater than 9000 Mb/sec bandwidth is defined as line speed.
- IPsec provides network security that is configured completely in the operating system and is transparent to the application.
- Peak bandwidths under IPsec are achieved only in aggregate with multiple client network streams to the target server.
- Oracle Solaris receiver fanout must be increased from the default to support the large numbers of streams at quoted peak rates.
ixgbenetwork driver relevant on servers with Intel 82599 10GbE controllers (driver systems and Intel Xeon target system) was limited to only a single receiver queue to maximize utilization of extra fanout.
- IPsec is configured to make a unique security association (SA) for each connection to avoid a bottleneck over the large stream counts.
- Jumbo frames are enabled (MTU of 9000) and network interrupt blanking (sometimes called interrupt coalescence) is disabled.
- The TCP streaming bandwidth tests, which run continuously for minutes and multiple times to determine statistical significance, are configured to use message sizes of 1,048,576 bytes.
- IPsec configuration defines that each SA is established through the use of a preshared key and Internet Key Exchange (IKE).
- IPsec encryption uses the Solaris Cryptographic Framework which applies the appropriate accelerated provider on both the SPARC T4 processor and the Intel Xeon processor.
- There is no need to configure a specific authentication algorithm for IPsec. With the Encapsulated Security Payload (ESP) security protocol and choosing AES-256-CCM for the encryption algorithm, the encapsulation is self-authenticating.
- Oracle Solaris IP Security (IPsec)
- Oracle Solaris Cryptographic Services
- IPsec - Wikipedia
- Netperf official site
- SPARC T4-1 Server
- Sun Fire X4270 M2 Server
Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.