Everything you want and need to know about Oracle SPARC systems performance

SPARC T3 Cryptography Performance Over 1.9x Increase in Throughput over UltraSPARC T2 Plus

Guest Author

In this study, the pk11rsaperf cryptographic microbenchmark program was used to compare the throughput performance of the UltraSPARC T2 Plus and SPARC T3 processors.

  • For the standard RSA 1024-bit public key encryption, the SPARC T3 showed a 1.93x performance improvement over the UltraSPARC T2 when performing multiple decryptions of a reference text encrypted with a fixed key pair
  • The SPARC T3 achieved nearly 80,000 ops/s for the standard RSA 1024-bit public key encryption, when performing multiple decryptions of a reference text encrypted with a fixed key pair in a 1U server.

As security has taken unprecedented importance in all facets of the IT industry, today organizations are proactively adopting to cryptographic mechanisms to protect their business information from unauthorized access and ensure its confidentiality and integrity during transit and storage.

Cryptographic operations are heavily compute-intensive which burdens the host system with additional CPU cycles and network bandwidth resulting significant degradation of overall throughput of the system and its hosted applications.

Oracle's T-series systems based on the Oracle's SPARC T3 processor provide the industry's fastest on-chip hardware cryptographic capabilities to accelerate the following cyphers.

  • RSA, DSA
  • Diffie Helman (key pair gen, derive)
  • Elliptic Curve (ECDH, ECDSA, key pair gen)
  • MD5, SHA1, SHA256, SHA384, SHA512
  • Hardware RNG

In contrast, the Intel Westmere processor only adds instructions to accelerate AES.

  • "The Intel AES-NI consists of seven instructions. Six of them offer full hardware support for AES. Four instructions support AES encryption and decryption, and the other two instructions support AES key expansion. The seventh aids in carry-less multiplication. The AES instructions have the flexibility to support all usages of AES, including all standard key lengths, standard modes of operation, and even some nonstandard or future variants." Reference
  • Further, Westmere's AES-NI instructions are \*not\* hypervisor aware, VM Guests do not use the feature when given workloads, and Java Cryptography Extensions do not provide an AES-NI library.

Performance Landscape

PK11 RSA 1024-bit Benchmark Test
Processor Processes Threads per
Performance (ops/sec)
SPARC T3 8 16 128 79,558
2 64 128 76,877
1 128 128 52,660
UltraSPARC T2 Plus 8 8 64 41,285
2 32 64 39,823
1 64 64 39,856

Results and Configuration Summary

Hardware Configuration:

1 x 1.65 GHz SPARC T3
Sun SPARC Enterprise T5240
2 x 1.6 GHz UltraSPARC T2 Plus (1 blacklisted)

Software Configuration:

Oracle Solaris 10 10/09

Benchmark Description

The RSA/AES-256 Cryptography benchmark suite was internally developed by Sun to measure maximum throughput of RSA private key (sign) operations and AES-256 operations that a system can perform. Multiple processes are used to achieve the maximum throughput.

pk11rsaperf measures the performance of RSA 1024-bit processing as performed by the Solaris Cryptographic Framework via PKCS#11 API. Different data sizes and varying numbers of concurrent threads can be tested. The metric is ops/sec.

Key Points and Best Practices

  • When running the SPARC T3 at full capacity, at least 2 processes (64 threads each) are recommended as this increases throughput by over 45% over using just 1 process (128 threads) for RSA processing.

See Also

Disclosure Statement

Copyright 2010, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/20/2010.

Join the discussion

Comments ( 1 )
  • rick jones Thursday, September 30, 2010

    The benchmark description says that different data sizes can be tested - how "big" is an op here?

    I think the "Total Processes" column is actually "Total Threads" yes? And on the topic of processes vs threads, why the need for processes rather than threads on the T3? Particularly since that requirement doesn't seem to be present (to nearly the same degree) for the T2+.

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.