Tuesday Nov 22, 2011

HOWTO Turn off SPARC T4 or Intel AES-NI crypto acceleration.

Since we released hardware crypto acceleration for SPARC T4 and Intel AES-NI support we have had a common question come up: 'How do I test without the hardware crypto acceleration?'.

Initially this came up just for development use so developers can do unit testing on a machine that has hardware offload but still cover the code paths for a machine that doesn't (our integration and release testing would run on all supported types of hardware anyway).  I've also seen it asked in a customer context too so that we can show that there is a performance gain from the hardware crypto acceleration, (not just the fact that SPARC T4 much faster performing processor than T3) and measure what it is for their application.

With SPARC T2/T3 we could easily disable the hardware crypto offload by running 'cryptoadm disable provider=n2cp/0'.  We can't do that with SPARC T4 or with Intel AES-NI because in both of those classes of processor the encryption doesn't require a device driver instead it is unprivileged user land callable instructions.

Turns out there is away to do this by using features of the Solaris runtime loader (ld.so.1). First I need to expose a little bit of implementation detail about how the Solaris Cryptographic Framework is implemented in Solaris 11.  One of the new Solaris 11 features of the linker/loader is the ability to have a single ELF object that has multiple different implementations of the same functions that are selected at runtime based on the capabilities of the machine.  The alternate to this is having the application coded to call getisax() and make the choice itself.  We use this functionality of the linker/loader when we build the userland libraries for the Solaris Cryptographic Framework (specifically libmd.so, and the unfortunately misnamed due to historical reasons libsoftcrypto.so)

The Solaris linker/loader allows control of a lot of its functionality via environment variables, we can use that to control the version of the cryptographic functions we run.  To do this we simply export the LD_HWCAP environment variable with values that tell ld.so.1 to not select the HWCAP section matching certain features even if isainfo says they are present. 

For SPARC T4 that would be:

export LD_HWCAP="-aes -des -md5 -sha256 -sha512 -mont -mpmul" 

and for Intel systems with AES-NI support:

export LD_HWCAP="-aes"

This will work for consumers of the Solaris Cryptographic Framework that use the Solaris PKCS#11 libraries or use libmd.so interfaces directly.  It also works for the Oracle DB and Java JCE.  However does not work for the default enabled OpenSSL "t4" or "aes-ni" engines (unfortunately) because they do explicit calls to getisax() themselves rather than using multiple ELF cap sections.

However we can still use OpenSSL to demonstrate this by explicitly selecting "pkcs11" engine  using only a single process and thread. 

$ openssl speed -engine pkcs11 -evp aes-128-cbc
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc      54170.81k   187416.00k   489725.70k   805445.63k  1018880.00k

$ LD_HWCAP="-aes" openssl speed -engine pkcs11 -evp aes-128-cbc
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc      29376.37k    58328.13k    79031.55k    86738.26k    89191.77k

We can clearly see the difference this makes in the case where AES offload to the SPARC T4 was disabled. The "t4" engine is faster than the pkcs11 one because there is less overhead (again on a SPARC T4-1 using only a single process/thread - using -multi you will get even bigger numbers).

$ openssl speed -evp aes-128-cbc
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc      85526.61k    89298.84k    91970.30k    92662.78k    92842.67k

Yet another cool feature of the Solaris linker/loader, thanks Rod and Ali.

Note these above openssl speed output is not intended to show the actual performance of any particular benchmark just that there is a significant improvement from using hardware acceleration on SPARC T4. For cryptographic performance benchmarks see the http://blogs.oracle.com/BestPerf/ postings.

About

DarrenMoffat

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today