Tuesday Apr 29, 2014

SO_FLOW_SLA socket option in Solaris 11.2

We have added a new socket option, SO_FLOW_SLA, in Solaris 11.2 to allow an application to create socket level flows and set resource control properties on them using setsockopt(). This socket option requires PRIV_SYS_FLOW_CONFIG privilege.

The setsockopt(3C) man page has all the details of the programming API. It is simple to use as shown below -

sock_flow_props_t sprop;

sock = socket(AF_INET, SOCK_STREAM, 0);
sprop.sfp_version = SOCK_FLOW_PROP_VERSION1;
sprop.sfp_mask = SFP_MAXBW;
sprop.sfp_maxbw = 500000000; /* 500 Mbps */
setsockopt(sock, SOL_SOCKET, SO_FLOW_SLA, &sprop, sizeof (sprop));

The flows created using setsockopt(3C) can be observed using flowadm(1M), flowstat(1M) as well as from pfiles(1).

Consider the example of the nc(1)/netcat tool which uses this socket option to implement the -M option.

a. Assume nc -l 80 is running on and run
# nc -M maxbw=100M 80

And in another window we observe -

# flowadm
24.sys.sock net1 tcp 38769 80 --

# flowadm show-flowprop
24.sys.sock maxbw rw 100 -- --
24.sys.sock priority rw -- medium low,medium,high

# pfiles `pgrep nc`
18827: nc -M maxbw=100M 80
3: S_IFSOCK mode:0666 dev:556,0 ino:5341 uid:0 gid:0 size:0
SO_FLOW_SLA(maxbw: 100.000 mbits/sec)
sockname: AF_INET port: 38769
peername: AF_INET port: 80
congestion control: newreno

Priority flows and socket level flows in Solaris 11.2

A brief bit of background first - Crossbow flows have been there since Solaris 11. flowadm(1M) and flowstat(1M) are the admin commands. flowadm(1M) is used for enforcing bandwidth limit on a service by creating a flow and setting 'maxbw' property on it. flowstat(1M) is used to observe the traffic on the flow. Note that a flow can be created without any property set on it. This is useful for observability.

We have extended Crossbow flows in Solaris 11.2 to support -
1. socket level (a.k.a. fine-grained) flows
2. A new flow property called 'priority'

1. socket level flows

One can now create a flow that corresponds to a listener socket or a fully-connected socket. To give an example, you can do the following in Solaris 11.2 -

a. Specify both local IP and local port attributes in a flow
# flowadm add-flow -l net0 -a transport=tcp,local_ip=,\
local_port=22 sshd-flow

b. Specify the 5-tuple attributes in a flow
# flowadm add-flow -l net0 -a transport=tcp,local_ip=,\
-p maxbw=800M custom-flow

In addition to extending the flowadm command, we have also introduced a new socket option, SO_FLOW_SLA, to allow a privileged socket application to create a socket level flow and set properties on it using setsockopt(). I will talk about that API in my next blog entry.

2. 'priority' flow property

We have also added a new property called 'priority'. From the man page
Setting the value to 'high' on a flow has the effect that packets
classified to that flow are processed ahead of packets from normal flows
on the same link. A high priority flow may offer a
better latency depending on the availability of system resources.

One could use it for interactive and/or latency sensitive applicationslike sshd, and ntp. For example, one could do
# flowadm set-flowprop -p priority=high sshd-flow
to ensure better latency for ssh traffic even when the system is heavily loaded by other networking traffic.

Thursday Mar 31, 2011

Change is the only constant

I moved from the Solaris security group to the Solaris networking group in late 2009. So far, it has been an exciting experience working on the networking stack. I will talk about some of this work later.

Friday Nov 20, 2009

KSSL is not impacted by SSL/TLS renegotiation vulnerability

... because it does not support client renegotiation or client certificates. One of the few benefits of less features :-). See the security blog for information on the impact on other Sun products.

Tuesday Sep 15, 2009

KSSL IPv6 support is in snv_124

Just a heads up that the KSSL IPv6 support RFE went in to snv_124. SSL servers - web servers like SJS web server and Apache web server, and application servers like Sun Glassfish and IBM Websphere, serving IPv6 addresses will now be able to offload the SSL processing to KSSL.

A side benefit of this change is that Apache httpd.conf works even with the directive "Listen <proxy_port>". Previously, one needed to do "Listen<proxy_port>".

Wednesday Jun 03, 2009

T5440 AES crypto performance

The following numbers from a kernel micro benchmark run on a T5440 show that the crypto stack scales nicely in the current build, snv_117. This micro benchmark calls crypto_encrypt() in a loop for CKM_AES_CBC mechanism with a 128-bit key.

#modload saes_scale_atomic (8192 byte input data size, crypto_encrypt() atomic call, in-place)

# Threads

Throughput in MBytes/sec















So, why the decrease after 64 threads? It turned it is because of too many thread context switches caused by the threads cv_wait'ing on a CWQ. Incidentally, there are 32 CWQ units on a T5440. I added the following line in n2cp.conf and redid the above tests -


#modload saes_scale_atomic (8192 byte input data size, crypto_encrypt() atomic call, in-place)

# Threads

Throughput in MBytes/sec















There is a penalty for setting the spinners to 8 though, which is increased CPU consumption. In practice, a workload is unlikely to have more than 64 threads all doing crypto_encrypt() at the same instant. So, the default value of 1 will work fine.

Friday May 29, 2009

Removing that last impediment to scalability!

The following para from a paper by Bryan Cantrill and Jeff Bonwick captures my state of mind this week -

Prepare for the thrill of victory—and the agony of

defeat. Making a system scale can be a frustrating pursuit:

the system will not scale until all impediments to scal-

ability have been removed, but it is often impossible to

know if the current impediment to scalability is the last

one. Removing that last impediment is incredibly gratify-

ing: with that change, throughput finally gushes through

the system as if through an open sluice.

I will follow up with details in my next post!

Wednesday May 20, 2009

Kernel SSL deep dive presentation

Slides from Kernel SSL deep dive presentation A while back, I gave a deep dive presentation on Kernel SSL to an internal audience.
I am making it available here.

Friday May 15, 2009

ksslcfg(1M) and the -T option on S10

ksslcfg(1M) and the -T option on S10 ksslcfg(1m) has a -T option. From the man page -

-T token_label
         When pkcs11 is specified with -f, uses the PKCS#11 token
         specified in token_label. Use cryptoadm list -v to
         display all PKCS#11 tokens available.

and from the Examples section
         # ksslcfg create -f pkcs11 -T "Sun Software PKCS#11 softtoken" \\
         -C "Server-Cert" -p /some/directory/password -u webservd \\
         -x 8080 www.mysite.com 443

The above example does not work in S10 due to a bug (6507464) that will be fixed. A work around is to disable metaslot before running the command and enable it after. So, do this for the above example

#cryptoadm disable metaslot
#ksslcfg create -f pkcs11 -T "Sun Software PKCS#11 softtoken" ...
#cryptoadm enable metaslot

Technorati Tag:
Technorati Tag:

Tuesday Aug 07, 2007

UltraSPARC T2 (Niagara 2) and Solaris crypto

I am thrilled to see the cryptographic performance of the UltraSPARC T2 (Niagara 2) processor highlighted in today's launch. I would highly recommend listening to the audio clip in this page featuring Lawrence Spracklen who talks about the crypto features of this processor. I have been working on making the Solaris crypto framework take full advantage of the hardware crypto acceleration. We focused both on reducing the latency and improving the scalability. One interesting fix is 65270071 and the follow up fix 6533554 . Maintaining good scalability while running with 64 threads means we can get the best overall crypto throughput.

Wednesday Jul 12, 2006

Kernel SSL proxy is now in Solaris 10 06/06

Kernel SSL proxy is now in Solaris 10 06/06
One of the new features in Solaris 10 06/06 is a kernel-level SSL proxy server. Kais Belgaied and I keep talking about blogging about this feature. But, for various reasons, I didn't get around to it till now :-). In this post, I will cover existing documentation for this feature.

First of all, due to an unfortunate slip up, the ksslcfg(1M) man page was not delivered in Solaris 10 06/06. But, you can find it here. Please note that this man page is for Solaris Express. The only difference is that one of the CLI options, -h ca_certchain_file, is available only in Solaris Express.

The 'network services' system administration guide also has a section on this feature. This guide covers configuring a Sun Java system web server or a Apache web server to use the kernel SSL proxy.

The Sun blue print article by Ning Sun and Pallab Bhattacharya, here talks about kernel SSL performance on a T2000 machine. Ning Sun also has an excellent blog entry here.

Technorati Tag:
Technorati Tag:

Friday Sep 30, 2005

Debugging tips for the Solaris crypto framework code

How to use crypto API in Solaris kernel code A while back, I wrote a document with various tips for debugging the crypto framework code (mainly in kernel). I am making it available here since much of the crypto framework code is now in build 22. Please let me know if you have a tip that you would like to see included. Happy debugging!

Technorati Tag:
Technorati Tag:

Friday Aug 05, 2005

How to use crypto API in Solaris kernel code - Part 2

How to use crypto API in Solaris kernel code - Part 2 Continuing from my previous post, we look at how to use the crypto API in the asynchronous case. The header file, uts/common/sys/crypto/api.h defines the crypto_call_req_t structure that needs to be passed in the asynchronous case. I am including a man page style description of this structure here.

As described in the man page, the default behavior is what is called an adaptive asynchronous mode (CRYPTO_ALWAYS_QUEUE flag is clear) as opposed to pure asynchronous mode (CRYPTO_ALWAYS_QUEUE flag is set).  kCF consists of various crypto providers some of them software-based and some of them hardware-based. For a given mechanism, a software provider is typically capable of doing the operation without needing kCF to block. The default behavior is appropriate for an operation like digest or MAC which do not take too many cycles. It may not be appropriate for operations like encrypt/decrypt, especially for public key ciphers like RSA. The caller needs to determine which behavior is suited for its case. In the default case, a caller is required to handle a CRYPTO_SUCCESS return value. Of course, if the CRYPTO_ALWAYS_QUEUE flag is set, this is not the case.

Looking at the ah_submit_req_inbound() code in uts/common/inet/ip/ipsecah.c.
   2826         AH_INIT_CALLREQ(&call_req);
   2828         ii->ipsec_in_skip_len = skip_len;
   2830         IPSEC_CTX_TMPL(assoc, ipsa_authtmpl, IPSEC_ALG_AUTH, ctx_tmpl);
   2832         /\* call KEF to do the MAC operation \*/
   2833         kef_rc = crypto_mac_verify(&assoc->ipsa_amech,
   2834             &ii->ipsec_in_crypto_data, &assoc->ipsa_kcfauthkey, ctx_tmpl,
   2835             &ii->ipsec_in_crypto_mac, &call_req);
   2837         switch (kef_rc) {
   2838         case CRYPTO_SUCCESS:
   2839                 AH_BUMP_STAT(crypto_sync);
   2840                 return (ah_auth_in_done(ipsec_mp));
   2841         case CRYPTO_QUEUED:
   2842                 /\* ah_callback() will be invoked on completion \*/
   2843                 AH_BUMP_STAT(crypto_async);
   2844                 return (IPSEC_STATUS_PENDING);
   2845         case CRYPTO_INVALID_MAC:
   2846                 AH_BUMP_STAT(crypto_sync);
   2847                 ah_log_bad_auth(ipsec_mp);
   2848                 return (IPSEC_STATUS_FAILED);
   2849         }

The call_req argument to crypto_mac_verify() is the one that is of interest here1. One thing to notice here is that this routine,  ah_submit_req_inbound() can be called from interrupt context while processing the incoming IPSec packet. So, we ensure we won't be blocking by calling crypto_mac_verify() in asynchronous mode. call_req is set on line 2826 with the following macro

   2777 #define AH_INIT_CALLREQ(_cr) {                                          \\
   2778         (_cr)->cr_flag = CRYPTO_SKIP_REQID|CRYPTO_RESTRICTED;           \\
   2779         if (ipsec_algs_exec_mode[IPSEC_ALG_AUTH] == IPSEC_ALGS_EXEC_ASYNC) \\
   2780                 (_cr)->cr_flag |= CRYPTO_ALWAYS_QUEUE;                  \\
   2781         (_cr)->cr_callback_arg = ipsec_mp;                              \\
   2782         (_cr)->cr_callback_func = ah_kcf_callback;                      \\
   2783 }

We specify a call back routine, ah_kcf_callback(), which kCF will call after completing the operation. The call back routine gets two arguments - cr_callback_arg (in this case ipsec_mp) and  a status of the crypto operation. The call back routine must adhere  to  the same restrictions as a driver soft interrupt handler. Note that this code sets CRYPTO_ALWAYS_QUEUE flag conditionally (default is that IPSEC_ALGS_EXEC_ASYNC is not set). This explains why  we check for CRYPTO_SUCCESS on line 2838 after calling crypto_mac_verify().

This concludes a brief overview of using crypto API.  The best way to get more details is to look at more code. You can find all the consumers of these API by looking for files which include sys/crypto/api.h.

1crypto_mac_verify() takes more arguments than crypto_digest() from our previous example. The third argument is the key used for mac'ing. The fourth argument is a context template which is used to precompute things like a key schedule and reuse it several times later.

Technorati Tag:
Technorati Tag:

Tuesday Jun 14, 2005

How to use crypto API in Solaris kernel code

How to use crypto API in Solaris kernel code Now that OpenSolaris is a reality, we can finally blog about the source code. I will start with how to do crypto stuff from the kernel code.

Solaris 10 has the kernel crypto framework (kCF) which offers crypto API for other kernel modules or drivers. IPSec and Kerberos are some of the components that make use of it. This API is not public yet and hence you won't see any man pages (They should very likely be public in Nevada). The complete list of API is in uts/common/sys/crypto/api.h.

Let us start with a simple digest operation. The digest API are
    crypto_digest_init(), crypto_digest_update(), crypto_digest_final()

If you are familiar with PKCS #11, these routines follow similar naming conventions except that crypto_digest() does not need an crypto_digest_init(). Let us look at crypto_digest(). The prototype is

    int crypto_digest(crypto_mechanism_t \*mech, crypto_data_t \*data, crypto_data_t \*digest, crypto_call_req_t \*cr);

The structures are defined in uts/common/sys/crypto/common.h. It is best to explain each argument by looking at an actual example. Looking in uts/common/io/cryptmod.c, we find
  692            rv = crypto_digest(&mech, &d1, &d2, NULL);

The first argument specifies the cryptographic mechanism  to be used and its parameters.

 672            mech.cm_type = digest_type;
 673            mech.cm_param = 0;
 674            mech.cm_param_len = 0;

cm_type identifies the type of the mechanism. This field must be set to the value returned by crypto_mech2id(). crypto_mech2id() gets the kCF internal mechanism id assigned for a mechanism name. This call needs to be made only once since the id stays the same till a reboot. For example in this file, a SHA1 mechanism id is obtained

 294          sha1_hash_mech = crypto_mech2id(SUN_CKM_SHA1);

and it is passed as digest_type to the kef_digest() routine.

cm_param specifies the parameter for a mechanism. It is zero here. But, it needs to be specified for some mechanisms. For example, the IV for a CKM_DES_CBC mechanism is passed in this field. cm_param_len specifies the length of cm_param, in bytes.

The second argument describes the data that is to be digested.

 676            v1.iov_base = (void \*)input;
 677            v1.iov_len = inlen;
 679            d1.cd_format = CRYPTO_DATA_RAW;
 680            d1.cd_offset = 0;
 681            d1.cd_length = v1.iov_len;
 682            d1.cd_raw = v1;

cd_format specifies the format of the data which can be one of CRYPTO_DATA_RAW, CRYPTO_DATA_UIO or CRYPTO_DATA_MBLK. CRYPTO_DATA_RAW format means that the input is a iovec_t which basically means a pointer to a buffer and its length. CRYPTO_DATA_MBLK format is useful in networking code where mblk_t structures are common. cd_offset specifies an offset from the beginning of the data. The digesting starts at that offset byte. cd_length specifies the length of the data to be used for the digesting. cd_raw is used to specify the address of a iovec_t buffer. As can be guessed, this field  is valid only if cd_format is equal to CRYPTO_DATA_RAW.

The third argument describes the output data.

 684            v2.iov_base = (void \*)output;
 685            v2.iov_len = hashlen;
 687            d2.cd_format = CRYPTO_DATA_RAW;
 688            d2.cd_offset = 0;
 689            d2.cd_length = v2.iov_len;
 690            d2.cd_raw = v2;

The fourth argument describes the calling conditions.  A NULL value, as is the case here, means that caller is prepared to block till the operation completes. Callers in an interrupt context usually can't block and need to specify a value for this argument making it an asynchronous interface.

One asynchronous example is in uts/common/inet/ip/ipsecah.c. I will talk about it in my next post.

Technorati Tag:
Technorati Tag:

Friday May 20, 2005

A brief history of /dev/random in Solaris

A /dev/random interface for Solaris first appeared as part of the unbundled SUNWski package in Solaris 7. /dev/random in SUNWski is actually implemented as a named pipe which was written to by a daemon process. A named pipe made sense because it was all done in user land. Starting from Solaris 9, /dev/random and /dev/urandom became device nodes since a kernel-based implementation was done. This is available as a patch on Solaris 8 also (112438-03 for SPARC and 112439-02 for X86). In Solaris 10, /dev/random supports hardware-based random number generators (RNG). It does so by using the kernel cryptographic framework (kCF). One cool thing about this feature is that existing applications which use /dev/random can get the random numbers from a hardware RNG \*without\* needing to be modified. A hardware RNG has to be registered with the kCF and implement random number generation routines to be usable by /dev/random. For more details about the kCF interfaces, see http://www.sun.com/bigadmin/features/articles/crypt_framework.html or send an email to solaris-crypto-api@sun.com. Another Solaris 10 enhancement was to make /dev/urandom scale much better on a multi-processor machine. We get near linear scaling for reads on /dev/urandom.

I am an engineer in the Solaris kernel networking group.


« February 2015