Soft Rings (pre-Crossbow or Crossbow Phase0)

Soft rings is a feature that I worked on recently and putback the changes into S10 update 2. This feature improves incoming network traffic performance. This is the worker thread model of processing packets. The incoming traffic is made to land on a soft ring and a worker thread will pick up the packet and deliver it to IP.

Let's for a minute go back and see what problem we are trying to solve:

The FireEngine architecture introduced a per-CPU synchronization mechanism called vertical perimeter inside TCP/IP module. These vertical perimeters are implemented using a serialization queue abstraction called squeue. A connection is bound to an instance of squeue when the connection is initialized. Afterwards all packets for the connection are always processed on the same squeue. In the case of new incoming connections, they get bound to the squeue of the CPU that took the interrupt. This helps achieve better cache locality and increased network performance.

Now on systems consisting of slow cpus (CPU speed less than 1 Ghz), a single CPU will not be able to handle incoming load of 1 Gbps. On the other hand, even faster CPUs will not be able to handle loads generated by 10 Gbps NICs. The solution would be to fanout the load to be handled by multiple CPUs.

The current solution of enabling this by setting ip_squeue_fanout to 1 is suboptimal (or rather one can say it is broken). With ip_squeue_fanout set to 1, for new incoming TCP connections a random squeue that could belong to any one of the CPUs in the system gets selected and then the packet could get processed in the same context. This is bad because what you want here is to have the other CPU to do the processing of the packets belonging to its squeue.

The problem is addresses by soft rings. Soft rings is an abstraction that simulates hardware Rx ring functionality in software. Multiple soft rings can be configured on a system (tunable: ip_soft_rings_cnt). By default 2 soft rings are configured. Incoming traffic is made to land on one of the soft rings. The soft ring will have pointer to the right squeue to which the packet has to be delivered. A worker thread will be created for each soft ring and this worker thread will pick up the packet from the soft ring and deliver it to IP. The worker thread will have affinity to the CPU to which the squeue belong. All this helps in efficient processing of the packets.

Other considerations:

Fanout based on the hardware/platform:

Consider Niagara processors. Niagara processor contain multiple cores in a single chip. Each core in turn can process 4 threads. When handling software fanout, due consideration is given to tie in the incoming data to be handled by threads (these thread are counted as CPUs) in the same core that took the interrupt. This would help preserve interrupt to cpu/core affinity.

Same is the case with AMD dual core processors. It would be optimal if the load can be fanned out to CPUs on the same core to capitalize on the shared L2 cache.

How to enable soft rings ?

You need to have Solaris 10 update 2.

On Niagara platforms (T1000 and T2000s), it is enabled by default.

On other platforms, it can be enabled by setting ip_squeue_fanout to 1.

ip_soft_rings_cnt has a default value of 2. A value of 2 or 3 has been found to be optimal for getting good performance on 1Gbps NICs on the Niagara platforms.

Comments:

Hi Gopi Thank you for this blog. I happen to be one of the early users of FireEngine, Niagara and Nevada. Your little write up on soft rings was very helpful and I wanted to let you know that its appreciated. I do follow Sunay Tripathi's blog as well and hope to learn more about FireEngine's performance optimizations, with special interest in UDP. Thanks again. Pankaj

Posted by pankaj on May 08, 2006 at 01:18 PM PDT #

Appreciate your comments. Thanks.

Posted by Gopi on May 09, 2006 at 04:30 PM PDT #

Hi Gopi, this blog is very helpful. I like to see this kind of high level short introduction, it clearly gives an easy-to-understand conceptual model, therefore no need to go deeply inside the code. Thanks again. Alex

Posted by Alpen on September 03, 2006 at 07:18 PM PDT #

Hi Gopi,

Thanks for trying to explain this shortly. However, I still find it difficult to see how the the soft rings part work. If a squeue is bound to a specific cpu, and the worker thread delivers the i/o to right squeue (from a soft ring), we still don't spread the load over different CPUs(cores/threads). Will a squeue when using soft rings bind to multiple cores and avoid CPUs which have worker threads for any soft rings?

Regards,
A confused Daniel

Posted by Daniel on September 26, 2007 at 07:24 AM PDT #

Hi Gopi:

Thank you for your explaination about soft ring's function. I have a question about the Rx ring how to fanout its incoming packets to multiple soft-rings? Is there any hash table or tunable could be configed?

Hongbo

Posted by Hongbo Zou on December 10, 2009 at 04:53 AM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

rajgopi

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today