Thursday May 21, 2009

OpenSolaris beats Linux on Memcached !

Following on the heels of our memcached performance tests on SunFire X2270 ( Sun's Nehalem-based server) running OpenSolaris, we ran the same tests on the same server but this time on RHEL5. As mentioned in the post presenting the first memcached results, a 10GBE Intel Oplin card was used in order to achieve the high throughput rates possible with these servers. It turned out that using this card on linux involved a bit of work resulting in driver and kernel re-builds.

  • With the default ixgbe driver from the RedHat distribution (version 1.3.30-k2 on kernel 2.6.18)), the interface simply hung during the benchmark test.
  • This led to downloading the driver from the Intel site (1.3.56.11-2-NAPI) and re-compiling it. This version does work and we got a maximum throughput of 232K operations/sec on the same linux kernel (2.6.18). However, this version of the kernel does not have support for multiple rings. 
  • The kernel version 2.6.29 includes support for multiple rings but still doesn't have the latest ixgbe driver which is 1.3.56-2-NAPI. So we downloaded, built and installed these versions of the kernel and driver. This worked well giving a maximum throughput of 280K with some tuning.

Results Comparison

The system running OpenSolaris and memcached 1.3.2 gave us a maximum throughput of 350K ops/sec as previously reported. The same system running RHEL5 (with kernel 2.6.29) and the same version of memcached resulted in 280K ops/sec. OpenSolaris outperforms Linux by 25% !

Linux Tuning

The following Linux tunables were changed to try and get the best performance:

net.ipv4.tcp_timestamps = 0
  net.core.wmem_default = 67108864
  net.core.wmem_max = 67108864
  net.core.optmem_max = 67108864
  net.ipv4.tcp_dsack = 0
  net.ipv4.tcp_sack = 0
  net.ipv4.tcp_window_scaling = 0
  net.core.netdev_max_backlog = 300000
  net.ipv4.tcp_max_syn_backlog = 200000
  

Here are the ixgbe specific settings that were used (2 transmit, 2 receive rings):

RSS=2,2 InterruptThrottleRate =1600,1600

OpenSolaris Tuning

The following settings in /etc/system were used to set the number of MSIX:

set ddi_msix_alloc_limit=4
set pcplusmp:apic_intr_policy=1

For the ixgbe interface, 4 transmit and 4 receive rings gave the best performance :

tx_queue_number=4, rx_queue_number=4

Finally, we bound the crossbow threads:

dladm set-linkprop -p cpus=12,13,14,15 ixgbe0

About

I'm a Senior Staff Engineer in the Performance & Applications Engineering Group (PAE). This blog focuses on tips to build, configure, tune and measure performance of popular open source web applications on Solaris.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today