Wednesday Sep 02, 2009

Erratic network performance: Spin mutexes vs. Interrupts

Erratic network performance: Spin mutexes vs. Interrupts

I was recently investigating the cause of high variance in network performance between Logical Domains on a SunFire T2000. I was running the iperf benchmark from one LDom guest to two other LDom guests. The rig was configured like this:
# ldm ls
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
primary          active     -n-cv-  SP      8     4G       2.1%  1d 1h
oaf381-ld-1      active     -n----  5000    8     6G        13%  1m
oaf381-ld-2      active     sn----  5001    8     6G       0.0%  1m
oaf381-ld-3      active     sn----  5002    8     6G       0.0%  2m
Sometimes I would see throughput of up to 1360 Mb/s, but other runs it would drop to as low as 870 Mb/s. Here's a graph of the benchmark results, as you can see they are very erratic. (You may need to open it in a separate window if your browser scales it).

Looking at mpstat output there seemed by some sort of connection between a high spin mutex count and performance, but it's hard to get a grasp of tens of mpstat outputs at once.

For example here is mpstat output for a run with a result of 1318 Mb/s

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0 1490   313    1   66   11    2 4250    0     4    0 100   0   0
  1    0   0 1467   314    0   71    9    4 4678    0     4    0 100   0   0
  2    0   0  486  2207    4 3687    2 1277  187    0    34    0  24   0  76
  3    0   0  192  1048    2 1574    2  526  106    0    21    0  12   0  87
  4    0   0  627  3302    5 6008    9  657  163    0    36    0  30   0  70
  5    0   0  608  3134    6 5597   11  695  159    0    45    0  31   0  68
  6    0   0 3911  6130 4094 4590   31  663  222    0    62    0  44   0  56
  7    0   0 4462  6279 4205 4625   32  666  238    0    50    0  45   0  55

and here is mpstat output for a run with a result of 882 Mb/s
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0  666  5338    4 9695    8  523  318    0    29    0  34   0  66
  1    0   0  540  4593    6 8272    9  506  277    0    34    0  31   0  69
  2    0   0  405  3382    6 5795    9  448  202    0    43    0  28   0  72
  3    0   0 1644  2037  112 3208    5  338  124    0    48    0  25   0  75
  4    0   0   84   928    4 1283    1  402   82    0    30    0  14   0  86
  5    0   0   36   503    2  496    0  152   31    0    15    0   5   0  95
  6    0   0 6490  6540 6102   87   23    1 2197    0     5    0 100   0   0
  7    0   0 6485  6547 6107   92   23    2 2336    0     5    0 100   0   0
The best way I found to see the pattern was to graph it. For each benchmark run I found what CPU had the highest smtx count, and plotted that smtx value against the iperf result, using a different colour for each CPU. The graph is below and reveals an unusual pattern:

A few notes:

  • There appear to be four groupings of behaviour
  • If the highest smtx count is on CPU 6 or 7, the iperf result is low
  • If the highest smtx count is on CPU 1 or 2 the iperf result is high
  • The highest smtx count is never on CPU 3.
  • There is a range of results with very low smtx values, so there may be anoth er variable in play as well.

Another data point is that in every run, the same two CPUs (6 and 7) handled the interrupts for the vnet device. Here is the intrstat output, and it is con firmed by the mpstat output above:

      device |      cpu0 %tim      cpu1 %tim      cpu2 %tim      cpu3 %tim
-------------+------------------------------------------------------------
       vdc#0 |         0  0.0         0  0.0         0  0.0         4  0.0
      vnet#0 |         0  0.0         0  0.0         0  0.0         0  0.0
      vnet#1 |         0  0.0         0  0.0         0  0.0         0  0.0

      device |      cpu4 %tim      cpu5 %tim      cpu6 %tim      cpu7 %tim
-------------+------------------------------------------------------------
       vdc#0 |         0  0.0         0  0.0         0  0.0         0  0.0
      vnet#0 |         0  0.0         0  0.0         0  0.0         0  0.0
      vnet#1 |         0  0.0         0  0.0      3973  3.6      3980  3.6
So the first conclusion I could draw was that if the interrupt handling and whatever generates the spin mutexes is on the same two CPUs, then iperf performance is badly affected.

I will follow up this blog entry with more analysis and some workarounds.

Notes

I was running with iperf 2.0.4. oaf381-ld-1 is the server, oaf381-ld-2 and oaf 381-ld-3 are the clients. It is invoked on the server as:
iperf204 -c 192.1.44.2 -f m -t 120 -N -l 1M -P 100 &
iperf204 -c 192.1.44.3 -f m -t 120 -N -l 1M -P 100 &
and on the clients as:
iperf204 -s -N -f m -l 1M
About

Snippets of code, useful tips, and observations of working with Sun.

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today