Faster Firmware

What a difference firmware can make! We take it for granted, and as administrators we probably do not update our system's firmware as often as we should, but I was recently involved in a performance investigation where it made a huge difference.

On a 128 CPU T5240 server, the throughput of an application peaked around 90 processes, but declined as more processes were added, until at 128 processes the throughput was just 25% of its peak value. Classic and severe anti-scaling. The puzzling part was that the usual suspects were innocent. mpstat showed that 99% of the time was usr mode, so no kernel issues; plockstat did not show any contended userland mutexes; cpustat did not show increases in cache misses, TLB misses, or any other counter per process; and a collector/analyzer profile did not show hot atomic functions or CAS operations. It did show a marked increase in the cost of the SPARC save instruction at function entry as process count was raised. Curious.

We eventually upgraded the firmware, and the application scaled nicely up to 128 processes. If you want some advice and do not care about gory details, skip the next two paragraphs :)

It turns out that the hypervisor had a global lock that was limiting scalability, and the lock was eliminated by a firmware upgrade. Normally very little time is spent executing code in hyper-privileged mode on the Sun CMT servers. However, the hypervisor is responsible for maintaining "permanent" VA->PA mappings in the TLB. These mappings are used for the Solaris kernel nucleus, one 4MB mapping for text, and one 4MB mapping for data. Solaris cannot handle an MMU miss for these mappings, so when the processor traps to hypervisor for the miss, the hypervisor finds the mapping, stuffs it into the hardware TLB, and returns from the trap, so Solaris never sees the miss.

The above hypervisor action was protected in the old firmware by a single global lock. The application had a high TLB miss rate exceeding 200K/CPU/sec, so the permanent mappings were being continuously evicted - not an issue if we don't use the kernel much. But, the app also had a deep function stack, so it generated lots of spill/fill traps. These trap to kernel text, which is backed by a permanent mapping, which has been evicted, which causes a hypervisor trap, which hits the global lock. A perfect storm limiting scalability! Spill/fill is a lightweight trap that does not change accounting mode to sys, hence I did not see high sys time; instead, I saw high save time. In hindsight, I could have directly observed the instruction and stall cycles spent in hypervisor mode using:

# cpustat -s -c pic0=Instr_cnt,pic1=Idle_strands,hpriv,nouser 10 10

Should you care? This issue is specific to firmware in the CMT server line, and it depends on your model:

  • T5140,T5240 (2-socket 128-CPU): Definitely verify and upgrade your firmware if needed; get the latest version of patch 136936.
  • T5120,T5220 (1-socket, 64-CPU): I have not observed the scalability bottleneck on this smaller system, but you may get a small performance boost by upgrading; get the latest version of patch 136932.
  • T1000, T2000 (1-socket, 32-CPU) - probably not an issue, the system is too small.
  • T5440 (4-socket, 256-CPU): not an issue, as the first units shipped already contained a later version of the firmware containing the fix.

The CR is: 6669222 lock in mmu_miss can be eliminated to reduce contention
It was fixed in Sun System Firmware version 7.1.3.d.
To show the version of firmware installed on your system, log in to the service processor and verify you have version 7.1.3.d or later:

sc> showhost
Sun System Firmware 7.1.3.d 2008/07/11 08:55
...

To upgrade your firmware:

  1. Go to http://sunsolve.sun.com
  2. lick on Patches and Updates link
  3. Type the patch number in the PatchFinder Form (eg 136936 for the T5140 or T5240)
  4. Push Find Patch button
  5. Click on the "Download Patch" link near the top.
  6. Unzip the download and refer to the Install.info file for instructions

If you have never upgraded firmware before, read the documentation and be careful!

Comments:

[Trackback] Steve Sistare has an excellent write up of a scaling issue that we hit last year. The issue was frustrating because all the tools seemed to indicate a healthy system, but we were just not getting the scaling that we expected. The solution, as Steve w...

Posted by Darryl Gove's blog on March 20, 2009 at 09:03 AM EDT #

the patch for the T5220 is 136934 not 136932

Posted by nick hindley on August 17, 2009 at 03:27 AM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Steve Sistare

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today