Tuesday Jun 29, 2010

Sun Fire X4170 M2 Sets World Record on SPEC CPU2006 Benchmark

Oracle's Sun Fire X4170 M2 server equipped with two Intel Xeon X5670 2.93 GHz processors and running the Oracle Solaris 10 operating system delivered the a world record score of 53.5 SPECfp_base2006.

  • The Sun Fire X4170 M2 server using the Oracle Solaris Studio Express 06/10 compiler delivered a world record result of 53.5 SPECfp_base2006.

  • The Sun Fire X4170 M2 server delivered 20% better performance on the SPECfp_base2006 benchmark compared to the IBM 780 POWER7 based system.

  • The Sun Fire X4170 M2 server beat systems from Supermicro (X8DTU-LN4F+), Dell (R710), IBM (x3650 M3) and Bull (R460 F2) on SPECfp_base2006.

Performance Landscape

SPEC CPU2006 Performance Charts - bigger is better, selected results, please see www.spec.org for complete results. All results as of 06/28/10.

In the tables below
"Base" = SPECint_base2006, SPECfp_base2006, SPECint_rate_base2006 or SPECfp_rate_base2006
"Peak" = SPECint2006, SPECfp2006, SPECint_rate2006 or SPECfp_rate2006

SPECfp2006 results

System Processors Performance Results
Cores/
Chips
Type GHz Peak Base
Sun Fire X4170 M2 12/2 Xeon X5670 2.93 57.6 53.5
Sun Fire X2270 M2 12/2 Xeon X5670 2.93 58.6 49.9
Supermicro X8DTU-LN4F+ 8/2 Xeon X5677 3.46 48.8 45.9
IBM x3650 M3 8/2 Xeon X5677 3.46 48.9 45.8
Bull R460 F2 8/2 Xeon X5677 3.46 49.3 45.8
Dell R710 8/2 Xeon X5677 3.46 49.3 45.8
Dell R710 12/2 Xeon X5680 3.33 48.5 45.0
IBM 780 16/2 POWER7 3.94 71.5 44.5
Dell R710 12/2 Xeon X5670 2.93 45.8 42.5

SPECint_rate2006 results

System Processors Base
Copies
Performance Results
Cores/
Chips
Type GHz Peak Base
Dell R815 24/2 Opteron 6176 2.3 24 401 314
Fijitsu BX922 S2 12/2 Xeon X5680 3.33 24 381 354
Dell R710 12/2 Xeon X5680 3.33 24 379 355
Sun Blade X6270 M2 12/2 Xeon X5680 3.33 24 369 337
Sun Fire X4170 M2 12/2 Xeon X5670 2.93 24 353 316
Sun Fire X2270 M2 (S10) 12/2 Xeon X5670 2.93 24 346 311
Sun Fire X2270 M2 (OEL) 12/2 Xeon X5670 2.93 24 342 320

SPECfp_rate2006 results

System Processors Base
Copies
Performance Results
Cores/
Chips
Type GHz Peak Base
Dell R815 24/2 Opteron 6176 2.3 24 323 295
Dell R710 12/2 Xeon X5680 3.33 24 256 248
Fijitsu BX922 S2 12/2 Xeon X5680 3.33 24 256 248
Sun Blade X6270 M2 12/2 Xeon X5680 3.33 24 255 247
Sun Fire X4170 M2 12/2 Xeon X5670 2.93 24 245 234
Sun Fire X2270 M2 (S10) 12/2 Xeon X5670 2.93 24 240 231
Sun Fire X2270 M2 (OEL) 12/2 Xeon X5670 2.93 24 235 226

Results and Configuration Summary

Hardware Configuration:

Sun Fire X4170
2 x 2.93 GHz Intel Xeon X5670
48 GB
Sun Fire X2270
2 x 2.93 GHz Intel Xeon X5670
48 GB
Sun Blade X6270
2 x 3.33 GHz Intel Xeon X5680
48 GB

Software Configuration:

Oracle Solaris 10 10/09
Oracle Solaris Studio Express 6/10
SPEC CPU2006 suite v1.1
MicroQuill SmartHeap Library v8.1

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark, with over 8000 results published in the three years since it was introduced. It measures:

  • "Speed" - single copy performance of chip, memory, compiler
  • "Rate" - multiple copy (throughput)

The rate metrics are used for the throughput-oriented systems described on this page. These metrics include:

  • SPECint_rate2006: throughput for 12 integer benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • SPECfp_rate2006: throughput for 17 floating point benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

There are base variants of both the above metrics that require more conservative compilation. In particular, all benchmarks of a particular programming language must use the same compilation flags.

See Also

Disclosure Statement

SPEC, SPECint, SPECfp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 24 June 2010 and this report. Sun Fire X4170 M2 53.5 SPECfp_base2006.

Monday Jun 28, 2010

Sun Fire X4470 Sets World Record on SPEC CPU2006 Rate Benchmark

Oracle's Sun Fire X4470 server delivered a world record SPECint_rate2006 result for all x86 systems with 4 chips.

  • The Sun Fire X4470 server with four Intel Xeon X7560 processors achieved a SPECint_rate2006 score of 788 and a SPECfp_rate2006 score of 573

  • The Sun Fire X4470 server delivered better 4 socket x86 system performance on the SPECint_rate2006 benchmark compared to HP (DL585 G7), Cicso (UCS C460 M1), Dell (R815) and IBM (x3850 X5).

  • The Sun Fire X4470 server delivered better performance on the SPECfp_rate2006 benchmark compared to similar Intel Xeon X7560 processor based systems from Cisco (UCS C460 M1), IBM (x3850 X5), and Fujitsu (RX600 S5).

Performance Landscape

SPEC CPU2006 Performance Charts - bigger is better, selected results, please see www.spec.org for complete results. All results as of 06/28/10.

In the tables below
"Base" = SPECint_rate_base2006 or SPECfp_rate_base2006
"Peak" = SPECint_rate2006 or SPECfp_rate2006

SPECint_rate2006 results

System Processors Base
Copies
Performance Results
Cores/
Chips
Type GHz Peak Base
Sun Fire X4470 32/4 Xeon X7560 2.26 64 788 724
HP DL585 G7 48/4 Opteron 6176 2.3 48 782 610
Cisco UCS C460 M1 32/4 Xeon X7560 2.26 64 772 723
Dell R815 48/4 Opteron 6174 2.20 48 771 602
IBM x3850 X5 32/4 Xeon X7560 2.26 64 770 720
Sun Fire X4640 48/8 Opteron 8435 2.6 48 730 574

SPECfp_rate2006 results

System Processors Base
Copies
Performance Results
Cores/
Chips
Type GHz Peak Base
Dell R815 48/4 Opteron 6174 2.20 48 626 574
HP DL585 G7 48/4 Opteron 6176 2.3 48 619 572
Sun Fire X4470 32/4 Xeon X7560 2.26 64 573 547
Cicso UCS C460 M1 32/4 Xeon X7560 2.26 64 568 549
IBM x3850 X5 32/4 Xeon X7560 2.26 64 560 543
Fujitsu RX600 S5 32/4 Xeon X7560 2.26 64 559 538
Sun Fire X4640 48/8 Opteron 8435 2.6 48 470 434

Results and Configuration Summary

Hardware Configuration:

Sun Fire X4470
4 x 2.26 GHz Intel Xeon X7560
256 GB

Software Configuration:

Oracle Solaris 10 10/09
Oracle Solaris Studio Express 6/10
SPEC CPU2006 suite v1.1
MicroQuill SmartHeap Library v8.1

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark, with over 8000 results published in the three years since it was introduced. It measures:

  • "Speed" - single copy performance of chip, memory, compiler
  • "Rate" - multiple copy (throughput)

The rate metrics are used for the throughput-oriented systems described on this page. These metrics include:

  • SPECint_rate2006: throughput for 12 integer benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
  • SPECfp_rate2006: throughput for 17 floating point benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

There are base variants of both the above metrics that require more conservative compilation. In particular, all benchmarks of a particular programming language must use the same compilation flags.

See Also

Disclosure Statement

SPEC, SPECint, SPECfp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 24 June 2010 and this report. Sun Fire X4470 788 SPECint_rate2006.

Friday Jun 12, 2009

OpenSolaris Beats Linux on memcached Sun Fire X2270

OpenSolaris provided 25% better performance on memcached than Linux on the Sun Fire X2270 server. memcached 1.3.2 using OpenSolaris gave a maximum throughput of 352K ops/sec compared to the same server running RHEL5 (with kernel 2.6.29) which produced a result of 281K ops/sec.

memcached is the de-facto distributed caching server used to scale many web2.0 sites today. With the requirement to support a very large number of users as sites grow, memcached aids scalability by effectively cutting down on MySQL traffic and improving response times.

  • memcached is a very light-weight server but is known not to scale beyond 4-6 threads. Some scalability improvements have gone into the 1.3 release (still in beta).
  • As customers move to the newer, more powerful Intel Nehalem based systems, it is important that they have the ability to use these systems efficiently using appropriate software and hardware components.

Performance Landscape

memcached performance results: ops/sec (bigger is better)

System C/C/T Processors Memory Operating System Performance
Ops/Sec
GHz Type
Sun Fire X2270 2/8/16 2.93 Intel X5570 QC 48GB OpenSolaris 2009.06 352K
Sun Fire X2270 2/8/16 2.93 Intel X5570 QC 48GB RedHat Enterprise Linux 5 (kernel 2.6.29) 281K

C/C/T: Chips, Cores, Threads

Results and Configuration Summary

Sun's results used the following hardware and software components.

Hardware:

    Sun Fire X2270
    2 x Intel X5570 QC 2.93 GHz
    48GB of memory
    10GbE Intel Oplin card

Software:

    OpenSolaris 2009.06
    Linux RedHat 5 (on kernel 2.6.29)

Benchmark Description

memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. The memcached benchmark was based on Apache Olio - a web2.0 workload.

The benchmark initially populates the server cache with objects of different sizes to simulate the types of data that real sites typically store in memcached :

  • small objects (4-100 bytes) to represent locks and query results
  • medium objects (1-2 KBytes) to represent thumbnails, database rows, resultsets
  • large objects (5-20 KBytes) to represent whole or partially generated pages

The benchmark then runs a mixture of operations (90% gets, 10% sets) and measures the throughput and response times when the system reaches steady-state. The workload is implemented using Faban, an open-source benchmark development framework. It not only speeds benchmark development, but the Faban harness is a great way to queue, monitor and archive runs for analysis.

Key Points and Best Practices

OpenSolaris Tuning

The following /etc/system settings were used to set the number of MSIX:

  • set ddi_msix_alloc_limit=4
  • set pcplusmp:apic_intr_policy=1

For the ixgbe interface, 4 transmit and 4 receive rings gave the best performance :

  • tx_queue_number=4, rx_queue_number=4

The crossbow threads were bound:

dladm set-linkprop -p cpus=12,13,14,15 ixgbe0

Linux Tuning

Linux was more complicated to tune, the following Linux tunables were changed to try and get the best performance:

  • net.ipv4.tcp_timestamps = 0
  • net.core.wmem_default = 67108864
  • net.core.wmem_max = 67108864
  • net.core.optmem_max = 67108864
  • net.ipv4.tcp_dsack = 0
  • net.ipv4.tcp_sack = 0
  • net.ipv4.tcp_window_scaling = 0
  • net.core.netdev_max_backlog = 300000
  • net.ipv4.tcp_max_syn_backlog = 200000

Here are the ixgbe specific settings that were used (2 transmit, 2 receive rings):

  • RSS=2,2 InterruptThrottleRate=1600,1600

Linux Issues

The 10GbE Intel Oplin card on Linux resulted in the following driver and kernel re-builds.

  • With the default ixgbe driver from the RedHat distribution (version 1.3.30-k2 on kernel 2.6.18)), the interface simply hung during the benchmark test.
  • This led to downloading the driver from the Intel site (1.3.56.11-2-NAPI) and re-compiling it. This version does work and we got a maximum throughput of 232K operations/sec on the same linux kernel (2.6.18). However, this version of the kernel does not have support for multiple TX rings.
  • The kernel version 2.6.29 includes support for multiple TX rings but still doesn't have the ixgbe driver which is 1.3.56.11-2-NAPI. So we downloaded, built and installed these versions of the kernel and driver. This worked well giving a maximum throughput of 281K with some tuning.

See Also

Disclosure Statement

Sun Fire X2270 server with OpenSolaris 352K ops/sec. Sun Fire X2270 server with RedHat Linux 281K ops/sec. For memcached information, visit http://www.danga.com/memcached. Results as of June 8, 2009.

About

BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

Index Pages
Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today