Thursday Dec 02, 2010

World Record TPC-C Result on Oracle's SPARC Supercluster with T3-4 Servers

Oracle demonstrated the world's fastest database performance using 27 of Oracle's SPARC T3-4 servers, 138 Sun Storage F5100 Flash Array storage systems and Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters (RAC) and Partitioning delivered a world-record TPC-C benchmark result.

  • The SPARC T3-4 server cluster delivered a world record TPC-C benchmark result of 30,249,688 tpmC and $1.01 $/tpmC (USD) using Oracle Database 11g Release 2 on a configuration available 6/1/2011.

  • The SPARC T3-4 server cluster is 2.9x faster than the performance of the IBM Power 780 (POWER7 3.86 GHz) cluster with IBM DB2 9.7 database and has 27% better price/performance on the TPC-C benchmark. Almost identical price discount levels were applied by Oracle and IBM.

  • The Oracle solution has three times better performance than the IBM configuration and only used twice the power during the run of the TPC-C benchmark.  (Based upon IBM's own claims of energy usage from their August 17, 2010 press release.)

  • The Oracle solution delivered 2.9x the performance in only 71% of the space compared to the IBM TPC-C benchmark result.

  • The SPARC T3-4 server with Sun Storage F5100 Flash Array storage solution demonstrates 3.2x faster response time than IBM Power 780 (POWER7 3.86 GHz) result on the TPC-C benchmark.

  • Oracle used a single-image database, whereas IBM used 96 separate database partitions on their 3-node cluster. It is interesting to note that IBM used 32 database images instead of running each server as a simple SMP.

  • IBM did not use DB2 Enterprise Database, but instead IBM used "DB2 InfoSphere Warehouse 9.7" which is a data warehouse and data management product and not their flagship OLTP product.

  • The multi-node SPARC T3-4 server cluster is 7.4x faster than the HP Superdome (1.6 GHz Itanium2) solution and has 66% better price/performance on the TPC-C benchmark.

  • The Oracle solution utilized Oracle's Sun FlashFire technology to deliver this result. The Sun Storage F5100 Flash Array storage system was used for database storage.

  • Oracle Database 11g Enterprise Edition Release 2 with Real Application Clusters and Partitioning scales and effectively uses all of the nodes in this configuration to produce the world record TPC-C benchmark performance.

  • This result showed Oracle's integrated hardware and software stacks provide industry leading performance.

Performance Landscape

TPC-C results (sorted by tpmC, bigger is better)

System tpmC Price/tpmC Avail Database Cluster Racks
27 x SPARC T3-4 30,249,688 1.01 USD 6/1/2011 Oracle 11g RAC Y 15
3 x IBM Power 780 10,366,254 1.38 USD 10/13/10 DB2 9.7 Y 10
HP Integrity Superdome 4,092,799 2.93 USD 08/06/07 Oracle 10g R2 N 46

Avail - Availability date
Racks - Clients, servers, storage, infrastructure

Oracle and IBM TPC-C Response times

System tpmC Response Time (sec)
New Order 90th%
Response Time (sec)
New Order Average
27 x SPARC T3-4 30,249,688 0.750 0.352
3 x IBM Power 780 10,366,254 2.1 1.137
Response Time Ratio - Oracle Better 2.9x 2.8x 3.2x

Oracle uses Average New Order Response time for comparison between Oracle and IBM.

Graphs of Oracle's and IBM's response times for New-Order can be found in the full disclosure reports on TPC's website TPC-C Official Result Page.

Configuration Summary and Results

Hardware Configuration:

15 racks used to hold

Servers
27 x SPARC T3-4 servers, each with
4 x 1.65 GHz SPARC T3 processors
512 GB memory
3 x 300 GB 10K RPM 2.5" SAS disks

Data Storage
69 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 2.93 GHz Intel Xeon X5670 processor
8 GB memory
9 x 2 TB 7.2K RPM 3.5" SAS disks
2 x Sun Storage F5100 Flash Array storage (1.92 TB each)
1 x Brocade DCX switch

Redo Storage
28 x Sun Fire X4270 M2 servers configured as COMSTAR heads, each with
1 x 2.93 GHz Intel Xeon X5670 processor
8 GB memory
11 x 2 TB 7.2K RPM 3.5" SAS disks
2 x Brocade 5300 switches

Clients
81 x Sun Fire X4170 M2 servers, each with
2 x 2.93 GHz Intel X5670 processors
48 GB memory
2 x 146 GB 10K RMP 2.5" SAS disks

Software Configuration:

Oracle Solaris 10 9/10 (for SPARC T3-4 and Sun Fire X4170 M2)
Oracle Solaris 11 Express (COMSTAR for Sun Fire X4270 M2)
Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters and Partitioning
Oracle iPlanet Web Server 7.0 U5
Tuxedo CFS-R Tier 1

Results:

System 27 x SPARC T3-4
tpmC 30,249,688
Price/tpmC 1.01 USD
Avail 6/1/2011
Database Oracle Database 11g RAC
Cluster yes
Racks 15
New Order Ave Response 0.352 seconds

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

Key Points and Best Practices

  • Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters and Partitioning scales easily to this high level of performance.

  • Sun Storage F5100 Flash Array storage provides high performance, very low latency, and very high storage density.

  • COMSTAR (Common Multiprotocol SCSI Target), new in Oracle Solaris 11 Express, is the software framework that enables a Solaris host to serve as a SCSI Target platform. COMSTAR uses a modular approach to break the huge task of handling all the different pieces in a SCSI target subsystem into independent functional modules which are glued together by the SCSI Target Mode Framework (STMF). The modules implementing functionality at SCSI level (disk, tape, medium changer etc.) are not required to know about the underlying transport. And the modules implementing the transport protocol (FC, iSCSI, etc.) are not aware of the SCSI-level functionality of the packets they are transporting. The framework hides the details of allocation providing execution context and cleanup of SCSI commands and associated resources and simplifies the task of writing the SCSI or transport modules.

  • Oracle iPlanet Web Server 7.0 U5 is used in the user tier of the benchmark with each of the web server instance supporting more than a quarter-million users, while satisfying the stringent response time requirement from the TPC-C benchmark.

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Processing Performance Council (TPC). 27-node SPARC T3-4 Cluster (4 x 1.65 GHz SPARC T3 processors) with Oracle Database 11g Release 2 Enterprise Edition with Real Application Clusters and Partitioning, 30,249,688 tpmC, $1.01/tpmC, Available 6/1/2011. IBM Power 780 Cluster (3 nodes using 3.86 GHz POWER7 processors) with IBM DB2 InfoSphere Warehouse Ent. Base Ed. 9.7, 10,366,254 tpmC, $1.38 USD/tpmC, available 10/13/2010. HP Integrity Superdome(1.6GHz Itanium2, 64 processors, 128 cores, 256 threads) with Oracle 10g Enterprise Edition, 4,092,799 tpmC, $2.93/tpmC, available 8/06/07. Energy claims based upon IBM calculations and internal measurements. Source: http://www.tpc.org/tpcc, results as of 11/22/2010

Monday Sep 20, 2010

SPARC T3-4 Sets World Record Single Server Result on SPECjEnterprise2010 Benchmark

World Record Single Application Server System Performance

Oracle produced a single application server world record SPECjEnterprise2010 benchmark result using Oracle's SPARC T3-4 server for the application server and Oracle's SPARC T3-2 server for the database server.
  • A SPARC T3-4 server paired with a SPARC T3-2 server delivered a result of 9456.28 SPECjEnterprise2010 EjOPS for the SPECjEnterprise benchmark.

  • The SPARC T3-4 server running at 1.65 GHz demonstrated 32% better performance compared to the IBM Power 750 system result of 7172.93 SPECjEnterprise2010 EjOPS which used four POWER7 chips running at 3.55 GHz.

  • The 4-socket SPARC T3-4 server was 32% faster than a 4-socket IBM Power 750 system proving that IBM's per-core performance is irrelevant when compared to system performance.

  • The SPARC T3-4 server has 5% better computational density than the IBM Power 750 system.

  • The SPARC T3-4 server running SPARC T3 processors at 1.65 GHz demonstrated 84% better performance compared to the IBM x3850 X5 system result of 5140.53 SPECjEnterprise2010 EjOPS using four Intel Xeon chips at 2.27 GHz.

  • The SPARC T3-4 server has 47% better computational density than the IBM x3850 X5 system.

  • This world record result was achieved using Oracle Weblogic 10.3.3 application server and Oracle Database 11g R2.

  • Oracle Fusion Middleware provides a family of complete, integrated, hot plugable and best-of-breed products known for enabling enterprise customers to create and run agile and intelligent business applications. The Oracle WebLogic Server's on-going, record-setting Java application server performance demonstrates why so many customers rely on Oracle Fusion Middleware as their foundation for innovation.

  • To obtain this leading result a number of Oracle technologies were used: Oracle Solaris 10, Oracle Solaris Containers, Oracle Java Hotspot VM, Oracle Weblogic, Oracle Database 11gR2, SPARC T3-4 server, and SPARC T3-2 server.

  • The SPARC T3-4 server demonstrated less than 1 second average response times for all SPECjEnterprise2010 transactions and 90% of all transaction times took less than 1 second.

  • The two T-series systems occupied a total of 16 RU of space. This is less than half of the 37 RU of space used in the IBM Power 750 system result of 7172.93 SPECjEnterprise2010 EjOPS.

  • The SPARC T3-4 server result only used 61% of floor space compared to the IBM x3850 X5 system result of 5140.53 SPECjEnterprise2010 EjOPS which requires 26 RU of space.

Performance Landscape

Complete benchmark results are at the SPEC website, SPECjEnterprise2010 Results.

SPECjEnterprise2010 Performance Chart
as of 9/20/2010
Submitter EjOPS\* Application Server Database Server
Oracle 9456.28 1 x Oracle SPARC T3-4
4 x SPARC 1.65 GHz SPARC T3
Oracle WebLogic 10.3.3
1 x Oracle SPARC T3-2
2 x 1.65 GHz SPARC T3
Oracle 11g DB 11.2.0.1
IBM 7172.93 1 x IBM Power 750 Express
4 x 3.55 GHz POWER7
WebSphere Application Server V7.0
1 x IBM BladeCenter PS702
2 x 3.0 GHz POWER7
IBM DB2 Universal Database 9.7
IBM 5140.53 1 x IBM x3850 X5
4 x 2.2 GHz Intel X7560
WebSphere Application Server V7.0
1 x IBM x3850 X5
4 x 2.2 GHz Intel X7560
IBM DB2 Universal Database 9.7

\* SPECjEnterprise2010 EjOPS, Bigger is better.

Results and Configuration Summary

Application Server:

1 x Oracle SPARC T3-4 server
4 x 1.65 GHz SPARC T3 processors
256 GB memory
4 x 10GbE NIC
Oracle Solaris 10 9/10
Oracle Solaris Containers
Oracle WebLogic 10.3.3 Application Server - Standard Edition
Oracle Fusion Middleware
Oracle Java SE, JDK 6 Update 21

Database Server:

1x Oracle SPARC T3-2
2 x 1.65 GHz SPARC T3 processors
256 GB memory
2 x 10GbE NIC
2 x Sun Storage 6180 Array
Oracle Solaris 10 9/10
Oracle Database Enterprise Edition Release 11.2.0.1

Benchmark Description

The SPECjEnterprise2010™ benchmark is a full system benchmark which allows performance measurement and characterization of Java EE 5.0 servers and supporting infrastructure such as JVM, Database, CPU, disk and servers.

The workload consists of an end-to-end web-based order processing domain, an RMI and Web Services driven manufacturing domain and a supply chain model utilizing document-based Web Services. The application is a collection of Java classes, Java Servlets, Java Server Pages , Enterprise Java Beans, Java Persistence Entities (pojo's) and Message Driven Beans.

SPECjEnterprise2010 is the third generation of the SPEC organization's J2EE end-to-end industry standard benchmark application. The new SPECjEnterprise2010 benchmark has been re-designed and developed to cover the JEE 5.0 specification's significantly expanded and simplified programming model, highlighting the major features used by developers in the industry today. This provides a real world workload driving the Application Server's implementation of the Java EE specification to its maximum potential and allowing maximum stressing of the underlying hardware and software systems.

SPEC has paid particular attention to making this benchmark as easy as possible to install and run. This has been achieved by utilizing simplification features of the Java EE 5.0 platform such as annotations and sensible defaulting and by the use of the opensource Faban facility for developing and running the benchmark driver.

SPECjEnterprise2010's new design spans Java EE 5.0 including the new EJB 3.0 and WSEE component architecture, Message Driven beans, and features level transactions.

Key Points and Best Practices

  • Eight Oracle WebLogic server instances on the SPARC T3-4 server were hosted in 8 separate Oracle Solaris Containers to demonstrate consolidation of multiple application servers.

  • Each Oracle Solaris container was bound to a separate processor set, each containing 7 cores. This was done to improve performance by using the physical memory closest to the processors, thereby, reducing memory access latency. The default processor set was used for network and disk interrupt handling.

  • The Oracle WebLogic application servers were executed in the FX scheduling class to improve performance by reducing the frequency of context switches.

  • The Oracle database processes were run in 2 processor sets using the Oracle Solaris psrset utility and executed in the FX scheduling class. These were done to improve performance by reducing memory access latency and by reducing context switches.

  • The Oracle Log Writer process was run in a separate processor set containing 1 core and run in the RT scheduling class. This was done to insure that the Log Writer had the most efficient use of CPU resources.

See Also

Disclosure Statement

SPEC is a registered trademark and SPECjEnterprise is a trademark of Standard Performance Evaluation Corporation. Results from www.spec.org as of 9/20/2010. SPARC T3-4 9456.28 SPECjEnterprise2010 EjOPS. IBM Power 750 Express 7,172.93 SPECjEnterprise2010 EjOPS. IBM System x3850 X5 5,140.53 SPECjEnterprise2010 EjOPS.

IBM Power 750 Express (4RU each).
IBM BladeCenter H Chassis (9RU each).
IBM System x3850 X5 (4RU each).
IBM DS4800 Disk System Model 82 (4RU each).
IBM DS4000 EXP810 (3RU each).

http://www-03.ibm.com/systems/power/hardware/750/index.html
http://www-03.ibm.com/systems/x/hardware/enterprise/x3850x5/index.html
http://www-03.ibm.com/systems/bladecenter/hardware/chassis/bladeh/index.html
http://www-900.ibm.com/storage/cn/disk/ds4000/ds4800/TSD01054USEN.pdf
http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-59552&brandind=5000028
http://www-03.ibm.com/systems/storage/disk/ds4000/exp810/

Thursday Jun 03, 2010

Sun SPARC Enterprise T5440 World Record SPECjAppServer2004

Using five of Oracle's Sun SPARC Enterprise T5440 systems for application serving along with one of Oracle's Sun SPARC Enterprise M9000 servers for the database server allowed Oracle to deliver a world record result of 28,648.74 SPECjAppServer2004 JOPS@Standard on the SPECjAppServer2004 benchmark.

This result was run using the Oracle WebLogic 10.3.3 Application Server, a component of Oracle Fusion Middleware, and Oracle Database 11g Enterprise Edition with the Oracle Solaris 10 operating system to obtain this world record result.

Oracle Performance Advantages
  • This Oracle result is 26% faster than the IBM result of 22,634.13 SPECjAppServer2004 JOPS@Standard. For the application tier of the benchmark, Oracle used five Sun SPARC Enterprise T5440 servers compared to the sixteen IBM BladeCenter HS blades used by IBM. For the database tier, Oracle used a Sun SPARC Enterprise M9000 server compared to a IBM system p5 595 used by IBM. 

  • The Oracle result is faster than the HP result of 28,463.03 SPECjAppServer2004 JOPS@Standard. For the application tier of the benchmark, Oracle used five Sun SPARC Enterprise T5440 servers compared to the seventeen HP BL870c blade servers used by HP. For the database tier, Oracle used a Sun SPARC Enterprise M9000 server compared to a HP Superdome used by HP. 

Oracle's Advantages in Reduced Space and Reduced Number of Servers
  • The five Sun SPARC Enterprise T5440 servers used a total of 20 RU of space to obtain this result which is 26% less than the 27 RU space used by the three blade chassis containing sixteen IBM BladeCenter HS blades.

  • IBM uses more than 3.4 times the number of application servers than Oracle.

  • The five Sun SPARC Enterprise T5440 servers occupied 40% of the 50 RU space used by the five blade chassis containing 17 HP BL870c blade servers to obtain this leading result. 

  • HP uses more than 3.2 times the number of application servers than Oracle.

Oracle's Storage Advantages:
  • The six Sun Storage F5100 Flash Array storage used in this result occupied 6U of rack space which is 13% of the 44U space used by the database storage in the IBM result. 

  • The database storage in the HP result used 4x EVA81000 Storage arrays consuming112U of space is more than 18 times the 6U space used for database storage in the Oracle result.

  • The application server storage in the HP result used an EVA6100 storage space which consumed 16U of space for JMS logs. The 5x T5440 each used internal SSDs for the same function - no additional external storage was used.

Oracle Technologies Utilized:
  • Six of Oracle's Sun Storage F5100 Flash Array storage were used with Oracle 11g Enterprise Edition on the Sun SPARC Enterprise M9000 server to show outstanding database performance in this benchmark. 

  • These results were obtained using Java Platform, Standard Edition JDK 6 Update 20 on the Sun SPARC Enterprise T5440 servers and running the Oracle Solaris 10 10/09 operating system.

  • The five Sun SPARC Enterprise T5440 servers used Oracle Solaris Containers to consolidate eight Oracle Weblogic application server instances on each server to achieve this result. 

  • Oracle Fusion Middleware provides a family of complete, integrated, hot pluggable and best-of-breed products known for enabling enterprise customers to create and run agile and intelligent business applications. Oracle WebLogic Servers on-going, record-setting Java application server performance demonstrates why so many customers rely on Oracle Fusion Middleware as their foundation for innovation.

Oracle has other benchmarks that show that Oracle's "Optimized System Performance" is more important than IBM's "Per-core Performance Focus".

Performance Landscape

SPECjAppServer2004 Performance Chart as of 6/2/2010. Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org. SPECjAppServer2004 JOPS@Standard (bigger is better)

Submitter SPECjAppServer2004
JOPS@Standard
J2EE Server DB Server
Oracle 28,648.74 5x Sun SPARC Enterprise T5440
1.6 GHz US-T2 Plus
Oracle WebLogic 10.3.3
1x Sun SPARC Enterprise M9000
2.88 GHz SPARC64-VII
Oracle 11g DB 11.1.0.7
HP 28,463.03 17x HP BL870c Server Blade
1.6 Ghz Itanium
Oracle WebLogic 10.3
1x HP Superdome
1.6 GHz Itanium
Oracle 11g DB 11.1.0.7
IBM 22,634.13 16x IBM BladeCenter HS21
3.32 GHz Intel X5470
WebSphere Application Server V7.0.0.1
1x IBM System p5 595
2.1 GHz POWER5+
IBM DB2 Universal Database 9.5 FP3

Results and Configuration Summary

Application Server:
    5x Sun SPARC Enterprise T5440
      4 x 1.6 GHz UltraSPARC T2 Plus
      256 GB memory
      2 x 10GbE NIC
      2 x 32GB SATA SSD
      Oracle Solaris 10 10/09
      Oracle Solaris Containers
      Oracle WebLogic 10.3.3 Application Server - Standard Edition
      Oracle Fusion Middleware
      Java Platform, Standard Edition JDK 6 Update 20

Database Server:

    Sun SPARC Enterprise M9000
      64x 2.88 GHz SPARC64-VII
      2048 GB memory
      6 x Sun Storage F5100 Flash Array
      Oracle Solaris 10 10/09
      Oracle Database Enterprise Edition Release 11.1.0.7

Benchmark Description

SPECjAppServer2004 (Java Application Server) is a multi-tier benchmark for measuring the performance of Java 2 Enterprise Edition (J2EE) technology-based application servers. SPECjAppServer2004 is an end-to-end application which exercises all major J2EE technologies implemented by compliant application servers as follows:
  • The web container, including servlets and JSPs
  • The EJB container
  • EJB2.0 Container Managed Persistence
  • JMS and Message Driven Beans
  • Transaction management
  • Database connectivity
Moreover, SPECjAppServer2004 also heavily exercises all parts of the underlying infrastructure that make up the application environment, including hardware, JVM software, database software, JDBC drivers, and the system network. The primary metric of the SPECjAppServer2004 benchmark is jAppServer Operations Per Second (JOPS) which is calculated by adding the metrics of the Dealership Management Application in the Dealer Domain and the Manufacturing Application in the Manufacturing Domain. There is NO price/performance metric in this benchmark.

Key Points and Best Practices

  • 8x Oracle WebLogic server instances on each Sun SPARC Enterprise T5440 server were hosted in 4x separate Solaris Containers to demonstrate consolidation of multiple application servers.
  • The Oracle WebLogic application servers were executed in the FX scheduling class to improve performance by reducing the frequency of context switches.
  • Enhancements in Java to the JVM had a major impact on performance.
  • Each Sun SPARC Enterprise T5440 used 2x 10GbE NICs for network traffic from the driver systems.

See Also

Disclosure Statement

SPECjAppServer2004, 5x Sun SPARC Enterprise T5440 (4 chips, 32 cores) 28648.74 SPECjAppServer2004 JOPS@Standard; 17x HP BL870c (4 chips, 8 cores) 28463.03 SPECjAppServer2004 JOPS@Standard; 16x IBM HS21 (2 chips, 8 cores) 22634.13 SPECjAppServer2004 JOPS@Standard; SPEC, SPECjAppServer reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 6/2/2010.

HP C7000 Blade Chassis (10 RU each). 5x Blade Chassis total 50 RU.
HP EVA8100 2C6D Storage Array(112 disks): 2x HSV210-B controllers (2U each) and 8x M5314C Disk Enclosures (3U each) total 28 RU. 4x EVA8100 2C6D total 112 RU.
HP EVA6100 2C4D Storage Array: 2x HSV200-B controllers (2U each) and 4x M5314C Disk Enclosures (3U each) total 16 RU.
http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00816246/c00816246.pdf
http://h18004.www1.hp.com/products/quickspecs/12745_div/12745_div.pdf

IBM BladeCenter H Chassis (9 RU each). 3x Chassis Total 27 RU.
IBM DS4800 Disk System Model 82 (4U each). 6x IBM DS4000 EXP810 (3U each) total 22 RU. 2x Total STorage DS4800 total 44 RU.
http://www-03.ibm.com/systems/xbc/cog/bc_h_8852/bc_h_8852aag.html
ftp://ftp.software.ibm.com/systems/support/system_x_pdf/59y7294.pdf
ftp://ftp.software.ibm.com/systems/support/bladecenter/gc26779809.pdf

Thursday Nov 19, 2009

SPECmail2009: New World record on T5240 1.6GHz Sun 7310 and ZFS

The Sun SPARC Enterprise T5240 server running the Sun Java Messaging server 7.2 achieved a World Record SPECmail2009 result using Sun Storage 7310 Unified Storage System and ZFS file system.  Sun's OpenStorage platforms enable another world record.

  • World record SPECmail2009 benchmark using the Sun SPARC Enterprise T5240 server (two 1.6GHz UltraSPARC T2 Plus), Sun Communications Suite 7, Solaris 10, and the Sun Storage 7310 Unified Storage System achieved 14,500 SPECmail_Ent2009 users at 69,857 Sessions/Hour.

  • This SPECmail2009 benchmark result clearly demonstrates that the Sun Messaging Server 7.2, Solaris 10 and ZFS solution can support a large, enterprise level IMAP mail server environment as a low cost 'Sun on Sun' solution, delivering the best performance and maximizing data integrity and availability of Sun Open Storage and ZFS.

  • The Sun SPARC Enterprise T5240 server supported 2.4 times more users with 2.4 times better sessions/hour rate than AppleXserv3 solution on the SPECmail2009 benchmark.

  • There are no IBM Power6 results on this benchmark.

  • The configuration using Sun OpenStorage outperformed all previous results with traditional direct attached storage and significantly higher number of disk devices.

SPECmail2009 Performance Landscape (ordered by performance)

System Performance Disks OS Messaging
Server
Users Sessions/
hour
Sun SPARC Enterprise T5240
2 x 1.6GHz UltraSPARC T2 Plus
14,500 69,857 58
NAS
Solaris 10 CommSuite 7.2
Sun JMS 7.2
Sun SPARC Enterprise T5240
2 x 1.6GHz UltraSPARC T2 Plus
12,000 57,758 80
DAS
Solaris 10 CommSuite 5
Sun JMS 6.3
Sun Fire X4275
2 x 2.93GHz Xeon X5570
8,000 38,348 44
NAS
Solaris 10 Sun JMS 6.2
Apple Xserv3,1
2 x 2.93GHz Xeon X5570
6,000 28,887 82
DAS
MacOS 10.6 Dovecot 1.1.14
apple 0.5
Sun SPARC Enterprise T5220
1 x 1.4GHz UltraSPARC T2
3,600 17,316 52
DAS
Solaris 10 Sun JMS 6.2

Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org

Users - SPECmail_Ent2009 Users
Sessions/hour - SPECmail2009 Sessions/hour
NAS - Network Attached Storage
DAS - Direct Attached Storage

Results and Configuration Summary

Hardware Configuration:

    Sun SPARC Enterprise T5240
      2 x 1.6 GHz UltraSPARC T2 Plus processors
      128 GB memory
      2 x 146GB, 10K RPM SAS disks, 4 x 32GB SSDs

External Storage:

    2 x Sun Storage 7310 Unified Storage System, each with
      32 GB of memory
      24 x 1 TB 7200 RPM SATA Drives

Software Configuration:

    Solaris 10
    ZFS
    Sun Java Communications Suite 7 Update 2
      Sun Java System Messaging Server 7.2
      Directory Server 6.3

Benchmark Description

The SPECmail2009 benchmark measures the ability of corporate e-mail systems to meet today's demanding e-mail users over fast corporate local area networks (LAN). The SPECmail2009 benchmark simulates corporate mail server workloads that range from 250 to 10,000 or more users, using industry standard SMTP and IMAP4 protocols. This e-mail server benchmark creates client workloads based on a 40,000 user corporation, and uses folder and message MIME structures that include both traditional office documents and a variety of rich media content. The benchmark also adds support for encrypted network connections using industry standard SSL v3.0 and TLS 1.0 technology. SPECmail2009 replaces all versions of SPECmail2008, first released in August 2008. The results from the two benchmarks are not comparable.

Software on one or more client machines generates a benchmark load for a System Under Test (SUT) and measures the SUT response times. A SUT can be a mail server running on a single system or a cluster of systems.

A SPECmail2009 'run' simulates a 100% load level associated with the specific number of users, as defined in the configuration file. The mail server must maintain a specific Quality of Service (QoS) at the 100% load level to produce a valid benchmark result. If the mail server does maintain the specified QoS at the 100% load level, the performance of the mail server is reported as SPECmail_Ent2009 SMTP and IMAP Users at SPECmail2009 Sessions per hour. The SPECmail_Ent2009 users at SPECmail2009 Sessions per Hour metric reflects the unique workload combination for a SPEC IMAP4 user.

Key Points and Best Practices

  • Each Sun Storage 7310 Unified Storage System was configured with one J4400 JBOD array with 22x1TB SATA drives to a mirrored device and 4 shared volumes are built under the mirrored device. Total 8 mirrored volumes from 2 x Sun Storage 7310 are mounted on the system under test (SUT) messaging mail indexes and mail messages file system using NFSV4 protocol. Four SSDs were used as the SUT internal disks. Each SSD is configured as a ZFS file system. Four such ZFS directories are used for the messaging server queue, store metadata, LDAP and queue. SSDs substantially reduced the store metadata and queue latencies.

  • Each Sun Storage 7310 Unified Storage System was connected to the SUT via a dual 10-Gigabit Ethernet Fiber XFP card.

  • The Sun Storage 7310 Unified Storage System software version is 2009.08.11,1-0.

  • The clients used these Java options: java -d64 -Xms4096m -Xmx4096m -XX:+AggressiveHeap

  • Substantial performance improvement and scalability was observed with Sun Communications Suite7 update2, Java Messaging Server 7.2 and Directory Server 6.2

  • See the SPEC Report for all OS, network and messaging server tunings.

See Also

Disclosure Statement

SPEC, SPECmail reg tm of Standard Performance Evaluation Corporation. Results as of 10/22/09 on www.spec.org. SPECmail2009: Sun SPARC Enterprise T5240, SPECmail_Ent2009 14,500 users at 69,857 SPECmail2009 Sessions/hour. Apple Xserv3,1, SPECmail_Ent2009 6,000 users at 28,887 SPECmail2009 Sessions/hour.

Wednesday Nov 04, 2009

New TPC-C World Record Sun/Oracle

TPC-C Sun SPARC Enterprise T5440 with Oracle RAC World Record Database Result

Sun and Oracle demonstrate the World's fastest database performance. Sun Microsystems using 12 Sun SPARC Enterprise T5440 servers, 60 Sun Storage F5100 Flash arrays and Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning delivered a world-record TPC-C benchmark result.

  • The 12-node Sun SPARC Enterprise T5440 server cluster result delivered a world record TPC-C benchmark result of 7,646,486.7 tpmC and $2.36 $/tpmC (USD) using Oracle 11g R1 on a configuration available 3/19/10.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the IBM Power 595 (5GHz) with IBM DB2 9.5 database by 26% and has 16% better price/performance on the TPC-C benchmark.

  • The complete Oracle/Sun solution used 10.7x better computational density than the IBM configuration (computational density = performance/rack).

  • The complete Oracle/Sun solution used 8 times fewer racks than the IBM configuration.

  • The complete Oracle/Sun solution has 5.9x better power/performance than the IBM configuration.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the HP Superdome (1.6GHz Itanium2) by 87% and has 19% better price/performance on the TPC-C benchmark.

  • The Oracle/Sun solution utilized Sun FlashFire technology to deliver this result. The Sun Storage F5100 flash array was used for database storage.

  • Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning scales and effectively uses all of the nodes in this configuration to produce the world record performance.

  • This result showed Sun and Oracle's integrated hardware and software stacks provide industry-leading performance.

More information on this benchmark will be posted in the next several days.

Performance Landscape

TPC-C results (sorted by tpmC, bigger is better)


System
tpmC Price/tpmC Avail Database Cluster Racks w/KtpmC
12 x Sun SPARC Enterprise T5440 7,646,487 2.36 USD 03/19/10 Oracle 11g RAC Y 9 9.6
IBM Power 595 6,085,166 2.81 USD 12/10/08 IBM DB2 9.5 N 76 56.4
HP Integrity Superdome 4,092,799 2.93 USD 08/06/07 Oracle 10g R2 N 46 to be added

Avail - Availability date
w/KtmpC - Watts per 1000 tpmC
Racks - clients, servers, storage, infrastructure

Sun and IBM TPC-C Response times


System
tpmC

Response Time

New Order 90th%

Response Time

New Order Average

12 x Sun SPARC Enterprise T5440 7,646,487 0.170 0.168
IBM Power 595 6,085,166 1.69
1.22
Response Time Ratio - Sun Better

9.9x 7.3x

Sun uses 7x comparison to highlight the differences in response times between Sun's solution and IBM.  Although notice that Sun is 10x faster on New Order transactions that finish in the 90% percentile.

It is also interesting to note that none of Sun's response times, avg or 90th percentile, for any transaction is over 0.25 seconds. While IBM does not have even one interactive transaction, not even the menu, below 0.50 seconds. Graphs of Sun's and IBM's response times for New-Order can be found in the full disclosure reports on TPC's website TPC-C Official Result Page.

Results and Configuration Summary

Hardware Configuration:

    9 racks used to hold

    Servers:
      12 x Sun SPARC Enterprise T5440
      4 x 1.6 GHz UltraSPARC T2 Plus
      512 GB memory
      10 GbE network for cluster
    Storage:
      60 x Sun Storage F5100 Flash Array
      61 x Sun Fire X4275, Comstar SAS target emulation
      24 x Sun StorageTek 6140 (16 x 300 GB SAS 15K RPM)
      6 x Sun Storage J4400
      3 x 80-port Brocade FC switches
    Clients:
      24 x Sun Fire X4170, each with
      2 x 2.53 GHz X5540
      48 GB memory

Software Configuration:

    Solaris 10 10/09
    OpenSolaris 6/09 (COMSTAR) for Sun Fire X4275
    Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning
    Tuxedo CFS-R Tier 1
    Sun Web Server 7.0 Update 5

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Performance Processing Council (TPC). 12-node Sun SPARC Enterprise T5440 Cluster (1.6GHz UltraSPARC T2 Plus, 4 processor) with Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning, 7,646,486.7 tpmC, $2.36/tpmC. Available 3/19/10. IBM Power 595 (5GHz Power6, 32 chips, 64 cores, 128 threads) with IBM DB2 9.5, 6,085,166 tpmC, $2.81/tpmC, available 12/10/08. HP Integrity Superdome(1.6GHz Itanium2, 64 processors, 128 cores, 256 threads) with Oracle 10g Enterprise Edition, 4,092,799 tpmC, $2.93/tpmC. Available 8/06/07. Source: www.tpc.org, results as of 11/5/09.

Wednesday Oct 14, 2009

Oracle Open World (OOW) BestPerf Index 14 October 2009

Here is a BestPerf blog index to a variety of benchmarks announced at Oracle Open World and others talked about at the conference.

Colors used:

Benchmark
Best Practices
Other

ORACLEOPENWORLD

CMT Servers

Oct 11, 2009 \* TPC-C World Record Sun - Oracle \*
Oct 13, 2009 Sun T5440 Oracle BI EE Sun T5440 World Record
Oct 13, 2009 SPECweb200 Sun T5440 World Record, Solaris Containers and Sun Storage F5100
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 27, 2009 Sun T5240 Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun T5220 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP4 (Unicode)

SPARC64 Servers

Oct 13, 2009 SAP 2-tier SD Benchmark on Sun M9000/32 SPARC64 VII
Oct 13, 2009 Oracle PeopleSoft Payroll Sun M4000 and Sun Storage F5100 World Record Performance
Oct 12, 2009 Best Practices: M4000 Sun Storage F5100 is a good option for Peoplesoft Payroll
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers, New SPARC64 VII

X86 Servers

Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 EP4 (Unicode) Standard Sales and Distribution Benchmark

HPC Benchmarks

Oct 13, 2009 Halliburton ProMAX Oil & Gas Appl on Sun 6048/X6275 Cluster and Oracle Database
Oct 13, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 13, 2009 CP2K Life Sciences, Ab-initio Chem - Sun C48 with Sun Blade X6275 - QDR InfiniBand
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset

Specific Storage Benchmarks

Oct 12, 2009 SPC-2 Sun Storage 6180 RAID 5 & RAID 6 Over 70% Better $/Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Over 70% Better $/Performance than IBM
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array

Additional CMT Server Benchmarks

Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip

Tuesday Oct 13, 2009

SPECweb2005 on Sun SPARC Enterprise T5440 World Record using Solaris Containers and Sun Storage F5100 Flash

The Sun SPARC Enterprise T5440 server with 1.6GHz UltraSPARC T2 Plus with Solaris Containers, Sun Flash Open Storage, and Sun JAVA System Web Server 7.0 Update 5 achieved World Record SPECweb2005.
  • Sun has obtained a World Record SPECweb2005 performance result of 100,209 SPECweb2005 on the Sun SPARC Enterprise T5440, running Solaris 10 10/09 Sun JAVA System Web Server 7.0 Update 5, and Java Hotspot™ Server VM.

  • This result demonstrates performance leadership of the Sun SPARC Enterprise T5440 server and its scalability, by using Solaris Containers to consolidate multiple web serving environments, and Sun OpenStorage Flash technology to store large datasets for fast data retrieval.

  • The Sun SPARC Enterprise T5440 delivers 21% greater SPECweb2005 performance than the HP DL370 G6 with 3.2GHz Xeon W5580 processors.

  • The Sun SPARC Enterprise T5440 delivers 40% greater SPECweb2005 performance than the HP DL 585 G5 with four 3.114 GHz Opteron 8393 SE processors.

  • The Sun SPARC Enterprise T5440 delivers 2x the SPECweb2005 performance of the HP DL 580 G5 with four 2.66GHz Xeon X7460 processors.

  • There are no IBM Power6 results on the SPECweb2005 benchmark.

  • This benchmark result clearly demonstrates that the Sun SPARC Enterprise T5440 running Solaris 10 10/09 and Sun Java System Webserver 7.0 Update 5 can support thousands of concurrent web server sessions and is an industry leader in web serving with a Sun solution.

Performance Landscape

Server

Processor

SPECweb2005

Banking\*

Ecomm\*

Support\*

Webserver

OS

Sun T5440

4x 1.6 T2 Plus

100,209

176,500

133,000

95,000

Java WebServer

Solaris

HP DL370 G6

2x 3.2 W5580

83,073

117,120

142,080

76,352

Rock

RedHat
Linux

HP DL585 G5

4x 3.11 O8393

71,629

117,504

123,072

56,320

Rock

RedHat
Linux

HP DL580 G5

4x 2.66 X7460

50,013

97,632

69,600

40,800

Rock

RedHat
Linux

\* Banking - SPECweb2005-Banking
   Ecomm - SPECweb2005-Ecommerce
   Support - SPECweb2005-Support

Results and Configuration Summary

Hardware Configuration:

  1 Sun SPARC Enterprise T5440 with

  • 4 x UltraSPARC T2 Processor 8 core, 64 threads, 1.6 GHz
  • 254 GB memory
  • 6 x 4Gb PCI Express 8-Port Host Adapter (SG-XPCIE8SAS-E-Z)
  • 1 x Sun Storage F5100 Flash Array (TA5100RASA4-80AA)
  • 1 x Sun Storage F5100 Flash Array (TA5100RASA4-40AA)

Server Software Configuration:

  • Solaris 10 10/09
  • JAVA System Web Server 7.0 Update 5
  • Java Hotspot™ Server VM

Network configuration:

  • 1 x Arista DCS-7124s 24-10GbE port  switch
  • 1 x Cisco 2970 series (WS-C2970G-24TS-E) switch for the three 1 GbE networks

Back-end Simulator:

  1 Sun Fire X4270 with

  • 2 x 2.93 GHz Intel X5570 Quad core
  • 48GB memory
  • Solaris 10 10/09
  • JSWS 7.0 Update 5
  • Java Hotspot™ Server VM

Clients:

  8 Sun Blade™ T6320

  • 1 x 1.417 GHz UltraSPARC-T2
  • 64 GB memory
  • Solaris 10 5/09
  • Java Hotspot™ Server VM

  8 Sun Blade™ 6270

  • 2 x 2.93 GHz Intel X5570 Quad core
  • 36 GB memory
  • Solaris 10 5/09
  • Java Hotspot™ Server VM

Benchmark Description

SPECweb2005, successor to SPECweb99 and SPECweb99_SSL, is an industry standard benchmark for evaluating Web Server performance developed by SPEC. The benchmark simulates multiple user sessions accessing a Web Server and generating static and dynamic HTTP requests. The major features of SPECweb2005 are:

  • Measures simultaneous user sessions
  • Dynamic content: currently PHP and JSP implementations
  • Page images requested using 2 parallel HTTP connections
  • Multiple, standardized workloads: Banking (HTTPS), E-commerce (HTTP and HTTPS), and Support (HTTP)
  • Simulates browser caching effects
  • File accesses more accurately simulate today's disk access patterns

Key Points and Best Practices

  • The server was divided into four Solaris Containers and a single web server instance was executed in each container.
  • Four processor sets were created (with varying numbers of threads depending on the workload) to run the web server in. This was done to reduce memory access latency using the physical memory closest to the processor.  All interrupts were run on the remaining threads.
  • Each web server is executed in the FX scheduling class to improve performance by reducing the frequency of context switches.
  • Two Sun Storage F5100 Flash Arrays (holding the target file set and logs) were shared by the four containers  for fast data retrieval.   
  • Use of Solaris Containers highlights the consolidation of multiple web serving environments on a single server.
  • Use of the Sun Ext I/O Expansion unit and Sun Storage F5100 Flash Arrays highlight the expandability of the server.

    Disclosure Statement

    Sun SPARC Enterprise T5440 (8 cores, 1 chip) 100209 SPECweb2005, was submitted to SPEC for review on October 13, 2009.  HP ProLiant DL370 G6 (8 cores, 2 chips) 83,073 SPECweb2005. HP ProLiant DL585 G5 (16 cores, 4 chips) 71,629 SPECweb2005. HP ProLiant DL580 G5 (24 cores, 4 chips) 50,013 SPECweb2005. SPEC, SPECweb reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of Oct 10, 2009.

    Sunday Oct 11, 2009

    TPC-C World Record Sun - Oracle

    TPC-C Sun SPARC Enterprise T5440 with Oracle RAC World Record Database Result

    Sun and Oracle demonstrate the World's fastest database performance. Sun Microsystems using 12 Sun SPARC Enterprise T5440 servers, 60 Sun Storage F5100 Flash arrays and Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning delivered a world-record TPC-C benchmark result.

    • The 12-node Sun SPARC Enterprise T5440 server cluster result delivered a world record TPC-C benchmark result of 7,646,486.7 tpmC and $2.36 $/tpmC (USD) using Oracle 11g R1 on a configuration available 3/19/10.

    • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the IBM Power 595 (5GHz) with IBM DB2 9.5 database by 26% and has 16% better price/performance on the TPC-C benchmark.

    • The complete Oracle/Sun solution used 10.7x better computational density than the IBM configuration (computational density = performance/rack).

    • The complete Oracle/Sun solution used 8 times fewer racks than the IBM configuration.

    • The complete Oracle/Sun solution has 5.9x better power/performance than the IBM configuration.

    • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the HP Superdome (1.6GHz Itanium2) by 87% and has 19% better price/performance on the TPC-C benchmark.

    • The Oracle/Sun solution utilized Sun FlashFire technology to deliver this result. The Sun Storage F5100 flash array was used for database storage.

    • Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning scales and effectively uses all of the nodes in this configuration to produce the world record performance.

    • This result showed Sun and Oracle's integrated hardware and software stacks provide industry-leading performance.

    More information on this benchmark will be posted in the next several days.

    Performance Landscape

    TPC-C results (sorted by tpmC, bigger is better)


    System
    tpmC Price/tpmC Avail Database Cluster Racks w/KtpmC
    12 x Sun SPARC Enterprise T5440 7,646,487 2.36 USD 03/19/10 Oracle 11g RAC Y 9 9.6
    IBM Power 595 6,085,166 2.81 USD 12/10/08 IBM DB2 9.5 N 76 56.4
    Bull Escala PL6460R 6,085,166 2.81 USD 12/15/08 IBM DB2 9.5 N 71 56.4
    HP Integrity Superdome 4,092,799 2.93 USD 08/06/07 Oracle 10g R2 N 46 to be added

    Avail - Availability date
    w/KtmpC - Watts per 1000 tpmC
    Racks - clients, servers, storage, infrastructure

    Results and Configuration Summary

    Hardware Configuration:

      9 racks used to hold

      Servers:
        12 x Sun SPARC Enterprise T5440
        4 x 1.6 GHz UltraSPARC T2 Plus
        512 GB memory
        10 GbE network for cluster
      Storage:
        60 x Sun Storage F5100 Flash Array
        61 x Sun Fire X4275, Comstar SAS target emulation
        24 x Sun StorageTek 6140 (16 x 300 GB SAS 15K RPM)
        6 x Sun Storage J4400
        3 x 80-port Brocade FC switches
      Clients:
        24 x Sun Fire X4170, each with
        2 x 2.53 GHz X5540
        48 GB memory

    Software Configuration:

      Solaris 10 10/09
      OpenSolaris 6/09 (COMSTAR) for Sun Fire X4275
      Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning
      Tuxedo CFS-R Tier 1
      Sun Web Server 7.0 Update 5

    Benchmark Description

    TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

    POSTSCRIPT: Here are some comments on IBM's grasping-at-straws-perf/core attacks on the TPC-C result:
    c0t0d0s0 blog: "IBM's Reaction to Sun&Oracle TPC-C

    See Also

    Disclosure Statement

    TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Performance Processing Council (TPC). 12-node Sun SPARC Enterprise T5440 Cluster (1.6GHz UltraSPARC T2 Plus, 4 processor) with Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning, 7,646,486.7 tpmC, $2.36/tpmC. Available 3/19/10. IBM Power 595 (5GHz Power6, 32 chips, 64 cores, 128 threads) with IBM DB2 9.5, 6,085,166 tpmC, $2.81/tpmC, available 12/10/08. HP Integrity Superdome(1.6GHz Itanium2, 64 processors, 128 cores, 256 threads) with Oracle 10g Enterprise Edition, 4,092,799 tpmC, $2.93/tpmC. Available 8/06/07. Source: www.tpc.org, results as of 10/11/09.

    Tuesday Sep 01, 2009

    String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine

    Significance of Results

    Sun SPARC Enterprise T5220, T5240 and T5440 servers ran benchmarks using the Aho-Corasick string searching algorithm. String searching or pattern matching are important to a variety of commercial, government and HPC applications. One of the core functions needed for text identification algorithms in data repositories is real-time string searching. For this benchmark, the IBM, HP and Sun systems used the Aho-Corasick algorithm for string searching.

    Sun SPARC Enterprise T5440

    • A 1.6 GHz Sun SPARC Enterprise T5440 server could search a book as tall as Mt. Everest (29,208 feet, 861 GB book) in 61 seconds, which corresponds to a string search rate of 14.2 GB/s.

    • A 1.6 GHz Sun SPARC Enterprise T5440 server can search at a rate of 14.2 GB/s, which corresponds to searching a book containing one terabyte of data (34,745 feet high) in only 70 seconds.

    • The 4-chip 1.6 GHz Sun SPARC Enterprise T5440 server performed string searching at a rate of 14.2 GB/s which is 29.9 times as fast as the 2-chip IBM Cell Broadband Engine DD3 Blade that performed string searching at a rate of 0.475 GB/s

    • The 4-chip 1.6 GHz Sun SPARC Enterprise T5440 server performed string searching 3.7 times as fast as the 4-chip HP DL-580 (2.93 GHz Xeon QC) server that performed string searching at a rate of 3.87 GB/s. The 1.6 GHz Sun SPARC Enterprise T5440 server has a 1.7 times advantage in delivered power-performance over the HP DL-580 (using a power consumption rate of 830 watts for the HP system as measured on other tests).

    • The 1.6 GHz Sun SPARC Enterprise T5440 server demonstrated a 12% improvement over the 1.4 GHz Sun SPARC Enterprise T5440.

    • The 1.6 GHz Sun SPARC Enterprise T5440 server demonstrated a 2x speedup over the 1.6 GHz Sun SPARC Enterprise T5240 server which demonstrated a 2.3x speedup over the 1.4 GHz Sun SPARC Enterprise T5220 server.

    Sun SPARC Enterprise T5240

    • The 2-chip 1.6 GHz Sun SPARC Enterprise T5240 server performed string searching at a rate of 7.22 GB/s which is 15.4 times as fast as the 2-chip IBM Cell Broadband Engine DD3 Blade that performed string searching at a rate of 0.475 GB/s.

    • The 2-chip 1.6 GHz Sun SPARC Enterprise T5240 server performed string searching 1.9 times as fast as the 4-chip HP DL-580 (2.93 GHz Xeon QC) server that performed string searching at a rate of 3.87 GB/s. The 1.6 GHz Sun SPARC Enterprise T5240 server has a 2.4 times advantage in delivered power-performance over the HP DL-580 (using a power consumption rate of 830 watts for the HP system as measured on other

    • The 1.6 GHz Sun SPARC Enterprise T5240 server demonstrated a 14% speedup over the 1.4 GHz Sun SPARC Enterprise T5240 server.

    Sun SPARC Enterprise T5220

    • The 1-chip 1.4 GHz Sun SPARC Enterprise T5220 server performed string searching at a rate of 3.16 GB/s which is 6.7 times as fast as the 2-chip IBM Cell Broadband Engine DD3 Blade that performed string searching at a rate of 0.475 GB/s.

    Performance Landscape

    System Throughput
    (GB/sec)
    Chips Cores
    Sun SPARC Enterprise T5440 (1.6 GHz) 14.2 4 32
    Sun SPARC Enterprise T5440 (1.4 GHz) 12.7 4 32
    Sun SPARC Enterprise T5240 (1.6 GHz) 7.2 2 16
    Sun SPARC Enterprise T5240 (1.4 GHz) 6.4 2 16
    HP DL-580 (2.9 GHz) 3.9 4 16
    Sun SPARC Enterprise T5220 (1.4 GHz) 3.2 1 8
    IBM Cell Broadband Engine DD3 Blade (3.2 GHz) 0.475 2 16

    Results and Configuration Summary

    Hardware Configuration:
      Sun SPARC Enterprise T5440 (1.6 GHz)
        4 x 1.6 GHz UltraSPARC T2 Plus processors
        256 GB
      Sun SPARC Enterprise T5440 (1.4 GHz)
        4 x 1.4 GHz UltraSPARC T2 Plus processors
        128 GB
      Sun SPARC Enterprise T5240 (1.6 GHz)
        2 x 1.6 GHz UltraSPARC T2 Plus processors
        64 GB
      Sun SPARC Enterprise T5240 (1.4 GHz)
        2 x 1.4 GHz UltraSPARC T2 Plus processors
        64 GB
      Sun SPARC Enterprise T5220 (1.4 GHz)
        1 x 1.4 GHz UltraSPARC T2 processor
        32 GB

    Software Configuration:

      Sun SPARC Enterprise T5440 (1.6 GHz)
        OpenSolaris 2009.06
        Sun Studio 12 (Sun C 5.9 2007.05)
      Sun SPARC Enterprise T5440 (1.4 GHz)
        Solaris 10 2008.07
        Sun Studio 12 (Sun C 5.9 2007.05)
      Sun SPARC Enterprise T5240 (1.6 GHz)
        OpenSolaris 2009.06
        Sun Studio 12 (Sun C 5.9 2007.05)
      Sun SPARC Enterprise T5240 (1.4 GHz)
        Solaris 10 2008.07
        Sun Studio 12 (Sun C 5.9 2007.05)
      Sun SPARC Enterprise T5220 (1.4 GHz)
        Solaris 10 2008.07
        Sun Studio 12 (Sun C 5.9 2007.05)

    Benchmark Description

    One of the core functions needed for text identification algorithms in data repositories is real-time string searching. This string searching benchmark demonstrates the usefulness of Sun's UltraSPARC T2 and T2 Plus processors for both ease of code creation and speed of code execution. In IEEE Computer, Volume 41, Number 4, pp. 42-50, April 2008, IBM describes a variant of the Aho-Corasick string searching algorithm that uses deterministic finite automata. The algorithm first constructs a graph that represents a dictionary, then walks that graph using successive input characters from a text file. Each "state" in the graph includes a state transition table (STT) that is accessed using the next input character from the text file to determine the address of the next state in the graph. IBM defines an automaton as a two-step loop that: (1) obtains the address of the next state from the STT, and (2) fetches the next state in the graph.

    IBM reports the performance of its Cell Broadband Engine (CBE) to execute this algorithm to search a 4.4 MB version of the King James Bible using a dictionary of the 20,000 most used words in the English language (average word length of 7.59 characters). Each of the 8 synergistic processing elements (SPEs) of each of the two CBEs executes 16 automata, for a total of 256 automata. All automata and hence all SPEs access a single, shared dictionary.

    IBM describes elaborate optimizations of the Aho-Corasick algorithm, including state shuffling, state replication, alphabet shuffling and state caching. These optimizations were required to: (1) overcome "memory congestion", i.e., contention amongst the SPEs for access to the shared dictionary, and (2) compensate for the limited local storage that is associated with each SPE. These optimizations were necessary to achieve the performance reported for the CBE DD3 Blade.

    IBM does not provide references that indicate where to obtain the dictionary and Bible. IBM reports the algorithmic performance in Gbits/s but does not indicate whether an 8-bit byte is extended to 10 bits as required for network transmission.

    In order to closely approximate the dictionary and Bible that were used by IBM, Sun used a dictionary of 25,143 English words (the Open Solaris file cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/spell/list) for which the average word length is 7.2 characters, and a 4.6 MB version of the King James Bible (www.patriot.net/users/bmcgin/kjv12.zip). For reporting of results in Gbits/s, the length of a byte is assumed to be 8 bits.

    Key Points and Best Practices

    • Power was measured during execution of the Aho-Corasick algorithm using a WattsUp power meter, and the average rate of power consumption is presented.

    • The Aho-Corasick algorithm as deployed on the IBM Cell Broadband Engine DD3 Blade required substantial optimization and tuning to achieve the reported performance, whereas on the Sun SPARC Enterprise T5220, T5240 or T5440 servers only a basic implementation of the algorithm and a simple compilation were needed.

    • In order to demonstrate the usefulness of Sun's UltraSPARC T2 and T2 Plus processors for both ease of code generation and speed of code execution, Sun implemented the Aho-Corasick algorithm using ANSI C. No optimizations of the algorithm were required to achieve the performance reported for the T5220, T5240 and T5440. The source code was compiled using the -m32 -xO3 and -xopenmp options. The dictionary is represented using a graph that comprises 82 MB. Each core of the T5220, T5240 or T5440 executes 8 automata using one OpenMP thread per automaton. Thus, the T5220 executes 64 total automata, the T5240 executes 128 total automata and the T5440 executes 256 total automata. All automata and hence all cores access a single, shared dictionary. Access to this dictionary is accelerated by the large, shared L2 caches of the Sun SPARC Enterprise T5220, T5240 and T5440.

    See Also

    Thursday Aug 27, 2009

    Sun SPARC Enterprise T5240 with 1.6GHz UltraSPARC T2 Plus Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005

    Significance of Results

    A Sun SPARC Enterprise T5240 server equipped with two UltraSPARC T2 Plus processors at 1.6GHz delivered a result of 422782 SPECjbb2005 bops, 26424 SPECjbb2005 bops/JVM. The Sun SPARC Enterprise T5240 consumed an average of 875 Watts of power during the execution of the benchmark.

    • The Sun SPARC Enterprise T5240 server running 2x 1.6 GHz UltraSPARC T2 Plus processor delivered 5% better performance than an IBM Power 570 with 4x 4.7 GHz POWER6 processors as measured by the SPECjbb2005 benchmark.

    • The Sun SPARC Enterprise T5240 server equipped with two UltraSPARC T2 Plus processors at 1.6GHz demonstrated 10% better performance than the Sun SPARC Enterprise T5240 server equipped with two UltraSPARC T2 Plus processors at 1.4GHz.
    • One Sun SPARC Enterprise T5240 (two 1.6GHz UltraSPARC T2 Plus chips, 2RU) has 2.3 times the power/performance than the IBM Power 570 (8RU) that used four 4.7GHz POWER6 chips.
    • The Sun SPARC Enterprise T5240 used OpenSolaris 2009.06 and the Sun JDK 1.6.0_14 Performance Release to obtain this result.

    Performance Landscape

    SPECjbb2005 Performance Chart (ordered by performance), select results presented.

    bops : SPECjbb2005 Business Operations per Second (bigger is better)

    System Processors Performance
    Chips Cores Threads GHz Type bops bops/JVM
    Sun SPARC Enterprise T5240 2 16 128 1.6 UltraSPARC T2 Plus 422782 26424
    IBM Power 570 4 8 16 4.7 POWER6 402923 100731
    Sun SPARC Enterprise T5240 2 16 128 1.4 UltraSPARC T2 Plus 384934 24058

    Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org.

    Results and Configuration Summary

    Hardware Configuration:

      Sun SPARC Enterprise T5240
        2 x 1.6 GHz UltraSPARC T2 Plus processors
        64 GB

    Software Configuration:

      OpenSolaris 2009.06
      Java HotSpot(TM) 32-Bit Server, Version 1.6.0_14 Performance Release

    Benchmark Description

    SPECjbb2005 (Java Business Benchmark) measures the performance of a Java implemented application tier (server-side Java). The benchmark is based on the order processing in a wholesale supplier application. The performance of the user tier and the database tier are not measured in this test. The metrics given are number of SPECjbb2005 bops (Business Operations per Second) and SPECjbb2005 bops/JVM (bops per JVM instance).

    Key Points and Best Practices

    • Each JVM executed in the FX scheduling class to improve performance by reducing the frequency of context switches.
    • Each JVM was bound to a separate processor containing 1 core to reduce memory access latency using the physical memory closest to the processor.

    See Also

    Disclosure Statement

    SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 8/25/2009 on http://www.spec.org.
    Sun SPARC T5240 (2 chips, 16 cores) 422782 SPECjbb2005 bops, 26424 SPECjbb2005 bops/JVM;Sun SPARC T5240 (2 chips, 16 cores) 384934 SPECjbb2005 bops, 24058 SPECjbb2005 bops/JVM; IBM Power 570 (4 chips, 8 cores) 402923 SPECjbb2005 bops, 100731 SPECjbb2005 bops/JVM.

    Sun watts were measured on the system during the test.

    IBM p 570 4P (2 building blocks) power specifications calculated as 80% of maximum input power reported 7/8/09 in 'Facts and Features Report': ftp://ftp.software.ibm.com/common/ssi/pm/br/n/psb01628usen/PSB01628USEN.PDF

    Wednesday Aug 26, 2009

    Sun SPARC Enterprise T5220 with 1.6GHz UltraSPARC T2 Sets Single Chip World Record on SPECjbb2005

    Significance of Results

    A Sun SPARC Enterprise T5220 server equipped with one UltraSPARC T2 processor at 1.6GHz delivered a World Record single-chip result of 231464 SPECjbb2005 bops, 28933 SPECjbb2005 bops/JVM. The Sun SPARC Enterprise T5220 consumed an average of 520 Watts of power during the execution of this benchmark.

    • The Sun SPARC Enterprise T5220 server (one 1.6 GHz UltraSPARC T2 chip) demonstrated 3% better performance over the Fujitsu TX100 result of 223691 SPECjbb2005 bops which used one 3.16 GHz Xeon X3380 processor.
    • The Sun SPARC Enterprise T5220 (one 1.6 GHz UltraSPARC T2 chip) demonstrated 8% better performance over the IBM x3200 result of 214578 SPECjbb2005 bops which used one 3.16 GHz Xeon X3380 processor.
    • The Sun SPARC Enterprise T5220 server (one 1.6 GHz UltraSPARC T2 chip) demonstrated 10% better performance over the Fujitsu RX100 result of 211144 SPECjbb2005 bops which used one 3.16 GHz Xeon X3380 processor.
    • The Sun SPARC Enterprise T5220 server (one 1.6 GHz UltraSPARC T2 chip) demonstrated 19% better performance over the IBM X3350 result of 194256 SPECjbb2005 bops which used one 3 GHz Xeon X3370 processor.
    • The Sun SPARC Enterprise T5220 server (one 1.6 GHz UltraSPARC T2 chip) demonstrated 2.6X the performance over the IBM p570 result of 88089 SPECjbb2005 bops which used one 4.7 GHz POWER6 processor.
    • One Sun SPARC Enterprise T5220 (one 1.6GHz UltraSPARC T2 Plus chip, 2RU) has 2.1 the power/performance than the IBM Power 570 (4RU) that used two 4.7GHz POWER6 chips.
    • The Sun SPARC Enterprise T5220 used OpenSolaris 2009.06 and the Sun JDK 1.6.0_14 Performance Release to obtain this result.

    Performance Landscape

    SPECjbb2005 Performance Chart (ordered by performance)

    bops : SPECjbb2005 Business Operations per Second (bigger is better)

    System Processors Performance
    Chips Cores Threads GHz Type bops bops/JVM
    Sun SPARC Enterprise T5220 1 8 64 1.6 UltraSPARC T2 231464 28933
    Sun Blade T6320 1 8 64 1.6 UltraSPARC T2 229576 28697
    Fujitsu TX100 1 4 4 3.16 Intel Xeon 223691 111846
    IBM x3200 M2 1 4 4 3.16 Intel Xeon 214578 107289
    Fujitsu RX100 1 4 4 3.16 Intel Xeon 211144 105572
    IBM Power 570 2 4 8 4.7 POWER6 205917 102959
    IBM x3350 1 4 4 3.0 Intel Xeon 194256 97128
    Sun SPARC Enterprise T5220 1 8 64 1.4 UltraSPARC T2 192055 24007
    IBM Power 570 1 2 4 4.7 POWER6 88089 88089

    Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org.

    Results and Configuration Summary

    Hardware Configuration:

      Sun SPARC Enterprise T5220
        1x 1.6 GHz UltraSPARC T2 processor
        64 GB

    Software Configuration:

      OpenSolaris 2009.06
      Java HotSpot(TM) 32-Bit Server, Version 1.6.0_14 Performance Release

    Benchmark Description

    SPECjbb2005 (Java Business Benchmark) measures the performance of a Java implemented application tier (server-side Java). The benchmark is based on the order processing in a wholesale supplier application. The performance of the user tier and the database tier are not measured in this test. The metrics given are number of SPECjbb2005 bops (Business Operations per Second) and SPECjbb2005 bops/JVM (bops per JVM instance).

    Key Points and Best Practices

    • Each JVM executed in the FX scheduling class to improve performance by reducing the frequency of context switches.
    • Each JVM was bound to a separate processor containing 1 core to reduce memory access latency using the physical memory closest to the processor.

    See Also

    Disclosure Statement

    SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 8/25/2009 on http://www.spec.org.
    Sun SPARC T5220 231464 SPECjbb2005 bops, 28933 SPECjbb2005 bops/JVM Submitted to SPEC for review; IBM p 570 88089 SPECjbb2005 bops, 88089 SPECjbb2005 bops/JVM; Fujitsu TX100 223691 SPECjbb2005 bops, 111846 SPECjbb2005 bops/JVM; IBM x3350 194256 SPECjbb2005 bops, 97128 SPECjbb2005 bops/JVM; Sun SPARC Enterprise T5120 192055 SPECjbb2005 bops, 24007 SPECjbb2005 bops/JVM.

    Sun watts were measured on the system during the test.

    IBM p 570 2P (1 building blocks) power specifications calculated as 80% of maximum input power reported 7/8/09 in "Facts and Features Report": ftp://ftp.software.ibm.com/common/ssi/pm/br/n/psb01628usen/PSB01628USEN.PDF

    Wednesday Aug 12, 2009

    SPECmail2009 on Sun SPARC Enterprise T5240 and Sun Java System Messaging Server 6.3

    Significance of Results

    The Sun SPARC Enterprise T5240 server running the Sun Java Messaging server 6.3 achieved World Record SPECmail2009 results using ZFS.

    • A Sun SPARC Enterprise T5240 server powered by two 1.6 GHz UltraSPARC T2 Plus processors running the Sun Java Communications Suite 5 software along with the Solaris 10 Operating System and using six Sun StorageTek 2540 arrays achieved a new World Record 12000 SPECmail_Ent2009 IMAP4 users at 57,758 Sessions/hour for SPECmail2009.
    • The Sun SPARC Enterprise T5240 server achieve twice the number of users and sessions/hour rate than the Apple Xserv3,1 solution equipped with Intel Nehalem processors.
    • The Sun result was obtained using ~10% fewer disk spindles with the Sun StorageTek 2540 RAID controller direct attach storage solution versus Apple's direct attached storage.
    • This benchmark result demonstrates that the Sun SPARC Enterprise T5240 server together with Sun Java Communication Suite 5 component Sun Java System Messaging Server 6.3, Solaris 10 and ZFS on Sun StorageTek 2540 arrays supports a large, enterprise level IMAP mail server environment. This solution is reliable, low cost, and low power, delivering the best performance and maximizing the data integrity with Sun's ZFS file systems.

    Performance Landscape

    SPECmail2009 (ordered by performance)

    System Processors Performance
    Type GHz Ch, Co, Th SPECmail_Ent2009
    Users
    SPECmail2009
    Sessions/hour
    Sun SPARC Enterprise T5240 UltraSPARC T2 Plus 1.6 2, 16, 128 12,000 57,758
    Sun Fire X4275 Xeon X5570 2.93 2, 8, 16 8,000 38,348
    Apple Xserv3,1 Xeon X5570 2.93 2, 8, 16 6,000 28,887
    Sun SPARC Enterprise T5220 UltraSPARC T2 1.4 1, 8, 64 3,600 17,316

    Notes:

      Number of SPECmail_Ent2009 users (bigger is better)
      SPECmail2009 Sessions/hour (bigger is better)
      Ch, Co, Th: Chips, Cores, Threads

    Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org

    Results and Configuration Summary

    Hardware Configuration:

      Sun SPARC Enterprise T5240

        2 x 1.6 GHz UltraSPARC T2 Plus processors
        128 GB
        8 x 146GB, 10K RPM SAS disks

      6 x Sun StorageTek 2540 Arrays,

        4 arrays with 12 x 146GB 15K RPM SAS disks
        2 arrays with 12 x 73GB 15K RPM SAS disks

      2 x Sun Fire X4600 benchmark manager, load generator and mail sink

        8 x AMD Opteron 8356 2.7 GHz QC processors
        64 GB
        2 x 73GB 10K RPM SAS disks

      Sun Fire X4240 load generator

        2 x AMD Opteron 2384 2.7 GHz DC processors
        16 GB
        2 x 73GB 10K RPM SAS disks

    Software Configuration:

      Solaris 10
      ZFS
      Sun Java Communication Suite 5
      Sun Java System Messaging Server 6.3

    Benchmark Description

    The SPECmail2009 benchmark measures the ability of corporate e-mail systems to meet today's demanding e-mail users over fast corporate local area networks (LAN). The SPECmail2009 benchmark simulates corporate mail server workloads that range from 250 to 10,000 or more users, using industry standard SMTP and IMAP4 protocols. This e-mail server benchmark creates client workloads based on a 40,000 user corporation, and uses folder and message MIME structures that include both traditional office documents and a variety of rich media content. The benchmark also adds support for encrypted network connections using industry standard SSL v3.0 and TLS 1.0 technology. SPECmail2009 replaces all versions of SPECmail2008, first released in August 2008. The results from the two benchmarks are not comparable.

    Software on one or more client machines generates a benchmark load for a System Under Test (SUT) and measures the SUT response times. A SUT can be a mail server running on a single system or a cluster of systems.

    A SPECmail2009 'run' simulates a 100% load level associated with the specific number of users, as defined in the configuration file. The mail server must maintain a specific Quality of Service (QoS) at the 100% load level to produce a valid benchmark result. If the mail server does maintain the specified QoS at the 100% load level, the performance of the mail server is reported as SPECmail_Ent2009 SMTP and IMAP Users at SPECmail2009 Sessions per hour. The SPECmail_Ent2009 users at SPECmail2009 Sessions per Hour metric reflects the unique workload combination for a SPEC IMAP4 user.

    Key Points and Best Practices

    • Each Sun StorageTek 2540 array was configured with 6 hardware RAID1 volumes. A total of 36 RAID1 volumes were configured with 24 of size 146GB and 12 of size 73GB. Four ZPOOLs of (6x146GB RAID1 volumes) were mounted as the four primary message stores and ZFS file systems. Four ZPOOLs of (8x73GB RAID1 volumes) were mounted as the four primary message indexes. The hardware RAID1 volumes were created with 64K stripe size without read ahead turned on. The 7x146GB internal drives were used to create four ZPOOLs and ZFS file systems for the LDAP, store metadata, queue and the mailserver log.

    • The clients used these Java options: java -d64 -Xms4096m -Xmx4096m -XX:+AggressiveHeap

    • See the SPEC Report for all OS, network and messaging server tunings.

    See Also

    Disclosure Statement

    SPEC, SPECmail reg tm of Standard Performance Evaluation Corporation. Results as of 08/07/2009 on www.spec.org. SPECmail2009: Sun SPARC Enterprise T5240 (16 cores, 2 chips) SPECmail_Ent2009 12000 users at 57,758 SPECmail2009 Sessions/hour. Apple Xserv3,1 (8 cores, 2 chips) SPECmail_Ent2009 6000 users at 28,887 SPECmail2009 Sessions/hour.

    Thursday Jul 23, 2009

    World Record Performance of Sun CMT Servers

    This week, Sun continues to highlight the record-breaking performance of its latest update to the chip multi-threaded (CMT) Sun SPARC Enterprise server family running Solaris.  Some of these benchmarks leverage the use of a variety of Sun's unique technologies including ZFS, SSD, various Storage Products and many more. These benchmarks were blogged about by various members or our team and the URLs are shown below.

    Messages

    • Sun's CMT is the most powerful CPU regardless of architectural/implementation details (#transistors, #cores, threads, MHz, etc.)!
    • Performance tests show that Sun can outperform IBM Power6 by more than 2x on a variety of benchmarks.
    • Performance tests show Sun's new 1.6GHz CMT systems can be 20% faster than Sun's previous generation 1.4GHz processors, given Sun's continual advancements in both hardware and software.

    Benchmark Results Recently Blogged

    Sun T5440 Oracle BI EE World Record Performance
    http://blogs.sun.com/BestPerf/entry/sun_t5440_oracle_bi_ee

    Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode), Beats IBM POWER6 (note1)
    http://blogs.sun.com/BestPerf/entry/sun_t5440_world_record_sap

    Zeus ZXTM Traffic Manager World Record on Sun T5240
    http://blogs.sun.com/BestPerf/entry/top_performance_on_sun_sparc

    Sun T5440 SPECjbb2005, Sun 1.6GHz T2 Plus chip is 2.3x IBM 4.7GHz POWER6 chip
    http://blogs.sun.com/BestPerf/entry/sun_t5440_specjbb2005_beats_ibm

    New SPECjAppServer2004 Performance on the Sun SPARC Enterprise T5440
    http://blogs.sun.com/BestPerf/entry/new_specjappserver2004_performance_on_sun

    1.6 GHz SPEC CPU2006: World Record 4-chip system, Rate Benchmarks, Beats IBM POWER6
    http://blogs.sun.com/BestPerf/entry/1_6_ghz_spec_cpu2006

    Sun Blade T6320 World Record 1-chip SPECjbb2005 performance, Sun 1.6GHz T2 Plus chip is 2.6x IBM 4.7GHz POWER6 chip
    http://blogs.sun.com/BestPerf/entry/new_specjbb2005_performance_on_the

    Comparison Table

    Benchmark Sun CMT Tier Software Key Messages
    Oracle BI EE Sun T5440 Appl,
    Database
    Oracle 11g,
    Oracle BIEE,
    ZFS,
    Solaris
    • World Record: T5440
    • Achieved 28,000 users
    • Reference
    SAP-SD 2-Tier Sun T5440 Appl,
    Database
    SAP ECC 6.0,EP4
    Oracle 10g,
    Solaris
    • World Record 4-socket: T5440
    • T5440 Beats 4-socket IBM 550 5GHz Power6 by 26% (note1)
    • T5440 Beats HP DL585 G6 4-socket Opteron (note1)
    • Unicode version
    SPECjAppServer
    2004
    Sun T5440 Appl, Database Oracle WebLogic,
    Oracle 11g,
    JDK 1.6.0_14,
    Solaris
    • World Record Single System (Appl Tier): T5440
    • T5440 is 6.4x faster of IBM Power 570 4.7GHz Power6
    • T5440 is 73% faster than HP DL 580 G5 Xeon 6C
    • Oracle Fusion Middleware
    Sun T5440
    SPECjbb2005
    Sun T5440 Appl Java HotSpot,
    OpenSolaris
    • 1.6GHz US T2 Plus CPU is 2.3x faster of IBM 4.7GHz Power6 CPU
    • 1.6GHz US T2 Plus CPU is 21% faster than previous generation 1.4GHz US T2 Plus CPU
    • Sun T5440 has 2.3x better power/perf than the IBM 570 (8 4.7GHz Power6)
    Sun Blade T6320 SPECjbb2005 Sun T6320 Appl Java HotSpot,
    OpenSolaris
    • World Record 1-socket: T6320
    • 1.6GHz US T2 Plus CPU is 2.6x faster than IBM 4.7GHz Power6 CPU
    • T6320 is 3% faster than Fujitsu 3.16GHz Xeon QC
    SPEC CPU2006 Sun T5440,
    Sun T5240,
    Sun T5220,
    Sun T5120,
    Sun T6320
    all tiers Sun Studio12,
    Solaris,
    ZFS
    • World Record 4-socket: T5440
    • 1.6GHz US T2 Plus CPU is 2.6x faster than IBM 4.7GHz Power6 CPU
    • T6320 is 3% faster than Fujitsu 3.16GHz Xeon QC
    Zeus ZXTM
    Traffic Manager
    Sun T5240 Web Zeus ZXTM v5.1r1,
    Solaris
    • World Record: T5240
    • T5240 Beats f5 BIG-IP VIPRON by 34%; 2.6x better $/perf
    • T5240 Beats f5 BIG-IP 8800 by 91%; 2.7x better $/perf⁞
    • T5240 Beats Citrix 12000 by 2.2x; 3.3x better $/perf
    • No IBM result

    Virtualization

    Sun's announcement also included updated virtualization software (LDOMs 1.1). Downloads are available to existing SPARC Enterprise server customers at: http://www.sun.com/servers/coolthreads/ldoms/index.jsp.  Also look the the blog posting "LDoms for Dummies" at http://blogs.sun.com/PierreReynes/entry/ldoms_for_dummies

    Try & Buy Program

    Sun is also offering free 60-day trials on Sun CMT servers with with a very popular Try and Buy program: http://www.sun.com/tryandbuy.

    Benchmark Performance Disclosure Statements (the URLs listed above go into more detail on each of these benchmarks)

    Note1: 4-processor world record on the 2-tier SAP SD Standard Application Benchmark with 4720 SD User, as of July 23, 2009, IBM System 550 (4 processors, 8 cores, 16 threads) 3,752 SAP SD Users, 4x 5 GHz Power6, 64 GB memory, DB2 9.5, AIX 6.1, Cert# 2009023. T5440 beats HP new 4-socket Opteron Servers (HPDL585 G6 with 4665 SD User and HP BL685c G6 with 4422 SD User)

    Two-tier SAP Sales and Distribution (SD) standard SAP ERP 6.0 2005/EP4 (Unicode) application benchmarks as of 07/21/09: Sun SPARC Enterprise T5440 Server (4 processors, 32 cores, 256 threads) 4,720 SAP SD Users, 4x 1.6 GHz UltraSPARC T2 Plus, 256 GB memory, Oracle10g, Solaris10, Cert# 2009026. HP ProLiant DL585 G6 (4 processors, 24 cores, 24 threads) 4,665 SAP SD Users, 4x 2.8 GHz AMD Opteron Processor 8439 SE, 64 GB memory, SQL Server 2008, Windows Server 2008 Enterprise Edition, Cert# 2009025. HP ProLiant BL685c G6 (4 processors, 24 cores, 24 threads) 4,422 SAP SD Users, 4x 2.6 GHz AMD Opteron Processor 8435, 64 GB memory, SQL Server 2008, Windows Server 2008 Enterprise Edition, Cert# 2009021. IBM System 550 (4 processors, 8 cores, 16 threads) 3,752 SAP SD Users, 4x 5 GHz Power6, 64 GB memory, DB2 9.5, AIX 6.1, Cert# 2009023. HP ProLiant DL585 G5 (4 processors, 16 cores, 16 threads) 3,430 SAP SD Users, 4x 3.1 GHz AMD Opteron Processor 8393 SE, 64 GB memory, SQL Server 2008, Windows Server 2008 Enterprise Edition, Cert# 2009008. HP ProLiant BL685 G6 (4 processors, 16 cores, 16 threads) 3,118 SAP SD Users, 4x 2.9 GHz AMD Opteron Processor 8389, 64 GB memory, SQL Server 2008, Windows Server 2008 Enterprise Edition, Cert# 2009007. NEC Express5800 (4 processors, 24 cores, 24 threads) 2,957 SAP SD Users, 4x 2.66 GHz Intel Xeon Processor X7460, 64 GB memory, SQL Server 2008, Windows Server 2008 Enterprise Edition, Cert# 2009018. Dell PowerEdge M905 (4 processors, 16 cores, 16 threads) 2,129 SAP SD Users, 4x 2.7 GHz AMD Opteron Processor 8384, 96 GB memory, SQL Server 2005, Windows Server 2003 Enterprise Edition, Cert# 2009017. Sun Fire X4600M2 (8 processors, 32 cores, 32 threads) 7,825 SAP SD Users, 8x 2.7 GHz AMD Opteron 8384, 128 GB memory, MaxDB 7.6, Solaris 10, Cert# 2008070. IBM System x3650 M2 (2 Processors, 8 Cores, 16 Threads) 5,100 SAP SD users,2x 2.93 Ghz Intel Xeon X5570, DB2 9.5, Windows Server 2003 Enterprise Edition, Cert# 2008079. HP ProLiant DL380 G6 (2 processors, 8 cores, 16 threads) 4,995 SAP SD Users, 2x 2.93 GHz Intel Xeon x5570, 48 GB memory, SQL Server 2005, Windows Server 2003 Enterprise Edition, Cert# 2008071. SAP, R/3, reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark.

    Oracle Business Intelligence Enterprise Edition benchmark, see http://www.oracle.com/solutions/business_intelligence/resource-library-whitepapers.html for more. Results as of 7/20/09.

    Zeus is TM of Zeus Technology Limited. Results as of 7/21/2009 on http://www.zeus.com/news/press_articles/zeus-price-performance-press-release.html?gclid=CLn4jLuuk5cCFQsQagod7gTkJA.

    SPEC, SPECint, SPECfp reg tm of Standard Performance Evaluation Corporation. Competitive results from www.spec.org as of 16 July 2009. Sun's new results quoted on this page have been submitted to SPEC. Sun Blade T6320 89.2 SPECint_rate_base2006, 96.7 SPECint_rate2006, 64.1 SPECfp_rate_base2006, 68.5 SPECfp_rate2006; Sun SPARC Enterprise T5220/T5120 89.1 SPECint_rate_base2006, 97.0 SPECint_rate2006, 64.1 SPECfp_rate_base2006, 68.5 SPECfp_rate2006; Sun SPARC Enterprise T5240 172 SPECint_rate_base2006, 183 SPECint_rate2006, 124 SPECfp_rate_base2006, 133 SPECfp_rate2006; Sun SPARC Enterprise T5440 338 SPECint_rate_base2006, 360 SPECint_rate2006, 254 SPECfp_rate_base2006, 270 SPECfp_rate2006; Sun Blade T6320 76.4 SPECint_rate_base2006, 85.5 SPECint_rate2006, 58.1 SPECfp_rate_base2006, 62.3 SPECfp_rate2006; Sun SPARC Enterprise T5220/T5120 76.2 SPECint_rate_base2006, 83.9 SPECint_rate2006, 57.9 SPECfp_rate_base2006, 62.3 SPECfp_rate2006; Sun SPARC Enterprise T5240 142 SPECint_rate_base2006, 157 SPECint_rate2006, 111 SPECfp_rate_base2006, 119 SPECfp_rate2006; Sun SPARC Enterprise T5440 270 SPECint_rate_base2006, 301 SPECint_rate2006, 212 SPECfp_rate_base2006, 230 SPECfp_rate2006; IBM p 570 53.2 SPECint_rate_base2006, 60.9 SPECint_rate2006, 51.5 SPECfp_rate_base2006, 58.0 SPECfp_rate2006; IBM Power 520 102 SPECint_rate_base2006, 124 SPECint_rate2006, 88.7 SPECfp_rate_base2006, 105 SPECfp_rate2006; IBM Power 550 215 SPECint_rate_base2006, 263 SPECint_rate2006, 188 SPECfp_rate_base2006, 222 SPECfp_rate2006; HP Integrity BL870c 114 SPECint_rate_base2006; HP Integrity rx7640 87.4 SPECfp_rate_base2006, 90.8 SPECfp_rate2006.

    SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 7/17/2009 on http://www.spec.org. SPECjbb2005, Sun Blade T6320 229576 SPECjbb2005 bops, 28697 SPECjbb2005 bops/JVM; IBM p 570 88089 SPECjbb2005 bops, 88089 SPECjbb2005 bops/JVM; Fujitsu TX100 223691 SPECjbb2005 bops, 111846 SPECjbb2005 bops/JVM; IBM x3350 194256 SPECjbb2005 bops, 97128 SPECjbb2005 bops/JVM; Sun SPARC Enterprise T5120 192055 SPECjbb2005 bops, 24007 SPECjbb2005 bops/JVM.

    SPECjAppServer2004, Sun SPARC Enterprise T5440 (4 chips, 32 cores) 7661.16 SPECjAppServer2004 JOPS@Standard; HP DL580 G5 (4 chips, 24 cores) 4410.07 SPECjAppServer2004 JOPS@Standard; HP DL580 G5 (4 chips, 16 cores) 3339.94 SPECjAppServer2004 JOPS@Standard; Two Dell PowerEdge 2950 (4 chips, 16 cores) 4794.33 SPECjAppServer2004 JOPS@Standard; Dell PowerEdge R610 (2 chips, 8 cores) 3975.13 SPECjAppServer2004 JOPS@Standard; Two Dell PowerEdge R610 (4 chips, 16 cores) 7311.50 SPECjAppServer2004 JOPS@Standard; IBM Power 570 (2 chips, 4 cores) 1197.51 SPECjAppServer2004 JOPS@Standard; SPEC, SPECjAppServer reg tm of Standard Performance Evaluation Corporation. Results from http://www.spec.org as of 7/20/09.

    SPECjbb2005 Sun SPARC Enterprise T5440 (4 chips, 32 cores) 841380 SPECjbb2005 bops, 26293 SPECjbb2005 bops/JVM. Results submitted to SPEC. HP DL585 G5 (4 chips, 24 cores) 937207 SPECjbb2005 bops, 234302 SPECjbb2005 bops/JVM. IBM Power 570 (8 chips, 16 cores) 798752 SPECjbb2005 bops, 99844 SPECjbb2005 bops/JVM. Sun SPARC Enterprise T5440 (4 chips, 32 cores) 692736 SPECjbb2005 bops, 21648 SPECjbb2005 bops/JVM. SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 7/20/09.

    IBM p 570 8P 4.7GHz (4 building blocks) power specifications calculated as 80% of maximum input power reported 7/8/09 in “Facts and Features Report”: ftp://ftp.software.ibm.com/common/ssi/pm/br/n/psb01628usen/PSB01628USEN.PDF

    Wednesday Jul 22, 2009

    Why does 1.6 beat 4.7?

    Sun has upgraded the UltraSPARC T2 and UltraSPARC T2 Plus processors to 1.6 GHz. As described in some detail in yesterday's post, new results show SPEC CPU2006 performance improvements vs. previous systems that often exceed the clock speed improvement.  The scaling can be attributed to both memory system improvements and software improvements, such as the Sun Studio 12 Update 1 compiler.

    A MHz improvement within a product line is often useful.  If yesterday's chip runs at speed n and today's at n\*1.12 then, intuitively, sure, I'll take today's.

    Comparing MHz across product lines is often counter-intuitive.  Consider that Sun's new systems provide:

    • up to 68% more throughput than the 4.7 GHz POWER6+ [1], and
    • up to 3x the throughput of the Itanium 9150N [2].

    The comparisons are particularly striking when one takes into account the cache size advantage for both the POWER6+ and the Itanium 9150N, and the MHz advantage for the POWER6+:

    Processor GHz Number of
    hw cache levels
    Size of
    last cache
    (per chip)
    SPECint_rate_base2006
    UltraSPARC T2
    UltraSPARC T2 Plus
    1.6 2 4 MB 1 chip: 89
    2 chips: 171
    4 chips: 338
    POWER6+ 4.7 3 32 MB Best 2 chip result: 102. UltraSPARC T2 Plus delivers 68% more integer throughput [1]
    Itanium 9150N 1.6 3 24 MB Best 4 chip result: 114. UltraSPARC T2 Plus delivers 3x the integer throughput. [2]

    These are per-chip results, not per-core or per-thread. Sun's CMT processors are designed for overall system throughput: how much work can the overall system get done.  

    A mystery: With comparatively smaller caches and modest clock rates, why do the Sun CMT processors win?

    The performance hole: Memory latency. From the point of view of a CPU chip, the big performance problem is that memory latency is inordinately long compared to chip cycle times.

    A hardware designer can attempt to cover up that latency with very large caches, as in the POWER6+ and Itanium, and this works well when running a small number of modest-sized applications. Large caches become less helpful, though, as workloads become more complex.

    MHz isn't everything. In fact, MHz hardly counts at all when the problem is memory latency. Suppose the hot part of an application looks like this:

      loop:
           computational instruction
           computational instruction
           computational instruction
           memory access instruction
           branch to loop
    

    For an application that looks like this, the computational instructions may complete in only a few cycles, while the memory access instruction may easily require on the order of 100ns - which, for a 1 GHz chip, is on the order of 100 cycles. If the processor speed is increased by a factor of 4, but memory speed is not, then memory is still 100ns away, and when measured in cycles, it is now 400 cycles distant. The overall loop hardly speeds up at all.

    Lest the reader think I am making this up - consider page 8 of this IBM talk from April, 2008 regarding the POWER6:

    latencies

    The IBM POWER systems have some impressive performance characteristics - if your application is tiny enough to fit in its first or second level cache. But memory latency is not impressive. If your workload requires multiple concurrent threads accessing a large memory space, Sun's CMT approach just might be a better fit.

    Operating System Overhead A context switch from one process to another is mediated by operating system services. The OS parks context from the process that is currently running - typically saving dozens of program registers and other context (such as virtual address space information); decides which process to run next (which may require access to several OS data structures); and loads the context for the new process (registers, virtual address context, etc.). If the system is running many processes, then caches are unlikely to be helpful during this context switch, and thousands of cycles may be spent on main memory accesses.

    Design for throughput: Sun's CMT approach handles the complexity of real-world applications by allowing up to 64 processes to be simultaneously on-chip. When a long-latency stall occurs, such as an access to main memory, the chip switches to executing instructions on behalf of other, non-stalled threads, thus improving overall system throughput. No operating system intervention is required as resources are shared among the processes on the chip.

    [1] http://www.spec.org/cpu2006/results/res2009q2/cpu2006-20090427-07263.html
    [2] http://www.spec.org/cpu2006/results/res2009q2/cpu2006-20090522-07485.html

    Competitive results retrieved from www.spec.org   20 July 2009.  Sun's CMT results have been submitted to SPEC.  SPEC, SPECfp, SPECint are registered trademarks of the Standard Performance Evaluation Corporation.

    Tuesday Jul 21, 2009

    Zeus ZXTM Traffic Manager World Record on Sun T5240

    Significance of Results

    The Sun SPARC Enterprise T5240 server equipped with two UltraSPARC T2 processors running at 1.6 GHz delivered World Record ZXTM HTTPThroughput results.

    • Sun SPARC Enterprise T5240 (2 UltraSPARC T2 Plus 1.6GHz) delivers an HTTPThroughput of 13.4 Gbit/sec and a price-performance of 5.5K $/Gb/sec which is 34% better performance and 2.6x the price-performance than a f5 BIG-IP VIPRON (Chassis + 1 blade).
    • Sun  SPARC Enterprise T5240 (2 UltraSPARC T2 Plus 1.6GHz) delivers an HTTPThroughput of 13.4 Gbit/sec and a price-performance of 5.5K $/Gb/sec which is 91% better performance and 2.7x the price-performance than a f5 BIG-IP 8800.
    • Sun SPARC Enterprise T5240 (2 UltraSPARC T2 Plus 1.6GHz) delivers an HTTPThroughput of 13.4 Gbit/sec and a price-performance of 5.5K $/Gb/sec which is 3.3x the price-performance than a Citrix 12000.
    • Sun's UltraSPARC T2+ processor includes support for common bulk ciphers, secure hash operations and both prime and binary field Elliptic Cryptography.  The UltraSPARC T2 processor supports RC4, DES, 3DES, AES-128, AES-192,  AES-256, MD5, SHA-1, SHA-256.

    Performance Landscape

    Zeus ZXTM HTTPThroughput Chart (ordered by performance)

    System
    Gb/sec

    $

    (HW+SW)

    $/perf

    ($/Gb/sec)

    Sun SPARC Enterprise T5240 (2x 1.6GHz US T2 Plus) 13.4
    $74K 5.5K
    f5 BIG-IP VIPRION 10.0 $141K 14.1K
    Sun SPARC Enterprise T5140 (2x 1.2GHz US T2 Plus)  9.1
    $55K
    6.1K
    f5 BIG-IP 8800 7.0
    $105K
    15.1K
    f5 BIG-IP 6900 6.0
    $71K
    11.8K
    Citrix 12000
    6.0
    $110K
    18.3K
    Sun SPARC Enterprise T5120 (1x 1.2GHz US T2) 5.9
    $46K
    7.8K
    Citrix 10010 4.8
    $85K 17.7K

    Performance graph of f5, Citrix and previous Sun results at: http://www.zeus.com/news/press_articles/zeus-price-performance-press-release.html?gclid=CLn4jLuuk5cCFQsQagod7gTkJA.

    Results and Configuration Summary

    Hardware Configuration:
      Sun SPARC Enterprise T5240 with
      • 2x 1.6GHz UltraSPARC T2 Plus
      • 16 GB memory
      • 1 internal 146GB 10K SAS drive
      • 2x Sun 10GbE Xaui Card - (SESX7XA1Z)
      • 2 x Dual 10GbE SFP+ PCIe ( X1109a-z ) with 1 X1109a-z per card

    Software Configuration:

      Solaris
      Zeus ZXTM version 5.1r1

    Benchmark Description

    The benchmark tests HTTP Throughput for Persistent HTTP connections.  Large files bandwidth (Gbit/s) is measured by fetching large files.  Load is applied by using ZeusBench, a benchmarking tool in ZXTM 5.1r1,  and is used for Zeus internal performance testing and as a load generation tool.   Multiple clients request 100MB files over http via the ZXTM load balancer.  

    See Also

    Performance on the Zeus Website

    Disclosure Statement

    Zeus is TM of Zeus Technology Limited. Results as of 7/21/2009 on http://www.zeus.com/news/press_articles/zeus-price-performance-press-release.html?gclid=CLn4jLuuk5cCFQsQagod7gTkJA.
    About

    BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

    Index Pages
    Search

    Archives
    « April 2014
    SunMonTueWedThuFriSat
      
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
       
           
    Today