Monday Sep 20, 2010

Schlumberger's ECLIPSE 300 Performance Throughput On Sun Fire X2270 Cluster with Sun Storage 7410

Oracle's Sun Storage 7410 system, attached via QDR InfiniBand to a cluster of eight of Oracle's Sun Fire X2270 servers, was used to evaluate multiple job throughput of Schlumberger's Linux-64 ECLIPSE 300 compositional reservoir simulator processing their standard 2 Million Cell benchmark model with 8 rank parallelism (MM8 job).

  • The Sun Storage 7410 system showed little difference in performance (2%) compared to running the MM8 job with dedicated local disk.

  • When running 8 concurrent jobs on 8 different nodes all to the Sun Storage 7140 system, the performance saw little degradation (5%) compared to a single MM8 job running on dedicated local disk.

Experiments were run changing how the cluster was utilized in scheduling jobs. Rather than running with the default compact mode, tests were run distributing the single job among the various nodes. Performance improvements were measured when changing from the default compact scheduling scheme (1 job to 1 node) to a distributed scheduling scheme (1 job to multiple nodes).

  • When running at 75% of the cluster capacity, distributed scheduling outperformed the compact scheduling by up to 34%. Even when running at 100% of the cluster capacity, the distributed scheduling is still slightly faster than compact scheduling.

  • When combining workloads, using the distributed scheduling allowed two MM8 jobs to finish 19% faster than the reference time and a concurrent PSTM workload to find 2% faster.

The Oracle Solaris Studio Performance Analyzer and Sun Storage 7410 system analytics were used to identify a 3D Prestack Kirchhoff Time Migration (PSTM) as a potential candidate for consolidating with ECLIPSE. Both scheduling schemes are compared while running various job mixes of these two applications using the Sun Storage 7410 system for I/O.

These experiments showed a potential opportunity for consolidating applications using Oracle Grid Engine resource scheduling and Oracle Virtual Machine templates.

Performance Landscape

Results are presented below on a variety of experiments run using the 2009.2 ECLIPSE 300 2 Million Cell Performance Benchmark (MM8). The compute nodes are a cluster of Sun Fire X2270 servers connected with QDR InfiniBand. First, some definitions used in the tables below:

Local HDD: Each job runs on a single node to its dedicated direct attached storage.
NFSoIB: One node hosts its local disk for NFS mounting to other nodes over InfiniBand.
IB 7410: Sun Storage 7410 system over QDR InfiniBand.
Compact Scheduling: All 8 MM8 MPI processes run on a single node.
Distributed Scheduling: Allocate the 8 MM8 MPI processes across all available nodes.

First Test

The first test compares the performance of a single MM8 test on a single node using local storage to running a number of jobs across the cluster and showing the effect of different storage solutions.

Compact Scheduling
Multiple Job Throughput Results Relative to Single Job
2009.2 ECLIPSE 300 MM8 2 Million Cell Performance Benchmark

Cluster Load Number of MM8 Jobs Local HDD Relative Throughput NFSoIB Relative Throughput IB 7410 Relative Throughput
13% 1 1.00 1.00\* 0.98
25% 2 0.98 0.97 0.98
50% 4 0.98 0.96 0.97
75% 6 0.98 0.95 0.95
100% 8 0.98 0.95 0.95

\* Performance measured on node hosting its local disk to other nodes in the cluster.

Second Test

This next test uses the Sun Storage 7410 system and compares the performance of running the MM8 job on 1 node using the compact scheduling to running multiple jobs with compact scheduling and to running multiple jobs with the distributed schedule. The tests are run on a 8 node cluster, so each distributed job has only 1 MPI process per node.

Comparing Compact and Distributed Scheduling
Multiple Job Throughput Results Relative to Single Job
2009.2 ECLIPSE 300 MM8 2 Million Cell Performance Benchmark

Cluster Load Number of MM8 Jobs Compact Scheduling
Relative Throughput
Distributed Scheduling\*
Relative Throughput
13% 1 1.00 1.34
25% 2 1.00 1.32
50% 4 0.99 1.25
75% 6 0.97 1.10
100% 8 0.97 0.98

\* Each distributed job has 1 MPI process per node.

Third Test

This next test uses the Sun Storage 7410 system and compares the performance of running the MM8 job on 1 node using the compact scheduling to running multiple jobs with compact scheduling and to running multiple jobs with the distributed schedule. This test only uses 4 nodes, so each distributed job has two MPI processes per node.

Comparing Compact and Distributed Scheduling on 4 Nodes
Multiple Job Throughput Results Relative to Single Job
2009.2 ECLIPSE 300 MM8 2 Million Cell Performance Benchmark

Cluster Load Number of MM8 Jobs Compact Scheduling
Relative Throughput
Distributed Scheduling\*
Relative Throughput
25% 1 1.00 1.39
50% 2 1.00 1.28
100% 4 1.00 1.00

\* Each distributed job it has two MPI processes per node.

Fourth Test

The last test involves running two different applications on the 4 node cluster. It compares the performance of running the cluster fully loaded and changing how the applications are run, either compact or distributed. The comparisons are made against the individual application running the compact strategy (as few nodes as possible). It shows that appropriately mixing jobs can give better job performance than running just one kind of application on a single cluster.

Multiple Job, Multiple Application Throughput Results
Comparing Scheduling Strategies
2009.2 ECLIPSE 300 MM8 2 Million Cell and 3D Kirchoff Time Migration (PSTM)

Number of PSTM Jobs Number of MM8 Jobs Compact Scheduling
(1 node x 8 processes
per job)
ECLIPSE
Distributed Scheduling
(4 nodes x 2 processes
per job)
ECLIPSE
Distributed Scheduling
(4 nodes x 4 processes
per job)
PSTM
Compact Scheduling
(2 nodes x 8 processes per job)
PSTM
Cluster Load
0 1 1.00 1.40

25%
0 2 1.00 1.27

50%
0 4 0.99 0.98

100%
1 2
1.19 1.02
100%
2 0

1.07 0.96 100%
1 0

1.08 1.00 50%

Results and Configuration Summary

Hardware Configuration:

8 x Sun Fire X2270 servers, each with
2 x 2.93 GHz Intel Xeon X5570 processors
24 GB memory (6 x 4 GB memory at 1333 MHz)
1 x 500 GB SATA
Sun Storage 7410 system, 24 TB total, QDR InfiniBand
4 x 2.3 GHz AMD Opteron 8356 processors
128 GB memory
2 Internal 233GB SAS drives (466 GB total)
2 Internal 93 GB read optimized SSD (186 GB total)
1 Sun Storage J4400 with 22 1 TB SATA drives and 2 18 GB write optimized SSD
20 TB RAID-Z2 (double parity) data and 2-way striped write optimized SSD or
11 TB mirrored data and mirrored write optimized SSD
QDR InfiniBand Switch

Software Configuration:

SUSE Linux Enterprise Server 10 SP 2
Scali MPI Connect 5.6.6
GNU C 4.1.2 compiler
2009.2 ECLIPSE 300
ECLIPSE license daemon flexlm v11.3.0.0
3D Kirchoff Time Migration

Benchmark Description

The benchmark is a home-grown study in resource usage options when running the Schlumberger ECLIPSE 300 Compositional reservoir simulator with 8 rank parallelism (MM8) to process Schlumberger's standard 2 Million Cell benchmark model. Schlumberger pre-built executables were used to process a 260x327x73 (2 Million Cell) sub-grid with 6,206,460 total grid cells and model 7 different compositional components within a reservoir. No source code modifications or executable rebuilds were conducted.

The ECLIPSE 300 MM8 job uses 8 MPI processes. It can run within a single node (compact) or across multiple nodes of a cluster (distributed). By using the MM8 job, it is possible to compare the performance between running each job on a separate node using local disk to using a shared network attached storage solution. The benchmark tests study the affect of increasing the number of MM8 jobs in a throughput model.

The first test compares the performance of running 1, 2, 4, 6 and 8 jobs on a cluster of 8 nodes using local disk, NFSoIB disk, and the Sun Storage 7410 system connected via InfiniBand. Results are compared against the time it takes to run 1 job with local disk. This test shows what performance impact there is when loading down a cluster.

The second test compares different methods of scheduling jobs on a cluster. The compact method involves putting all 8 MPI processes for a job on the same node. The distributed method involves using 1 MPI processes per node. The results compare the performance against 1 job on one node.

The third test is similar to the second test, but uses only 4 nodes in the cluster, so when running distributed, there are 2 MPI processes per node.

The fourth test compares the compact and distributed scheduling methods on 4 nodes while running a 2 MM8 jobs and one 16-way parallel 3D Prestack Kirchhoff Time Migration (PSTM).

Key Points and Best Practices

  • ECLIPSE is very sensitive to memory bandwidth and needs to be run on 1333 MHz or greater memory speeds. In order to maintain 1333 MHz memory, the maximum memory configuration for the processors used in this benchmark is 24 GB. Bios upgrades now allow 1333 MHz memory for up to 48 GB of memory. Additional nodes can be used to handle data sets that require more memory than available per node. Allocating at least 20% of memory per node for I/O caching helps application performance.

  • If allocating an 8-way parallel job (MM8) to a single node, it is best to use an ECLIPSE license for that particular node to avoid the any additional network overhead of sharing a global license with all the nodes in a cluster.

  • Understanding the ECLIPSE MM8 I/O access patterns is essential to optimizing a shared storage solution. The analytics available on the Oracle Unified Storage 7410 provide valuable I/O characterization information even without source code access. A single MM8 job run shows an initial read and write load related to reading the input grid, parsing Petrel ascii input parameter files and creating an initial solution grid and runtime specifications. This is followed by a very long running simulation that writes data, restart files, and generates reports to the 7410. Due to the nature of the small block I/O, the mirrored configuration for the 7410 outperformed the RAID-Z2 configuration.

    A single MM8 job reads, processes, and writes approximately 240 MB of grid and property data in the first 36 seconds of execution. The actual read and write of the grid data, that is intermixed with this first stage of processing, is done at a rate of 240 MB/sec to the 7410 for each of the two operations.

    Then, it calculates and reports the well connections at an average 260 KB writes/second with 32 operations/second = 32 x 8 KB writes/second. However, the actual size of each I/O operation varies between 2 to 100 KB and there are peaks every 20 seconds. The write cache is on average operating at 8 accesses/second at approximately 61 KB/second (8 x 8 KB writes/sec). As the number of concurrent jobs increases, the interconnect traffic and random I/O operations per second to the 7410 increases.

  • MM8 multiple job startup time is reduced on shared file systems, if each job uses separate input files.

See Also

Disclosure Statement

Copyright 2010, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/20/2010.

Monday Aug 23, 2010

Repriced: SPC-1 Sun Storage 6180 Array (8Gb) 1.9x Better Than IBM DS5020 in Price-Performance

Results are presented on Oracle's Sun Storage 6180 array with 8Gb connectivity for the SPC-1 benchmark.
  • The Sun Storage 6180 array is more than 1.9 times better in price-performance compared to the IBM DS5020 system as measured by the SPC-1 benchmark.

  • The Sun Storage 6180 array delivers 50% more SPC-1 IOPS than the previous generation Sun Storage 6140 array and IBM DS4700 on the SPC-1 benchmark.

  • The Sun Storage 6180 array is more than 3.1 times better in price-performance compared to the NetApp FAS3040 system as measured by the SPC-1 benchmark.

  • The Sun Storage 6180 array betters the Hitachi 2100 system by 34% in price-performance on the SPC-1 benchmark.

  • The Sun Storage 6180 array has 16% better IOPS/disk drive performance than the Hitachi 2100 on the SPC-1 benchmark.

Performance Landscape

Select results for the SPC-1 benchmark comparing competitive systems (ordered by performance), data as of August 6th, 2010 from the Storage Performance Council website.

Sponsor System SPC-1 IOPS $/SPC-1
IOPS
ASU
Capacity
(GB)
TSC Price Data
Protection
Level
Results
Identifier
Hitachi HDS 2100 31,498.58 $5.85 3,967.500 $187,321 Mirroring A00076
NetApp FAS3040 30,992.39 $13.58 12,586.586 $420,800 RAID6 A00062
Oracle SS6180 (8Gb) 26,090.03 $4.37 5,145.060 $114,042 Mirroring A00084
IBM DS5020 (8Gb) 26,090.03 $8.46 5,145.060 $220,778 Mirroring A00081
Fujitsu DX80 19,492.86 $3.45 5,355.400 $67,296 Mirroring A00082
Oracle STK6140 (4Gb) 17,395.53 $4.93 1,963.269 $85,823 Mirroring A00048
IBM DS4700 (4Gb) 17,195.84 $11.67 1,963.270 $200,666 Mirroring A00046

SPC-1 IOPS = the Performance Metric
$/SPC-1 IOPS = the Price-Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-1 benchmark results may be found at http://www.storageperformance.org.

Results and Configuration Summary

Storage Configuration:

80 x 146.8GB 15K RPM drives
4 x Qlogic QLE 2560 HBA

Server Configuration:

IBM system x3850 M2

Software Configuration:

MS Windows 2003 Server SP2
SPC-1 benchmark kit

Benchmark Description

SPC Benchmark-1 (SPC-1): is the first industry standard storage benchmark and is the most comprehensive performance analysis environment ever constructed for storage subsystems. The I/O workload in SPC-1 is characterized by predominately random I/O operations as typified by multi-user OLTP, database, and email servers environments. SPC-1 uses a highly efficient multi-threaded workload generator to thoroughly analyze direct attach or network storage subsystems. The SPC-1 benchmark enables companies to rapidly produce valid performance and price-performance results using a variety of host platforms and storage network topologies.

SPC1 is built to:

  • Provide a level playing field for test sponsors.
  • Produce results that are powerful and yet simple to use.
  • Provide value for engineers as well as IT consumers and solution integrators.
  • Is easy to run, easy to audit/verify, and easy to use to report official results.

Key Points and Best Practices

See Also

Disclosure Statement

SPC-1, SPC-1 IOPS, $/SPC-1 IOPS reg tm of Storage Performance Council (SPC). More info www.storageperformance.org, results as of 8/6/2010. Sun Storage 6180 array 26,090.03 SPC-1 IOPS, ASU Capacity 5,145.060GB, $/SPC-1 IOPS $4.37, Data Protection Mirroring, Cost $114,042, Ident. A00084.

Repriced: SPC-2 (RAID 5 & 6 Results) Sun Storage 6180 Array (8Gb) Outperforms IBM DS5020 by up to 64% in Price-Performance

Results are presented on Oracle's Sun Storage 6180 array with 8 Gb connectivity for the SPC-2 benchmark using RAID 5 and RAID 6.
  • The Sun Storage 6180 array outperforms the IBM DS5020 system by 62% in price-performance for SPC-2 benchmark using RAID 5 data protection.

  • The Sun Storage 6180 array outperforms the IBM DS5020 system by 64% in price-performance for SPC-2 benchmark using RAID 6 data protection.

  • The Sun Storage 6180 array is over 50% faster than the previous generation systems, the Sun Storage 6140 array and IBM DS4700, on the SPC-2 benchmark using RAID 5 data protection.

Performance Landscape

Select results from Oracle and IBM competitive systems for the SPC-2 benchmark (in performance order), data as of August 7th, 2010 from the Storage Performance Council website.

Sponsor System SPC-2 MBPS $/SPC-2 MBPS ASU Capacity (GB) TSC Price Data
Protection
Level
Results Identifier
Oracle SS6180 1,286.74 $56.88 3,504.693 $73,190 RAID 6 B00044
IBM DS5020 1,286.74 $93.26 3,504.693 $120,002 RAID 6 B00042
Oracle SS6180 1,244.89 $50.40 3,504.693 $62,747 RAID 5 B00043
IBM DS5020 1,244.89 $81.73 3,504.693 $101,742 RAID 5 B00041
IBM DS4700 823.62 $106.73 1,748.874 $87,903 RAID 5 B00028
Oracle ST6140 790.67 $67.82 1,675.037 $53,622 RAID 5 B00017
Oracle ST2540 735.62 $37.32 2,177.548 $27,451 RAID 5 B00021
Oracle ST2530 672.05 $26.15 1,451.699 $17,572 RAID 5 B00026

SPC-2 MBPS = the Performance Metric
$/SPC-2 MBPS = the Price-Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-2 benchmark results may be found at http://www.storageperformance.org.

Results and Configuration Summary

Storage Configuration:

Sun Storage 6180 array with 4GB cache
30 x 146.8GB 15K RPM drives (for RAID 5)
36 x 146.8GB 15K RPM drives (for RAID 6)
4 x PCIe 8 Gb single port HBA

Server Configuration:

IBM system x3850 M2

Software Configuration:

Microsoft Windows 2003 Server SP2
SPC-2 benchmark kit

Benchmark Description

The SPC Benchmark-2™ (SPC-2) is a series of related benchmark performance tests that simulate the sequential component of demands placed upon on-line, non-volatile storage in server class computer systems. SPC-2 provides measurements in support of real world environments characterized by:
  • Large numbers of concurrent sequential transfers.
  • Demanding data rate requirements, including requirements for real time processing.
  • Diverse application techniques for sequential processing.
  • Substantial storage capacity requirements.
  • Data persistence requirements to ensure preservation of data without corruption or loss.

Key Points and Best Practices

  • This benchmark was performed using RAID 5 and RAID 6 protection.
  • The controller stripe size was set to 512k.
  • No volume manager was used.

See Also

Disclosure Statement

SPC-2, SPC-2 MBPS, $/SPC-2 MBPS are regular trademarks of Storage Performance Council (SPC). More info www.storageperformance.org, results as of 8/9/2010. Sun Storage 6180 Array 1,286.74 SPC-2 MBPS, $/SPC-2 MBPS $56.88, ASU Capacity 3,504.693 GB, Protect RAID 6, Cost $73,190, Ident. B00044. Sun Storage 6180 Array 1,244.89 SPC-2 MBPS, $/SPC-2 MBPS $50.40, ASU Capacity 3,504.693 GB, Protect RAID 5, Cost $62,747, Ident. B00043.

Thursday Nov 19, 2009

SPECmail2009: New World record on T5240 1.6GHz Sun 7310 and ZFS

The Sun SPARC Enterprise T5240 server running the Sun Java Messaging server 7.2 achieved a World Record SPECmail2009 result using Sun Storage 7310 Unified Storage System and ZFS file system.  Sun's OpenStorage platforms enable another world record.

  • World record SPECmail2009 benchmark using the Sun SPARC Enterprise T5240 server (two 1.6GHz UltraSPARC T2 Plus), Sun Communications Suite 7, Solaris 10, and the Sun Storage 7310 Unified Storage System achieved 14,500 SPECmail_Ent2009 users at 69,857 Sessions/Hour.

  • This SPECmail2009 benchmark result clearly demonstrates that the Sun Messaging Server 7.2, Solaris 10 and ZFS solution can support a large, enterprise level IMAP mail server environment as a low cost 'Sun on Sun' solution, delivering the best performance and maximizing data integrity and availability of Sun Open Storage and ZFS.

  • The Sun SPARC Enterprise T5240 server supported 2.4 times more users with 2.4 times better sessions/hour rate than AppleXserv3 solution on the SPECmail2009 benchmark.

  • There are no IBM Power6 results on this benchmark.

  • The configuration using Sun OpenStorage outperformed all previous results with traditional direct attached storage and significantly higher number of disk devices.

SPECmail2009 Performance Landscape (ordered by performance)

System Performance Disks OS Messaging
Server
Users Sessions/
hour
Sun SPARC Enterprise T5240
2 x 1.6GHz UltraSPARC T2 Plus
14,500 69,857 58
NAS
Solaris 10 CommSuite 7.2
Sun JMS 7.2
Sun SPARC Enterprise T5240
2 x 1.6GHz UltraSPARC T2 Plus
12,000 57,758 80
DAS
Solaris 10 CommSuite 5
Sun JMS 6.3
Sun Fire X4275
2 x 2.93GHz Xeon X5570
8,000 38,348 44
NAS
Solaris 10 Sun JMS 6.2
Apple Xserv3,1
2 x 2.93GHz Xeon X5570
6,000 28,887 82
DAS
MacOS 10.6 Dovecot 1.1.14
apple 0.5
Sun SPARC Enterprise T5220
1 x 1.4GHz UltraSPARC T2
3,600 17,316 52
DAS
Solaris 10 Sun JMS 6.2

Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org

Users - SPECmail_Ent2009 Users
Sessions/hour - SPECmail2009 Sessions/hour
NAS - Network Attached Storage
DAS - Direct Attached Storage

Results and Configuration Summary

Hardware Configuration:

    Sun SPARC Enterprise T5240
      2 x 1.6 GHz UltraSPARC T2 Plus processors
      128 GB memory
      2 x 146GB, 10K RPM SAS disks, 4 x 32GB SSDs

External Storage:

    2 x Sun Storage 7310 Unified Storage System, each with
      32 GB of memory
      24 x 1 TB 7200 RPM SATA Drives

Software Configuration:

    Solaris 10
    ZFS
    Sun Java Communications Suite 7 Update 2
      Sun Java System Messaging Server 7.2
      Directory Server 6.3

Benchmark Description

The SPECmail2009 benchmark measures the ability of corporate e-mail systems to meet today's demanding e-mail users over fast corporate local area networks (LAN). The SPECmail2009 benchmark simulates corporate mail server workloads that range from 250 to 10,000 or more users, using industry standard SMTP and IMAP4 protocols. This e-mail server benchmark creates client workloads based on a 40,000 user corporation, and uses folder and message MIME structures that include both traditional office documents and a variety of rich media content. The benchmark also adds support for encrypted network connections using industry standard SSL v3.0 and TLS 1.0 technology. SPECmail2009 replaces all versions of SPECmail2008, first released in August 2008. The results from the two benchmarks are not comparable.

Software on one or more client machines generates a benchmark load for a System Under Test (SUT) and measures the SUT response times. A SUT can be a mail server running on a single system or a cluster of systems.

A SPECmail2009 'run' simulates a 100% load level associated with the specific number of users, as defined in the configuration file. The mail server must maintain a specific Quality of Service (QoS) at the 100% load level to produce a valid benchmark result. If the mail server does maintain the specified QoS at the 100% load level, the performance of the mail server is reported as SPECmail_Ent2009 SMTP and IMAP Users at SPECmail2009 Sessions per hour. The SPECmail_Ent2009 users at SPECmail2009 Sessions per Hour metric reflects the unique workload combination for a SPEC IMAP4 user.

Key Points and Best Practices

  • Each Sun Storage 7310 Unified Storage System was configured with one J4400 JBOD array with 22x1TB SATA drives to a mirrored device and 4 shared volumes are built under the mirrored device. Total 8 mirrored volumes from 2 x Sun Storage 7310 are mounted on the system under test (SUT) messaging mail indexes and mail messages file system using NFSV4 protocol. Four SSDs were used as the SUT internal disks. Each SSD is configured as a ZFS file system. Four such ZFS directories are used for the messaging server queue, store metadata, LDAP and queue. SSDs substantially reduced the store metadata and queue latencies.

  • Each Sun Storage 7310 Unified Storage System was connected to the SUT via a dual 10-Gigabit Ethernet Fiber XFP card.

  • The Sun Storage 7310 Unified Storage System software version is 2009.08.11,1-0.

  • The clients used these Java options: java -d64 -Xms4096m -Xmx4096m -XX:+AggressiveHeap

  • Substantial performance improvement and scalability was observed with Sun Communications Suite7 update2, Java Messaging Server 7.2 and Directory Server 6.2

  • See the SPEC Report for all OS, network and messaging server tunings.

See Also

Disclosure Statement

SPEC, SPECmail reg tm of Standard Performance Evaluation Corporation. Results as of 10/22/09 on www.spec.org. SPECmail2009: Sun SPARC Enterprise T5240, SPECmail_Ent2009 14,500 users at 69,857 SPECmail2009 Sessions/hour. Apple Xserv3,1, SPECmail_Ent2009 6,000 users at 28,887 SPECmail2009 Sessions/hour.

Wednesday Nov 18, 2009

Sun Flash Accelerator F20 PCIe Card Achieves 100K 4K IOPS and 1.1 GB/sec

Part of the Sun FlashFire family, the Sun Flash Accelerator F20 PCIe Card is a low-profile x8 PCIe card with 4 Solid State Disks-on-Modules (DOMs) delivering over 101K IOPS (4K IO) and 1.1 GB/sec throughput (1M reads).

The Sun F20 card is designed to accelerate IO-intensive applications, such as databases, at a fraction of the power, space, and cost of traditional hard disk drives. It is based on enterprise-class SLC flash technology, with advanced wear-leveling, integrated backup protection, solid state robustness, and 3M hours MTBF reliability.

  • The Sun Flash Accelerator F20 PCIe Card demonstrates breakthrough performance of 101K IOPS for 4K random read
  • The Sun Flash Accelerator F20 PCIe Card can also perform 88K IOPS for 4K random write
  • The Sun Flash Accelerator F20 PCIe Card has unprecedented throughput of 1.1 GB/sec.
  • The Sun Flash Accelerator F20 PCIe Card (low-profile x8 size) has the IOPS performance of over 550 SAS drives or 1,100 SATA drives.

Performance Landscape

Bandwidth and IOPS Measurements

Test DOMs
4 2 1
Random 4K Read 101K IOPS 68K IOPS 35K IOPS
Maximum Delivered Random 4K Write 88K IOPS 44K IOPS 22K IOPS
Maximum Delivered 50-50 4K Read/Write 54K IOPS 27K IOPS 13K IOPS
Sequential Read (1M) 1.1 GB/sec 547 MB/sec 273 MB/sec
Maximum Delivered Sequential Write (1M) 567 MB/sec 243 MB/sec 125 MB/sec

Sustained Random 4K Write\* 37K IOPS 18K IOPS 10K IOPS
Sustained 50/50 4K Read/Write\* 34K IOPS 17K IOPS 8.6K IOPS

(\*) Maximum Delivered values measured over a 1 minute period. Sustained write performance differs from maximum delivered performance. Over time, wear-leveling and erase operations are required and impact write performance levels.

Latency Measurements

The Sun Flash Accelerator F20 PCIe Card is tuned for 4 KB or larger IO sizes, the write service for IOs smaller than 4 KB can be 10 times more than shown in the table below. It should also be noted that the service times shown below are both the latency and the time to transfer the data. This becomes the dominant portion the the service time for IOs over 64 KB in size.

Transfer Size Service Time (ms)
Read Write
4 KB 0.32 0.22
8 KB 0.34 0.24
16 KB 0.37 0.27
32 KB 0.43 0.33
64 KB 0.54 0.46
128 KB 0.49 1.30
256 KB 1.31 2.15
512 KB 2.25 2.25

- Latencies are measured application latencies via vdbench tool.
- Please note that the FlashFire F20 card is a 4KB sector device. Doing IOs of less than 4KB in size, or not aligned on 4KB boundaries, can result in a significant performance degradations on write operations.

Results and Configuration Summary

Storage:

    Sun Flash Accelerator F20 PCIe Card
      4 x 24-GB Solid State Disks-on-Modules (DOMs)

Servers:

    1 x Sun Fire X4170

Software:

    OpenSolaris 2009.06 or Solaris 10 10/09 (MPT driver enhancements)
    Vdbench 5.0
    Required Flash Array Patches SPARC, ses/sgen patch 138128-01 or later & mpt patch 141736-05
    Required Flash Array Patches x86, ses/sgen patch 138129-01 or later & mpt patch 141737-05

Benchmark Description

Sun measured a wide variety of IO performance metrics on the Sun Flash Accelerator F20 PCIe Card using Vdbench 5.0 measuring 100% Random Read, 100% Random Write, 100% Sequential Read, 100% Sequential Write, and 50-50 read/write. This demonstrates the maximum performance and throughput of the storage system.

Vdbench profile f20-parmfile.txt is here for bandwidth and IOPs. And here is the vdbench profile f20-latency.txt file for latency.

Vdbench is publicly available for download at: http://vdbench.org

Key Points and Best Practices

  • Drive each Flash Modules with 32 outstanding IO as shown in the benchmark profile above.
  • SPARC platforms will align with the 4K boundary size set by the Flash Array. x86/windows platforms don't necessarily have this alignment built in and can show lower performance

See Also

Disclosure Statement

Sun Flash Accelerator F20 PCIe Card delivered 100K 4K read IOPS and 1.1 GB/sec sequential read. Vdbench 5.0 (http://vdbench.org) was used for the test. Results as of September 14, 2009.

Wednesday Nov 04, 2009

New TPC-C World Record Sun/Oracle

TPC-C Sun SPARC Enterprise T5440 with Oracle RAC World Record Database Result

Sun and Oracle demonstrate the World's fastest database performance. Sun Microsystems using 12 Sun SPARC Enterprise T5440 servers, 60 Sun Storage F5100 Flash arrays and Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning delivered a world-record TPC-C benchmark result.

  • The 12-node Sun SPARC Enterprise T5440 server cluster result delivered a world record TPC-C benchmark result of 7,646,486.7 tpmC and $2.36 $/tpmC (USD) using Oracle 11g R1 on a configuration available 3/19/10.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the IBM Power 595 (5GHz) with IBM DB2 9.5 database by 26% and has 16% better price/performance on the TPC-C benchmark.

  • The complete Oracle/Sun solution used 10.7x better computational density than the IBM configuration (computational density = performance/rack).

  • The complete Oracle/Sun solution used 8 times fewer racks than the IBM configuration.

  • The complete Oracle/Sun solution has 5.9x better power/performance than the IBM configuration.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the HP Superdome (1.6GHz Itanium2) by 87% and has 19% better price/performance on the TPC-C benchmark.

  • The Oracle/Sun solution utilized Sun FlashFire technology to deliver this result. The Sun Storage F5100 flash array was used for database storage.

  • Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning scales and effectively uses all of the nodes in this configuration to produce the world record performance.

  • This result showed Sun and Oracle's integrated hardware and software stacks provide industry-leading performance.

More information on this benchmark will be posted in the next several days.

Performance Landscape

TPC-C results (sorted by tpmC, bigger is better)


System
tpmC Price/tpmC Avail Database Cluster Racks w/KtpmC
12 x Sun SPARC Enterprise T5440 7,646,487 2.36 USD 03/19/10 Oracle 11g RAC Y 9 9.6
IBM Power 595 6,085,166 2.81 USD 12/10/08 IBM DB2 9.5 N 76 56.4
HP Integrity Superdome 4,092,799 2.93 USD 08/06/07 Oracle 10g R2 N 46 to be added

Avail - Availability date
w/KtmpC - Watts per 1000 tpmC
Racks - clients, servers, storage, infrastructure

Sun and IBM TPC-C Response times


System
tpmC

Response Time

New Order 90th%

Response Time

New Order Average

12 x Sun SPARC Enterprise T5440 7,646,487 0.170 0.168
IBM Power 595 6,085,166 1.69
1.22
Response Time Ratio - Sun Better

9.9x 7.3x

Sun uses 7x comparison to highlight the differences in response times between Sun's solution and IBM.  Although notice that Sun is 10x faster on New Order transactions that finish in the 90% percentile.

It is also interesting to note that none of Sun's response times, avg or 90th percentile, for any transaction is over 0.25 seconds. While IBM does not have even one interactive transaction, not even the menu, below 0.50 seconds. Graphs of Sun's and IBM's response times for New-Order can be found in the full disclosure reports on TPC's website TPC-C Official Result Page.

Results and Configuration Summary

Hardware Configuration:

    9 racks used to hold

    Servers:
      12 x Sun SPARC Enterprise T5440
      4 x 1.6 GHz UltraSPARC T2 Plus
      512 GB memory
      10 GbE network for cluster
    Storage:
      60 x Sun Storage F5100 Flash Array
      61 x Sun Fire X4275, Comstar SAS target emulation
      24 x Sun StorageTek 6140 (16 x 300 GB SAS 15K RPM)
      6 x Sun Storage J4400
      3 x 80-port Brocade FC switches
    Clients:
      24 x Sun Fire X4170, each with
      2 x 2.53 GHz X5540
      48 GB memory

Software Configuration:

    Solaris 10 10/09
    OpenSolaris 6/09 (COMSTAR) for Sun Fire X4275
    Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning
    Tuxedo CFS-R Tier 1
    Sun Web Server 7.0 Update 5

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Performance Processing Council (TPC). 12-node Sun SPARC Enterprise T5440 Cluster (1.6GHz UltraSPARC T2 Plus, 4 processor) with Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning, 7,646,486.7 tpmC, $2.36/tpmC. Available 3/19/10. IBM Power 595 (5GHz Power6, 32 chips, 64 cores, 128 threads) with IBM DB2 9.5, 6,085,166 tpmC, $2.81/tpmC, available 12/10/08. HP Integrity Superdome(1.6GHz Itanium2, 64 processors, 128 cores, 256 threads) with Oracle 10g Enterprise Edition, 4,092,799 tpmC, $2.93/tpmC. Available 8/06/07. Source: www.tpc.org, results as of 11/5/09.

Wednesday Oct 28, 2009

SPC-2 Sun Storage 6780 Array RAID 5 & RAID 6 51% better $/performance than IBM DS5300

Significance of Results

Results on the Sun Storage 6780 Array with 8Gb connectivity are presented for the SPC-2 benchmark using RAID 5 and RAID 6.
  • The Sun Storage 6780 array outperforms the IBM DS5300 by 51% in price performance for SPC-2 benchmark using RAID 5 data protection.

  • The Sun Storage 6780 array outperforms the IBM DS5300 by 51% in price performance for SPC-2 benchmark using RAID 6 data protection.

  • The Sun Storage 6780 Array has 62% better performance than the Fujitsu 800/1100 and delivers a price performance advantage of 5.6x as measured by the SPC-2 benchmark.

  • The Sun Storage 6800 array with 8Gb connectivity improved performance by 36% over the 4GB connected solution as measured by the SPC-2 benchmark.

Performance Landscape

SPC-2 Performance Chart (in increasing price-performance order)

Sponsor System SPC-2
MBPS
$/SPC-2
MBPS
ASU
Capacity
(GB)
TSC Price Data
Protection
Level
Date Results
Identifier
Sun SS6780 (8Gb) 5,634.17 $44.88 16,383.186 $252,873 RAID 5 10/27/09 B00047
IBM DS5300 (8Gb) 5,634.17 $67.75 16,383.186 $381,720 RAID 5 10/21/09 B00045
Sun SS6780 (8Gb) 5,543.88 $45.61 14,042.731 $252,873 RAID 6 10/27/09 B00048
IBM DS5300 (8Gb) 5,543.88 $68.85 14,042.731 $381,720 RAID 6 10/21/09 B00046
Sun SS6780 (4Gb) 4,818.43 $53.61 16,383.186 $258,329 RAID 5 02/03/09 B00039
IBM DS5300 (4Gb) 4,818.43 $93.80 16,383.186 $451,986 RAID 5 09/25/08 B00037
Sun SS6780 (4Gb) 4,675.50 $55.25 14,042.731 $258,329 RAID 6 02/03/09 B00040
IBM DS5300 (4Gb) 4,675.50 $96.67 14,042.731 $451,986 RAID 6 09/25/08 B00038
Fujitsu 800/1100 3,480.68 $238.93 4,569.845 $831,649 Mirroring 03/08/07 B00019

SPC-2 MBPS = the Performance Metric
$/SPC-2 MBPS = the Price/Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-2 benchmark results may be found at http://www.storageperformance.org.

Results and Configuration Summary

Storage Configuration:

    8 x CM200 trays, each with 16 x 146GB 15K RPM drives
    8 x Qlogic 8Gb HBA

Server Configuration:

    4 x IBM x3650
      2 x 2.93 GHz Intel X5570
      5 GB memory

Software Configuration:

    Microsoft Windows Server 2003 Enterprise Edition (32-bit) with SP2
    SPC-2 benchmark kit

Benchmark Description

The SPC Benchmark-2™ (SPC-2) is a series of related benchmark performance tests that simulate the sequential component of demands placed upon on-line, non-volatile storage in server class computer systems. SPC-2 provides measurements in support of real world environments characterized by:
  • Large numbers of concurrent sequential transfers.
  • Demanding data rate requirements, including requirements for real time processing.
  • Diverse application techniques for sequential processing.
  • Substantial storage capacity requirements.
  • Data persistence requirements to ensure preservation of data without corruption or loss.

Key Points and Best Practices

  • This benchmark was performed using RAID 5 and RAID 6 protection.
  • The controller stripe size was set to 512k.
  • No volume manager was used.

See Also

Benchmark Tags

$/Perf, performance, bandwidth, OpenStorage, Storage

Disclosure Statement

SPC-2, SPC-2 MBPS, $/SPC-2 MBPS are regular trademarks of Storage Performance Council (SPC). More info www.storageperformance.org. Sun Storage 6780 Array 5,634.17 SPC-2 MBPS, $/SPC-2 MBPS $44.88, ASU Capacity 16,838.186GB, Protect RAID 5, Cost $252,873.00, Ident. B00047. Sun Storage 6780 Array 5,543.88 SPC-2 MBPS, $/SPC-2 MBPS $45.61, ASU Capacity 14,042.731 GB, Protect RAID 6, Cost $252,873.00, Ident. B00048.

Publication Rules

See here for publication rules.

Tuesday Oct 13, 2009

Oracle Hyperion Sun M5000 and Sun Storage 7410

The Sun SPARC Enterprise M5000 server with SPARC64 VII processors (configured with 4 CPUs) and Sun Storage 7410 Unified Storage System has achieved exceptional performance for Oracle Hyperion Essbase 11.1.1.3 and Oracle 11g database for hundreds of GB of data, 15 dimensional db and millions of members running on free and open Solaris 10. Oracle Hyperion is a component of Oracle Fusion Middleware.

  • The Sun Storage 7410 Unified Storage System provides more than 20% improvement out of the box compared to a mid-size fiber channel disk array for default aggregation and user based aggregation.

  • The Sun SPARC Enterprise M5000 server with Sun Storage 7410 Unified Storage System and Oracle Hyperion Essbase 11.1.1.3 running on Solaris 10 OS provides < 1sec query response times for 20K users in a 15 dimension database.

  • The Sun Storage 7410 Unified Storage system and Oracle Hyperion Essbase provides the best combination for large Essbase database leveraging ZFS and taking advantage of high bandwidth for faster load and aggregation.

  • Oracle Fusion Middleware provides a family of complete, integrated, hot plugable and best-of-breed products known for enabling enterprise customers to create and run agile and intelligent business applications. Oracle Hyperion's performance demonstrates why so many customers rely on Oracle Fusion Middleware as their foundation for innovation.

Performance Landscape

System Processor OS Storage Dataload Def.Agg UserAgg
Sun SE M5000 4 x 2.4GHz SPARC64 VII Solaris Sun Storage 7410 120 min 448 min 17.5 min
Sun SE M5000 4 x 2.4GHz SPARC64 VII Solaris Sun StorageTek 6140 128 min 526 min 24.7 min

Results and Configuration Summary

Hardware Configuration:

    1 x Sun SPARC Enterprise M5000 (2.4 GHz/32GB)
    1 x Sun StorageTek 6140 (32 x 146GB)
    1 x Sun Storage 7410 (24TB disk)

Software Configuration:

    Solaris 10 5/09
    Installer V 11.1.1.3
    Oracle Hyperion Essbase Client v 11.1.1.3
    Oracle Hyperion Essbase v 11.1.1.3
    Oracle Hyperion Essbase Adminstration services 64-bit
    Oracle Weblogic 9.2MP3 -- 64 bit
    Oracle Fusion Middleware
    Sun's JDK 1.5 Update 19 -- 64-bit
    Oracle RDBMS 11.1.0.7 64-bit
    HP's Mercury Interactive QuickTest Professional 9.0

Benchmark Description

Oracle Hyperion is a OLAP based analytics application used to analyze business needs and plans that is highly dimensional detailed, such as "what-if" analysis to look into the future, build multi-user scenario modeling, planning, customer buying patterns, etc.

The objective of the benchmark is to collect data for the following Oracle Hyperion Essbase benchmark key performance indicators (KPI):
  • Database build time: Time elapsed to build a database including outline and data load.

  • Database Aggregation build time: Time elapsed to build aggregation.

  • Analytic Query Time: With user load increasing from 500, 1000, 2000, 10000, 20000 users track the time required to process each query and hence track average analytic query time.

  • Analytic Queries per minute: Number of queries handled by the Essbase server per minute Track resource, i.e. CPU, memory usage.

The benchmark is based on the data set used by Product Assurance for 2005 Essbase 7.x testing.

    40 flat files of 1.2 GB each , 49.4 GB in total
    10 million rows per file, 400 million rows total
    28 columns of data per row
    49.4 GB total size of 40 files
    Database outline has 15 dimensions (five of them are attribute dimensions)
    Customer dimension has 13.3 million members

Key Points and Best Practices

  • The Sun Storage 7410 was configured with iSCSI.

See Also

Disclosure Statement

Oracle Hyperion Enterprise, www.oracle.com/solutions/mid/oracle-hyperion-enterprise.html, results 10/13/2009.

Monday Oct 12, 2009

SPC-2 Sun Storage 6180 Array RAID 5 & RAID 6 Over 70% Better Price Performance than IBM

Significance of Results

Results on the Sun Storage 6180 Array with 8Gb connectivity are presented for the SPC-2 benchmark using RAID 5 and RAID 6.
  • The Sun Storage 6180 Array outperforms the IBM DS5020 by 77% in price performance for SPC-2 benchmark using RAID 5 data protection.

  • The Sun Storage 6180 Array outperforms the IBM DS5020 by 91% in price performance for SPC-2 benchmark using RAID 6 data protection.

  • The Sun Storage 6180 Array is 50% faster than the previous generation, the Sun Storage 6140 Array and IBM DS4700 on the SPC-2 benchmark using RAID 5 data protection.

Performance Landscape

SPC-2 Performance Chart (in increasing price-performance order)

Sponsor System SPC-2 MBPS $/SPC-2 MBPS ASU Capacity (GB) TSC Price Data Protection Level Date Results Identifier
Sun SS6180 1,286.74 $45.47 3,504.693 $58,512 RAID 6 10/08/09 B00044
IBM DS5020 1,286.74 $87.04 3,504.693 $112,002 RAID 6 10/08/09 B00042
Sun SS6180 1,244.89 $42.53 3,504.693 $52,951 RAID 5 10/08/09 B00043
IBM DS5020 1,244.89 $75.30 3,504.693 $93,742 RAID 5 10/08/09 B00041
Sun J4400 887.44 $25.63 23,965.918 $22,742 unprotected 08/15/08 B00034
IBM DS4700 823.62 $106.73 1,748.874 $87,903 RAID 5 04/01/08 B00028
Sun ST6140 790.67 $67.82 1,675.037 $53,622 RAID 5 02/13/07 B00017
Sun ST2540 735.62 $37.32 2,177.548 $27,451 RAID 5 04/10/07 B00021
IBM DS3400 731.25 $34.36 1,165.933 $25,123 RAID 5 02/27/08 B00027
Sun ST2530 672.05 $26.15 1,451.699 $17,572 RAID 5 08/16/07 B00026
Sun J4200 548.80 $22.92 11,995.295 $12,580 Unprotected 07/10/08 B00033

SPC-2 MBPS = the Performance Metric
$/SPC-2 MBPS = the Price/Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-2 benchmark results may be found at http://www.storageperformance.org.

Results and Configuration Summary

Storage Configuration:

    30 146.8GB 15K RPM drives (for RAID 5)
    36 146.8GB 15K RPM drives (for RAID 6)
    4 Qlogic HBA

Server Configuration:

    IBM system x3850 M2

Software Configuration:

    MS Win 2003 Server SP2
    SPC-2 benchmark kit

Benchmark Description

The SPC Benchmark-2™ (SPC-2) is a series of related benchmark performance tests that simulate the sequential component of demands placed upon on-line, non-volatile storage in server class computer systems. SPC-2 provides measurements in support of real world environments characterized by:
  • Large numbers of concurrent sequential transfers.
  • Demanding data rate requirements, including requirements for real time processing.
  • Diverse application techniques for sequential processing.
  • Substantial storage capacity requirements.
  • Data persistence requirements to ensure preservation of data without corruption or loss.

Key Points and Best Practices

  • This benchmark was performed using RAID 5 and RAID 6 protection.
  • The controller stripe size was set to 512k.
  • No volume manager was used.

See Also

Disclosure Statement

SPC-2, SPC-2 MBPS, $/SPC-2 MBPS are regular trademarks of Storage Performance Council (SPC). More info www.storageperformance.org. Sun Storage 6180 Array 1,286.74 SPC-2 MBPS, $/SPC-2 MBPS $45.47, ASU Capacity 3,504.693 GB, Protect RAID 6, Cost $58,512.00, Ident. B00044. Sun Storage 6180 Array 1,244.89 SPC-2 MBPS, $/SPC-2 MBPS $42.53, ASU Capacity 3,504.693 GB, Protect RAID 5, Cost $52,951.00, Ident. B00043.

SPC-1 Sun Storage 6180 Array Over 70% Better Price Performance than IBM

Significance of Results

Results on the Sun Storage 6180 Array with 8Gb connectivity are presented for the SPC-1 benchmark.
  • The Sun Storage 6180 Array outperforms the IBM DS5020 by 72% in price performance on the SPC-1 benchmark.

  • The Sun Storage 6180 Array is 50% faster than the previous generation, Sun Storage 6140 Array and IBM DS4700 on the SPC-1 benchmark.

  • The Sun Storage 6180 Array betters the HDS 2100 by 27% in price performance on the SPC-1 benchmark.

  • The Sun Storage 6180 Array has 16% better IOPS/Drive performance than the HDS 2100 on the SPC-1 benchmark.

Performance Landscape

SPC-1 Performance Chart (in increasing price-performance order)

Sponsor System SPC-1 IOPS $/SPC-1 IOPS ASU
Capacity
(GB)
TSC Price Data
Protection
Level
Date Results
Identifier
HDS AMD 2300 42,502.61 $6.96 7,955.000 $295,740 Mirroring 3/24/09 A00077
HDS AMD 2100 31,498.58 $5.85 3,967.500 $187,321 Mirroring 3/24/09 A00076
Sun SS6180 (8Gb) 26,090.03 $4.70 5,145.060 $122,623 Mirroring 10/09/09 A00084
IBM DS5020 (8Gb) 26,090.03 $8.08 5,145.060 $210,782 Mirroring 8/25/09 A00081
Fujitsu DX80 19,492.86 $3.45 5,355.400 $67,296 Mirroring 9/14/09 A00082
Sun STK6140 (4Gb) 17,395.53 $4.93 1,963.269 $85,823 Mirroring 10/16/06 A00048
IBM DS4700 (4Gb) 17,195.84 $11.67 1,963.270 $200,666 Mirroring 8/21/06 A00046

SPC-1 IOPS = the Performance Metric
$/SPC-1 IOPS = the Price/Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-1 benchmark results may be found at http://www.storageperformance.org.

Results and Configuration Summary

Storage Configuration:

    80 x 146.8GB 15K RPM drives
    8 Qlogic HBA

Server Configuration:

    IBM system x3850 M2

Software Configuration:

    MS Windows 2003 Server SP2
    SPC-1 benchmark kit

Benchmark Description

SPC Benchmark-1 (SPC-1): is the first industry standard storage benchmark and is the most comprehensive performance analysis environment ever constructed for storage subsystems. The I/O workload in SPC-1 is characterized by predominately random I/O operations as typified by multi-user OLTP, database, and email servers environments. SPC-1 uses a highly efficient multi-threaded workload generator to thoroughly analyze direct attach or network storage subsystems. The SPC-1 benchmark enables companies to rapidly produce valid performance and price/performance results using a variety of host platforms and storage network topologies.

SPC1 is built to:

  • Provide a level playing field for test sponsors.
  • Produce results that are powerful and yet simple to use.
  • Provide value for engineers as well as IT consumers and solution integrators.
  • Is easy to run, easy to audit/verify, and easy to use to report official results.

Key Points and Best Practices

See Also

Disclosure Statement

SPC-1, SPC-1 IOPS, $/SPC-1 IOPS reg tm of Storage Performance Council (SPC). More info www.storageperformance.org. Sun Storage 6180 Array 26,090.03 SPC-1 IOPS, ASU Capacity 5,145.060GB, $/SPC-1 IOPS $4.70, Data Protection Mirroring, Cost $122,623, Ident. A00084.


Sunday Oct 11, 2009

1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array

The Sun Storage F5100 Flash Array is a high performance high density solid state flash array delivering over 1.6M IOPS (4K IO) and 12.8GB/sec throughput (1M reads). The Flash Array is designed to accelerate IO-intensive applications, such as databases, at a fraction of the power, space, and cost of traditional hard disk drives. It is based on enterprise-class SLC flash technology, with advanced wear-leveling, integrated backup protection, solid state robustness, and 3M hours MTBF reliability.

  • The Sun Storage F5100 Flash Array demonstrates breakthrough performance of 1.6M IOPS for 4K random reads
  • The Sun Storage F5100 Flash Array can also perform 1.2M IOPS for 4K random writes
  • The Sun Storage F5100 Flash Array has unprecedented throughput of 12.8 GB/sec.

Performance Landscape

Results were obtained using four hosts.

Bandwidth and IOPS Measurements

Test Flash Modules
80 40 20 1
Random 4K Read 1,591K IOPS 796K IOPS 397K IOPS 21K IOPS
Maximum Delivered Random 4K Write 1,217K IOPS 610K IOPS 304K IOPS 15K IOPS
Maximum Delivered 50-50 4K Read/Write 850K IOPS 426K IOPS 213K IOPS 11K IOPS
Sequential Read (1M) 12.8 GB/sec 6.4 GB/sec 3.2 GB/sec 265 MB/sec
Maximum Delivered Sequential Write (1M) 9.7 GB/sec 4.8 GB/sec 2.4 GB/sec 118 MB/sec

Sustained Random 4K Write\*

172K IOPS 9K IOPS

(\*) Maximum Delivered values measured over a 1 minute period. Sustained write performance measured over a 1 hour period and differs from maximum delivered performance. Over time, wear-leveling and erase operations are required and impact write performance levels.

Latency Measurements

The Sun Storage F5100 Flash Array is tuned for 4 KB or larger IO sizes, the write service for IOs smaller than 4 KB can be 10 times more than shown in the table below. It should also be noted that the service times shown below are both the latency and the time to transfer the data. This becomes the dominant portion the the service time for IOs over 64 KB in size.

Transfer Size Service Time (ms)
Read Write
4 KB 0.41 0.28
8 KB 0.42 0.35
16 KB 0.45 0.72
32 KB 0.51 0.77
64 KB 0.63 1.52
128 KB 0.87 2.99
256 KB 1.34 6.03
512 KB 2.29 12.14
1024 KB 4.19 23.79

- Latencies are measured application latencies via vdbench tool.
- Please note that the F5100 Flash Array is a 4KB sector device. Doing IOs of less than 4KB in size, or not aligned on 4KB boundaries, can result in a significant performance degradations on write operations.

Results and Configuration Summary

Storage:

    Sun Storage F5100 Flash Array
      80 Flash Modules
      16 ports
      4 domains (20 Flash Modules per)
      CAM zoning - 5 Flash Modules per port

Servers:

    4 x Sun SPARC Enterprise T5240
    4 x 4 HBAs each, firmware version 01.27.03.00-IT

Software:

    OpenSolaris 2009.06 or Solaris 10 10/09 (MPT driver enhancements)
    Vdbench 5.0
    Required Flash Array Patches SPARC, ses/sgen patch 138128-01 or later & mpt patch 141736-05
    Required Flash Array Patches x86, ses/sgen patch 138129-01 or later & mpt patch 141737-05

Benchmark Description

Sun measured a wide variety of IO performance metrics on the Sun Storage F5100 Flash Array using Vdbench 5.0 measuring 100% Random Read, 100% Random Write, 100% Sequential Read, 100% Sequential Write, and 50-50 read/write. This demonstrates the maximum performance and throughput of the storage system.

Vdbench profile parmfile.txt here

Vdbench is publicly available for download at: http://vdbench.org

Key Points and Best Practices

  • Drive each Flash Modules with 32 outstanding IO as shown in the benchmark profile above.
  • LSI HBA firmware level should be at Phase 15 maxq.
  • LSI HBAs either use single port HBAs or only 1 port per HBA.
  • SPARC platforms will align with the 4K boundary size set by the Flash Array. x86/windows platforms don't necessarily have this alignment built in and can show lower performance

See Also

Disclosure Statement

Sun Storage F5100 Flash Array delivered 1.6M 4K read IOPS and 12.8 GB/sec sequential read. Vdbench 5.0 (http://vdbench.org) was used for the test. Results as of September 12, 2009.

TPC-C World Record Sun - Oracle

TPC-C Sun SPARC Enterprise T5440 with Oracle RAC World Record Database Result

Sun and Oracle demonstrate the World's fastest database performance. Sun Microsystems using 12 Sun SPARC Enterprise T5440 servers, 60 Sun Storage F5100 Flash arrays and Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning delivered a world-record TPC-C benchmark result.

  • The 12-node Sun SPARC Enterprise T5440 server cluster result delivered a world record TPC-C benchmark result of 7,646,486.7 tpmC and $2.36 $/tpmC (USD) using Oracle 11g R1 on a configuration available 3/19/10.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the IBM Power 595 (5GHz) with IBM DB2 9.5 database by 26% and has 16% better price/performance on the TPC-C benchmark.

  • The complete Oracle/Sun solution used 10.7x better computational density than the IBM configuration (computational density = performance/rack).

  • The complete Oracle/Sun solution used 8 times fewer racks than the IBM configuration.

  • The complete Oracle/Sun solution has 5.9x better power/performance than the IBM configuration.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the HP Superdome (1.6GHz Itanium2) by 87% and has 19% better price/performance on the TPC-C benchmark.

  • The Oracle/Sun solution utilized Sun FlashFire technology to deliver this result. The Sun Storage F5100 flash array was used for database storage.

  • Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning scales and effectively uses all of the nodes in this configuration to produce the world record performance.

  • This result showed Sun and Oracle's integrated hardware and software stacks provide industry-leading performance.

More information on this benchmark will be posted in the next several days.

Performance Landscape

TPC-C results (sorted by tpmC, bigger is better)


System
tpmC Price/tpmC Avail Database Cluster Racks w/KtpmC
12 x Sun SPARC Enterprise T5440 7,646,487 2.36 USD 03/19/10 Oracle 11g RAC Y 9 9.6
IBM Power 595 6,085,166 2.81 USD 12/10/08 IBM DB2 9.5 N 76 56.4
Bull Escala PL6460R 6,085,166 2.81 USD 12/15/08 IBM DB2 9.5 N 71 56.4
HP Integrity Superdome 4,092,799 2.93 USD 08/06/07 Oracle 10g R2 N 46 to be added

Avail - Availability date
w/KtmpC - Watts per 1000 tpmC
Racks - clients, servers, storage, infrastructure

Results and Configuration Summary

Hardware Configuration:

    9 racks used to hold

    Servers:
      12 x Sun SPARC Enterprise T5440
      4 x 1.6 GHz UltraSPARC T2 Plus
      512 GB memory
      10 GbE network for cluster
    Storage:
      60 x Sun Storage F5100 Flash Array
      61 x Sun Fire X4275, Comstar SAS target emulation
      24 x Sun StorageTek 6140 (16 x 300 GB SAS 15K RPM)
      6 x Sun Storage J4400
      3 x 80-port Brocade FC switches
    Clients:
      24 x Sun Fire X4170, each with
      2 x 2.53 GHz X5540
      48 GB memory

Software Configuration:

    Solaris 10 10/09
    OpenSolaris 6/09 (COMSTAR) for Sun Fire X4275
    Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning
    Tuxedo CFS-R Tier 1
    Sun Web Server 7.0 Update 5

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

POSTSCRIPT: Here are some comments on IBM's grasping-at-straws-perf/core attacks on the TPC-C result:
c0t0d0s0 blog: "IBM's Reaction to Sun&Oracle TPC-C

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Performance Processing Council (TPC). 12-node Sun SPARC Enterprise T5440 Cluster (1.6GHz UltraSPARC T2 Plus, 4 processor) with Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning, 7,646,486.7 tpmC, $2.36/tpmC. Available 3/19/10. IBM Power 595 (5GHz Power6, 32 chips, 64 cores, 128 threads) with IBM DB2 9.5, 6,085,166 tpmC, $2.81/tpmC, available 12/10/08. HP Integrity Superdome(1.6GHz Itanium2, 64 processors, 128 cores, 256 threads) with Oracle 10g Enterprise Edition, 4,092,799 tpmC, $2.93/tpmC. Available 8/06/07. Source: www.tpc.org, results as of 10/11/09.

Thursday Jun 25, 2009

Sun SSD Server Platform Bandwidth and IOPS (Speeds & Feeds)

The Sun SSD (32 GB SATA 2.5" SSD) is the world's first enterprise-quality, open-standard Flash design. Built to an industry-standard JEDEC form factor, the module is being made available to developers and the OpenSolaris Storage community to foster Flash innovation. The Sun SSD delivers unprecedented IO performance, saves on power, space, and cooling, and will enable new levels of server optimization and datacenter efficiencies.

  • The Sun SSD demonstrated performance of 98K 4K random read IOPS on a Sun Fire X4450 server running the Solaris operating system.

Performance Landscape

Solaris 10 Results

Test SSD Result
X4450 T5240
Random Read (4K) 98.4K IOPS 71.5K IOPS
Random Write (4K) 31.8K IOPS 14.4K IOPS
50-50 Read/Write (4K) 14.9K IOPS 15.7K IOPS
Sequential Read (MB/sec) 764 MB/sec 1012 MB/sec
Sequential Write (MB/sec) 376 MB/sec 531 MB/sec

Results and Configuration Summary

Storage:

    4 x Sun SSD
    32 GB SATA 2.5" SSD (24 GB usable)
    2.5in drive form factor

Servers:

    Sun SPARC Enterprise T5240 - 4 internal drive slots used (LSI driver)
    Sun Fire X4450 - 4 internal drive slots used (LSI driver)

Software:

    OpenSolaris 2009.06 or Solaris 10 10/09 (MPT driver enhancements)
    Vdbench 5.0

Benchmark Description

Sun measured a wide variety of IO performance metrics on the Sun SSD using Vdbench 5.0 measuring 100% Random Read, 100% Random Write, 100% Sequential Read, 100% Sequential Write, and 50-50 read/write. This demonstrates the maximum performance and throughput of the storage system.

Vdbench profile:

    wd=wm_80dr,sd=sd\*,readpct=0,rhpct=0,seekpct=100
    wd=ws_80dr,sd=sd\*,readpct=0,rhpct=0,seekpct=0
    wd=rm_80dr,sd=(sd1-sd80),readpct=100,rhpct=0,seekpct=100
    wd=rs_80dr,sd=(sd1-sd80),readpct=100,rhpct=0,seekpct=0
    wd=rwm_80dr,sd=sd\*,readpct=50,rhpct=0,seekpct=100
    rd=default
    ###Random Read and writes tests varying transfer size
    rd=default,el=30m,in=6,forx=(4K),forth=(32),io=max,pause=20
    rd=run1_rm_80dr,wd=rm_80dr
    rd=run2_wm_80dr,wd=wm_80dr
    rd=run3_rwm_80dr,wd=rwm_80dr
    ###Sequential read and Write tests varying transfer size
    rd=default,el=30m,in=6,forx=(512k),forth=(32),io=max,pause=20
    rd=run4_rs_80dr,wd=rs_80dr
    rd=run5_ws_80dr,wd=ws_80dr
Vdbench is publicly available for download at: http://vdbench.org

Key Points and Best Practices

  • All measurements were done with the internal HBA and not the internal RAID.

See Also

Disclosure Statement

Sun SSD delivered 71.5K 4K read IOPS and 1012 MB/sec sequential read. Vdbench 5.0 (http://vdbench.org) was used for the test. Results as of June 17, 2009.

Wednesday Jun 17, 2009

Performance of Sun 7410 and 7310 Unified Storage Array Line

Roch (rhymes with Spock) Bourbonnais, posted more data showing the performance of Sun's OpenStorage products.  Some of his basic conclusions are:

  • The Sun Storage 7410 Unified Storage Array delivers 1 GB/sec throughput performance.
  • The Sun Storage 7310 Unified Storage Array delivers over 500 MB/sec on streaming writes for backups and imaging applications.
  • The Sun Storage 7410 Unified Storage Array delivers over 22000 of 8K synchronous writes per second combining great DB performance and ease of deployment of Network Attached Storage while delivering the economics benefits of inexpensive SATA disks.
  • The Sun Storage 7410 Unified Storage Array delivers over 36000 of random 8K reads per second from a 400GB working set for great Mail application responsiveness. This corresponds to an entreprise of 100000 people with every employee accessing new data every 3.6 second consolidated on a single server.

You can read more about it at: http://blogs.sun.com/roch/entry/compared_performance_of_sun_7000

Thursday Jun 11, 2009

SAS Grid Computing 9.2 utilizing the Sun Storage 7410 Unified Storage System

Sun has demonstrated the first large scale grid validation for the SAS Grid Computing 9.2 benchmark. This workload showed both the strength of Solaris 10 utilizing containers for ease of deployment, as well as the Sun Storage 7410 Unified Storage Systems and Fishworks Analytics in analyzing, tuning and delivering performance in complex SAS grid data-intensive multi-node environments.

In order to model the real world, the Grid Endurance Test uses large data sizes and complex data processing.  Results demonstrate real customer scenarios and results.  These benchmark results represent significant engineering effort, collaboration and coordination between SAS and Sun. The results also illustrate the commitment of the two companies to provide the best solutions for the most demanding data integration requirements.

  • A combination of 7 Sun Fire X2200 M2 servers utilizing Solaris 10 and a Sun Storage 7410 Unified Storage System showed continued performance improvement as the node count increased from 2 to 7 nodes for the Grid Endurance Test.
  • SAS environments are often complex. Ease of deployment, configuration, use, and ability to observe application IO characteristics (hotspots, trouble areas) are critical for production environments. The power of Fishworks Analytics combined with the reliability of ZFS is a perfect fit for these types of applications.
  • Sun Storage 7410 Unified Storage System (exporting via NFS) satisfied performance needs, throughput peaking at over  900MB/s (near 10GbE line speed) in this multi-node environment.
  • Solaris 10 Containers were used to create agile and flexible deployment environments. Container deployments were trivially migrated (within minutes) as HW resources became available (Grid expanded).
  • This result is the only large scale grid validation for the SAS Grid Computing 9.2, and the first and most timely qualification of OpenStorage for SAS.
  • The test show a delivered throughput through client 1Gb connection of over 100MB/s.

Configuration

The test grid consisted of 8x Sun Fire x2200 M2 servers, 1 configured as the grid manager, 7 as the actual grid nodes.  Each node had a 1GbE connection through a Brocade FastIron 1GbE/10GbE switch.  The 7410 had a 10GbE connection to the switch and sat as the back end storage providing a common shared file system to all nodes which SAS Grid Computing requires.  A storage appliance like the 7410 serves as an easy to setup and maintain solution, satisfying the bandwidth required by the grid.  Our particular 7410 consisted of 46 700GB 7200RPM SATA drives, 36GB of write optimized SSD's and 300GB of Read optimized SSD's.


About the Test

The workload is a batch mixture.  CPU bound workloads are numerically intensive tests, some using tables varying in row count from  9,000 to almost 200,000.  The tables have up to 297 variables, and are processed with both stepwise linear regression and stepwise logistic regression.   Other computational tests use GLM (General Linear Model).  IO intensive jobs vary as well.  One particular test reads raw data from multiple files, then generates 2 SAS data sets, one containing over 5 million records, the 2nd over 12 million.  Another IO intensive job creates a 50 million record SAS data set, then subsequently does lookups against it and finally sorts it into a dimension table.   Finally, other jobs are both compute and IO intensive.

 The SAS IO pattern for all these jobs is almost always sequential, for read, write, and mixed access, as can be viewed via Fishworks Analytics further below.  The typical block size for IO is 32KB. 

Governing the batch is the SAS Grid Manager Scheduler,  Platform LSF.  It determines when to add a job to a node based on number of open job slots (user defined), and a point in time sample of how busy the node actually is.  From run to run, jobs end up scheduled randomly making runs less predictable.  Inevitably, multiple IO intensive jobs will get scheduled on the same node, throttling the 1Gb connection, creating a bottleneck while other nodes do little to no IO.  Often this is unavoidable due to the great variety in behavior a SAS program can go through during its lifecycle.  For example, a program can start out as CPU intensive and be scheduled on a node processing an IO intensive job.  This is the desired behavior and the correct decision based on that point in time.  However, the intially CPU intensive job can then turn IO intensive as it proceeds through its lifecycle.


Results of scaling up node count

Below is a chart of results scaling from 2 to 7 nodes.  The metric is total run time from when the 1st job is scheduled, until the last job is completed.

Scaling of 400 Analytics Batch Workload
Number of Nodes Time to Completion
2 6hr 19min
3 4hr 35min
4 3hr 30min
5 3hr 12min
6 2hr 54min
7 2hr 44min

One may note that time to completion is not linear as node count scales upwards.   To a large extent this is due to the nature of the workload as explained above regarding 1Gb connections getting saturated.  If this were a highly tuned benchmark with jobs placed with high precision, we certainly could have improved run time.  However, we did not do this in order to keep the batch as realistic as possible.  On the positive side, we do continue to see improved run times up to the 7th node.

The Fishworks Analytics displays below show several performance statistics with varying numbers of nodes, with more nodes on the left and fewer on the right.  The first two graphs show file operations per second, and the third shows network bytes per second.   The 7410 provides over 900 MB/sec in the seven-node test.  More information about the interpretation of the Fishworks data for these test will be provided in a later white paper.

An impressive part is in the Fishworks Analytics shot above, throughput of 763MB/s was achieved during the sample period.  That wasn't the top end of the what 7410 could provide.  For the tests summarized in the table above, the 7 node run peaked at over 900MB/s through a single 10GbE connection.  Clearly the 7410 can sustain a fairly high level of IO.

It is also important to note that while we did try to emulate a real world scenario with varying types of jobs and well over 1TB of data being manipulated during the batch, this is a benchmark.  The workload tries to encompass a large variety of job behavior.  Your scenario may vary quite differently from what was run here.  Along with scheduling issues, we were certainly seeing signs of pushing this 7410 configuration near its limits (with the SAS IO pattern and data set sizes), which also affected the ability to achieve linear scaling .  But many grid environments are running workloads that aren't very IO intensive and tend to be more CPU bound with minimal IO requirements.  In that scenario one could expect to see excellent node scaling well beyond what was demonstrated by this batch.  To demonstrate this, the batch was run sans the IO intensive jobs.  These jobs do require some IO, but tend to be restricted to 25MB/s or less per process and only for the purpose of initially reading a data set, or writing results.

  • 3 nodes ran in 120 minutes
  • 7 nodes ran in 58 minutes
Very nice scaling - near linear, especially with the lag time that can occur with scheduling batch jobs.  The point of this exercise being, know your workload.  In this case, the 7410 solution on the back end was more than up to the demands these 350+ jobs put on it and there was still room to grow and scale out more nodes further reducing overall run time.


Tuning(?)

The question mark is actually appropriate.  For the achieved results, after configuring a RAID1 share on the 7410, only 1 parameter made a significant difference.  During the IO intensive periods, single 1Gb client throughput was observed at 120MB/s simplex, and 180MB/s duplex - producing well over 100,000 interrupts a second.  Jumbo frames were enabled on the 7410 and clients, reducing interrupts by almost 75% and reducing IO intensive job run time by an average of 12%.  Many other NFS, Solaris, tcp/ip tunings were tried, with no meaningful reduction in microbenchmarks, or the actual batch.  Nice relatively simple (for a grid) setup.

Not a direct tuning but an application change worth mentioning was due to the visibility that Analytics provides.  Early on during the load phase of the benchmark, the IO rate was less than spectacular.  What should have taken about 4.5 hours was going to take almost a day.  Drilling down through analytics showed us that 100,000's of file open/closes were occurring that the development team had been unaware of.  Quickly that was fixed and the data loader ran at expected rates.


Okay - Really no other tuning?  How about 10GbE!

Alright, so there was something else we tried which was outside the test results achieved above.  The x2200 we were using is an 8 core box.  Even when maxing out the 1Gb testing with multiple IO bound jobs, there was still CPU resources left over.  Considering that a higher core count with more memory is becoming more the standard when referencing a "client", it makes sense to utilize all those resources.  In the case where a node would be scheduled with multiple IO jobs, we wanted to see if 10GbE could potentially push up client throughput.  Through our testing, two things helped improve performance.

The first was to turn off interrupt blanking.  With blanking disabled, packets are processed when they arrive as opposed to being processed when an interrupt is issued.  Doing this resulted in a ~15% increase in duplex throughput.  Caveat - there is a reason interrupt blanking exists and it isn't to slow down your network throughput.  Tune this only if you have a decent amount of idle cpu as disabling interrupt blanking will consume it.  The other piece that resulted in a significant increase in throughput through the 10GbE NIC was to use multiple NFS client processes.  We achieved this through zones.  By adding a second zone, throughput through the single 10GbE interface increased ~30%.  The final duplex numbers were (These are also peak throughput).

  • 288MB/s no tuning
  • 337MB/s interrupt blanking disabled
  • 430MB/s 2 NFS client processes + interrupt blanking disabled


Conclusion - what does this show?

  • SAS Grid Computing which requires a shared file system between all nodes, can fit in very nicely on the 7410 storage appliance.  The workload continues to scale while adding nodes.
  • The 7410 can provide very solid throughput peaking at over 900MB/s (near 10GbE linespeed) with the configuration tested. 
  • The 7410 is easy to set up, gives an incredible depth of knowledge about the IO your application does which can lead to optimization. 
  • Know your workload, in many cases the 7410 storage appliance can be a great fit at a relatively inexpensive price while providing the benefits described (and others not described) above.
  • 10GbE client networking can be a help if your 1GbE IO pipeline is a bottleneck and there is a reasonable amount of free CPU overhead.


Additional Reading

Sun.com on Sas Grid Computing and the Sun Storage 7410 Unified Storage Array

Description of Sas Grid Computing


    About

    BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

    Index Pages
    Search

    Archives
    « July 2014
    SunMonTueWedThuFriSat
      
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
      
           
    Today