Monday Nov 25, 2013

World Record Single System TPC-H @10000GB Benchmark on SPARC T5-4

Oracle's SPARC T5-4 server delivered world record single server performance of 377,594 QphH@10000GB with price/performance of $4.65/QphH@10000GB USD on the TPC-H @10000GB benchmark. This result shows that the 4-chip SPARC T5-4 server is significantly faster than the 8-chip server results from HP (Intel x86 based).

  • The SPARC T5-4 server with four SPARC T5 processors is 2.4 times faster than the HP ProLiant DL980 G7 server with eight x86 processors.

  • The SPARC T5-4 server delivered 4.8 times better performance per chip and 3.0 times better performance per core than the HP ProLiant DL980 G7 server.

  • The SPARC T5-4 server has 28% better price/performance than the HP ProLiant DL980 G7 server (for the price/QphH metric).

  • The SPARC T5-4 server with 2 TB memory is 2.4 times faster than the HP ProLiant DL980 G7 server with 4 TB memory (for the composite metric).

  • The SPARC T5-4 server took 9 hours, 37 minutes, 54 seconds for data loading while the HP ProLiant DL980 G7 server took 8.3 times longer.

  • The SPARC T5-4 server accomplished the refresh function in around a minute, the HP ProLiant DL980 G7 server took up to 7.1 times longer to do the same function.

This result demonstrates a complete data warehouse solution that shows the performance both of individual and concurrent query processing streams, faster loading, and refresh of the data during business operations. The SPARC T5-4 server delivers superior performance and cost efficiency when compared to the HP result.

Performance Landscape

The table lists the leading TPC-H @10000GB results for non-clustered systems.

TPC-H @10000GB, Non-Clustered Systems
System
Processor
P/C/T – Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC T5-4
3.6 GHz SPARC T5
4/64/512 – 2048 GB
377,594.3 $4.65 342,714.1 416,024.4 Oracle 11g R2 11/25/13
HP ProLiant DL980 G7
2.4 GHz Intel Xeon E7-4870
8/80/160 – 4096 GB
158,108.3 $6.49 185,473.6 134,780.5 SQL Server 2012 04/15/13

P/C/T = Processors, Cores, Threads
QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric in USD (smaller is better)
QppH = the Power Numerical Quantity (bigger is better)
QthH = the Throughput Numerical Quantity (bigger is better)

The following table lists data load times and average refresh function times.

TPC-H @10000GB, Non-Clustered Systems
Database Load & Database Refresh
System
Processor
Data Loading
(h:m:s)
T5
Advan
RF1
(sec)
T5
Advan
RF2
(sec)
T5
Advan
SPARC T5-4
3.6 GHz SPARC T5
09:37:54 8.3x 58.8 7.1x 62.1 6.4x
HP ProLiant DL980 G7
2.4 GHz Intel Xeon E7-4870
79:28:23 1.0x 416.4 1.0x 394.9 1.0x

Data Loading = database load time
RF1 = throughput average first refresh transaction
RF2 = throughput average second refresh transaction
T5 Advan = the ratio of time to the SPARC T5-4 server time

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Server Under Test:

SPARC T5-4 server
4 x SPARC T5 processors (3.6 GHz total of 64 cores, 512 threads)
2 TB memory
2 x internal SAS (2 x 300 GB) disk drives
12 x 16 Gb FC HBA

External Storage:

24 x Sun Server X4-2L servers configured as COMSTAR nodes, each with
2 x 2.5 GHz Intel Xeon E5-2609 v2 processors
4 x Sun Flash Accelerator F80 PCIe Cards, 800 GB each
6 x 4 TB 7.2K RPM 3.5" SAS disks
1 x 8 Gb dual port HBA

2 x 48 port Brocade 6510 Fibre Channel Switches

Software Configuration:

Oracle Solaris 11.1
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 10000 GB (Scale Factor 10000)
TPC-H Composite: 377,594.3 QphH@10000GB
Price/performance: $4.65/QphH@10000GB USD
Available: 11/25/2013
Total 3 year Cost: $1,755,709 USD
TPC-H Power: 342,714.1
TPC-H Throughput: 416,024.4
Database Load Time: 9:37:54

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor.

Key Points and Best Practices

  • COMSTAR (Common Multiprotocol SCSI Target) is the software framework that enables an Oracle Solaris host to serve as a SCSI Target platform. COMSTAR uses a modular approach to break the huge task of handling all the different pieces in a SCSI target subsystem into independent functional modules which are glued together by the SCSI Target Mode Framework (STMF). The modules implementing functionality at SCSI level (disk, tape, medium changer etc.) are not required to know about the underlying transport. And the modules implementing the transport protocol (FC, iSCSI, etc.) are not aware of the SCSI-level functionality of the packets they are transporting. The framework hides the details of allocation providing execution context and cleanup of SCSI commands and associated resources and simplifies the task of writing the SCSI or transport modules.

  • The SPARC T5-4 server achieved a peak IO rate of 37 GB/sec from the Oracle database configured with this storage.

  • Twelve COMSTAR nodes were mirrored to another twelve COMSTAR nodes on which all of the Oracle database files were placed. IO performance was high and balanced across all the nodes.

  • Oracle Solaris 11.1 required very little system tuning.

  • Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric when comparing systems.

  • The SPARC T5-4 server and Oracle Solaris efficiently managed the system load of nearly two thousand Oracle Database parallel processes.

See Also

Disclosure Statement

TPC Benchmark, TPC-H, QphH, QthH, QppH are trademarks of the Transaction Processing Performance Council (TPC). Results as of 11/25/13, prices are in USD. SPARC T5-4 www.tpc.org/3293; HP ProLiant DL980 G7 www.tpc.org/3285.

Wednesday Sep 25, 2013

SPARC T5 Encryption Performance Tops Intel E5-2600 v2 Processor

The cryptography benchmark suite was developed by Oracle to measure security performance on important AES security modes. Oracle's SPARC T5 processor with it security software in silicon is faster than x86 servers that have the AES-NI instructions. In this test, the performance of on-processor encryption operations is measured (32 KB encryptions). Multiple threads are used to measure each processors maximum throughput. The SPARC T5-8 shows dramatically faster encryption.

  • A SPARC T5 processor running Oracle Solaris 11.1 is 2.7 times faster executing AES-CFB 256-bit key encryption (in cache) than the Intel E5-2697 v2 processor (with AES-NI) running Oracle Linux 6.3. AES-CFB encryption is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage.

  • On the AES-CFB 128-bit key encryption, the SPARC T5 processor is 2.5 times faster than the Intel E5-2697 v2 processor (with AES-NI) running Oracle Linux 6.3 for in-cache encryption. AES-CFB mode is used by Oracle Database for Transparent Data Encryption (TDE) which provides security for database storage.

  • The IBM POWER7+ has three hardware security units for 8-core processors, but IBM has not publicly shown any measured performance results on AES-CFB or other encryption modes.

Performance Landscape

Presented below are results for running encryption using the AES cipher with the CFB, CBC, CCM and GCM modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6).

Encryption Performance – AES-CFB

Performance is presented for in-cache AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-CFB
SPARC T5 3.60 2 54,396 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 19,960 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 12,823 Oracle Linux 6.3, IPP/AES-NI
AES-192-CFB
SPARC T5 3.60 2 61,000 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 23,217 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 14,928 Oracle Linux 6.3, IPP/AES-NI
AES-128-CFB
SPARC T5 3.60 2 68,695 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 27,740 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 17,824 Oracle Linux 6.3, IPP/AES-NI

Encryption Performance – AES-GCM

Performance is presented for in-cache AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-GCM
SPARC T5 3.60 2 34,101 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 15,338 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 13,520 Oracle Linux 6.3, IPP/AES-NI
AES-192-GCM
SPARC T5 3.60 2 36,852 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 15,768 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 14,159 Oracle Linux 6.3, IPP/AES-NI
AES-128-GCM
SPARC T5 3.60 2 39,003 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 16,405 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 14,877 Oracle Linux 6.3, IPP/AES-NI

Encryption Performance – AES-CCM

Performance is presented for in-cache AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-CCM
SPARC T5 3.60 2 29,431 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 19,447 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 12,493 Oracle Linux 6.3, IPP/AES-NI
AES-192-CCM
SPARC T5 3.60 2 33,715 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 22,634 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 14,507 Oracle Linux 6.3, IPP/AES-NI
AES-128-CCM
SPARC T5 3.60 2 39,188 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 26,951 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 17,256 Oracle Linux 6.3, IPP/AES-NI

Encryption Performance – AES-CBC

Performance is presented for in-cache AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-CBC
SPARC T5 3.60 2 56,933 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 19,962 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 12,822 Oracle Linux 6.3, IPP/AES-NI
AES-192-CBC
SPARC T5 3.60 2 63,767 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 23,224 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 14,915 Oracle Linux 6.3, IPP/AES-NI
AES-128-CBC
SPARC T5 3.60 2 72,508 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2697 v2 2.70 2 27,733 Oracle Linux 6.3, IPP/AES-NI
Intel E5-2690 2.90 2 17,823 Oracle Linux 6.3, IPP/AES-NI

Configuration Summary

SPARC T5-2 server
2 x SPARC T5 processor, 3.6 GHz
512 GB memory
Oracle Solaris 11.1 SRU 4.2

Sun Server X4-2L server
2 x E5-2697 v2 processors, 2.70 GHz
256 GB memory
Oracle Linux 6.3

Sun Server X3-2 server
2 x E5-2690 processors, 2.90 GHz
128 GB memory
Oracle Linux 6.3

Benchmark Description

The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-cache (32 KB encryptions) and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM.

The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance.

See Also

Disclosure Statement

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/23/2013.

Wednesday Jun 12, 2013

SPARC T5-4 Produces World Record Single Server TPC-H @3000GB Benchmark Result

Oracle's SPARC T5-4 server delivered world record single server performance of 409,721 QphH@3000GB with price/performance of $3.94/QphH@3000GB on the TPC-H @3000GB benchmark. This result shows that the 4-chip SPARC T5-4 server is significantly faster than the 8-chip server results from IBM (POWER7 based) and HP (Intel x86 based).

This result demonstrates a complete data warehouse solution that shows the performance both of individual and concurrent query processing streams, faster loading, and refresh of the data during business operations. The SPARC T5-4 server delivers superior performance and cost efficiency when compared to the IBM POWER7 result.

  • The SPARC T5-4 server with four SPARC T5 processors is 2.1 times faster than the IBM Power 780 server with eight POWER7 processors and 2.5 times faster than the HP ProLiant DL980 G7 server with eight x86 processors on the TPC-H @3000GB benchmark. The SPARC T5-4 server also delivered better performance per core than these eight processor systems from IBM and HP.

  • The SPARC T5-4 server with four SPARC T5 processors is 2.1 times faster than the IBM Power 780 server with eight POWER7 processors on the TPC-H @3000GB benchmark.

  • The SPARC T5-4 server costs 38% less per $/QphH@3000GB compared to the IBM Power 780 server with the TPC-H @3000GB benchmark.

  • The SPARC T5-4 server took 2 hours, 6 minutes, 4 seconds for data loading while the IBM Power 780 server took 2.8 times longer.

  • The SPARC T5-4 server executed the first refresh function (RF1) in 19.4 seconds, the IBM Power 780 server took 7.6 times longer.

  • The SPARC T5-4 server with four SPARC T5 processors is 2.5 times faster than the HP ProLiant DL980 G7 server with the same number of cores on the TPC-H @3000GB benchmark.

  • The SPARC T5-4 server took 2 hours, 6 minutes, 4 seconds for data loading while the HP ProLiant DL980 G7 server took 4.1 times longer.

  • The SPARC T5-4 server executed the first refresh function (RF1) in 19.4 seconds, the HP ProLiant DL980 G7 server took 8.9 times longer.

  • The SPARC T5-4 server delivered 6% better performance than the SPARC Enterprise M9000-64 server and 2.1 times better than the SPARC Enterprise M9000-32 server on the TPC-H @3000GB benchmark.

Performance Landscape

The table lists the leading TPC-H @3000GB results for non-clustered systems.

TPC-H @3000GB, Non-Clustered Systems
System
Processor
P/C/T – Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC T5-4
3.6 GHz SPARC T5
4/64/512 – 2048 GB
409,721.8 $3.94 345,762.7 485,512.1 Oracle 11g R2 09/24/13
SPARC Enterprise M9000
3.0 GHz SPARC64 VII+
64/256/256 – 1024 GB
386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g R2 09/22/11
SPARC T4-4
3.0 GHz SPARC T4
4/32/256 – 1024 GB
205,792.0 $4.10 190,325.1 222,515.9 Oracle 11g R2 05/31/12
SPARC Enterprise M9000
2.88 GHz SPARC64 VII
32/128/256 – 512 GB
198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g R2 12/09/10
IBM Power 780
4.1 GHz POWER7
8/32/128 – 1024 GB
192,001.1 $6.37 210,368.4 175,237.4 Sybase 15.4 11/30/11
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
8/64/128 – 512 GB
162,601.7 $2.68 185,297.7 142,685.6 SQL Server 2008 10/13/10

P/C/T = Processors, Cores, Threads
QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric in USD (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

The following table lists data load times and refresh function times during the power run.

TPC-H @3000GB, Non-Clustered Systems
Database Load & Database Refresh
System
Processor
Data Loading
(h:m:s)
T5
Advan
RF1
(sec)
T5
Advan
RF2
(sec)
T5
Advan
SPARC T5-4
3.6 GHz SPARC T5
02:06:04 1.0x 19.4 1.0x 22.4 1.0x
IBM Power 780
4.1 GHz POWER7
05:51:50 2.8x 147.3 7.6x 133.2 5.9x
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
08:35:17 4.1x 173.0 8.9x 126.3 5.6x

Data Loading = database load time
RF1 = power test first refresh transaction
RF2 = power test second refresh transaction
T5 Advan = the ratio of time to T5 time

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Hardware Configuration:

SPARC T5-4 server
4 x SPARC T5 processors (3.6 GHz total of 64 cores, 512 threads)
2 TB memory
2 x internal SAS (2 x 300 GB) disk drives

External Storage:

12 x Sun Storage 2540-M2 array with Sun Storage 2501-M2 expansion trays, each with
24 x 15K RPM 300 GB drives, 2 controllers, 2 GB cache
2 x Brocade 6510 Fibre Channel Switches (48 x 16 Gbs port each)

Software Configuration:

Oracle Solaris 11.1
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 3000 GB (Scale Factor 3000)
TPC-H Composite: 409,721.8 QphH@3000GB
Price/performance: $3.94/QphH@3000GB
Available: 09/24/2013
Total 3 year Cost: $1,610,564
TPC-H Power: 345,762.7
TPC-H Throughput: 485,512.1
Database Load Time: 2:06:04

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor.

Key Points and Best Practices

  • Twelve of Oracle's Sun Storage 2540-M2 arrays with Sun Storage 2501-M2 expansion trays were used for the benchmark. Each contains 24 15K RPM drives and is connected to a single dual port 16Gb FC HBA using 2 ports through a Brocade 6510 Fibre Channel switch.

  • The SPARC T5-4 server achieved a peak IO rate of 33 GB/sec from the Oracle database configured with this storage.

  • Oracle Solaris 11.1 required very little system tuning.

  • Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric when comparing systems.

  • The SPARC T5-4 server and Oracle Solaris efficiently managed the system load of two thousand Oracle Database parallel processes.

  • Six Sun Storage 2540-M2/2501-M2 arrays were mirrored to another six Sun Storage 2540-M2/25001-M2 arrays on which all of the Oracle database files were placed. IO performance was high and balanced across all the arrays.

  • The TPC-H Refresh Function (RF) simulates periodical refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T5-4 server outperformed both the IBM POWER7 server and HP ProLiant DL980 G7 server. (See the RF columns above.)

See Also

Disclosure Statement

TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org, results as of 6/7/13. Prices are in USD. SPARC T5-4 www.tpc.org/3288; SPARC T4-4 www.tpc.org/3278; SPARC Enterprise M9000 www.tpc.org/3262; SPARC Enterprise M9000 www.tpc.org/3258; IBM Power 780 www.tpc.org/3277; HP ProLiant DL980 www.tpc.org/3285. 

Friday Mar 29, 2013

SPARC T5 System Performance for Encryption Microbenchmark

The cryptography benchmark suite was internally developed by Oracle to measure the maximum throughput of in-memory, on-chip encryption operations that a system can perform. Multiple threads are used to achieve the maximum throughput. Systems powered by Oracle's SPARC T5 processor show outstanding performance on the tested encryption operations, beating Intel processor based systems.

  • A SPARC T5 processor running Oracle Solaris 11.1 runs from 2.4x to 4.4x faster on AES 256-bit key encryption than the Intel E5-2690 processor running in-memory encryption of 32 KB blocks using CFB128, CBC, CCM and GCM modes fully hardware subscribed.

  • AES CFB mode is used by the Oracle Database 11g for Transparent Data Encryption (TDE) which provides security to database storage.

Performance Landscape

Presented below are results for running encryption using the AES cipher with the CFB, CBC, CCM and GCM modes for key sizes of 128, 192 and 256. Decryption performance was similar and is not presented. Results are presented as MB/sec (10**6).

Encryption Performance – AES-CFB

Performance is presented for in-memory AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-CFB
SPARC T5 3.60 2 54,396 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 12,823 IPP/AES-NI
AES-192-CFB
SPARC T5 3.60 2 61,000 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 14,928 IPP/AES-NI
AES-128-CFB
SPARC T5 3.60 2 68,695 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 17,824 IPP/AES-NI

Encryption Performance – AES-CBC

Performance is presented for in-memory AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-CBC
SPARC T5 3.60 2 56,933 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 12,822 IPP/AES-NI
AES-192-CBC
SPARC T5 3.60 2 63,767 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 14,915 IPP/AES-NI
AES-128-CBC
SPARC T5 3.60 2 72,508 Oracle Solaris 11.1, libsoftcrypto + libumem
SPARC T4 2.85 2 31,085 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel X5690 3.47 2 20,721 IPP/AES-NI
Intel E5-2690 2.90 2 17,823 IPP/AES-NI

Encryption Performance – AES-CCM

Performance is presented for in-memory AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-CCM
SPARC T5 3.60 2 29,431 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 12,493 IPP/AES-NI
AES-192-CCM
SPARC T5 3.60 2 33,715 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 14,507 IPP/AES-NI
AES-128-CCM
SPARC T5 3.60 2 39,188 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 17,256 IPP/AES-NI

Encryption Performance – AES-GCM

Performance is presented for in-memory AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Chips Performance Software Environment
AES-256-GCM
SPARC T5 3.60 2 34,101 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 13,520 IPP/AES-NI
AES-192-GCM
SPARC T5 3.60 2 36,852 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 14,159 IPP/AES-NI
AES-128-GCM
SPARC T5 3.60 2 39,003 Oracle Solaris 11.1, libsoftcrypto + libumem
Intel E5-2690 2.90 2 14,877 IPP/AES-NI

Configuration Summary

SPARC T5-2 server
2 x SPARC T5 processor, 3.6 GHz
512 GB memory
Oracle Solaris 11.1 SRU 4.2

Sun Server X3-2 server
2 x E5-2690 processors, 2.90 GHz
128 GB memory

Benchmark Description

The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-memory and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM.

The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance.

See Also

Disclosure Statement

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 3/26/2013.

Tuesday Mar 26, 2013

SPARC T5-2 Achieves ZFS File System Encryption Benchmark World Record

Oracle continues to lead in enterprise security. Oracle's SPARC T5 processors combined with the Oracle Solaris ZFS file system demonstrate faster file system encryption than equivalent x86 systems using the Intel Xeon Processor E5-2600 Sequence chips which have AES-NI security instructions.

Encryption is the process where data is encoded for privacy and a key is needed by the data owner to access the encoded data.

  • The SPARC T5-2 server is 3.4x faster than a 2 processor Intel Xeon E5-2690 server running Oracle Solaris 11.1 that uses the AES-NI GCM security instructions for creating encrypted files.

  • The SPARC T5-2 server is 2.2x faster than a 2 processor Intel Xeon E5-2690 server running Oracle Solaris 11.1 that uses the AES-NI CCM security instructions for creating encrypted files.

  • The SPARC T5-2 server consumes a significantly less percentage of system resources as compared to a 2 processor Intel Xeon E5-2690 server.

Performance Landscape

Below are results running two different ciphers for ZFS encryption. Results are presented for runs without any cipher, labeled clear, and a variety of different key lengths. The results represent the maximum delivered values measured for 3 concurrent sequential write operations using 1M blocks. Performance is measured in MB/sec (bigger is better). System utilization is reported as %CPU as measured by iostat (smaller is better).

The results for the x86 server were obtained using Oracle Solaris 11.1 with performance bug fixes.

Encryption Using AES-GCM Ciphers

System GCM Encryption: 3 Concurrent Sequential Writes
Clear AES-256-GCM AES-192-GCM AES-128-GCM
MB/sec %CPU MB/sec %CPU MB/sec %CPU MB/sec %CPU
SPARC T5-2 server 3,918 7 3,653 14 3,676 15 3,628 14
SPARC T4-2 server 2,912 11 2,662 31 2,663 30 2,779 31
2-Socket Intel Xeon E5-2690 3,969 42 1,062 58 1,067 58 1,076 57
SPARC T5-2 vs x86 server 1.0x 3.4x 3.4x 3.4x

Encryption Using AES-CCM Ciphers

System CCM Encryption: 3 Concurrent Sequential Writes
Clear AES-256-CCM AES-192-CCM AES-128-CCM
MB/sec %CPU MB/sec %CPU MB/sec %CPU MB/sec %CPU
SPARC T5-2 server 3,862 7 3,665 15 3,622 14 3,707 12
SPARC T4-2 server 2,945 11 2,471 26 2,801 26 2,442 25
2-Socket Intel Xeon E5-2690 3,868 42 1,566 64 1,632 63 1,689 66
SPARC T5-2 vs x86 server 1.0x 2.3x 2.2x 2.2x

Configuration Summary

Storage Configuration:

Sun Storage 6780 array
4 CSM2 trays, each with 16 83GB 15K RPM drives
8x 8 GB/sec Fiber Channel ports per host
R0 Write cache enabled, controller mirroring off for peak write bandwidth
8 Drive R0 512K stripe pools mirrored via ZFS to storage

Sun Storage 6580 array
9 CSM2 trays, each with 16 136GB 15K RPM drives
8x 4 GB/sec Fiber Channel ports per host
R0 Write cache enabled, controller mirroring off for peak write bandwidth
4 Drive R0 512K stripe pools mirrored via ZFS to storage

Server Configuration:

SPARC T5-2 server
2 x SPARC T5 3.6 GHz processors
512 GB memory
Oracle Solaris 11.1

SPARC T4-2 server
2 x SPARC T4 2.85 GHz processors
256 GB memory
Oracle Solaris 11.1

Sun Server X3-2L server
2 x Intel Xeon E5-2690, 2.90 GHz processors
128 GB memory
Oracle Solaris 11.1

Switch Configuration:

Brocade 5300 FC switch

Benchmark Description

This benchmark evaluates secure file system performance by measuring the rate at which encrypted data can be written. The Vdbench tool was used to generate the IO load. The test performed 3 concurrent sequential write operations using 1M blocks to 3 separate files.

Key Points and Best Practices

  • ZFS encryption is integrated with the ZFS command set. Like other ZFS operations, encryption operations such as key changes and re-key are performed online.

  • Data is encrypted using AES (Advanced Encryption Standard) with key lengths of 256, 192, and 128 in the CCM and GCM operation modes.

  • The flexibility of encrypting specific file systems is a key feature.

  • ZFS encryption is inheritable to descendent file systems. Key management can be delegated through ZFS delegated administration.

  • ZFS encryption uses the Oracle Solaris Cryptographic Framework which gives it access to SPARC T5 and Intel Xeon E5-2690 processor hardware acceleration or to optimized software implementations of the encryption algorithms automatically.

  • On modern computers with multiple threads per core, simple statistics like %utilization measured in tools like iostat and vmstat are not "hard" indications of the resources that might be available for other processing. For example, 90% idle may not mean that 10 times the work can be done. So drawing numerical conclusions must be done carefully.

See Also

Disclosure Statement

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of March 26, 2013.

Friday Feb 08, 2013

Improved Oracle Solaris 10 1/13 Secure Copy Performance for High Latency Networks

With Oracle Solaris 10 1/13, the performance of secure copy or scp is significantly improved for high latency networks.

  • Oracle Solaris 10 1/13 enabling a TCP receive window size up to 1 MB has up to 8 times faster transfer times over the latency range 50 - 200 msec compared to the previous Oracle Solaris 10 8/11.

  • The default TCP receive window size of 48 KB delivered similar performance in both Oracle Solaris 10 1/13 and Oracle Solaris 10 8/11.

  • In this study, settings above 1 MB for the TCP receive window size delivered similar performance to the 1 MB results.

  • The tuning of the TCP receive window has been available in Oracle Solaris for some time. This improved performance is available with Oracle Solaris 10 1/13 and Oracle Solaris 11.

Performance Landscape

T4-4_SSH_SCP.png

X4170M2_SSH_SCP.png

Configuration Summary

Test Systems:

SPARC T4-4 server
4 x SPARC T4 processor 3.0 GHz
1 TB memory
Oracle Solaris 10 1/13
Oracle Solaris 10 8/11

Sun Fire X4170 M2
2 x Intel Xeon X5675 3.06 GHz
48 GB memory
Oracle Solaris 10 1/13
Oracle Solaris 10 8/11

Driver System:

Sun Fire X4170 M2
2 x Intel Xeon X5675 3.06 GHz
48 GB memory
Oracle Solaris 10

Router / Programmable Delay System:

Sun Fire X4170 M2
2 x Intel Xeon X5675 3.06 GHz
48 GB memory
Oracle Solaris 10

Switch in between the router and the 2 test systems

Cisco linksys SR2024C

Benchmark Description

This benchmark measures the scp performance between two systems with variable router delays in the network between the two systems. A file size of 48 MB was used while measuring the affects of varying the latency (network delays) and varying the TCP receive window size.

Key Points and Best Practices

  • The WAN emulator (aka. hxbt) is used in the router to achieve delays. Verification of network function and characteristics confirmed after setting the simulator using Netperf latency and bandwidth tests between driver and test system.

  • Transfers performed over 1 GbE private, dedicated network.

  • Files were transferred to and from /tmp (i.e. in memory) on the test systems to minimize effect of filesystem performance and variability on the measurements.

  • Larger TCP receive windows than default can be enabled using the system-wide parameter tcp_recv_hiwat (e.g. to enable 1024 KB windows using this method, use the command: ndd -set /dev/tcp tcp_recv_hiwat 1048576). To make this change persistent the command will have to be added to system startup scripts.

  • sshd on target system must be restarted before any benefit can be observed after increasing the enabled tcp receive buffer size. (e.g: can restart with the command /usr/sbin/svcadm restart svc:/network/ssh:default)

  • Note that tcp_recv_hiwat is a system-wide variable that adjusts the entire TCP stack. Care, therefore, must be taken to make sure that changes do not adversely affect your environment.

  • Geographically distant servers can be affected by connection latencies of the kind presented here.

See Also

Disclosure Statement

Copyright 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 2/08/2013.

Wednesday Nov 30, 2011

SPARC T4-4 Beats 8-CPU IBM POWER7 on TPC-H @3000GB Benchmark

Oracle's SPARC T4-4 server delivered a world record TPC-H @3000GB benchmark result for systems with four processors. This result beats eight processor results from IBM (POWER7) and HP (x86). The SPARC T4-4 server also delivered better performance per core than these eight processor systems from IBM and HP. Comparisons below are based upon system to system comparisons, highlighting Oracle's complete software and hardware solution.

This database world record result used Oracle's Sun Storage 2540-M2 arrays (rotating disk) connected to a SPARC T4-4 server running Oracle Solaris 11 and Oracle Database 11g Release 2 demonstrating the power of Oracle's integrated hardware and software solution.

  • The SPARC T4-4 server based configuration achieved a TPC-H scale factor 3000 world record for four processor systems of 205,792 QphH@3000GB with price/performance of $4.10/QphH@3000GB.

  • The SPARC T4-4 server with four SPARC T4 processors (total of 32 cores) is 7% faster than the IBM Power 780 server with eight POWER7 processors (total of 32 cores) on the TPC-H @3000GB benchmark.

  • The SPARC T4-4 server is 36% better in price performance compared to the IBM Power 780 server on the TPC-H @3000GB Benchmark.

  • The SPARC T4-4 server is 29% faster than the IBM Power 780 for data loading.

  • The SPARC T4-4 server is up to 3.4 times faster than the IBM Power 780 server for the Refresh Function.

  • The SPARC T4-4 server with four SPARC T4 processors is 27% faster than the HP ProLiant DL980 G7 server with eight x86 processors on the TPC-H @3000GB benchmark.

  • The SPARC T4-4 server is 52% faster than the HP ProLiant DL980 G7 server for data loading.

  • The SPARC T4-4 server is up to 3.2 times faster than the HP ProLiant DL980 G7 for the Refresh Function.

  • The SPARC T4-4 server achieved a peak IO rate from the Oracle database of 17 GB/sec. This rate was independent of the storage used, as demonstrated by the TPC-H @3000TB benchmark which used twelve Sun Storage 2540-M2 arrays (rotating disk) and the TPC-H @1000TB benchmark which used four Sun Storage F5100 Flash Array devices (flash storage). [*]

  • The SPARC T4-4 server showed linear scaling from TPC-H @1000GB to TPC-H @3000GB. This demonstrates that the SPARC T4-4 server can handle the increasingly larger databases required of DSS systems. [*]

  • The SPARC T4-4 server benchmark results demonstrate a complete solution of building Decision Support Systems including data loading, business questions and refreshing data. Each phase usually has a time constraint and the SPARC T4-4 server shows superior performance during each phase.

[*] The TPC believes that comparisons of results published with different scale factors are misleading and discourages such comparisons.

Performance Landscape

The table lists the leading TPC-H @3000GB results for non-clustered systems.

TPC-H @3000GB, Non-Clustered Systems
System
Processor
P/C/T – Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M9000
3.0 GHz SPARC64 VII+
64/256/256 – 1024 GB
386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g R2 09/22/11
SPARC T4-4
3.0 GHz SPARC T4
4/32/256 – 1024 GB
205,792.0 $4.10 190,325.1 222,515.9 Oracle 11g R2 05/31/12
SPARC Enterprise M9000
2.88 GHz SPARC64 VII
32/128/256 – 512 GB
198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g R2 12/09/10
IBM Power 780
4.1 GHz POWER7
8/32/128 – 1024 GB
192,001.1 $6.37 210,368.4 175,237.4 Sybase 15.4 11/30/11
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
8/64/128 – 512 GB
162,601.7 $2.68 185,297.7 142,685.6 SQL Server 2008 10/13/10

P/C/T = Processors, Cores, Threads
QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric in USD (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

The following table lists data load times and refresh function times during the power run.

TPC-H @3000GB, Non-Clustered Systems
Database Load & Database Refresh
System
Processor
Data Loading
(h:m:s)
T4
Advan
RF1
(sec)
T4
Advan
RF2
(sec)
T4
Advan
SPARC T4-4
3.0 GHz SPARC T4
04:08:29 1.0x 67.1 1.0x 39.5 1.0x
IBM Power 780
4.1 GHz POWER7
05:51:50 1.5x 147.3 2.2x 133.2 3.4x
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
08:35:17 2.1x 173.0 2.6x 126.3 3.2x

Data Loading = database load time
RF1 = power test first refresh transaction
RF2 = power test second refresh transaction
T4 Advan = the ratio of time to T4 time

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Hardware Configuration:

SPARC T4-4 server
4 x SPARC T4 3.0 GHz processors (total of 32 cores, 128 threads)
1024 GB memory
8 x internal SAS (8 x 300 GB) disk drives

External Storage:

12 x Sun Storage 2540-M2 array storage, each with
12 x 15K RPM 300 GB drives, 2 controllers, 2 GB cache

Software Configuration:

Oracle Solaris 11 11/11
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 3000 GB (Scale Factor 3000)
TPC-H Composite: 205,792.0 QphH@3000GB
Price/performance: $4.10/QphH@3000GB
Available: 05/31/2012
Total 3 year Cost: $843,656
TPC-H Power: 190,325.1
TPC-H Throughput: 222,515.9
Database Load Time: 4:08:29

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB and 100000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multiple user modes. The benchmark requires reporting of price/performance, which is the ratio of the total HW/SW cost plus 3 years maintenance to the QphH. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor.

Key Points and Best Practices

  • Twelve Sun Storage 2540-M2 arrays were used for the benchmark. Each Sun Storage 2540-M2 array contains 12 15K RPM drives and is connected to a single dual port 8Gb FC HBA using 2 ports. Each Sun Storage 2540-M2 array showed 1.5 GB/sec for sequential read operations and showed linear scaling, achieving 18 GB/sec with twelve Sun Storage 2540-M2 arrays. These were stand alone IO tests.

  • The peak IO rate measured from the Oracle database was 17 GB/sec.

  • Oracle Solaris 11 11/11 required very little system tuning.

  • Some vendors try to make the point that storage ratios are of customer concern. However, storage ratio size has more to do with disk layout and the increasing capacities of disks – so this is not an important metric in which to compare systems.

  • The SPARC T4-4 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle Database parallel processes.

  • Six Sun Storage 2540-M2 arrays were mirrored to another six Sun Storage 2540-M2 arrays on which all of the Oracle database files were placed. IO performance was high and balanced across all the arrays.

  • The TPC-H Refresh Function (RF) simulates periodical refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T4-4 server outperformed both the IBM POWER7 server and HP ProLiant DL980 G7 server. (See the RF columns above.)

See Also

Disclosure Statement

TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org. SPARC T4-4 205,792.0 QphH@3000GB, $4.10/QphH@3000GB, available 5/31/12, 4 processors, 32 cores, 256 threads; IBM Power 780 QphH@3000GB, 192,001.1 QphH@3000GB, $6.37/QphH@3000GB, available 11/30/11, 8 processors, 32 cores, 128 threads; HP ProLiant DL980 G7 162,601.7 QphH@3000GB, $2.68/QphH@3000GB available 10/13/10, 8 processors, 64 cores, 128 threads.

Monday Oct 03, 2011

SPARC T4-4 Beats IBM POWER7 and HP Itanium on TPC-H @1000GB Benchmark

Oracle's SPARC T4-4 server configured with SPARC-T4 processors, Oracle's Sun Storage F5100 Flash Array storage, Oracle Solaris, and Oracle Database 11g Release 2 achieved a TPC-H benchmark performance result of 201,487 QphH@1000GB with price/performance of $4.60/QphH@1000GB.

  • The SPARC T4-4 server benchmark results demonstrate a complete solution of building Decision Support Systems including data loading, business questions and refreshing data. Each phase usually has a time constraint and the SPARC T4-4 server shows superior performance during each phase.

  • The SPARC T4-4 server is 22% faster than the 8-socket IBM POWER7 server with the same number of cores. The SPARC T4-4 server has over twice the performance per socket compared to the IBM POWER7 server.

  • The SPARC T4-4 server achieves 33% better price/performance than the IBM POWER7 server.

  • The SPARC T4-4 server is up to 4 times faster than the IBM POWER7 server for the Refresh Function.

  • The SPARC T4-4 server is 44% faster than the HP Superdome 2 server. The SPARC T4-4 server has 5.7x the performance per socket of the HP Superdome 2 server.

  • The SPARC T4-4 server is 62% better on price/performance than the HP Itanium server.

  • The SPARC T4-4 server is up to 3.7 times faster than the HP Itanium server for the Refresh Function.

  • The SPARC T4-4 server delivers nearly the same performance as Oracle's SPARC Enterprise M8000 server, but with 52% better price/performance on the TPC-H @1000GB benchmark.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.14.2 specification which is the strictest level.

  • This TPC-H result demonstrates that the SPARC T4-4 server can deliver the performance while running the increasingly larger databases required of DSS systems. The server measured more than 16 GB/sec of IO throughput through Oracle Database 11g Release 2 software while maintaining the high cpu load.

Performance Landscape

The table below lists published non-cluster results from comparable enterprise class systems from Oracle, IBM and HP. Each system was configured with 512 GB of memory.

TPC-H @1000GB

System
CPU type
Proc/Core/Thread
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M8000
3 GHz SPARC64 VII+
16 / 64 / 128
209,533.6 $9.53 177,845.9 246,867.2 Oracle 11g 09/22/11
SPARC T4-4
3 GHz SPARC-T4
4 / 32 / 256
201,487.0 $4.60 181,760.6 223,354.2 Oracle 11g 10/30/11
IBM Power 780
4.14 GHz POWER7
8 / 32 / 128
164,747.2 $6.85 170,206.4 159,463.1 Sybase 03/31/11
HP Superdome 2
1.73 GHz Intel Itanium 9350
16 / 64 / 64
140,181.1 $12.15 139,181.0 141,188.3 Oracle 11g 10/20/10

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Hardware Configuration:

SPARC T4-4 server
4 x SPARC-T4 3.0 GHz processors (total of 32 cores, 128 threads)
512 GB memory
8 x internal SAS (8 x 300 GB) disk drives

External Storage:

4 x Sun Storage F5100 Flash Array storage, each with
80 x 24 GB Flash Modules

Software Configuration:

Oracle Solaris 10 8/11
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 1000 GB (Scale Factor 1000)
TPC-H Composite: 201,487 QphH@1000GB
Price/performance: $4.60/QphH@1000GB
Available: 10/30/2011
Total 3 Year Cost: $925,525
TPC-H Power: 181,760.6
TPC-H Throughput: 223,354.2
Database Load Time: 1:22:39

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • Four Sun Storage F5100 Flash Array devices were used for the benchmark. Each F5100 device contains 80 flash modules (FMODs). Twenty (20) FMODs from each F5100 device were connected to a single SAS 6 Gb HBA. A single F5100 device showed 4.16 GB/sec for sequential read and demonstrated linear scaling of 16.62 GB/sec with 4 x F5100 devices.

  • The IO rate from the Oracle database was over 16 GB/sec.

  • Oracle Solaris 10 8/11 required very little system tuning.

  • The SPARC T4-4 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle parallel processes.

  • The Oracle database files for tables and indexes were managed by Oracle Automatic Storage Manager (ASM) with 4M stripe. Two F5100 devices were mirrored to another 2 F5100 devices under ASM. IO performance was high and balanced across all the FMODs.
  • The Oracle redo log files were mirrored across the F5100 devices using Oracle Solaris Volume Manager with 128K stripe.
  • Parallel degree on tables and indexes was set to 128. This setting worked the best for performance.
  • TPC-H Refresh Function simulates periodical Refresh portion of Data Warehouse by adding new sales and deleting old sales data. Parallel DML (parallel insert and delete in this case) and database log performance are a key for this function and the SPARC T4-4 server outperformed both HP Superdome 2 and IBM POWER7 servers.

See Also

Disclosure Statement

TPC-H, QphH, $/QphH are trademarks of Transaction Processing Performance Council (TPC). For more information, see www.tpc.org. SPARC T4-4 201,487 QphH@1000GB, $4.60/QphH@1000GB, avail 10/30/2011, 4 processors, 32 cores, 256 threads; SPARC Enterprise M8000 209,533.6 QphH@1000GB, $9.53/QphH@1000GB, avail 09/22/11, 16 processors, 64 cores, 128 threads; IBM Power 780 QphH@1000GB, 164,747.2 QphH@1000GB, $6.85/QphH@1000GB, avail 03/31/11, 8 processors, 32 cores, 128 threads; HP Integrity Superdome 2 140,181.1 QphH@1000GB, $12.15/QphH@1000GB avail 10/20/10, 16 processors, 64, cores, 64 threads.

Friday Sep 30, 2011

SPARC T4-2 Server Beats Intel (Westmere AES-NI) on ZFS Encryption Tests

Oracle continues to lead in enterprise security. Oracle's SPARC T4 processors combined with Oracle's Solaris ZFS file system demonstrate faster file system encryption than equivalent systems based on the Intel Xeon Processor 5600 Sequence chips which use AES-NI security instructions.

Encryption is the process where data is encoded for privacy and a key is needed by the data owner to access the encoded data. The benefits of using ZFS encryption are:

  • The SPARC T4 processor is 3.5x to 5.2x faster than the Intel Xeon Processor X5670 that has the AES-NI security instructions in creating encrypted files.

  • ZFS encryption is integrated with the ZFS command set. Like other ZFS operations, encryption operations such as key changes and re-key are performed online.

  • Data is encrypted using AES (Advanced Encryption Standard) with key lengths of 256, 192, and 128 in the CCM and GCM operation modes.

  • The flexibility of encrypting specific file systems is a key feature.

  • ZFS encryption is inheritable to descendent file systems. Key management can be delegated through ZFS delegated administration.

  • ZFS encryption uses the Oracle Solaris Cryptographic Framework which gives it access to SPARC T4 processor and Intel Xeon X5670 processor (Intel AES-NI) hardware acceleration or to optimized software implementations of the encryption algorithms automatically.

Performance Landscape

Below are results running two different ciphers for ZFS encryption. Results are presented for runs without any cipher, labeled clear, and a variety of different key lengths.

Encryption Using AES-CCM Ciphers

MB/sec – 5 File Create* Encryption
Clear AES-256-CCM AES-192-CCM AES-128-CCM
SPARC T4-2 server 3,803 3,167 3,335 3,225
SPARC T3-2 server 2,286 1,554 1,561 1,594
2-Socket 2.93 GHz Xeon X5670 3,325 750 764 773

Speedup T4-2 vs X5670 1.1x 4.2x 4.4x 4.2x
Speedup T4-2 vs T3-2 1.7x 2.0x 2.1x 2.0x

Encryption Using AES-GCM Ciphers

MB/sec – 5 File Create* Encryption
Clear AES-256-GCM AES-192-GCM AES-128-GCM
SPARC T4-2 server 3,618 3,929 3,164 2,613
SPARC T3-2 server 2,278 1,451 1,455 1,449
2-Socket 2.93 GHz Xeon X5670 3,299 749 748 753

Speedup T4-2 vs X5670 1.1x 5.2x 4.2x 3.5x
Speedup T4-2 vs T3-2 1.6x 2.7x 2.2x 1.8x

(*) Maximum Delivered values measured over 5 concurrent mkfile operations.

Configuration Summary

Storage Configuration:

Sun Storage 6780 array
16 x 15K RPM drives
Raid 0 pool
Write back cache enable
Controller cache mirroring disabled for maximum bandwidth for test
Eight 8 Gb/sec ports per host

Server Configuration:

SPARC T4-2 server
2 x SPARC T4 2.85 GHz processors
256 GB memory
Oracle Solaris 11

SPARC T3-2 server
2 x SPARC T3 1.6 GHz processors
Oracle Solaris 11 Express 2010.11

Sun Fire X4270 M2 server
2 x Intel Xeon X5670, 2.93 GHz processors
Oracle Solaris 11

Benchmark Description

The benchmark ran the UNIX command mkfile (1M). Mkfile is a simple single threaded program to create a file of a specified size. The script ran 5 mkfile operations in the background and observed the peak bandwidth observed during the test.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of December 16, 2011.

SPARC T4 Processor Beats Intel (Westmere AES-NI) on AES Encryption Tests

The cryptography benchmark suite was internally developed by Oracle to measure the maximum throughput of in-memory, on-chip encryption operations that a system can perform. Multiple threads are used to achieve the maximum throughput.

  • Oracle's SPARC T4 processor running Oracle Solaris 11 is 1.5x faster on AES 256-bit key CFB mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 1.7x faster on AES 256-bit key CBC mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 3.6x faster on AES 256-bit key CCM mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption with authentication of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 1.4x faster on AES 256-bit key GCM mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption with authentication of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 9% faster on single-threaded AES 256-bit key CFB mode encryption than the Intel Xeon X5690 processor running Oracle Linux 6.1 for in-memory encryption of 32 KB blocks.

  • The SPARC T4 processor running Oracle Solaris 11 is 1.8x faster on AES 256-bit key CFB mode encryption than the SPARC T3 running Solaris 11 Express.

  • AES CFB mode is used by the Oracle Database 11g for Transparent Data Encryption (TDE) which provides security to database storage.

Performance Landscape

Encryption Performance – AES-CFB

Performance is presented for in-memory AES-CFB128 mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 10,963 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 7,526 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,023 Oracle Solaris 11 Express, libpkcs11
Intel X5690 3.47 12 2,894 Oracle Solaris 11, libsoftcrypto
SPARC T4 2.85 1 712 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 653 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 425 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 331 Oracle Solaris 11 Express, libpkcs11

AES-192-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 12,451 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 8,677 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,175 Oracle Solaris 11 Express, libpkcs11
Intel X5690 3.47 12 2,976 Oracle Solaris 11, libsoftcrypto
SPARC T4 2.85 1 816 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 752 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 461 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 371 Oracle Solaris 11 Express, libpkcs11

AES-128-CFB
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 14,388 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 10,214 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 32 6,390 Oracle Solaris 11 Express, libpkcs11
Intel X5690 3.47 12 3,115 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 953 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 886 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 509 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 395 Oracle Solaris 11 Express, libpkcs11

Encryption Performance – AES-CBC

Performance is presented for in-memory AES-CBC mode encryption. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 11,588 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 7,171 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 6,704 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 5,980 Oracle Solaris 11 Express, libpkcs11
SPARC T4 2.85 1 748 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 592 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 569 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 336 Oracle Solaris 11 Express, libpkcs11

AES-192-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 13,216 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 8,211 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 7,588 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,333 Oracle Solaris 11 Express, libpkcs11
SPARC T4 2.85 1 862 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 672 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 643 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 358 Oracle Solaris 11 Express, libpkcs11

AES-128-CBC
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 15,323 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 9,785 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 8,746 Oracle Linux 6.1, IPP/AES-NI
SPARC T3 1.65 32 6,347 Oracle Solaris 11 Express, libpkcs11
SPARC T4 2.85 1 1,017 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 781 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 1 739 Oracle Solaris 11, libsoftcrypto
SPARC T3 1.65 1 434 Oracle Solaris 11 Express, libpkcs11

Encryption Performance – AES-CCM

Performance is presented for in-memory AES-CCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 5,850 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,860 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,613 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 480 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 258 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 190 Oracle Linux 6.1, IPP/AES-NI

AES-192-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 6,709 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,930 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,715 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 565 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 293 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 206 Oracle Linux 6.1, IPP/AES-NI

AES-128-CCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 7,856 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 2,031 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 1,838 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 664 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 321 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 225 Oracle Linux 6.1, IPP/AES-NI

Encryption Performance – AES-GCM

Performance is presented for in-memory AES-GCM mode encryption with authentication. Multiple key sizes of 256-bit, 192-bit and 128-bit are presented. The encryption/authentication was performance on 32 KB of pseudo-random data (same data for each run).

AES-256-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 6,871 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 4,794 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 12 1,685 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 691 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 571 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 253 Oracle Solaris 11, libsoftcrypto

AES-192-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 7,450 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 5,054 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 12 1,724 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 727 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 618 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 268 Oracle Solaris 11, libsoftcrypto

AES-128-GCM
Microbenchmark Performance (MB/sec)
Processor GHz Th Performance Software Environment
SPARC T4 2.85 64 7,987 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 12 5,315 Oracle Linux 6.1, IPP/AES-NI
Intel X5690 3.47 12 1,781 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 765 Oracle Linux 6.1, IPP/AES-NI
SPARC T4 2.85 1 655 Oracle Solaris 11, libsoftcrypto
Intel X5690 3.47 1 281 Oracle Solaris 11, libsoftcrypto

Configuration Summary

SPARC T4-1 server
1 x SPARC T4 processor, 2.85 GHz
128 GB memory
Oracle Solaris 11

SPARC T3-1 server
1 x SPARC T3 processor, 1.65 GHz
128 GB memory
Oracle Solaris 11 Express

Sun Fire X4270 M2 server
2 x Intel Xeon X5690, 3.47 GHz
Hyper-Threading enabled
Turbo Boost enabled
24 GB memory
Oracle Linux 6.1

Sun Fire X4270 M2 server
2 x Intel Xeon X5690, 3.47 GHz
Hyper-Threading enabled
Turbo Boost enabled
24 GB memory
Oracle Solaris 11 Express

Benchmark Description

The benchmark measures cryptographic capabilities in terms of general low-level encryption, in-memory and on-chip using various ciphers, including AES-128-CFB, AES-192-CFB, AES-256-CFB, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CCM, AES-192-CCM, AES-256-CCM, AES-128-GCM, AES-192-GCM and AES-256-GCM.

The benchmark results were obtained using tests created by Oracle which use various application interfaces to perform the various ciphers. They were run using optimized libraries for each platform to obtain the best possible performance.

See Also

Disclosure Statement

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 1/13/2012.

Thursday Sep 29, 2011

SPARC T4-1 Server Outperforms Intel (Westmere AES-NI) on IPsec Encryption Tests

Oracle's SPARC T4 processor has significantly greater performance than the Intel Xeon X5690 processor when both are using Oracle Solaris 11 secure IP networking (IPsec). The SPARC T4 processor using IPsec AES-256-CCM mode achieves line speed over a 10 GbE network.

  • On IPsec, SPARC T4 processor is 23% faster than the 3.46 GHz Intel Xeon X5690 processor (Intel AES-NI).

  • The SPARC T4 processor is only at 23% utilization when running at its maximum throughput making it 3.6 times more efficient at secure networking than the 3.46 GHz Intel Xeon X5690 processor.

  • The 3.46 GHz Intel Xeon X5690 processor is nearly fully utilized at its maximum throughput leaving little CPU for application processing.

  • The SPARC T4 processor using IPsec AES-256-CCM mode achieves line speed over a 10 GbE network.

  • The SPARC T4 processor approaches line speed with fewer than one-quarter the number of IPsec streams required for the Intel Xeon X5690 processor to achieve its peak throughput. The SPARC T4 processor supports the additional streams with minimal extra CPU utilization.

IPsec provides general purpose networking security which is transparent to applications. This is ideal for supplying the capability to those networking applications that don't have cryptography built-in. IPsec provides for more than Virtual Private Networking (VPN) deployments where the technology is often first encountered.

Performance Landscape

Performance was measured using the AES-256-CCM cipher in megabits per second (Mb/sec) aggregate over sufficient numbers of TCP/IP streams to achieve line rate threshold (SPARC T4 processor) or drive a peak throughput (Intel Xeon X5690).

Processor GHz AES Decrypt AES Encrypt
B/W (Mb/sec) CPU Util Streams B/W (Mb/sec) CPU Util Streams
– Peak performance
SPARC T4 2.85 9,800 23% 96 9,800 20% 78
Intel Xeon X5690 3.46 8,000 83% 4,700 81%
– Load at which SPARC T4 processor performance crosses 9000 Mb/sec
SPARC T4 2.85 9,300 19% 17 9,200 15% 17
Intel Xeon X5690 3.46 4,700 41% 3,200 47%

Configuration Summary

SPARC Configuration:

SPARC T4-1 server
1 x SPARC T4 processor 2.85 GHz
128 GB memory
Oracle Solaris 11
Single 10-Gigabit Ethernet XAUI Adapter

Intel Configuration:

Sun Fire X4270 M2
1 x Intel Xeon X5690 3.46 GHz, Hyper-Threading and Turbo Boost active
48 GB memory
Oracle Solaris 11
Sun Dual Port 10GbE PCIe 2.0 Networking Card with Intel 82599 10GbE Controller

Driver Systems Configuration:

2 x Sun Blade 6000 chassis each with
1 x Sun Blade 6000 Virtualized Ethernet Switched Network Express Module 10GbE (NEM)
10 x Sun Blade X6270 M2 server modules each with
2 x Intel Xeon X5680 3.33 GHz, Hyper-Threading and Turbo Boost active
48 GB memory
Oracle Solaris 11
Dual 10-Gigabit Ethernet Fabric Expansion Module (FEM)

Benchmark Configuration:

Netperf 2.4.5 network benchmark adapted for testing bandwidth of multiple streams in aggregate.

Benchmark Description

The results here are derived from runs of the Netperf 2.4.5 benchmark. Netperf is a client/server benchmark measuring network performance providing a number of independent tests, including the TCP streaming bandwidth tests used here.

Netperf is, however, a single network stream benchmark and to demonstrate peak network bandwidth over a 10 GbE line under encryption requires many streams.

The Netperf documentation provides an example of using the software to drive multiple streams. The example is not sufficient to develop the workload because it does not scale beyond a single driver node which limits the processing power that can be applied. This subsequently limits how many full bandwidth streams can be supported. We chose to have a single server process on the target system (containing either the SPARC T4 processor or the Intel Xeon processor) and to spawn one or more Netperf client processes each across a cluster of the driver systems. The client processes are managed by the mpirun program of the Oracle Message Passing Toolkit.

Tabular results include aggregate bandwidth and CPU utilization. The aggregate bandwidth is computed by dividing the total traffic of the client processes by the overall runtime. CPU utilization on the target system is the average of that reported by all of the Netperf client processes.

IPsec is configured in the operating system of each participating server transparently to Netperf and applied to the dedicated network connecting the target system to the driver systems.

Key Points and Best Practices

  • Line speed is defined as data bandwidth within 10% of theoretical maximum bit rate of network line. For 10 GbE greater than 9000 Mb/sec bandwidth is defined as line speed.

  • IPsec provides network security that is configured completely in the operating system and is transparent to the application.

  • Peak bandwidths under IPsec are achieved only in aggregate with multiple client network streams to the target server.

  • Oracle Solaris receiver fanout must be increased from the default to support the large numbers of streams at quoted peak rates.

  • The ixgbe network driver relevant on servers with Intel 82599 10GbE controllers (driver systems and Intel Xeon target system) was limited to only a single receiver queue to maximize utilization of extra fanout.

  • IPsec is configured to make a unique security association (SA) for each connection to avoid a bottleneck over the large stream counts.

  • Jumbo frames are enabled (MTU of 9000) and network interrupt blanking (sometimes called interrupt coalescence) is disabled.

  • The TCP streaming bandwidth tests, which run continuously for minutes and multiple times to determine statistical significance, are configured to use message sizes of 1,048,576 bytes.

  • IPsec configuration defines that each SA is established through the use of a preshared key and Internet Key Exchange (IKE).

  • IPsec encryption uses the Solaris Cryptographic Framework which applies the appropriate accelerated provider on both the SPARC T4 processor and the Intel Xeon processor.

  • There is no need to configure a specific authentication algorithm for IPsec. With the Encapsulated Security Payload (ESP) security protocol and choosing AES-256-CCM for the encryption algorithm, the encapsulation is self-authenticating.

See Also

Disclosure Statement

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 9/26/2011.

Friday Jun 03, 2011

SPARC Enterprise M8000 with Oracle 11g Beats IBM POWER7 on TPC-H @1000GB Benchmark

Oracle's SPARC Enterprise M8000 server configured with SPARC64 VII+ processors, Oracle's Sun Storage F5100 Flash Array storage, Oracle Solaris, and Oracle Database 11g Release 2 achieved a TPC-H performance result of 209,533 QphH@1000GB with price/performance of $9.53/QphH@1000GB.

Oracle's SPARC server surpasses the performance of the IBM POWER7 server on the 1 TB TPC-H decision support benchmark.

Oracle focuses on the performance of the complete hardware and software stack. Implementation details such as the number of cores or the number of threads obscures the important metric of delivered system performance. The SPARC Enterprise M8000 server delivers higher performance than the IBM Power 780 even though the SPARC VII+ processor-core is 1.6x slower than the POWER7 processor-core.

  • The SPARC Enterprise M8000 server is 27% faster than the IBM Power 780. IBM's reputed single-thread performance leadership does not provide benefit for throughput.

  • Oracle beats IBM Power with better performance. This shows that Oracle's focus on integrated system design provides more customer value than IBM's focus on per core performance.

  • The SPARC Enterprise M8000 server is up to 3.8 times faster than the IBM Power 780 for Refresh Function. Again, IBM's reputed single-thread performance leadership does not provide benefit for this important function.

  • The SPARC Enterprise M8000 server is 49% faster than the HP Superdome 2 (1.73 GHz Itanium 9350).

  • The SPARC Enterprise M8000 server is 22% better price performance than the HP Superdome 2 (1.73 GHz Itanium 9350).

  • The SPARC Enterprise M8000 server is 2 times faster than the HP Superdome 2 (1.73 GHz Itanium 9350) for Refresh Function.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.14.0 specification which is the highest level.

  • One should focus on the performance of the complete hardware and software stack since server implementation details such as the number of cores or the number of threads obscures the important metric of delivered system performance.

  • This TPC-H result demonstrates that the SPARC Enterprise M8000 server can handle the increasingly large databases required of DSS systems. The server delivered more than 16 GB/sec of IO throughput through Oracle Database 11g Release 2 software maintaining high cpu load.

Performance Landscape

The table below lists published results from comparable enterprise class systems from Oracle, HP and IBM. Each system was configured with 512 GB of memory.

TPC-H @1000GB

System
CPU type
Proc/Core/Thread
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M8000
3 GHz SPARC64 VII+
16 / 64 / 128
209,533.6 $9.53 177,845.9 246,867.2 Oracle 11g 09/22/11
IBM Power 780
4.14 GHz POWER7
8 / 32 / 128
164,747.2 $6.85 170,206.4 159,463.1 Sybase 03/31/11
HP SuperDome 2
1.73 GHz Intel Itanium 9350
16 / 64 / 64
140,181.1 $12.15 139,181.0 141,188.3 Oracle 11g 10/20/10

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Server:

SPARC Enterprise M8000 server
16 x SPARC64 VII+ 3.0 GHz processors (total of 64 cores, 128 threads)
512 GB memory
12 x internal SAS (12 x 300 GB) disk drives

External Storage:

4 x Sun Storage F5100 Flash Array device, each with
80 x 24 GB Flash Modules

Software:

Oracle Solaris 10 8/11
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 1000 GB (Scale Factor 3000)
TPC-H Composite: 209,533.6 QphH@1000GB
Price/performance: $9.53/QphH@1000GB
Available: 09/22/2011
Total 3 year Cost: $1,995,715
TPC-H Power: 177,845.9
TPC-H Throughput: 246,867.2
Database Load Time: 1:27:12

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • Four Sun Storage F5100 Flash Array devices were used for the benchmark. Each F5100 device contains 80 Flash Modules (FMODs). Twenty (20) FMODs from each F5100 device were connected to a single SAS 6 Gb HBA. A single F5100 device showed 4.16 GB/sec for sequential read and demonstrated linear scaling of 16.62 GB/sec with 4 x F5100 devices.
  • The IO rate from the Oracle database was over 16 GB/sec.
  • Oracle Solaris 10 8/11 required very little system tuning.
  • The SPARC Enterprise M8000 server and Oracle Solaris efficiently managed the system load of over one thousand Oracle parallel processes.
  • The Oracle database files were mirrored under Solaris Volume Manager (SVM). Two F5100 arrays were mirrored to another 2 F5100 arrays. IO performance was good and balanced across all the FMODs. Because of the SVM mirror one of the durability tests, the disk/controller failure test, was transparent to the Oracle database.

See Also

Disclosure Statement

SPARC Enterprise M8000 209,533.6 QphH@1000GB, $9.53/QphH@1000GB, avail 09/22/11, IBM Power 780 QphH@1000GB, 164,747.2 QphH@1000GB, $6.85/QphH@1000GB, avail 03/31/11, HP Integrity Superdome 2 140,181.1 QphH@1000GB, $12.15/QphH@1000GB avail 10/20/10, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info www.tpc.org.

Friday Mar 25, 2011

SPARC Enterprise M9000 with Oracle Database 11g Delivers World Record Single Server TPC-H @3000GB Result

Oracle's SPARC Enterprise M9000 server delivers single-system TPC-H @3000GB world record performance. The SPARC Enterprise M9000 server along with Oracle's Sun Storage 6180 arrays and running Oracle Database 11g Release 2 on the Oracle Solaris operating system proves the power of Oracle's integrated solution.

  • The SPARC Enterprise M9000 server configured with SPARC64 VII+ processors, Sun Storage 6180 arrays and running Oracle Solaris 10 combined with Oracle Database 11g Release 2 achieved World Record TPC-H performance of 386,478.3 QphH@3000GB for non-clustered systems.

  • The SPARC Enterprise M9000 server running the Oracle Database 11g Release 2 software is 2.5 times faster than the IBM p595 (POWER6) server which ran with Sybase IQ v.15.1 database software.

  • The SPARC Enterprise M9000 server is 3.4 times faster than the IBM p595 server for data loading.

  • The SPARC Enterprise M9000 server is 3.5 times faster than the IBM p595 server for Refresh Function.

  • The SPARC Enterprise M9000 server configured with Sun Storage 6180 arrays shows linear scaling up to the maximum delivered IO performance of 48.3 GB/sec as measured by vdbench.

  • The SPARC Enterprise M9000 server running the Oracle Database 11g Release 2 software is 2.4 times faster than the HP ProLiant DL980 server which used Microsoft SQL Server 2008 R2 Enterprise Edition software.

  • The SPARC Enterprise M9000 server is 2.9 times faster than the HP ProLiant DL980 server for data loading.

  • The SPARC Enterprise M9000 server is 4 times faster than the HP ProLiant DL980 server for Refresh Function.

  • A 1.94x improvement was delivered by the SPARC Enterprise M9000 server result using 64 SPARC64 VII+ processors compared to the previous Sun SPARC Enterprise M9000 server result which used 32 SPARC64 VII processes.

  • Oracle's TPC-H result shows that the SPARC Enterprise M9000 server can handle the increasingly large databases required of DSS systems. The IO rate as measured by the Oracle database is over 40 GB/sec.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.14.0 specification which is the highest level.

Performance Landscape

TPC-H @3000GB, Non-Clustered Systems

System
CPU type
Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
SPARC Enterprise M9000
3 GHz SPARC64 VII+
1024 GB
386,478.3 $18.19 316,835.8 471,428.6 Oracle 11g 09/22/11
Sun SPARC Enterprise M9000
2.88 GHz SPARC64 VII
512 GB
198,907.5 $15.27 182,350.7 216,967.7 Oracle 11g 12/09/10
HP ProLiant DL980 G7
2.27 GHz Intel Xeon X7560
512 GB
162,601.7 $2.68 185,297.7 142,601.7 SQL Server 10/13/10
IBM Power 595
5.0 GHz POWER6
512 GB
156,537.3 $20.60 142,790.7 171,607.4 Sybase 11/24/09

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Server:

SPARC Enterprise M9000
64 x SPARC VII+ 3.0 GHz processors
1024 GB memory
4 x internal SAS (4 x 146 GB)

External Storage:

32 x Sun Storage 6180 arrays (each with 16 x 600 GB)

Software:

Oracle Solaris 10 9/10
Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 3000 GB (Scale Factor 3000)
TPC-H Composite: 386,478.3 QphH@3000GB
Price/performance: $18.19/QphH@3000GB
Available: 09/22/2011
Total 3 year Cost: $7,030,009
TPC-H Power: 316,835.8
TPC-H Throughput: 471,428.6
Database Load Time: 2:59:01

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • The Sun Storage 6180 array showed linear scalability of 48.3 GB/sec Sequential Read with thirty-two Sun Storage 6180 arrays. Scaling could continue if there are more arrays available.
  • Oracle Solaris 10 9/10 required very little system tuning.
  • The optimal Sun Storage 6180 arrays configuration for the benchmark was to set up 1 disk per volume instead of multiple disks per volume and let Oracle Oracle Automatic Storage Management (ASM) mirror. Presenting as many volumes as possible to Oracle database gave the highest scan rate.

  • The storage was managed by ASM with 4 MB stripe size. 1 MB is the default stripe size but 4 MB works better for large databases.

  • All the Oracle database files, except TEMP tablespace, were mirrored under ASM. 16 x Sun Storage 6180 arrays (256 disks) were mirrored to another 16 x Sun Storage 6180 arrays using ASM. IO performance was good and balanced across all the disks. With the ASM mirror the benchmark passed the ACID (Atomicity, Consistency, Isolation and Durablity) test.

  • Oracle database tables were 256-way partitioned. The parallel degree for each table was set to 256 to match the number of available cores. This setting worked the best for performance.

  • Oracle Database 11g Release 2 feature Automatic Parallel Degree Policy was set to AUTO for the benchmark. This enabled automatic degree of parallelism, statement queuing and in-memory parallel execution.

See Also

Disclosure Statement

SPARC Enterprise M9000 386,478.3 QphH@3000GB, $18.19/QphH@3000GB, avail 09/22/11, IBM Power 595 QphH@3000GB, 156,537.3 QphH@3000GB, $20.60/QphH@3000GB, avail 11/24/09, HP ProLiant DL980 G7 162,601.7 QphH@3000GB, $2.68/QphH@3000GB avail 10/13/10, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info www.tpc.org.

Monday Oct 11, 2010

Sun SPARC Enterprise M9000 Server Delivers World Record Non-Clustered TPC-H @3000GB Performance

Oracle's Sun SPARC Enterprise M9000 server delivered a single-system TPC-H 3000GB world record performance. The Sun SPARC Enterprise M9000 server, running Oracle Database 11g Release 2 on the Oracle Solaris operating system proves the power of Oracle's integrated solution.

  • Oracle beats IBM Power with better performance and price/performance (3 Year TCO). This shows that Oracle's focus on integrated system design provides more customer value than IBM's focus on "per core performance"!

  • The Sun SPARC Enterprise M9000 server is 27% faster than the IBM Power 595.

  • The Sun SPARC Enterprise M9000 server is 22% faster than the HP ProLiant DL980 G7.

  • The Sun SPARC Enterprise M9000 server is 26% lower than the IBM Power 595 for price/performance.

  • The Sun SPARC Enterprise M9000 server is 2.7 times faster than the IBM Power 595 for data loading.

  • The Sun SPARC Enterprise M9000 server is 2.3 times faster than the HP ProLiant DL980 for data loading.

  • The Sun SPARC Enterprise M9000 server is 2.6 times faster than the IBM p595 for Refresh Function.

  • The Sun SPARC Enterprise M9000 server is 3 times faster than the HP ProLiant DL980 for Refresh Function.

  • Oracle used Storage Redundancy Level 3 as defined by the TPC-H 2.12.0 specification, which is the highest level. IBM is the only other vendor to secure the storage to this level.

  • One should focus on the performance of the complete hardware and software stack since server implementation details such as the number of cores or the number of threads will obscure the important metrics of delivered system performance and system price/performance.

  • The Sun SPARC Enterprise M9000 server configured with SPARC VII processors, Sun Storage 6180 arrays, and running Oracle Solaris 10 operating system combined with Oracle Database 11g Release 2 achieved World Record TPC-H performance of 198,907.5 QphH@3000GB for non-clustered systems.

  • The Sun SPARC Enterprise M9000 server is over three times faster than the HP Itanium2 Superdome.

  • The Sun Storage 6180 array configuration (a total of 16 6180 arrays) in this benchmark delivered IO performance of over 21 GB/sec Sequential Read performance as measured by the vdbench tool.

  • This TPC-H result demonstrates that the Sun SPARC Enterprise M9000 server can handle the increasingly large databases required of DSS systems. The server delivered more than 18 GB/sec of real IO throughput as measured by the Oracle Database 11g Release 2 software.

  • Both Oracle and IBM had the same level of hardware discounting as allowed by TPC rules to provide a effective comparison of price/performance.

  • IBM has not shown any delivered I/O performance results for the high-end IBM POWER7 systems. In addition, they have not delivered any commercial benchmarks (TPC-C, TPC-H, etc.) which have heavy I/O demands.

Performance Landscape

TPC-H @3000GB, Non-Clustered Systems

System
CPU type
Memory
Composite
(QphH)
$/perf
($/QphH)
Power
(QppH)
Throughput
(QthH)
Database Available
Sun SPARC Enterprise M9000
2.88GHz SPARC64 VII
512GB
198,907.5 $15.27 182,350.7 216,967.7 Oracle 12/09/10
HP ProLiant DL980 G7
2.27GHz Intel Xeon X7560
512GB
162,601.7 $2.68 185,297.7 142,601.7 SQL Server 10/13/10
IBM Power 595
5.0GHz POWER6
512GB
156,537.3 $20.60 142,790.7 171,607.4 Sybase 11/24/09
Unisys ES7000 7600R
2.6GHz Intel Xeon
1024GB
102,778.2 $21.05 120,254.8 87,841.4 SQL Server 05/06/10
HP Integrity Superdome
1.6GHz Intel Itanium
256GB
60,359.3 $32.60 80,838.3 45,068.3 SQL Server 05/21/07

QphH = the Composite Metric (bigger is better)
$/QphH = the Price/Performance metric (smaller is better)
QppH = the Power Numerical Quantity
QthH = the Throughput Numerical Quantity

Complete benchmark results found at the TPC benchmark website http://www.tpc.org.

Configuration Summary and Results

Server:

Sun SPARC Enterprise M9000
32 x SPARC VII 2.88 GHz processors
512 GB memory
4 x internal SAS (4 x 300 GB)

External Storage:

16 x Sun Storage 6180 arrays (16x 16 x 300 GB)

Software:

Operating System: Oracle Solaris 10 10/09
Database: Oracle Database 11g Release 2 Enterprise Edition

Audited Results:

Database Size: 3000 GB (Scale Factor 3000)
TPC-H Composite: 198,907.5 QphH@3000GB
Price/performance: $15.27/QphH@3000GB
Available: 12/09/2010
Total 3 year Cost: $3,037,900
TPC-H Power: 182,350.7
TPC-H Throughput: 216,967.7
Database Load Time: 3:40:11

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance.

Key Points and Best Practices

  • The Sun Storage 6180 array showed good scalability and these sixteen 6180 arrays showed over 21 GB/sec Sequential Read performance as measured by the vdbench tool.
  • Oracle Solaris 10 10/09 required little system tuning.
  • The optimal 6180 configuration for the benchmark was to set up 1 disk per volume instead of multiple disks per volume and let Oracle Solaris Volume Manager (SVM) mirror. Presenting as many volumes as possible to Oracle database gave the highest scan rate.

  • The storage was managed by SVM with 1MB stripe size to match with Oracle's database IO size. The default 16K stripe size is just too small for this DSS benchmark.

  • All the Oracle files, except TEMP tablespace, were mirrored under SVM. Eight 6180 arrays (128 disks) were mirrored to another 8 6180 arrays using 128-way stripe. IO performance was good and balanced across all the disks with a round robin order. Read performance was the same with mirror or without mirror. With the SVM mirror the benchmark passed the ACID (Atomicity, Consistency, Isolation and Durablity) test.

  • Oracle tables were 128-way partitioned and parallel degree for each table was set to 128 because the system had 128 cores. This setting worked the best for performance.

  • CPU usage during the Power run was not so high. This is because parallel degree was set to 128 for the tables and indexes so it utilized 128 vcpus for the most of the queries but the system had 256 vcpus.

See Also

Disclosure Statement

Sun SPARC Enterprise M9000 198,907.5 QphH@3000GB, $15.27/QphH@3000GB, avail 12/09/10, IBM Power 595 QphH@3000GB, 156,537.3 QphH@3000GB, $20.60/QphH@3000GB, avail 11/24/09, HP Integrity Superdome 60,359.3 QphH@3000GB, $32.60/QphH@3000GB avail 06/18/07, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info www.tpc.org.

Thursday Sep 23, 2010

Sun Storage F5100 Flash Array with PCI-Express SAS-2 HBAs Achieves Over 17 GB/sec Read

Sun Storage F5100 Flash Array with PCI-Express SAS-2 HBAs Achieves Over 17 GB/sec Read bandwidth flash openstorage performance storage Oracle's Sun Storage F5100 Flash Array storage is a high-performance, high-density, solid-state flash array delivering 17 GB/sec sequential read throughput performance (1 MB reads) and 10 GB/sec write sequential throughput performance (1 MB writes).
  • Use the PCI-Express SAS-2 HBAs and the slot count can be reduced 50%, compared to the PCI-Express SAS-1 HBAs.

  • The Sun Storage F5100 Flash Array storage using 8 PCI-Express SAS-2 HBAs showed a 33% aggregate, sequential read bandwidth improvement over using 16 PCI-Express SAS-1 HBAs.

  • The Sun Storage F5100 Flash Array storage using 8 PCI-Express SAS-2 HBAs showed a 6% aggregate, sequential write bandwidth improvement over using 16 PCI-Express SAS-1 HBAs.

  • Each SAS port of the Sun Storage F5100 Flash Array storage delivered over 1 GB/sec sequential read performance.

  • Performance data is also presented utilizing smaller numbers of FMODs in the full configuration, demonstrating near perfect scaling from 20 to 80 FMODs.

The Sun Storage F5100 Flash Array storage is designed to accelerate IO-intensive applications, such as databases, at a fraction of the power, space, and cost of traditional hard disk drives. It is based on enterprise-class SLC flash technology, with advanced wear-leveling, integrated backup protection, solid state robustness, and 3M hours MTBF reliability.

Performance Landscape

Results for the PCI-Express SAS-2 HBAs were obtained using four hosts, each configured with 2 HBAs.

Results for the PCI-Express SAS-1 HBAs were obtained using four hosts, each configured with 4 HBAs.

Bandwidth Measurements

Sequential Read (Aggregate GB/sec) for 1 MB Transfers
HBA Configuration FMODs
1 20 40 80
8 SAS-2 HBAs 0.26 4.3 8.5 17.0
16 SAS-1 HBAs 0.26 3.2 6.4 12.8
Sequential Write (Aggregate GB/sec) for 1 MB Transfers
HBA Configuration FMODs
1 20 40 80
8 SAS-2 HBAs 0.14 2.7 5.2 10.3
16 SAS-1 HBAs 0.12 2.4 4.8 9.7

Results and Configuration Summary

Storage Configuration:

Sun Storage F5100 Flash Array
80 Flash Modules
16 ports
4 domains (20 Flash Modules per domain)
CAM zoning - 5 Flash Modules per port

Server Configuration:

4 x Sun Fire X4270 servers, each with
16 GB memory
2 x 2.93 GHz Quad-core Intel Xeon X5570 processors
2 x PCI-Express SAS-2 External HBAs, firmware version SW1.1-RC5

Software Configuration:

OpenSolaris 2009.06 or Oracle Solaris 10 10/09
Vdbench 5.0

Benchmark Description

Two IO performance metrics on the Sun Storage F5100 Flash Array storage using Vdbench 5.0 were measured: 100% Sequential Read and 100% Sequential Write. This demonstrates the maximum performance and throughput of the storage system.

Vdbench is publicly available for download at: http://vdbench.org

Key Points and Best Practices

  • Please note that the Sun Storage F5100 Flash Array storage is a 4KB sector device. Doing IOs of less than 4KB in size, or IOs not aligned on 4KB boundaries, can impact performance on write operations.
  • Drive each Flash Module with 8 outstanding IOs.
  • Both ports of each LSI PCE-Express SAS-2 HBA were used.
  • SPARC platforms will align with the 4K boundary size set by the Flash Array. x86/windows platforms don't necessarily have this alignment built in and can show lower performance.

See Also

Disclosure Statement

The Sun Storage F5100 Flash Array storage delivered 17.0 GB/sec sequential read and 10.3 GB/sec sequential write. Vdbench 5.0 (http://vdbench.org) was used for the test. Results as of September 20, 2010.

About

BestPerf is the source of Oracle performance expertise. In this blog, Oracle's Strategic Applications Engineering group explores Oracle's performance results and shares best practices learned from working on Enterprise-wide Applications.

Index Pages
Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today