Everything you want and need to know about Oracle SPARC systems performance

In-Memory Database: SPARC T7-1 Faster Than x86 E5 v3

Brian Whitney
Principal Software Engineer

Fast analytics on large databases are critical to transforming key business processes. Oracle's SPARC M7 processors are specifically designed to accelerate in-memory analytics using Oracle Database 12c Enterprise Edition utilizing the In-Memory option. The SPARC M7 processor outperforms an x86 E5 v3 chip by up to 10.8x on analytics queries. In order to test real world deep analysis on the SPARC M7 processor a scenario with over 2,300 analytical queries was run against a real cardinality database (RCDB) star schema.  This benchmark was audited by Enterprise Strategy Group (ESG). ESG is an IT research, analyst, strategy, and validation firm focused on the global IT community.

The SPARC M7 processor does this by using Data Accelerator co-processor (DAX). DAX is not a SIMD instruction but rather an actual co-processor that offloads in-memory queries which frees the cores up for other processing. The DAX has direct access to the memory bus and can execute scans at near full memory bandwidth. Oracle makes the DAX API available to other applications, so this kind of acceleration not just for the Oracle database, it is open.

  • The SPARC M7 processor delivers up to a 10.8x Query Per Minute speedup per chip over the Intel Xeon Processor E5-2699 v3 when executing analytical queries using the In-Memory option of Oracle Database 12c.

  • Oracle's SPARC T7-1 server delivers up to a 5.4x Query Per Minute speedup over the 2-chip x86 E5 v3 server when executing analytical queries using the In-Memory option of Oracle Database 12c.

  • The SPARC T7-1 server delivers over 143 GB/sec of memory bandwidth which is up to 7x more than the 2-chip x86 E5 v3 server when the Oracle Database 12c is executing the same analytical queries against the RCDB.

  • The SPARC T7-1 server scanned over 48 billion rows per second through the database.

  • The SPARC T7-1 server compresses the on-disk RCDB star schema by around 6x when using the Memcompress For Query High setting (more information following below) and by nearly 10x compared to a standard data warehouse row format version of the same database.

Performance Landscape

The table below compares the SPARC T7-1 server and 2-chip x86 E5 v3 server.  The x86 E5 v3 server single chip compares are from actual measurements against a single chip configuration.

The number of cores is per chip, multiply by number of chips to get system total.

RCDB Performance Chart
2,304 Queries
System Elapsed
Queries Per
DB Memory
1 x SPARC M7 (32core)
381 363 5.4x 10.8x 143 GB/sec
x86 E5 v3 server
2 x Intel E5-2699 v3 (2x 18core)
2059 67 1.0x 2.0x 20 GB/sec
x86 E5 v3 server
1 x Intel E5-2699 v3 (18core)
4096 34 0.5x 1.0x 10 GB/sec


Fused Decompress + Scan

The In-Memory feature of Oracle Database 12c puts tables in columnar format. There are different levels of compression that can be applied. One of these is Oracle Zip (OZIP) which is used with the "MEMCOMPRESS FOR QUERY HIGH" setting. Typically when compression is applied to data, in order to operate on it, the data must be:

(1) Decompressed
(2) Written back to memory in uncompressed form
(3) Scanned and the results returned.

When OZIP is applied to the data inside of an In-Memory Columnar Unit (or IMCU, an N sized chunk of rows), the DAX is able to take this data in its compressed format and operate (scan) directly upon it, returning results in a single step. This not only saves on compute power by not having the CPU do the decompression step, but also on memory bandwidth as the uncompressed data is not put back into memory. Only the results are returned. To illustrate this, a microbenchmark was used which measured the amount of rows that could be scanned per second.

SAE hpk-uperf


This performance test was run on a Scale Factor 1750 database, which represents a 1.75 TB row format data warehouse. The database is then transformed into a star schema which ends up around 1.1 TB in size. The star schema is then loaded in memory with a setting of "MEMCOMPRESS FOR QUERY HIGH", which focuses on performance with somewhat more aggressive compression. This memory area is a separate part of the System Global Area (SGA) which is defined by the database initialization parameter "inmemory_size". See below for an example. Here is a breakdown of each table in memory with compression ratios.

Column Name Original Size
In Memory
Size (Bytes)
LINEORDER 1,103,524,528,128 178,586,451,968 6.2x
DATE 11,534,336 1,179,648 9.8x
PART 11,534,336 1,179,648 9.8x
SUPPLIER 11,534,336 1,179,648 9.8x
CUSTOMER 11,534,336 1,179,648 9.8x

Configuration Summary

SPARC Server:

1 X SPARC T7-1 server
1 X SPARC M7 processor
512 GB memory
Oracle Solaris 11.3
Oracle Database 12c Enterprise Edition Release

x86 Server:

1 X Oracle Server X5-2L
2 X Intel Xeon Processor E5-2699 v3
512 GB memory
Oracle Linux 6 Update 5 (3.8.13-16.2.1.el6uek.x86_64)
Oracle Database 12c Enterprise Edition Release

Benchmark Description

The real cardinality database (RCDB) benchmark was created to showcase the potential speedup one may see moving from on disk, row format data warehouse/Star Schema, to utilizing Oracle Database 12c's In-Memory feature for analytical queries.

The workload consists of 2,304 unique queries asking questions such as "In 2014, what was the total revenue of single item orders", or "In August 2013, how many orders exceeded a total price of $50". Questions like these can help a company see where to focus for further revenue growth or identify weaknesses in their offerings.

RCDB scale factor 1750 represents a 1.75 TB data warehouse. It is transformed into a star schema of 1.1 TB, and then becomes 179 GB in size when loaded in memory. It consists of 1 fact table, and 4 dimension tables with over 10.5 billion rows. There are 56 columns with most cardinalities varying between 5 and 2,000, a primary key being an example of something outside this range.

One problem with many industry standard generated databases is that as they have grown in size the cardinalities for the generated columns have become exceedingly unrealistic. For instance one industry standard benchmark uses a schema where at scale factor 1 TB it calls for the number of parts to be SF * 800,000. A 1 TB database that calls for 800 million unique parts is not very realistic. Therefore RCDB attempts to take some of these unrealistic cardinalities and size them to be more representative of at least a section of customer data. Obviously one cannot encompass every database in one schema, this is just an example.

We carefully scaled each system so that the optimal number of users was run on each system under test so that we did not create artificial bottlenecks. Each user ran an equal number of queries and the same queries were run on each system, allowing for a fair comparison of the results.

Key Points and Best Practices

  • This benchmark utilized the SPARC M7 processor's co-processor DAX for query acceleration.
  • All SPARC T7-1 server results were run with out of the box tuning for Oracle Solaris.

  • All Oracle Server X5-2L system results were run with out of the box tunings for Oracle Linux except for the setting in /etc/sysctl.conf to get large pages for the Oracle Database:

    • vm.nr_hugepages=64520
  • To create an in memory area, the following was added to the init.ora:

    • inmemory_size = 200g
  • An example of how to set a table to be in memory is below:


See Also

Disclosure Statement

Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 10/25/2015.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.