X

Everything you want and need to know about Oracle SPARC systems performance

In-Memory Database: SPARC S7-2 Performance

Brian Whitney
Principal Software Engineer

Fast analytics on large databases are critical to transforming key business processes. Oracle's SPARC S7 processors are specifically designed to accelerate in-memory analytics using Oracle Database 12c Enterprise Edition and its In-Memory option. The SPARC S7 processor outperforms an x86 E5-2699 v4 chip by up to 2.8x on analytics queries where all queries were run in-memory. In order to test real world deep analysis on the SPARC S7 processor, a scenario with over 2,300 analytical queries was run against a real cardinality database (RCDB) star schema.

The SPARC S7 processor does this by using its Data Analytics Accelerator (DAX). DAX is not a SIMD instruction, but rather an actual co-processor that offloads in-memory queries to free up the cores for other processing. The DAX has direct access to the memory bus and can execute scans at near full memory bandwidth. This kind of acceleration is not just for the Oracle database. Oracle makes the DAX API available to other applications.

  • Oracle's SPARC S7-2 server delivers up to a 2.8x Queries Per Minute speedup over a 2-chip x86 E5-2699 v4 server when executing analytical queries using the In-Memory option of Oracle Database 12c.

  • The SPARC S7-2 server scanned over 36 billion rows per second through the database.

  • Oracle Database In-Memory compresses the on-disk RCDB star schema by about 6x when using the Memcompress For Query High setting (more information following below) and by nearly 10x compared to a standard data warehouse row format version of the same database.

This result shows the potential licensing cost advantage the SPARC S7 processor has because of DAX when compared to x86 based solutions. Because the Oracle Database core multipliers are the same, the licensing cost of the database gives the SPARC S7-2 server a 7.7x advantage.

Performance Landscape

RCDB Performance

All of the following results were run as part of this benchmark effort.

RCDB Performance Chart
2,304 Queries
Comparison Point 2-Chip
x86 E5-2699 v4
SPARC S7-2 SPARC
Advantage
Total Cores 44 16 2.75x
Elapsed Time 1885 sec 675 sec 2.8x
Throughput 73 qpm 205 qpm 2.8x
Required Database Licenses 22 8 2.75x
 
SPARC S7-2 Server Licensing Advantage 7.7x

Total Cores – Total processor cores in the system
Elapsed Time – Run time of test
Throughput – Number of queries per minute processed
Required Database License – Number of database licenses, using 0.5 multiplier, see OLSA page for more
Licensing Advantage – To show how DAX can help, this shows the relative throughput per licenses between the presented systems

Compression

This performance test was run on a Scale Factor 1750 database, which represents a 1.75 TB row format data warehouse. The database is then transformed into a star schema which ends up around 1.1 TB in size. The star schema is then loaded in memory with a setting of "MEMCOMPRESS FOR QUERY HIGH", which focuses on performance with somewhat more aggressive compression. This memory area is a separate part of the System Global Area (SGA) which is defined by the database initialization parameter "inmemory_size". See below for an example. The LINEORDER fact table, which comprises nearly the entire database size, is listed below in memory with its compression ratio.

 
Column Name Original Size
(Bytes)
In Memory
Size (Bytes)
Compression
Ratio
LINEORDER 1,101,950,197,760 178,583,568,384 6.2x

Configuration Summary

SPARC Server:

1 x SPARC S7-2 server with
2 x SPARC S7 processors, 4.27 GHz
512 GB memory
Oracle Solaris 11.3
Oracle Database 12c Enterprise Edition Release 12.1.0.2
Oracle Database In-Memory

x86 Server:

1 x Oracle Server X6-2L system with
2 x Intel Xeon Processor E5-2699 v4, 2.2 GHz
512 GB memory
Oracle Linux 7.2 (3.8.13-98.7.1.el7uek.x86_64)
Oracle Database 12c Enterprise Edition Release 12.1.0.2
Oracle Database In-Memory

Benchmark Description

The real cardinality database (RCDB) benchmark was created to showcase the potential speedup one may see moving from on disk, row format data warehouse/Star Schema, to utilizing the Oracle Database 12c In-Memory feature for analytical queries.  All tests presented are run in-memory.

The workload consists of 2,304 unique queries asking questions such as "In 2014, what was the total revenue of single item orders?" or "In August 2013, how many orders exceeded a total price of $50?" Questions like these can help a company see where to focus for further revenue growth or identify weaknesses in their offerings.

RCDB scale factor 1750 represents a 1.75 TB data warehouse. It is transformed into a star schema of 1.1 TB and then becomes 179 GB in size when loaded in memory. It consists of 1 fact table and 4 dimension tables with over 10.5 billion rows. There are 56 columns with most cardinalities varying between 5 and 2,000. A primary key is an example of something outside this range.

One problem with many industry standard generated databases is that as they have grown in size, the cardinalities for the generated columns have become exceedingly unrealistic. For instance, one industry standard benchmark uses a schema where at scale factor 1 TB, it calls for the number of parts to be SF * 800,000. A 1 TB database that calls for 800 million unique parts is not very realistic. Therefore RCDB attempts to take some of these unrealistic cardinalities and size them to be more representative of, at least, a section of customer data. Obviously, one cannot encompass every database in one schema. This is just an example.

We carefully scaled each system so that the optimal number of users was run on each system under test so that we did not create artificial bottlenecks. Each user ran an equal portion of the total batch job (2304 queries) and the same queries were run on each system, allowing for a fair comparison of the results.

Key Points and Best Practices

  • This benchmark utilized the SPARC S7 processor's DAX for query acceleration.
  • The batch accesses many columns that have been Run Length Encoded (RLE). The decompression of these columns can be done at a much faster rate when offloaded to DAX.

  • Some columns are also OZIP compressed. DAX can scan and return results directly on an OZIP'd column, resulting in reduced computation and bandwidth.

  • The batch scans large chunks of data for every query. These scans are offloaded to DAX, resulting in reduced computation and freeing up cores for other work.

  • All SPARC S7-2 server results were run with out-of-the-box tuning for Oracle Solaris.

  • All Oracle Server X6-2L system results were run with out of the box tunings for Oracle Linux, except for the setting in /etc/sysctl.conf to get large pages for the Oracle Database:

    • vm.nr_hugepages=64520
  • To create an in-memory area, the following was added to the init.ora:

    • inmemory_size = 200g
  • An example of how to set a table to be in-memory is below:

    • ALTER TABLE CUSTOMER INMEMORY MEMCOMPRESS FOR QUERY HIGH

     

See Also

Disclosure Statement

Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of June 29, 2016.

The previous information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.