Reproducible Performance Benchmarks for Oracle Cloud Infrastructure Compute Instances

March 25, 2020 | 7 minute read
Sanjay Pillai
Director Product Management
Text Size 100%:

Updated with data from April 2021

Enterprise workloads are often performance sensitive. At Oracle, we designed our cloud to deliver consistently high performance. As we improve our cloud over time, we will at a minimum measure performance annually and share that data with our customers, so that you can understand what to expect. In this post, we show test results from two common benchmarks and present the methodology to re-create our results.

We measured four of our commonly used Compute instances by using two well-known and reliable benchmarking suites: UnixBench on Linux, and LINPACK on Windows. For each OS, there were two levels of workload complexity: with UnixBench, varying concurrency in the tests; with LINPACK, the number of equations solved.

Compute Instance Performance on Linux OS

We used UnixBench to test the performance of Linux instances. UnixBench is a set of tests that measure the performance of various aspects of a UNIX or Linux based system. The result is an aggregate score that measures the overall system performance as opposed to any one individual component.

The following table shows the aggregate UnixBench performance for each of the tested Compute instance shapes with single-threaded and concurrent (multi-threaded) test configurations. Higher mean/median results indicate better performance, and standard deviation is provided to show consistency (smaller is better).

Table 1A: Aggregate UnixBench Performance for Tested Compute Shapes - Intel based instances

Shape

Test Concurrency

Mean

Median

Standard Deviation

VM.Standard2.1

1

533

537

18

VM.Standard2.1

2

746

748

22

VM.Standard2.2

1

593

598

15

VM.Standard2.2

4

1410

1413

18

VM.Standard2.4

1

623

627

14

VM.Standard2.4

8

2339

2343

27

VM.Standard2.8

1

637

640

13

VM.Standard2.8

16

3832

3842

50

Table 1B: Aggregate UnixBench Performance for Tested Compute Shapes - AMD based instances

Shape

Test Concurrency

Mean

Median

Standard Deviation

VM.Standard.E3.Flex 1 OCPU, 16 GB

1

1430

1434

32

VM.Standard.E3.Flex 1 OCPU, 16 GB

2

2168

2175

41

VM.Standard.E3.Flex 2 OCPU, 32 GB

1

1542

1574

67

VM.Standard.E3.Flex 2 OCPU, 32 GB

4

3376

3724

510

VM.Standard.E3.Flex 4 OCPU, 64 GB

1

1598

1600

56

VM.Standard.E3.Flex 4 OCPU, 64 GB

8

5160

4947

703

VM.Standard.E3.Flex 8 OCPU, 128 GB

1

1584

1584

36

VM.Standard.E3.Flex 4 OCPU, 64 GB

16

7302

7302

424

What we saw is relatively consistent performance across these instances in the single threaded performance, and roughly proportional increase in performance as the tests are run in a concurrent configuration.  Customers can expect to see similar performance for single process workloads regardless of instance type, and proportional increases when running concurrent  workloads as the size of the compute instances increase.

The following graphs plots these results:

We performed these tests in different regions and found them to be highly consistent across all tested regions. Here are the results by region:

 

Compute Instance Performance on Windows OS

We used LINPACK to test the performance of Windows instances. The LINPACK benchmark measures how quickly a system can solve several linear equations. The results show the average number of floating-point operations per second that an instance can perform, measured in gigaFLOPS (GFLOPS). We ran it with a small workload of 2,000 equations, and with a larger workload of 40,000 equations.

The following table shows the aggregate LINPACK performance for each of the tested Compute instance shapes. A higher number indicates better performance, and a lower standard deviation indicates reduced variation between each test. Note that since we used a LINPACK binary distributed by Intel, this data is only from Intel based instances.

Table 2: Aggregate LINPACK Performance for Tested Compute Shapes - Intel based instances

Shape

Test Size

Mean

Median

Standard Deviation

VM.Standard2.1

2,000

19

19

2

VM.Standard2.1

40,000

48

48

3

VM.Standard2.2

2,000

50

55

10

VM.Standard2.2

40,000

101

102

5

VM.Standard2.4

2,000

87

84

15

VM.Standard2.4

40,000

199

203

11

VM.Standard2.8

2,000

184

184

19

VM.Standard2.8

40,000

349

331

41

The following chart shows a summary of the average scores for small and large test sizes by instance type. The results show that the performance increases as the size of the instance increases, with a steeper increase shown for tests with a larger number of equations. You can expect the floating-point performance on Intel Xeon based virtual machine Compute instances to scale linearly with a shape’s core count.

Similar to the Linux data, the Windows results are fairly consistent across the tested regions. The following graph shows minimal variation between regions:

Testing Methodology

We intended our testing to be easily reproducible. We used standard open source benchmarks and a straightforward testing methodology.

Our performance tests used the following parameters:

  • Commonly used compute instance shapes:
        * VM.Standard2.1
        * VM.Standard2.2
        * VM.Standard2.4
        * VM.Standard2.8
        * VM.Standard.E3.Flex(1,2,4,8 OCPUs) for Unix bench.
        * Standard boot volume sizes (50 GB for Linux and 256 GB for Windows), with the default performance settings
        * A current version of UnixBench, compiled locally, was used for Linux; A pre-compiled LINPACK binary distributed by Intel was used for Windows
        * Thirty instances were run in each availability domain; No more than ten instances were run simultaneously at any time
        * Current platform images for Oracle Linux 7.9 and Windows Server 2016 were chosen for the benchmark

To reproduce these results, you need the following items:

  • Access to an Oracle Cloud Infrastructure account with the following permissions:
    • Create and delete dynamic groups, and the associated access policy
    • Create and delete Object Storage buckets
    • Create and delete virtual cloud networks (VCNs) and associated objects
    • Create and delete Compute instances
    • (Optional) Create and delete streams
  • Access to a system that can run Python 3. The benchmark code has been tested on OS X and Oracle Linux 7.7. Although you can run the code on your personal workstation, we recommend that you use a Compute VM instance to run the tests.
  • A working installation of the Oracle Cloud Infrastructure Python SDK.
  • The code to run the benchmark tests. Download the code, extract the files, and then customize the files for your environment.

Note: The code to run the benchmark is provided only as an example. Although the code creates a working setup, Oracle Cloud Infrastructure doesn’t support this code.

Customizing the Benchmark Tests for Your Environment

The config.py file shows you the relevant customizations. Although you can edit the file directly, we recommend that you create an override file that contains your specific changes and then activate the file by setting the OVERRIDES environment variable, before you run any scripts.

At a minimum, modify the following parameters in the config.py file:

  • homeregion
  • compartment
  • cname
  • tenancy
  • launches: Don't set this value too large, because the tests could cause your tenancy to reach its Compute service limits for the shape that you're testing.
  • limit: Don't set this value too large, because the tests could cause your tenancy to reach its Compute service limits for the shape that you're testing.
  • public_key: Although a readable file name is required, the file contents aren't validated. If you want to prevent cloud-init from configuring Linux instances for SSH access, provide a dummy file.
  • regions: The test code launches instances in all commercial availability domains, which is excessive for most users. Include only the regions where you want to launch instances.

Launching Instances Using the Example Benchmark Tests

If you decide to launch instances with these scripts, you typically run the scripts in this order:

  1. setupenv.py: Creates the objects that are needed by the global runtime environment
  2. launchinstance.py: Creates the regional objects, launches the instances, and waits for the instances to terminate
  3. download_data.py: Stores the run artifacts in a directory that you choose
  4. teardown.py: Terminates (or reterminates, if necessary) any instances that were launched and removes the regional objects
  5. teardownenv.py: Removes the global objects and associated artifacts that setupenv.py created

We’re excited to help you understand the performance that you can expect from Oracle Cloud implementations, and this post describes a transparent methodology for measuring the performance that our Compute instances deliver.

We want you to experience the features and enterprise-grade capabilities that Oracle Cloud Infrastructure offers. It’s easy to try them out with our 30-day trial and our Always Free tier of services. For more information, see the Oracle Cloud Infrastructure Getting Started guide, Compute service overview, and Compute FAQ.

Sanjay Pillai

Director Product Management

Director Product Management, OCI Compute Service


Previous Post

Solution-Soft Time Machine Now Available on the Oracle Cloud Infrastructure Marketplace

Guest Author | 3 min read

Next Post


The Best Public Cloud for Oil and Gas Reservoir Simulation

Yogi Pandey | 4 min read
Oracle Chatbot
Disconnected