Apache Spark—Lightning fast on GraalVM Enterprise

May 5, 2020 | 6 minute read
Text Size 100%:

Photo by israel palacio on Unsplash

Apache Spark and Oracle GraalVM Enterprise

Apache Spark is an open-source unified analytics engine for large-scale data processing, machine learning, streaming, and graph processing written in Java and Scala that also provides bindings for a number of other programming languages including Python, R, and SQL.

Oracle GraalVM Enterprise is a high-performance runtime that provides significant improvements in application performance and efficiency. Built on Oracle Java SE, it can run applications written in Java and other JVM-based languages (e.g., Scala, Groovy, Kotlin, Clojure), JavaScript, Python, Ruby, R, and LLVM-based languages such as C and C++.

As Apache Spark excels at processing large amounts of data, performance is absolutely critical.  There are many sophisticated performance tuning options and techniques available in Apache Spark, but one of the easiest ways to improve overall performance is to run it on GraalVM Enterprise. The Renaissance benchmark suite’s Apache Spark benchmarks show an average workload execution time reduction of 1.6x with some benchmarks running as fast as 5x faster!  Let's take a look at the types of speedups you can expect when running Apache Spark on GraalVM Enterprise.

Benefits of running Apache Spark on GraalVM Enterprise

Apache Spark execution times are greatly reduced when running on GraalVM Enterprise for several reasons. First, GraalVM Enterprise’s advanced compiler is able to perform a large number of optimizations due to the high level of abstraction found in Scala. Thanks to optimizations such as path duplication [1], optimization-driven inlining [2], and advanced speculations [3], GraalVM Enterprise’s just-in-time compiler improves the peak performance of Apache Spark applications, reducing execution time. 

Second, GraalVM Enterprise’s optimizing compiler reduces garbage collection overhead. Optimizations such partial escape analysis [4] can greatly reduce the number of objects allocated in an application, thereby reducing both the memory footprint of the application and the CPU load incurred by the garbage collector. This is important for Apache Spark because typical Big Data workloads tend to allocate a large number of objects.

Finally, GraalVM Enterprise can run multiple languages but without the inter-language call cost that is normally paid when two different languages, such as Python and Java, call each other’s functions. This is especially beneficial in the context of Apache Spark and data processing since Apache Spark is implemented in Scala and many useful data-processing utilities are implemented in analytics languages such as Python and R.

Performance Comparison

Let’s compare Apache Spark’s performance using OpenJDK and GraalVM Enterprise on eight benchmarks from the Renaissance suite [5]: als, chi-square, dec-tree, gauss-mix, log-regression, movie-lens, naive-bayes and page-rank. These benchmarks represent typical Big Data processing and machine-learning algorithms, and they use Apache Spark’s Reliable Distributed Datasets (RDDs) and its Machine Learning Library (MLLib).

Peak Performance

Peak performance is defined as the performance after the JVM warms up and the just-in-time compiler generated machine code for an application has been fully optimized. The following graphs compare the peak performance of OpenJDK 8 and GraalVM Enterprise. We’ll take OpenJDK 8 as the baseline with the y-axis showing the normalized running time, so lower is better.

Figure 1: Apache Spark benchmarks—GraalVM Enterprise (JDK 8) vs. OpenJDK 8

Figure 1 shows peak performance on JDK 8. Overall, GraalVM Enterprise achieves higher peak performance than OpenJDK 8 on all of these benchmarks. Performance improvements with GraalVM Enterprise typically range from 10% up to 50% with a mean improvement of 1.6x. In some cases, such as naive-bayes, the performance of GraalVM Enterprise exceeds OpenJDK 8 by almost 5x.

Figure 2: Apache Spark benchmarks—GraalVM Enterprise (JDK 11) vs. OpenJDK 11

Figure 2 shows peak performance comparison on Java 11 on the same set of benchmarks. Again, GraalVM Enterprise offers significant performance improvements over OpenJDK 11, with mean speedup of around 1.5x.

Memory consumption and ergonomics

Thanks to advanced memory-allocation optimizations, such as partial escape analysis and scalar replacement [4], GraalVM Enterprise is capable of optimizing away more object allocations. Consequently, an application running on GraalVM Enterprise needs to spend less time doing automatic memory management and garbage collection. From a practical perspective, this means that the application running on GraalVM Enterprise can achieve better performance with less memory, which is particularly important for the Cloud environments.

Figure 3: Performance at different memory levels on naive-bayes (JDK 8)

To illustrate this, consider Figure 3, which compares the performance on the naive-bayes benchmark on JDK 8 for the different JVM memory sizes (controlled with the -Xmx option). The y-axis shows the execution time in milliseconds, so lower is better. On this benchmark, GraalVM Enterprise achieves better performance than OpenJDK regardless of the available memory. For example, GraalVM Enterprise achieves the same performance with only 0.75 GB as OpenJDK 8 does with 4 GB of memory.

Figure 4 compares application execution time of OpenJDK 11 and GraalVM Enterprise for different memory sizes. The default garbage collector on JDK 11 is the G1 collector [6], which generally requires more memory than the Parallel Collector [7] on JDK 8. We can observe a similar effect on JDK 11 and the G1 collector - GraalVM Enterprise is able to achieve the same performance with 6 GB of memory as OpenJDK 11 with 10 GB of memory.

Figure 4: Performance at different memory levels on naive-bayes (JDK 11)

A similar effect can be observed on the chi-square benchmark, as shown in Figure 5, and in the log-regression benchmark, as shown in Figure 6. GraalVM Enterprise is able to consistently achieve the same performance with lower memory consumption. 

Figure 5: Performance at different memory levels on chi-square (JDK 8)

 

Figure 6: Performance at different memory levels on log-regression (JDK 8)

Conclusion

As you can see, for a range of benchmarks Apache Spark consumes less memory and runs faster on GraalVM Enterprise than on OpenJDK. If you’re running Apache Spark in production and would like to use fewer compute resources while improving throughput it’s as easy as switching to GraalVM Enterprise. To try it out for yourself simply install GraalVM Enterprise, set your JAVA_HOME, and go! Download GraalVM Enterprise today: http://oracle.com/graalvm.

In a future posting we’ll explore GraalVM Enterprise’s multi-language support which enables calling methods in data-analytics languages, such as R and Python, from within Spark’s data-processing operations.

References

[1] Dominance-based duplication simulation (DBDS): code duplication to enable compiler optimizations, David Leopoldseder, Lukas Stadler, Thomas Würthinger, Josef Eisl, Doug Simon, Hanspeter Mössenböck
[2] An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers, Aleksandar Prokopec, Gilles Duboscq, David Leopoldseder, Thomas Würthinger
[3] Speculation without regret: reducing deoptimization meta-data in the Graal compiler, Gilles Duboscq, Thomas Würthinger, Hanspeter Mössenböck
[4] Partial Escape Analysis and Scalar Replacement for Java, Lukas Stadler, Thomas Würthinger, Hanspeter Mössenböck
[5] Renaissance Benchmark Suite, renaissance.dev
[6] Getting Started with the G1 Garbage Collector, https://www.oracle.com/technetwork/tutorials/tutorials-1876574.html
[7] The Parallel Collector, https://docs.oracle.com/javase/9/gctuning/parallel-collector1.htm

Shaun Smith

Shaun Smith leads product management of Oracle Lab's GraalVM.


Previous Post

The Advanced Management Console (AMC) 2.17 release is now available!

Clifford Wayne | 2 min read

Next Post


Optimizing Machine Learning Performance at Netsuite with GraalVM and NVIDIA GPUs

Alina Yurenko | 3 min read
Oracle Chatbot
Disconnected