Facebook is using GraalVM to accelerate its Spark workloads and reduce memory & CPU usage. Keep reading to learn about their migration story, performance improvement results, and future plans.
Technology behind Facebook
Java is used at Facebook in a few key areas, such as big data (Spark, Presto, etc.), backend services, and mobile. Before moving to GraalVM, the team used Oracle JDK and OpenJDK for Java 8 and Java 11.
At such scale, any performance improvements bring significant value — they improve user experience and reduce infrastructure costs. That is why the engineering team is always looking for ways to improve the performance of their applications, and decided to evaluate GraalVM to determine if it was a faster Java runtime.
- Since performance was a major consideration, the Facebook team decided to evaluate GraalVM as their Java runtime and see whether it would improve the performance of their Java applications. GraalVM offers advanced optimizations, such as partial escape analysis and inlining heuristics. Thanks to that, many Java/JVM applications will see performance gains out of the box just by switching to GraalVM. As the Facebook team also observed, GraalVM shows significant YoY improvements compared to C2 on benchmarks like SpecJVM2008 and DaCapo.
- Additionally, the GraalVM compiler is written from scratch, using Java in a modular and expandable way. This allows for easy maintenance as well as the addition of incremental improvements. This was important for Facebook, since the team is considering GraalVM their long-term investment.
- Community. The GraalVM project has a vibrant open source community, with many organizations and individuals contributing to the project and shaping its roadmap. It’s also easy to find help and support in the community.
The Facebook team used GraalVM as a replacement of OpenJDK. In this scenario, migration to GraalVM is very easy — it’s just a matter of switching runtime, with no changes required for the application code. Such transition makes applications run faster thanks to advanced performance optimizations of GraalVM without any manual tuning.
Apache Spark is a unified analytics engine for big data processing, with built-in modules for streaming, SQL, machine learning, and graph processing. It’s quite fast at processing data out of the box, yet many teams are looking for ways to optimize its performance even further. One of the easiest ways to do so is to run Spark workloads on GraalVM. Thanks to the set of specific compiler optimizations, which we’ll talk about more in a bit, GraalVM can significantly speed up Spark workloads. The Renaissance benchmark suite’s Apache Spark benchmarks show an average speedup 1.1x for Community and of 1.42x for Enterprise, with some benchmarks running up to 4.84x faster.
For Facebook, Spark is the largest SQL query engine in their data warehouse, running on aggregated compute storage clusters. Because of the huge amount of data, efficiency and cost reduction are big priorities.
They began their evaluation in early 2020. As initial benchmarks showed good results, the team rolled out GraalVM to production and kept monitoring performance and reliability.
Performance-wise, they observed about a 10% reduction in CPU usage, and this CPU reduction has been consistent ever since the rollout.
How GraalVM Accelerates Spark Workloads
Some of the optimizations that contributed to Spark performance improvement the most, are:
- Polymorphic inlining. Traditional inlining only works if the compiler can decide on the exact method that a method call is targeting. GraalVM enables inlining beyond this point by collecting additional profiling information that allows abstract methods to also be inlined.
- Partial escape analysis. The idea of partial escape analysis is to remove unnecessary object allocations by performing scalar replacement in branches where the object does not escape, and make sure that the object exists in the heap in branches where it does have to escape. This reduces both the memory footprint of the application, and the CPU load incurred by GC. Such optimization is even more important in data-heavy applications like Spark. In particular, as observed by Facebook, GraalVM reduced CPU consumption by 5x in methods like
- Advanced speculative optimizations in GraalVM produce faster machine code by taking advantage of dynamic runtime feedback. By speculating that certain parts of the program will not be run during the program’s execution, the GraalVM compiler is able to specialize the code and make it more efficient. For Spark, this optimization works particularly well by eliminating branches (such as long
if-then-elsechains), simplifying the control flow, reducing the amount of dynamic checks in loop bodies, and establishing aliasing constraints, which enables further optimizations.
As the result of their evaluation, the Facebook team migrated most of their CPU-intensive big-data services to GraalVM. They also observed >5% CPU and GC pause improvement for Presto after switching to GraalVM. Next the team is planning to push GraalVM to other memory-bound services to benefit from escape analysis optimizations. The team also plans to contribute to the project and community.
They also are exploring opportunities to use other GraalVM features, such as Native Image and the Truffle Framework.
GraalVM can significantly speed up many Java and Scala workloads thanks to advanced compiler optimizations. In particular, Spark workloads can expect around 10%-42% speedup just by switching to GraalVM as a JDK distribution.
What’s interesting is — a similar journey and similar observations were shared by engineers from another popular social media platform, Twitter. After moving their Scala workloads to GraalVM, they observed significant performance improvements, such as reducing P99 latency by 19.9%, thanks to the GraalVM compiler. For platforms like Twitter or Facebook, such performance improvements are multiplied even further by the scale of the platform.
To get started with GraalVM for your applications, go to graalvm.org/docs/getting-started/.