This blog was written in collaboration with Seema Mehta, senior principal product marketing manager at Ampere Computing.
Apache Cassandra is an open source NoSQL distributed database trusted by thousands of companies, including Apple, eBay, and Netflix. Apache Cassandra’s high-performance, linear scalability, and proven fault tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
The Ampere Altra processors are complete system-on-chip (SOC) solutions built for cloud native applications. Ampere Altra’s innovative architecture delivers high performance, linear scalability, and amazing energy efficiency. Ampere Altra allows workloads to run in a predictable way with minimal variance under increasing loads. This predictability enables industry-leading price-performance and a smaller footprint for real-world workloads, such as Cassandra.
Oracle Cloud Infrastructure (OCI) offers Ampere Altra compute shapes on the cloud native Ampere A1 platform. You can deploy the Ampere A1 platform as bare metal servers or flexible virtual machine (VM) shapes, giving customers full control of their entire cloud stack. The Ampere A1 VM shapes provide flexible sizing from 1–80 cores and 1–64 GB of memory per core with several key benefits, such as deterministic performance, linear scalability, and a secure architecture with the best price performance in the market.
In this blog, we compare OCI A1 Compute shapes powered by Ampere Altra processors to OCI E4 shapes powered by AMD EPYC 7763 processors running Cassandra, while measuring the throughput, latencies, and cost of running the Cassandra workload on each of these shapes.
OCI A1 Compute shapes powered by Arm-based Ampere Altra processors are designed to deliver exceptional performance for cloud native applications like Cassandra. Running Cassandra on Ampere Altra processors offers the following benefits:
Cloud native: Designed from the ground up for cloud customers, OCI A1 shapes are ideal for cloud native use cases, such as Cassandra.
Scalable: With an innovative scale-out architecture, OCI A1 shapes powered by Ampere Altra processors have a high core count with compelling single-threaded performance combined with consistent frequency of 3.0 GHz for all cores delivering greater performance at socket level.
Power-efficient: Industry-leading energy efficiency allows OCI A1 processors to hit competitive levels of raw performance, while consuming much lower power than the x86 offerings.
Price-performance: Utilizing Ampere’s low-power design and OCI’s high-performance infrastructure, Ampere A1 shapes offer the best price-performance in the cloud.
For benchmark testing, we performed the Cassandra on the following configuration:
Client: 1 OCI A1 VM, 32 OCPU, 32-GB RAM.
OCI A1: 1 OCI Ampere A1 VM, 64 OCPU per 64 T, 512-GB RAM, 2 2,048-TB disks with RAID 0 for data, 2x 2,048-TB disks with RAID 0 for commit log.
OCI E4: 1 OCI AMD EPYC Milan E4 VM, 32 OCPU per 64 T, 512-GB RAM, 2 2,048-TB disks with RAID 0 for data, 2 2,048-TB disks with RAID 0 for commit log.
Operating system: Oracle Linux 9.
We used Cassandra version 4.0.6 and Java developer kit (JDK) version 15. We recommend compiling Cassandra with JDK-15, compiled with GCC 10.2 with the right flags, or newer, because newer jJva versions have made significant progress toward generating optimized code that can improve performance for Aarch64 applications. We used G1GC as the Java compiler with appropriate memory and threads for the Java virtual machine (JVM).
We performed this test using the Cassandra stress as a load generator for benchmarking Cassandra. Each test was configured to run for three minutes with multiple threads and multiple clients.
Because measuring throughput as a measure of performance under a specified service level agreement (SLA) is realistic, we used a 99th percentile latency (p.99) of 5 millisecond. This latency ensured that 99 percent of the requests had a response time of 5 ms in the worst case.
The test ran for three minutes with warmup with 90% write and 10% read, which is a critical usage for Cassandra, which is optimized for write operations. The test initially used an appropriate number of clients and threads to load one instance of Cassandra, while ensuring the p.99 latency was at most 5 ms.
Next, the number of Cassandra instances was successively increased till one or more instances violated the p.99 latency SLA. The aggregate throughput of all instances was used as the primary Next, we successively increased the number of Cassandra instances until one or more instances violated the p.99 latency SLA. We use the aggregate throughput of all instances as the primary performance metric. We ran the test three times and observed minimal run-to-run variation.
For throughput, Ampere Altra demonstrates 9–17% better throughput versus OCI E4. In the following graphic, higher is better.
For latency, Ampere A1 demonstrated latency up to 40% lower versus AMD E4 third-generation EPYC. In the graphic, lower is better.
For total cost of ownership, (TCO), for a Cassandra database running on OCI E4 powered by AMD EPYC, we calculated the annual cost per 1 PB of data to be $24M. In comparison, the annual cost for running the workload on Ampere was $17M for 1 PB of data. Customers can save $7M per PB of data annually by running their Cassandra workloads on A1 providing a 30% better TCO.
Distributed NoSQL databases, such as Cassandra, manage a large volume of data with great ease and scalability and are popular in cloud deployments. Our tests showed that Oracle OCI Ampere A1 instances powered by Ampere Altra processors provide higher throughput, lower latencies and lower TCO for running Cassandra database workloads.
With Cassandra running on Oracle Ampere A1, we observed up to 17% higher throughput, up to 40% lower latency, and up to 30% better cost savings over AMD E4 third-generation EPYC.
For cloud application developers, choosing Ampere Altra-based VMs on Oracle Cloud Infrastructure means higher performance, and price-performance, while reducing your carbon footprint.
For more information, see the following resources:
Previous Post