In my previous post, "Yes, Database Performance Matters", I talked about those I met at Collaborate, and how most everyone believed Oracle Exadata performance is impressive. However, every now and then I run into someone who agrees Exadata performance is impressive, but also believes they can achieve this with a build your own solution. I think on that one, I have to disagree...
There are a great many performance enhancing features, not just bolted on, but deeply engineered into Exadata. Some provide larger impact than others, but collectively they are the secret sauce that makes Exadata deliver extreme performance. Let’s start with its scale out architecture. As you add additional compute servers and storage servers, you grow the overall CPU, IO, storage, and network capacity of the machine. As you grow a machine from the smallest 1/8th rack to the largest multi-rack configuration, performance scales linearly.
Key to scaling compute nodes is Oracle Real Application Clusters (RAC). This allows a single database workload to scale across multiple servers. While RAC is not unique to Exadata, a great deal of performance enhancements have been done on RAC’s communication protocols specifically for Exadata, making Exadata the most efficient platform for scaling RAC across server nodes.
Servers are connected using a high-bandwidth, low-latency 40 Gb per second InfiniBand network. Exadata runs specialized database networking protocols using Remote Direct Memory Access (RDMA) to take full advantage of this infrastructure, providing much lower latency and higher bandwidth than possible if you tried this in a build-your-own environment. Exadata also understands the importance of the traffic on the network, and can prioritize important packets. This, of course, has a direct impact on the overall performance of the databases running on the machine.
t’s common knowledge that IO is often the bottleneck in a database system. Exadata has impressive IO capabilities. I’m not going to overwhelm you with numbers, but if you are curious, check out the Exadata data sheet for a full set of specifications. More interesting is how Exadata provides extreme IO. The most obvious technique, is to use plenty of flash memory. Exadata storage cells can be fully loaded with NVMe flash, providing extreme IOPS and throughput for any database read or write operation. This flash is placed directly on the PCI bus, not behind bottlenecking storage controllers. Perhaps surprisingly, most customers do not opt for all flash storage. Rather they choose a lesser (read that as less expensive) flash configuration backed by high capacity HDDs. The flash provides an intelligent cache, buffering most latency sensitive IO operations. The net result is the storage economics of HDDs, with the effective performance of NVMe flash.
You might be wondering how flash can be a differentiator for Exadata. After all, many vendors sell all flash arrays, or front-end caches in front of HDDs. The key is understanding the database workload. Only Exadata understands the difference between a latency-sensitive write of a commit record to a redo log, and an asynchronous database file update. Exadata knows to cache database blocks, that are very likely to be read or updated repeatedly, but not to cache IO from a database backup or large table scan, that will never be re-read again. Exadata provides special handling for log writes using a unique algorithm that reduces the latency of these critical writes and avoids the latency spikes common in other flash solutions. Exadata can even store cached data in an optimized columnar format, to speed processing on analytical operations that need only access a subset of columns. These features require the storage server to work in concert with the database server, something no generic storage array can do.
Flash is fast, but there is only so much you can solve with flash. You still need to get the data from the storage to the database instance, and storage interconnect technologies have not kept up with the rapid rise in the database server’s ability to consume data. To eliminate the interconnect as a potential bottleneck, Exadata takes advantage of its unique Smart Scan technology to offload data intensive SQL operations from the database servers directly to the storage servers. This parallel data filtering and processing dramatically reduces the amount of data that need be returned to the database servers, correspondingly increasing the overall effective IO and processing capabilities of the system.
Exadata’s intelligent storage further improves processing by tracking summary information for data stored in regions of each storage cell. Using this information, the storage cell can determine whether relevant data may even exist in a region of storage, avoiding unnecessarily reading and filtering that data. These fast in-memory lookups eliminate large numbers of slow HDD IO operations, dramatically speeding database operations.
While you can run the Oracle database on many different platforms, not all features are available on all platforms. When run on Exadata, Oracle database supports Hybrid Columnar Compression (HCC) which stores data in an optimized combination of row and columnar methods, yielding the compression benefits of columnar storage, while avoiding the performance issues typically associated with columnar storage. While compression reduces disk IO, it traditionally hurts performance as substantial CPU is consumed with decompression. Exadata offloads that work to the storage cells, and once you account for the savings in IO, most analytic workloads run faster with HCC than without.
Perhaps there is no better testimonial to Exadata’s performance than real-world examples. Four of the top five banks, telcos and retailers run on Exadata. For example Target consolidated database from over 350 systems onto Exadata. They now enjoy a 300% performance improvement and 5x faster batch and SQL processing. This has enabled them to extend their ship from store option for Target.com to over 1000 stores, allowing customers to get their orders sooner than before.
I’ve really just breezed over 10 years of performance advancements. Those interested can find more detail in the Exadata data sheet. Hopefully, you see it would be impossible to get the same performance from a self-built Exadata or similar system. In the case of database performance, only deep engineering can deliver extreme performance.
This is the third blog in a series of blog posts celebrating the 10th anniversary of the introduction of Oracle Exadata. Our next post, "Oracle Exadata Availability," will focus on high availability.
Bob Thome is a Vice President at Oracle responsible for product management for Database Engineered Systems and Cloud Services, including Exadata, Exadata Cloud Service, Exadata Cloud at Customer, RAC on OCI-C, VM DB (RAC and SI) on OCI, and Oracle Database Appliance. He has over 30 years of experience working in the Information Technology industry. With experience in both hardware and software companies, he has managed databases, clusters, systems, and support services. He has been at Oracle for 20 years, where he has been responsible for high availability, information integration, clustering, and storage management technologies for the database. For the past several years, he has directed product management for Oracle Database Engineered Systems and related database cloud technologies, including Oracle Exadata, Oracle Exadata Cloud Service, Oracle Exadata Cloud at Customer, Oracle Database Appliance, and Oracle Database Cloud Service.