Persistent Memory Magic in Exadata uses Intel Optane Persistent Memory (or PMEM for short), a new memory technology that can be used (by software) in the same manner that software uses DRAM, but the contents can be made persistent like Flash or disk storage. There are some critical aspects of exactly how data gets persisted into PMEM, and some specific operations inside the Oracle Database that benefit from the performance gains it brings. The challenge for the Exadata Development team was to realize the performance benefits of PMEM, while still ensuring data integrity and database availability. The magic of PMEM in Exadata is a story of how Oracle Developers took advantage of the incredible speed and data persistence capabilities through deep integration into the Oracle Database software.
Oracle Database uses RDMA over Converged Ethernet with PMEM on Exadata
While PMEM is a critical piece of the technology puzzle, the Exadata Development team determined that RDMA was another critical component that needed to be used in conjunction with PMEM to deliver the fastest database performance yet. Oracle Database uses Remote Direct Memory Access over Converged Ethernet (RoCE) to access data cached in Intel Optane Persistent Memory (PMEM) on the storage side of Exadata. Converged Ethernet is the networking layer, but the ability to use Remote Direct Memory Access (RDMA) over Converged Ethernet allows Exadata to realize the full performance potential of PMEM.
The Oracle Database software drives how PMEM is used in Exadata. Exadata software and Oracle Database software use PMEM as a storage-side cache, and software makes PMEM redundant to protect data. This high degree of integration between the Oracle Database software, Exadata software, and Exadata hardware is the latest example that shows what makes Exadata uniquely different from other architectures used in Cloud and on-premises environments. To understand how PMEM is used inside of Exadata, we first need to look at how PMEM would work in Exadata without RDMA.
Oracle Database software normally accesses storage over the internal Exadata network, using layers of code running in Exadata Database Servers and inside of Exadata Storage Servers to handle communication back and forth. The speed of PMEM is so fast (in the microsecond range) that the layers of code in the I/O path have become a barrier to improving performance beyond a certain threshold. We get excellent performance in Exadata with PMEM alone, but those layers of code are slowing the response time and block us from further improvements.
Exadata reached the point where 10’s of microseconds was the new challenge
It’s important to understand that the magic of PMEM in Exadata begins with how data is cached in the storage-tier and what data gets cached. Exadata intelligently caches Warm data in Flash, and the hottest data is cached into PMEM based on deep integration with Oracle database caching and data access patterns. The hottest blocks of data are tied to the highest read IOPS (Input/Output Operations Per Second) incurred by the Database and are accessed through single block read operations. These can be seen in Oracle Automatic Workload Repository (AWR) reports under the wait event “db file sequential read” for non-Exadata systems, and “cell single block physical read” on Exadata systems. Databases experiencing high read IOPS rates will show high values for these specific events.
Exadata caches warm data in Flash and the hottest data in PMEM
By caching the Oracle Database blocks that are specifically involved in high numbers of I/O operations, Exadata makes more effective use of the available PMEM capacity. If it wasn’t for this integration with the Oracle Database software and data caching algorithms, a much larger amount of PMEM would be required. In other words, the way Exadata uses PMEM increases the effective capacity of PMEM by at least 10X, or one order of magnitude. Conventional storage subsystems (or server-side storage) simply cannot distinguish the database wait events attributed to reading specific database blocks, so more space would be required to achieve the same performance results. The end-to-end read latency using PMEM as a storage-side cache in Exadata is extremely fast at around 100 microseconds. This is excellent performance but Exadata goes beyond this by using Remote Direct Memory Access (RDMA) in conjunction with PMEM.
Oracle Database on Exadata uses RDMA to bypass layers of code
This new feature combining RDMA and PMEM is called the Exadata Persistent Memory Data Accelerator, which makes single block reads (“db file sequential read” or “cell single block physical read”) even faster. RDMA allows the C/C++ source code of the Oracle Database to use PMEM residing in Exadata storage cells as if that memory was part of the database server itself. RDMA calls in the Oracle Database software bypass communication layers to access PMEM directly, without intervening layers of software and without context switches or network communication layers. Single block read latency is reduced dramatically, dropping from the 100-microsecond range, to less than 19 microseconds.
While performance is excellent on Exadata, protection against failures is equally important. Exadata protects data stored in PMEM by writing to Flash in all configurations, and to disk as well in High Capacity (HC) Exadata systems. Exadata also protects data by writing multiple copies in unison to multiple storage servers, writing two copies in Exadata NORMAL redundancy configurations, and three copies with HIGH redundancy. Exadata also automatically detects failures and is able to automatically repair data as well.
Exadata protects data by writing multiple copies in unison to Flash and Disk
Using PMEM in the storage-side of Exadata allows these systems to scale to much larger capacities than server-side PMEM can achieve, while also providing redundancy and therefore much greater availability of databases and business applications. Only Exadata integrates software and hardware to deliver un-matched Database performance and capacity, while also protecting against failure. These features of Exadata work automatically, don’t require additional software licenses, and are built into the latest Exadata systems by default without any specific purchasing or configuration decisions necessary. Storage-Side Persistent Memory is just one example of the advanced level of integration between Oracle Database software, Exadata software, and the cutting-edge hardware in Exadata.
Be sure to see the Persistent Memory Primer for more about PMEM.