More than Just Redundant Hardware: Exadata MAA and HA Explained part I, the compute node

The previous blog post pointed out that it takes more than just having redundant components to get the world-class HA that Oracle Exadata provides.

In this blogpost, I will dive deeper into the features that Exadata brings to compute nodes at the component level.
The compute nodes run the operating system (bare-metal or virtualized) and, eventually, the databases.

X10M_compute_node

A compute node consists of various components :

Two or optionally four NVME Flash drives containing the operating system or Hypervisor and VMs, the Oracle binaries, etc.
up to 3TB of RAM
2 CPUs
Host Channel Adapter which provides connectivity to the RDMA network fabric.

We will go through these highlighted components one by one to show you how Exadata makes a difference.

Let’s start with the basics.

What happens if one of the mirrored flash drives fails?

The built-in RAID has you covered. Moreover, you can change the NVME flash drive without turning off the system; the NVME flash drives are hot-pluggable.

The Management Service (MS) running on the Compute (and also on the Exadata Storage server, but that is for later) will alert the failure and notify upon replacement.

A dead device is the most straightforward case; it is more difficult when a sick device is not showing issues continuously but intermittently.

What happens if a sick flash drive or a filesystem fills up or gets corrupted? How does Exadata solve this problematic issue?

Exadata has software engineering across many layers to prevent, detect and repair; for example, the Clusterware Synchronisation Service (CSS) clusterware process is multithreaded, two threads have the responsibility to write out diagnostic data to the filesystem, other threads are responsible for the core responsibility of CSS; node monitoring and group membership, these threads mind their own business and don’t interfere.

The flash disk has a predictive failure functionality and will fail proactively when certain thresholds are met.

Management Service will not only alert on space issues but also tell who is filling up the space and will clean up after itself.

What about memory? Does Exadata reboot, just as my non-Exadata system does, when encountering uncorrectable errors?

Only the process affected by the memory corruption / uncorrectable error is affected; it receives a SIGBUS and gets killed.

On top of that, the Management Service will send out an alert.

The built-in SMART Integrated Lights Out Management (Ilom) functionality will keep track of the correctable and uncorrectable errors and flags memory for replacement based on predefined thresholds.

How does Exadata make sure that memory is used optimally?

Again, the tight integration between hardware and software makes the difference here.

Management Service alerts when general memory usage is too high but will also do that for individual processes.

Oracle Databases created on Exadata using Oracle Exadata Deployment Assistant (OEDA) have hugepages enforced by default.

The hugepages are reserved at the OS level, and the database needs to use hugepages. The use of hugepages, not only improves performance but also tremendously improves availability.

To ensure that customers remember to enable hugepages, Exachk will check for this, it also checks for oversubscription, and many other best practices.

Exachk as mentioned in the introductory blog post is part of the Autonomous Health Framework (AHF), and bundles the best practices for Exadata.

How do I make sure that my database servicing application uses the appropriate number of cores and how can I change this easily if necessary?

Exadata offers Capacity on Demand to reduce the number of active cores on your Exadata database server and lower the initial software licensing cost. Resource Management and Instance caging can be used to avoid CPU oversubscription. Exadata also offers the possibility of doing Dynamic CPU scaling and modifying the number of vCPUs assigned to Virtual Machines on the fly.

All the previous sections talk about individual components, but how does Exadata handle this at the system level, or how does it handle major version system software updates?

Cluster resources are restarted automatically upon failure.

Traditional systems use software and rely on TCP timeouts to detect nodes that are unreachable, this can lead to unexpected results when systems are under heavy load.

On Exadata with Instant Failure Detection (IFD), frequent RDMA heartbeat messages are sent; since RDMA uses hardware, this also functions when systems are heavily loaded. This results in node evictions in seconds instead of minutes and avoids split-brain situations.

If you have ever tried to do an in-place upgrade of a previous Linux version to a new major version, you know how difficult this is and how much testing it needs to get it right. Exadata systems ship with Oracle Enterprise Linux (OEL); we know exactly what is installed on Exadata systems, and this allows us to provide seamless upgrades to future major versions. Exadata version 23.1 upgrades to Oracle Enterprise Linux 8.7 and UEK6 kernel, all without reinstallation. The UEK6 kernel is highly optimized for Exadata as you can read in full detail in following blog post.

As a general best practice, I cannot repeat this enough, it is essential to create custom database services. Your application must connect to a service to use the high availability features: FAN, Draining and Application Continuity. This cannot be the default database service or the default PDB service. On Exadata, the RHPhelper will coordinate draining connections.

Please make sure to follow the guidelines as indicated in the MAA Application Checklist and in our High Availability Overview and Best Practices.

I hope to see you back for another blog post in this series.

References and further reading:

https://www.oracle.com/a/tech/docs/application-checklist-for-continuous-availability-for-maa.pdf

https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/application-high-availability-levels.html#GUID-6BF1367C-E409-4ECE-AD2B-E9016CB1A763

https://www.oracle.com/database/technologies/rac/ahf.html

Blog posts in this series :

More than Just Redundant Hardware: Exadata MAA and HA Explained – Introduction

More than Just Redundant Hardware: Exadata MAA and HA Explained part I, the compute node

More than Just Redundant Hardware: Exadata MAA and HA Explained – Part II, the Exadata Storage Cell

More than Just Redundant Hardware: Exadata MAA and HA Explained part III, RoCE Fabric / Human Error

More than Just Redundant Hardware: Exadata MAA and HA Explained part I, the compute node

What happens if one of the mirrored flash drives fails?

What happens if a sick flash drive or a filesystem fills up or gets corrupted? How does Exadata solve this problematic issue?

What about memory? Does Exadata reboot, just as my non-Exadata system does, when encountering uncorrectable errors?

How does Exadata make sure that memory is used optimally?

How do I make sure that my database servicing application uses the appropriate number of cores and how can I change this easily if necessary?

All the previous sections talk about individual components, but how does Exadata handle this at the system level, or how does it handle major version system software updates?

Philippe Fierens

Product Manager for Exadata Fleet Update - Fleet Patching & Provisioning - Oracle Update Advisor

Oracle ZDM for Oracle Database@Azure Migrations

Zero Data Loss Recovery Appliance Architectures for Ransomware Protection and Cyber-Resilience

More than Just Redundant Hardware: Exadata MAA and HA Explained part I, the compute node

What happens if one of the mirrored flash drives fails?

What happens if a sick flash drive or a filesystem fills up or gets corrupted? How does Exadata solve this problematic issue?

What about memory? Does Exadata reboot, just as my non-Exadata system does, when encountering uncorrectable errors?

How does Exadata make sure that memory is used optimally?

How do I make sure that my database servicing application uses the appropriate number of cores and how can I change this easily if necessary?

All the previous sections talk about individual components, but how does Exadata handle this at the system level, or how does it handle major version system software updates?

Authors

Philippe Fierens

Product Manager for Exadata Fleet Update - Fleet Patching & Provisioning - Oracle Update Advisor

Oracle ZDM for Oracle Database@Azure Migrations

Zero Data Loss Recovery Appliance Architectures for Ransomware Protection and Cyber-Resilience