Quick Overview of Data Reloading in MySQL HeatWave After Reboot

In MySQL HeatWave, the in-memory cluster (RAPID engine) stores data in volatile memory, so a reboot of the DB System or HeatWave cluster (e.g., due to maintenance, upgrades, or failures) typically requires reloading data to restore full functionality. Prior to MySQL 9.2.0, reloading involved manually or automatically scanning the underlying InnoDB tables (or Object Storage for Lakehouse tables), which could be time-consuming for large datasets.

Starting with MySQL 9.2.0, HeatWave introduced significant improvements for faster recovery by persisting additional metadata in the HeatWave Storage Layer (backed by OCI Object Storage on Oracle Cloud Infrastructure or equivalent on AWS). This enables automatic data recovery from the Storage Layer instead of a full reload from InnoDB, reducing recovery time dramatically—often from hours to minutes, depending on dataset size and cluster configuration. Data changes made while the cluster was offline are also incorporated during recovery.

Improvements in version 9.4.2

MySQL HeatWave 9.4.2 (released September 23, 2025, as a General Availability Innovation release) builds on the 9.2.0 foundation without introducing new mechanisms for post-reboot reloading. Instead, it inherits and benefits from the enhanced recovery process, while incorporating cumulative performance optimizations from intermediate releases (e.g., 9.3.0 and 9.4.0–9.4.1). Here’s how it achieves faster reloading:

Automatic Recovery from HeatWave Storage Layer:

  • How It Works: Upon reboot, HeatWave first attempts to recover data directly from the persisted Storage Layer metadata. This includes sharded data distribution, table schemas, and columnar formats stored durably. If recovery succeeds, no full reload from InnoDB is needed—HeatWave simply rehydrates the in-memory structures.
  • Speed Improvement: Recovery is near-instantaneous for metadata and partial data, as it avoids re-scanning InnoDB indexes or rows. For large tables (e.g., >1TB), this can reduce reload time by 80–95% compared to pre-9.2.0 versions.
  • Fallback: If Storage Layer recovery fails (e.g., due to incompatibility after a major upgrade), it falls back to reloading from InnoDB or Object Storage, but this is rare in 9.4.2 due to improved compatibility checks.
  • Scope: Applies to both Standalone (non-HA) and High Availability (HA) DB Systems. In HA setups (enhanced in 9.3.0), recovery propagates across replicas automatically.

Auto-Reload After Maintenance or Upgrades:

  • Integrated with OCI/AWS maintenance workflows: After a planned restart (e.g., for patching), HeatWave 9.4.2 automatically triggers reload/recovery without manual intervention. This was refined in 9.2.0+ to include change propagation deltas.
  • Change Propagation Integration: Post-recovery, ongoing DML changes (INSERT/UPDATE/DELETE) from InnoDB are automatically propagated to the cluster in real-time, ensuring data consistency without additional reloads.

Optimizations Inherited from 9.4.x Series:

  • Statistics Cache (from 9.4.1): Introduces an InnoDB statistics cache (up to 65,536 entries) that speeds up query planning during recovery, indirectly accelerating the initial post-reboot query offload to HeatWave.
  • Enhanced Bulk Load (from 9.2.0 and 9.4.0): If fallback to InnoDB reload is needed, 9.4.2 uses parallelized CSV/Parquet parsing and multi-threaded loading (via LOAD DATA or Auto Parallel Load), supporting up to 8x faster ingestion for tables without primary keys or with VECTOR types.
  • Guided Load Control (from 9.4.1): Allows explicit enabling/disabling of optimized loading paths (GUIDED ON/OFF in load statements), giving finer control during recovery for complex schemas.
  • Advanced Cardinality Estimation (ACE) Auto-Training (from 9.4.2): A new system variable (heatwave_ace_auto_train) enables background rebuilding of query statistics models for loaded tables, ensuring faster optimization post-reboot without manual ANALYZE TABLE.

When Reloading Is Triggered

  • DB System Restart: Full cluster reboot; data lost from memory but recoverable from Storage Layer.
  • Cluster Resize/Stop-Start: Elastic scaling (supported up to 64 nodes in 9.1.0+) requires reload, but 9.4.2 uses optimized parallel threads.
  • Node Failure: Automatic failover with Storage Layer recovery (monitored every 60 seconds).
  • Upgrade Incompatibility: If data format mismatches (e.g., pre-9.2.0 data), manual unload/reload may be needed, but 9.4.2 minimizes this via better versioning.

Verification and Best Practices

  • Check Recovery Status: Post-reboot, query Performance Schema and check the loading status:
USE performance_schema;
    SELECT ID, LOAD_STATUS, LOAD_PROGRESS, LOAD_START_TIMESTAMP, LOAD_END_TIMESTAMP FROM performance_schema.rpd_tables order by LOAD_PROGRESS desc, LOAD_START_TIMESTAMP asc;

  • AVAIL_RPDGSTABSTATE: Successfully recovered/loaded.
  • LOADING_RPDGSTABSTATE:  Data is still being loaded.
  • UNAVAIL_RPDGSTABSTATE: Data loading has not started.
  • RECOVERYFAILED_RPDGSTABSTATE (new in 9.2.2): Indicates failure for Lakehouse tables; retry with manual load.

Monitor with Events: On OCI, set up notifications for HeatWaveClusterRestarted events to track recovery time.

Tips for Faster Recovery:

  • Use Auto Parallel Load for initial loads: CALL sys.heatwave_load_status('schema_name'); optimizes threads automatically.
  • Enable data compression (heatwave_compression=ON) to reduce Storage Layer footprint.
  • For Lakehouse tables, configure event-based incremental loads (9.4.1+) to minimize full reloads.
  • Test with EXPLAIN post-recovery to confirm offload: Look for Using secondary engine RAPID.

Limitations

  • Recovery assumes tables were loaded pre-reboot; new tables require manual ALTER TABLE ... SECONDARY_ENGINE_LOAD.
  • In SUPER_READ_ONLY mode (post-upgrade), recovery pauses—disable it temporarily.
  • For AWS deployments, recovery sources from S3 if Storage Layer fails, which may be slightly slower than OCI Object Storage.

This mechanism in 9.4.2 ensures minimal downtime for analytical workloads, making HeatWave more resilient for production use. For full details, refer to the MySQL HeatWave 9.4.2 Release Notes and HeatWave User Guide on Reloading Data.