Choosing The Correct Shape When Migrating MySQL From On-Premise to HeatWave in OCI

If you want to migrate your on-premise MySQL database to HeatWave (MySQL) on Oracle Cloud Infrastructure (OCI), choosing the correct shape is essential for a successful migration. A “shape” in OCI defines the compute resources (CPU’s, memory, storage, etc.) which will be allocated to your HeatWave instance – for both the OLTP (DB System) and OLAP (HeatWave Cluster) services.

Understanding Your On-Premise Workload

Data Size: Estimate the total size of your database, including tables, indexes, and any anticipated growth. You will want to select the DB System’s storage size and again, add room for growth. When creating the DB System, be sure to enable the automatic storage expansion to avoid lost of service when the allocated storage space is consumed. When deciding on the size of the HeatWave Cluster, it is optimized for in-memory analytics, so the data you plan to query with HeatWave must fit into the cluster’s memory. Here is a query you can use to get the size of your current MySQL database(s):

SELECT
        TABLE_SCHEMA AS `DatabaseName`,
        TABLE_NAME AS `TableName`,
        ROUND(SUM(COALESCE(data_length, 0) + COALESCE(index_length, 0)) / 1024 / 1024, 2) AS `TotalSizeMB`,
        ROUND(SUM(COALESCE(data_free, 0)) / 1024 / 1024, 2) AS `FreeSpaceMB`
    FROM
        information_schema.TABLES
    WHERE
        TABLE_SCHEMA NOT IN ('information_schema', 'performance_schema', 'mysql', 'sys')
    GROUP BY
        TABLE_SCHEMA, TABLE_NAME
    ORDER BY
        TotalSizeMB DESC;

And you can use this query to get the size of your current MySQL indexes:

SELECT
        TABLE_SCHEMA AS `Database`,
        TABLE_NAME AS `Table`,
        ROUND(SUM(INDEX_LENGTH) / 1024 / 1024, 2) AS `Index Size (MB)`
    FROM
        information_schema.TABLES
    WHERE
        TABLE_SCHEMA NOT IN ('information_schema', 'performance_schema', 'mysql', 'sys')
    GROUP BY
        TABLE_SCHEMA, TABLE_NAME
    ORDER BY
        `Index Size (MB)` DESC;

Or, if you want a simple query to give you the total sizes of the databases and indexes:

SELECT
        ROUND(SUM(COALESCE(data_length, 0) + COALESCE(index_length, 0)) / 1024 / 1024 / 1024, 2) AS `TotalDatabaseSizeGB`,
        ROUND(SUM(COALESCE(index_length, 0)) / 1024 / 1024 / 1024, 2) AS `TotalIndexSizeGB`
    FROM
        information_schema.TABLES
    WHERE
        TABLE_SCHEMA NOT IN ('information_schema', 'performance_schema', 'mysql', 'sys');

Workload Type: Determine if your workload is primarily transactional (OLTP), analytical (OLAP), or a mix of both. HeatWave excels at accelerating analytical queries, so focus on the analytics portion if you’re adding a HeatWave Cluster.

HeatWave Data (OLAP) Size Requirements: Identify the complexity and frequency of your analytical queries, and which tables you will need to load to help determine the size of the HeatWave Cluster.

Current Resource Usage: Analyze your on-premise system’s CPU, memory, and I/O usage under peak load to establish a baseline.

Max Connections and Queries

There are a few queries you can use to determine the maximum number of concurrent connections and queries on your on-premise MySQL server. These will be helpful if you don’t have a database monitoring tool.

Max connections: The simplest way to check the maximum number of concurrent connections since the server was last restarted is to query the max_used_connections server status variable. This variable tracks the peak number of simultaneous connections.

SHOW STATUS WHERE Variable_name = 'Max_used_connections';
    +----------------------+-------+
    | Variable_name        | Value |
    +----------------------+-------+
    | Max_used_connections | 173   |
    +----------------------+-------+

The variable max_used_connections reflects the maximum since the last server restart (uptime). This value includes the peak number of connections over the entire uptime – and you can see how long the MySQL instance has been running with:

SELECT
        Variable_name,
        Variable_value AS uptime_seconds,
        FLOOR(Variable_value / 86400) AS uptime_days,
        ROUND(Variable_value / 86400, 2) AS uptime_days_decimal
    FROM performance_schema.global_status
    WHERE Variable_name = 'Uptime';

For HeatWave, there are several user-configurable, shape-dependent system variables, and many have different values for each HeatWave shape. Check this page to see the max_connections for the different shapes to be sure you select a shape that will support your required number of max_connections. You can see your current max_connections via the following query, but remember this isn’t the same as the value of the max_used_connections. The max_used_connections variable should be less than max_connections:

mysql> SHOW VARIABLES LIKE 'max_connections';
    +-----------------+-------+
    | Variable_name   | Value |
    +-----------------+-------+
    | max_connections | 250   |
    +-----------------+-------+
    1 row in set (0.00 sec)

Note: the number of connections doesn’t equal the number of queries per second. (A connection may be open and doing nothing, so you may want to adjust the wait_timeout variable from it’s default value of eight hours). If you do not currently use a database monitoring tool, you can calculate the number of queries over a given period of time. You will want to run these queries during a peak workload:

SHOW GLOBAL STATUS LIKE 'queries';SELECT SLEEP (10);SHOW GLOBAL STATUS LIKE 'queries';
    +---------------+--------+
    | Variable_name |  Value |
    +---------------+--------+
    | Queries       |  43000 |
    +---------------+--------+
    1 row in set (0.00 sec)
    +------------+
    | sleep (10) |
    +------------+
    |          0 |
    +------------+
    1 row in set (5.01 sec)
    +---------------+--------+
    | Variable_name |  Value |
    +---------------+--------+
    | Queries       |  45000 |
    +---------------+--------+
    1 row in set (0.00 sec)

Subtract the first value (43000) from the second value (45000) and divide by the number of sleep seconds (10) to get an approximate number of queries per second (200).

HeatWave Shapes

In OCI, HeatWave supports two main categories of shapes:

ECPU Shapes: An ECPU is architecture-agnostic (supporting AMD or Intel processors) and all ECPU shapes support HeatWave clusters. ECPU’s are the standard as of March, 2025, and the older OCPU shapes were deprecated in late 2024. You may attach any ECPU shape to any HeatWave Cluster shape.

HeatWave Cluster Shapes: These shapes define the resources available for the HeatWave cluster nodes, which are separate resources from the DB system shape. Three shapes are available: HeatWave.Free is a single 16GB node and can’t be expanded. HeatWave.32GB supports a maximum of 16 nodes, and each node has 32GB of memory. HeatWave.512GB (which is recommended) is 512GB in size and supports HeatWave Lakehouse and HeatWave GenAI (in certain regions).

Match Shapes to Your Requirements

For the DB System (OLTP) workload, choose an ECPU-based shape for the MySQL DB System that supports your transactional workload (including desired innodb_buffer_pool size) and allows a HeatWave Cluster to be attached. A HeatWave cluster provides a distributed, scalable, shared-nothing, in-memory, hybrid columnar, query processing engine. It can accelerate analytic queries, query external data stored in object storage, and perform machine learning. HeatWave cluster can be added to a MySQL DB system. Here are two examples:

A small transactional workload might use a shape with 4-8 ECPUs.
A high-availability (HA) setup with mixed OLTP/OLAP might need 16+ ECPUs.

Remember that MySQL performance relies more on RAM than the number of CPU’s or cores. MySQL uses RAM for the InnoDB Buffer Pool, which is an area in main memory where InnoDB caches table and index data as it is accessed. The buffer pool permits frequently-used data to be accessed directly from memory, which speeds up processing. On dedicated servers, up to 80% of physical memory is often assigned to the buffer pool.

To see the size of your current buffer pool in gigabytes:

SELECT @@innodb_buffer_pool_size/1024/1024/1024;

And remember, you can have multiple innodb_buffer_pool_instances and each instance has a innodb_buffer_pool_chunk_size. For example, I have a buffer pool size of 1GB, with eight buffer pool chunks, so each chunk is 128MB:

mysql> SELECT @@innodb_buffer_pool_size/1024/1024/1024;
    +------------------------------------------+
    | @@innodb_buffer_pool_size/1024/1024/1024 |
    +------------------------------------------+
    |                           1.000000000000 |
    +------------------------------------------+
    mysql> select @@innodb_buffer_pool_instances;
    +--------------------------------+
    | @@innodb_buffer_pool_instances |
    +--------------------------------+
    |                              8 |
    +--------------------------------+
    mysql> select @@innodb_buffer_pool_chunk_size/1024/1024;
    +-------------------------------------------+
    | @@innodb_buffer_pool_chunk_size/1024/1024 |
    +-------------------------------------------+
    |                              128.00000000 |
    +-------------------------------------------+

The size of the HeatWave innodb_buffer_pool is dependent on the amount of RAM provided by the shape, and this amount can not be configured manually. Check the Default User Variables page and search for innodb_buffer_pool_size to make sure you select a shape that meets your innodb_buffer_pool needs.

For the HeatWave Cluster shape, the selection is much simpler – you want a shape based upon your analytical data size and performance needs:

Use HeatWave.32GB for datasets under 50 GB.
Use HeatWave.512GB for larger datasets, scaling the number of nodes as needed.

Note: if you want to use HeatWave Lakehouse or HeatWave GenAI, both require the HeatWave.512GB shape.

Consider HeatWave Cluster Requirements

Remember that HeatWave compresses data as it is loaded, which permits HeatWave nodes to store more data. More data per node reduces costs by minimizing the size of the HeatWave Cluster required to store the data.

While data compression results in a smaller HeatWave Cluster, decompression operations that occur as data is accessed affect performance to a small degree. Specifically, decompression operations have a minor effect on query runtimes, on the rate at which queries are offloaded to HeatWave during change propagation, and on recovery time from Object Storage.

If data storage size (and cost) is not a concern, disable data compression by setting the rapid_compression session variable to OFF before loading data:

 mysql>SET SESSION rapid_compression=OFF;

The default option is AUTO which automatically chooses the best compression algorithm for each column.

HeatWave requires the data to be loaded into memory for analytics, so the total memory across all HeatWave nodes must exceed your analytical dataset size.

Example: A 1TB dataset with HeatWave.512GB (512 GB per node) requires at least 2 nodes (1 TB ÷ 512 GB ≈ 2) if you aren’t using data compression.

Add nodes for redundancy, performance, or larger datasets. The minimum is one node, but the maximum depends on your tenancy limits and workload.

Use the Node Count Estimator

To determine the appropriate HeatWave Cluster size for a workload, you can estimate the required cluster size. Cluster size estimates are generated by the HeatWave Auto Provisioning feature, which uses machine learning models to predict the number of required nodes based on node shape and data sampling.

Perform a cluster size estimate:

When adding a HeatWave Cluster to a MySQL DB System, to determine the number of nodes required for the data you intend to load.
Periodically, to ensure that you have an appropriate number of HeatWave nodes for your data. Over time, data size may increase or decrease, so it is important to monitor the size of your data by performing cluster size estimates.
When encountering out-of-memory errors while running queries. In this case, the HeatWave Cluster may not have sufficient memory capacity.
When the data growth rate is high.
When the transaction rate (the rate of updates and inserts) is high.

Account for Additional Features

High Availability (HA): If you need HA, choose a shape with enough resources to support three MySQL instances (primary + two secondaries). The HeatWave cluster attaches to the primary instance.

HeatWave Lakehouse: If you plan to query data in Object Storage, ensure your shape supports it (e.g., HeatWave.512GB).

HeatWave GenAI: Allows you to communicate with unstructured data in HeatWave using natural-language queries, requires the HeatWave.512GB shape and is available only in certain OCI regions.

Growth: Factor in future data growth and scalability. ECPU shapes allow flexibility to scale-up later, but a reboot is required.

Test and Validate

Proof of Concept (PoC): As with any migration, you will want to do a proof of concept. Provision a small test DB System and HeatWave cluster in OCI with a subset of your data. Then test your queries to verify performance.

Adjust as needed: If performance lags, scale up the DB System shape (more ECPUs) or add HeatWave nodes. You can change shapes post-migration with a restart.

Plan the Migration

Export your on-premise MySQL database – preferably using MySQL Shell’s Dump Instance utility (but mysqldump is still an option, even though it is much slower than Shell.). MySQL Shell’s instance dump utility util.dumpInstance() and schema dump utility util.dumpSchemas() (introduced in MySQL Shell 8.0.21), support the export of all schemas or a selected schema from an on-premise MySQL instance into an Oracle Cloud Infrastructure Object Storage bucket or a set of local files. MySQL Shell also supports dumping MySQL data to S3-compatible buckets, such as Amazon Web Services (AWS) S3.

Import the dump into a provisioned DB System with an ECPU shape that supports your OLTP workload. Then create your HeatWave Cluster. A HeatWave Cluster must be associated with an active DB System, and a DB System can have only one HeatWave Cluster, but the Cluster can have multiple nodes. Before you create a HeatWave Cluster, ensure that you have created a DB System and that the DB System does not already have a HeatWave Cluster.

After you create the HeatWave cluster, you can load your analytical tables into the HeatWave Cluster for query acceleration. For MySQL versions 9.1.0 and later, use Auto Parallel Load. The loading of data into HeatWave can be classified into three types: Initial Bulk Load, Incremental Bulk Load, and Change Propagation.

Initial Bulk Load: Performed when loading data into HeatWave for the first time, or when reloading data. The best time to perform an initial bulk load is during off-peak hours, as bulk load operations can affect OLTP performance on the MySQL DB System.

Incremental Bulk Load: Performed when there is a substantial amount of data to load into tables that are already loaded in HeatWave. An incremental bulk load involves these steps:

Performing a SECONDARY_UNLOAD operation to unload a table from HeatWave. See “Unloading Tables”.
Importing data into the table on the MySQL DB System node.
Performing a SECONDARY_LOAD operation to reload the table into HeatWave. See “Loading Data to HeatWave MySQL”.

Depending on the amount of data, an incremental bulk load may be a faster method of loading new data than waiting for change propagation to occur. It also provides greater control over when new data is loaded. As with initial build loads, the best time to perform an incremental bulk load is during off-peak hours, as bulk load operations can affect OLTP performance on the MySQL DB System.

Prerequisites: Before loading data into the HeatWave Cluster, ensure that you have met the following prerequisites:

The data you want to load must be available on the MySQL DB System. For information about importing data into a MySQL DB System, refer to the following instructions:

For HeatWave on OCI, see Importing and Exporting Databases in the HeatWave on OCI Service Guide.
For HeatWave on AWS, see Importing Data in the HeatWave on AWS Service Guide.
For HeatWave for Azure, see Importing Data to HeatWave in the HeatWave for Azure Service Guide.

The MySQL DB System only supports the InnoDB storage engine. The tables you intend to load must be InnoDB tables. You can manually convert tables to InnoDB using the following ALTER TABLE statement:

mysql>ALTER TABLE tbl_name ENGINE=InnoDB;

The tables you intend to load must be defined with a primary key. You can add a primary key using the following syntax:

mysql>ALTER TABLE tbl_name ADD PRIMARY KEY (column);

If you have numerous tables without primary keys, you can automatically add invisible primary keys when you dump your instance or tables via MySQL Shell (search for create_invisible_pks). See the Prerequisites page for more information before loading your data.

Change Propagation: After tables are loaded into HeatWave, data changes are automatically propagated from InnoDB tables on the MySQL DB System to their counterpart tables in HeatWave. See “Change Propagation”. The eliminates the need for extract, transform and load processes for running your OLAP queries against your OLTP data.

Migration Without Downtime

If you want to do a live switchover from on-premise to HeatWave, you will want to setup MySQL Replication, by creating a replication channel for your HeatWave instance. See “Managing a Replication Channel“. You can replicate from your on-premise MySQL database to HeatWave on OCI by going over the public Internet, but an OCI FastConnect connection provides an easy way to create a secure, dedicated, private connection between your data center and Oracle Cloud Infrastructure.

Example Scenario:

On-Premise Setup: 500 GB database, 200 GB for analytics, moderate OLTP, and complex OLAP queries.
DB System Shape: MySQL.VM.Standard.E4 with 16 ECPUs (handles OLTP and supports HeatWave).
HeatWave Cluster: HeatWave.512GB with 1 node (512 GB > 200 GB), but consider 2 nodes for better performance.
Validation: Test with a PoC, then adjust nodes or shape if needed.

Final Tips:

Start with a conservative shape and scale up as needed – Oracle Cloud allows shape changes with minimal downtime.
For datasets >10 TB or specific needs (e.g., MySQL.256 shape), contact Oracle Support or your sales representative due to limited hardware availability.
Review the latest Oracle HeatWave Service Guide for updated shape details, as options evolve.

By aligning your workload’s data size, query demands, and resource needs with the available shapes, you’ll ensure a smooth migration and optimal performance in HeatWave on OCI. Let me know if you need help with specific calculations or migration steps.

Choosing The Correct Shape When Migrating MySQL From On-Premise to HeatWave in OCI

Understanding Your On-Premise Workload

Max Connections and Queries

HeatWave Shapes

Match Shapes to Your Requirements

Consider HeatWave Cluster Requirements

Use the Node Count Estimator

Account for Additional Features

Test and Validate

Plan the Migration

Migration Without Downtime

Example Scenario:

Final Tips:

Tony Darnell

MySQL 9.3! Thank you for your contributions!

Anomaly Detection with HeatWave AutoML

Choosing The Correct Shape When Migrating MySQL From On-Premise to HeatWave in OCI

Understanding Your On-Premise Workload

Max Connections and Queries

HeatWave Shapes

Match Shapes to Your Requirements

Consider HeatWave Cluster Requirements

Use the Node Count Estimator

Account for Additional Features

Test and Validate

Plan the Migration

Migration Without Downtime

Example Scenario:

Final Tips:

Authors

Tony Darnell

MySQL 9.3! Thank you for your contributions!

Anomaly Detection with HeatWave AutoML