In many organizations, Oracle databases power critical systems; such as healthcare patient management, billing, and customer profiles, while MySQL drives e-commerce, content management, web-based, or healthcare applications. Bridging the two often means moving not just data, but live transactions, with precision.

That’s where Oracle GoldenGate comes in.

Setting up replication between Oracle and MySQL starts with instantiation, capturing a consistent snapshot of the Oracle data to initialize MySQL, while preparing for real-time synchronization.

Whether replicating healthcare information, syncing e-commerce data across regions, or migrating legacy apps, instantiation is the first critical step.

In this article we shall demonstrate the Oracle precise instantiation method to create a heterogeneous (Oracle to MySQL Database) replication implementation without taking an application or database outage.

Oracle to MySQL Database Initial Load
Image 1. Oracle to MySQL Database Instantiation

Prerequisites

Before starting the instantiation process, the source and target Oracle GoldenGate Hubs (OGG Hubs) and the source and target database environments need to be configured to enable replication. Not executing these steps often leads to setup failures, inconsistent data, or unsupported configuration errors.

  • Convert Oracle tables to MySQL : Oracle GoldenGate does not provide database conversion tools. To convert your Oracle tables, view, stored procedure, and functions to MySQL you will need to select a utility to perform this task. There are numerous utilities available, such as Oracle SQL Developer and SQLines.
  • Install OpenSSL on target OGG Hub : Oracle GoldenGate 23ai for MySQL requires that OpenSSL 1.1 are present on the Oracle GoldenGate server prior to creating a deployment. See the Oracle GoldenGate documentation for details.
  • GoldenGate Installation : Install the latest patch release of Oracle GoldenGate 23ai (Microservices Architecture) for Oracle and MySQL. See the Oracle GoldenGate documentation for details regarding software installation and patching.
  • Create Deployments for Oracle and MySQL GoldenGate : Create the source and target Oracle GoldenGate deployments. See the Oracle GoldenGate documentation for more details.
  • Database version : Ensure your source (Oracle) and target (MySQL) databases are up to date with patch releases.
  • Database Configuration :

Pre-Instantiation Steps

When setting up Oracle GoldenGate for replication, taking time to understand your table metadata isn’t just good practice, it’s essential for performance and reliability.

Oracle GoldenGate’s Initial Load Extract process is single-threaded and performs a fetch from each table (processed in alphabetical order). While this approach is straightforward, it can be inefficient for environments with large tables or high table counts. All rows are selected and streamed downstream to the target, which can quickly become time-consuming without the right preparation.

To optimize this, it’s important to:

  • Capture the row count and size of each source table being replicated: Knowing how much data needs to move helps you estimate the time and resources required for instantiation.
  • Review target table key constraints: Primary keys, unique constraints, and foreign keys impact Oracle GoldenGate’s ability to replicate changes accurately and efficiently.
    • Tables without keys may require special handling for change data replication.
    • Foreign keys should be disabled for instantiation to prevent data apply errors.
  • Check the source side (Oracle) database UNDO tablespace and UNDO_RETENTION setting.
    • The default setting for Oracle is UNDO_MANAGEMENT AUTO, so the database optimizes UNDO tablespace.
    • The default setting for Oracle is UNDO_RETENTION 900 (seconds).
    • If the undo tablespace does not have enough space to honor the UNDO_RETENTION period, Oracle may reuse unexpired undo space, which can lead to “snapshot too old” errors for queries that require older undo information, and may result in Initial Load Extract abend.
    • If sufficient UNDO_RETENTION and/or UNDO tablespace cannot be allocated, precise instantiation cannot be used.

Understanding this metadata upfront allows you to make informed decisions, identify potential bottlenecks, and tailor your instantiation strategy to your environment’s scale and complexity.

To generate this overview, we used the SQL query shown in the following, which retrieves source table size.

— Gather fresh statistics for the source schema
EXEC DBMS_STATS.GATHER_SCHEMA_STATS(‘[schema]’);

— Now print table size and row counts
COLUMN TABLE_NAME FORMAT a25
COLUMN NUM_ROWS FORMAT 9999999999
COLUMN SIZE_MB FORMAT 9,999,999.999

SELECT
    t.table_name,
    s.num_rows,
    ROUND(SUM(d.bytes) / 1024 / 1024, 2) AS size_mb
FROM
    all_tables t
JOIN
    all_tab_statistics s ON t.owner = s.owner AND t.table_name = s.table_name
JOIN
    dba_segments d ON t.owner = d.owner AND t.table_name = d.segment_name
WHERE
    t.owner = UPPER(‘[schema]’)
    AND d.segment_type = ‘TABLE’
GROUP BY
    t.table_name, s.num_rows
ORDER BY
    size_mb DESC;
Source Table Size
Image 2. Source table size

To list the target table constraints, use the following MySQL query:

SELECT
    table_name,
    constraint_name,
    constraint_type
FROM
    information_schema.TABLE_CONSTRAINTS
WHERE
    constraint_schema = ‘[schema]’;
Target table constraints
Image 3. Target table constraints

Source Side Instantiation Steps of GoldenGate for Oracle

In this section, we will complete all the required steps on the source side of the replication setup.

Add Trandata

After creating the source Oracle database connection, the first step is to enable supplemental logging using ADD TRANDATA (per table) or ADD SCHEMATRANDATA (for all tables in a schema).

  • The recommended best practice is ADD SCHEMATRANDATA.

When replicating to heterogeneous targets (such as Oracle to MySQL), the recommended best practice is to enable TRANDATA for all columns as this ensures that Oracle GoldenGate captures all column-level changes, even for updates that do not modify primary or unique key columns. This improves replication accuracy.

Add Trandata
Image 4. Source Trandata

Create Primary Change Data Capture (CDC) Extract

Create the Integrated Extract and record the registration SCN. Here are the Integrated Extract parameters used for this demonstration:

EXTRACT IECC
USERIDALIAS dc1west DOMAIN OracleGoldenGate
EXTTRAIL ec

— Every 60 minutes, report the count of
— records processed since Extract start to
— the process report file.
— Rate reports the number of operations
— per second.
— The report file also contains DELTA
— throughput which is the number of records
— since the last report divided by the time
— since the last report.
REPORTCOUNT EVERY 60 MINUTES, RATE

— Extract generates a warning message to
— report file for long running transactions
— In this case any transaction which is open
— more than 1 hour.
— CHECKINTERVAL is the frequency at which
— Oracle GoldenGate checks for long
— transactions.
WARNLONGTRANS 1H, CHECKINTERVAL 10M

— Capture  changes for all tables in the
— TPCC schema.
TABLE PDB_WEST.TPCC.*;

Capturing the registration SCN is crucial for ensuring a precise and consistent instantiation point during replication setup. To find the start, or registered SCN, run the query below on source Oracle Database and make note of the returned value.

COLUMN CAPTURE_NAME FORMAT a25
COLUMN START_SCN FORMAT 9999999999
SELECT capture_name, start_scn
FROM dba_capture
WHERE capture_name = ‘OGG$CAP_[CDC EXTRACT NAME]’;
Integrated Extract start SCN
Image 5. Integrated Extract start SCN

To achieve a consistent and reliable initial load using Oracle GoldenGate, it’s crucial to identify a valid SCN (System Change Number) from which the Extract can begin. This SCN must be recorded after ensuring no open transactions exist in the source Oracle database.

Why? Because the CDC (Change Data Capture) Extract only tracks transactions that start after its registration. Any open transactions initiated before registration won’t be captured, even if they commit later, leading to potential data loss.

Use the following query to list all open transactions on the source database after registering the Extract:

set linesize 150
set wrap off
COLUMN SQL_TEXT FORMAT a30
COLUMN TSTATUS FORMAT a7
COLUMN USERNAME FORMAT a15
COLUMN OSUSER FORMAT a15
COLUMN SID FORMAT 9999
COLUMN SERIAL# FORMAT 999999
SELECT
    T.START_SCN,
    T.STATUS AS TSTATUS,
    TO_CHAR(T.START_DATE, ‘MM/DD/YY HH24:MI:SS’) AS START_TIME,
    S.SID,
    S.SERIAL#,
    S.INST_ID,
    S.USERNAME,
    S.OSUSER,
    S.STATUS AS SSTATUS,
    TO_CHAR(S.LOGON_TIME, ‘MM/DD/YY HH24:MI:SS’) AS LOGON_TIME,
    S.SQL_ID,
    Q.SQL_TEXT
FROM
    gv$transaction T
JOIN
    gv$session S ON T.ses_addr = S.saddr AND T.inst_id = S.inst_id
LEFT JOIN
    gv$sql Q ON S.SQL_ID = Q.SQL_ID AND S.inst_id = Q.inst_id
UNION ALL
— Fallback row when there are no open transactions
SELECT
    D.CURRENT_SCN,
    ‘CURRENT’,
    TO_CHAR(CURRENT_DATE, ‘MM/DD/YY HH24:MI:SS’) AS START_TIME,
    NULL, NULL, NULL,
    ‘SYS’, NULL, NULL,
    TO_CHAR(CURRENT_DATE, ‘MM/DD/YY HH24:MI:SS’) AS LOGON_TIME,
    NULL, NULL
FROM
    v$database D
ORDER BY 1;
Current open transactions
Image 6. Current open transactions

After running the query to check for open transactions, compare each transaction’s START_SCN with the registration SCN of your CDC Extract. In our case, the Extract was registered at SCN 17658088.

As all active transactions have START_SCN > 17658088, it is safe to pick the most current database SCN value as the starting SCN for your Initial Load Extract. This ensures there’s no data loss between the Initial Load and Integrated Extracts.

If your output shows any open transaction with START_SCN < 17658088, you have two options:

  • Wait until those transactions complete (and disappear from the query).
  • Identify and terminate unwanted or long-running sessions.
Current SCN
Image 7. Current SCN

Create Initial Load Extract

Create the Initial Load Extract, using the SQLPREDICATE option to the TABLE parameter and specifying the SCN selected from the above step.

EXTRACT ILTPCC
USERIDALIAS dc1west DOMAIN OracleGoldenGate
EXTFILE ic, MEGABYTES 2000, PURGE

— Every 60 seconds, report the count of
— records processed since Extract start to
— the process report file.
— Rate reports the number of operations
— per second.
— The report file also contains DELTA
— throughput which is the number of records
— since the last report divided by the time
— since the last report.
REPORTCOUNT EVERY 60 SECONDS, RATE

TABLE PDB_WEST.TPCC.*, SQLPREDICATE “AS OF SCN 18104068”;

Large Table Considerations

For large tables with high row counts, it is recommended to parallelize the Initial Load Extract to improve performance and reduce overall load time. This can be achieved by assigning different tables or schemas to separate Initial Load Extracts.

When dealing with a single large table, you can further enhance efficiency by splitting the table across multiple Extracts using the @RANGE function, enabling parallel processing of different segments of the table:

TABLE [pdb].[schema].[table name] , FILTER (@RANGE ([n], [n])), SQLPREDICATE “AS OF SCN scn”;

For more information on @RANGE function, refer to the GoldenGate 23ai Parameters and Functions Reference Guide.

Start Extracts

Start the Initial Load Extract. When it has completed reading all data from all tables, it will automatically stop.

To start the Integrated Extract, select Start with Options.

Start Integrated Extract with options
Image 8. Integrated Extract start with options

Select After CSN as the start option and enter the SCN used in the Initial Load Extract “AS OF SCN” setting.

Start after CSN
Image 9. Set starting SCN of Integrated Extract

Create Distribution Path

Create and start two Distribution Paths from the source Oracle GoldenGate server to the target Oracle GoldenGate server, as specified in the Oracle GoldenGate documentation, one for the Instantiation Trail data and another for the CDC Trail data.

Create Initial Load Replicat

Before proceeding, it’s important to understand how Oracle GoldenGate handles Initial Load Trail data and how it impacts downstream processing.

The Initial Load Extract reads data directly from the source tables, meaning no transactional metadata (like BEGIN/COMMIT records) is written to the Initial Load Trail files.

Oracle GoldenGate does not use a dedicated Initial Load Replicat. Instead, the CDC Replicats are responsible for applying the data captured during the Initial Load.

Since CDC Replicats rely on transaction boundaries in the trail files to group operations into transactions, Oracle GoldenGate generates a synthetic (fake) transaction ID for the Initial Load data. This ID is derived from: The trail file sequence number, The relative byte address (RBA), The timestamp of the first record in the trail

A synthetic transaction is considered complete under one of the following conditions:

  • End-of-file (EOF) is reached in each trail being processed, or
  • The number of records read by Replicat reaches the value set by the MAXTRANSOPS parameter (default: 10,000,000 records)

Because the entire Initial Load data may be treated as a single large transaction, it’s critical to adjust the Replicat configuration, especially when using Parallel Replicat. If not properly tuned:

  • Replicat may appear to be hung or stalled,
  • No statistics will be reported, and
  • No checkpoints will be recorded until the synthetic transaction is committed.

To avoid these issues, we shall enable the setting SPLIT_TRANS_RECS in our Parallel Replicat. If you are not using Parallel Replicat, review and adjust the MAXTRANSOPS setting.

  • MAXTRANSOPS defines a hard limit on the number of records per transaction.
  • SPLIT_TRANS_RECS is specific to Parallel Replicat and is used to break large transactions into smaller, concurrent chunks, improving checkpointing and performance.
REPLICAT ilcc
USERIDALIAS ggadmin_tpcc DOMAIN OracleGoldenGate

— Set minimum number of Appliers to 1 and
— spawn up to 6 to improve initial load
— performance.
— When everything has been applied, only 1 — Applier will be running, signalling
— initial load is complete.
MIN_APPLY_PARALLELISM 1
MAX_APPLY_PARALLELISM 6

— Every 60 seconds, report the count of
— records processed since Replicat starts
— to the process report file.
— Rate reports the number of operations per
— second.
— The report file also contains DELTA
— throughput which is the number of records
— since the last report divided by the time
— since the last report.
REPORTCOUNT EVERY 60 SECONDS, RATE

— Large transactions should be broken into
— pieces of specified size and applied
— concurrently.
SPLIT_TRANS_RECS 10000
MAP PDB_WEST.TPCC.*, TARGET tpcc.*;

Start the Replicat. Note, the Replicat will not stop when it has completed applying the Initial Load Data. There are a couple of ways to determine if instantiation is complete.

Check Replicat Checkpoint vs. Trail File Size

On the GoldenGate Hub Server, list the trail files used for initial load and compare the last trail sequence number and size with Replicat’s checkpoint sequence number and RBA (from the Web UI).

Compare Extract and Replicat Statistics

Review the Initial Load Extract report to see how many rows were fetched per table. Then, check Replicat statistics to verify the same row counts. Matching numbers indicate that the initial load completed successfully.

If the record counts reported by Initial Load Extract and Replicat match, instantiation is complete. Stop the Initial Load Replicat.

Create CDC Replicat

Once instantiation is complete, create and start a CDC (Change Data Capture) Replicat to apply both queued and real-time changes.

REPLICAT PRCC
USERIDALIAS ggadmin_tpcc DOMAIN OracleGoldenGate

— Every 60 minutes, report the count of
— records processed since Replicat starts
— to the process report file.
— Rate reports the number of operations per second.
— The report file also contains DELTA
— throughput which is the number of records
— since the last report divided by the time
— since the last report.
REPORTCOUNT EVERY 60 MINUTES, RATE

— Large transactions should be broken into
— pieces of specified size and applied
— concurrently.
SPLIT_TRANS_RECS 10000

MAP PDB_WEST.TPCC.*, target tpcc.*;

Post Instantiation Cleanup

We recommend the following to clean up the Oracle GoldenGate installations post instantiation:

  • Delete the Replicat used to apply instantiation data.
  • Stop and delete the Distribution Path used for transmitting instantiation data.
  • Delete the Initial Load Extract.
  • Create a Trail Purge Task in the Administration Service WebUI for each Oracle GoldenGate Instance to ensure disk space is freed when Oracle GoldenGate Trails are consumed.

Conclusion

In this article we presented detailed steps to perform precise instantiation of an MySQL target database from a source Oracle database using Oracle GoldenGate. To learn more about Oracle GoldenGate functionality, you may refer to: