In the previous blog post, I discussed using RMAN for physical instantiation of large databases. In this post, I’ll focus on another approach: using Oracle GoldenGate 23ai Initial Load Extract for instantiation.
This method is particularly useful when:

  • You’re working with a subset of the database, such as a set of small tables.
  • You’re operating in a heterogeneous environment.

Unlike RMAN, GoldenGate Initial Load provides a logical instantiation method, allowing you to fine-tune the process at both the object and record level.
The GoldenGate Initial Load Extract works by reading data directly from the source tables and writing it into an EXTFILE.  The process directly connects to the database and fetches the changes. You can think of it as running an INSERT INTO <extfile> SELECT * FROM <table_name> and exporting the results to an EXTFILE used by GoldenGate.

The diagram below outlines the overall workflow, which I’ll explain step-by-step in the following sections. It highlights two distinct phases: the Initial Load replication (shown in blue) and the continuous online change synchronization (shown in orange).
In summary, you set up (but do not start) continuous replication before running the Initial Load, and then start the continuous replication from the point in time when the Initial Load was initiated.

Let’s step into the details:
Before adding any components, it’s essential to prepare the source tables for GoldenGate Replication. This is done using the ADD TRANDATA or ADD SCHEMATRANDATA command, which enables supplemental logging for the scheduling columns required for replication. I consider this a prerequisite step—typically performed alongside other setup tasks like creating the GoldenGate administrator user and configuring the heartbeat and checkpoint tables.

AdminClient>
CONNECT https://north:9231 DEPLOYMENT north AS ggma PASSWORD *** !

ALTER CREDENTIALSTORE ADD USER ggadmin@dbnorth
  ALIAS ggnorth DOMAIN OracleGoldenGate PASSWORD ***

DBLOGIN USERIDALIAS ggnorth DOMAIN OracleGoldenGate
ADD SCHEMATRANDATA hr
ADD HEARTBEATTABLE


Just like with the source database, you may need to set up some prerequisites on the target database.

AdminClient>
CONNECT https://south:9231 DEPLOYMENT south AS ggma PASSWORD *** !

ALTER CREDENTIALSTORE ADD USER ggadmin@dbsouth
  ALIAS ggsouth DOMAIN OracleGoldenGate PASSWORD ***

DBLOGIN USERIDALIAS ggsouth DOMAIN OracleGoldenGate
ADD HEARTBEATTABLE
ADD CHECKPOINTTABLE ggadmin.ggs_checkpoint


1. Add and Start the Extract for Continuous Replication
Start by adding and launching the Extract process at the source system that will handle ongoing replication.  Using the following simplified Extract parameter file, you can add the Extract using the WEB-UI, AdminClient, or directly use the REST API calls that runs underneath:

EXTRACT extn
USERIDALIAS ggnorth
EXTTRAIL north/ea
TABLE hr.*;

 

AdminClient>
CONNECT https://north:9231 DEPLOYMENT north AS ggma PASSWORD *** !

ADD EXTRACT extn, INTEGRATED TRANLOGS, BEGIN NOW
REGISTER EXTRACT extn, database
ADD EXTTRAIL north/ea, EXTRAXT extn


Keep in mind that Extract only captures complete transactions. Any transaction that began before the Extract was registered will be ignored. Therefore, you must wait for such transactions to either commit or roll back before proceeding.

SELECT t.start_scn, t.status tstatus, t.start_date,
       s.sid ,s.serial#, s.inst_id, s.username, s.osuser,
       s.status sstatus, s.logon_time

FROM gv\$transaction t
INNER JOIN gv$session s
  ON s.saddr = t.ses_addr
UNION ALL
SELECT current_scn, ‘current’, current_date,
    null, null, null, ‘sys’, null, null, null
  FROM v$database
  ORDER BY 1;


BTW:  There is even a REST API call available that you can use to retrieve information of active transactions.

2. Perform the Initial Load
Next, set up the Initial Load Extract. To ensure you capture changes from a precise point in time, you can retrieve the current SCN (System Change Number) from the database and use it with a WHERE ... AS OF SCN predicate in your load query.

SELECT current_scn FROM v$database;
CURRENT_SCN

———–
168298293

The Initial Load Extract Parameter file looks like the following:

EXTRACT     exti

USERIDALIAS ggnorth
EXTFILE     north/il MEGABYTES 2000 PURGE
TABLE       hr.*, SQLPREDICATE “AS OF SCN 168298293“;


Note that the Initial Load uses an EXTFILE—not an EXTTRAIL. You can also control the size of the EXTFILE through parameters in the configuration file. Additionally, the PURGE option allows you to overwrite existing EXTFILE files if they already exist—such as files left behind from a previous, possibly failed, run. A key setting is the SQLPREDICATE, which specifies the exact point in time—using SCN (System Change Number)—from which data should be extracted. When using the AS OF SCN clause, ensure the database retains enough UNDO to support the operation.

Adding and starting the Initial Load Extract is straightforward:

AdminClient>
CONNECT https://north:9231 DEPLOYMENT depl_north AS ggma PASSWORD *** !

DBLOGIN USERIDALIAS ggnorth DOMAIN OracleGoldenGate
ADD   EXTRACT exti SOURCEISTABLE
START EXTRACT exti
INFO ALL, TASKS

 

The SOURCEISTABLE indicates that you are adding an Initial Load Extract. Note that the Initial Load Extract is managed as a task. Therefore, you check the state of the task with INFO ALL, TASKS. Once the task is finished, it automatically stops, and you want to check the statistic section in the report file to proof that all records were written to the EXTFILE.


3. Route Initial Load Data to the Target
The data extracted during the Initial Load is written to an EXTFILE. This file is then processed by a DistPath, which routes the data to the target and writes it into the Trail File. Once this step is complete, the process can be stopped.

AdminClient>
CONNECT https://north:9231 DEPLOYMENT depl_north AS ggma PASSWORD *** !

ADD DISTPATH dpi
   SOURCE trail://north:9232/services/v2/sources?trail=north/il
   TARGET   wss://south:9233/services/v2/targets?trail=north/da !

START DISTPATH dpi


You stop the DistPath one all changes from the Initial Load (visible as INSERT operations) have successfully been routed to the target system. The changes are written into an EXTTRAIL da file. Once you stop the DistPath from the Initial Load, you can seamlessly apply data from the continuous Extract and DistPath with the same EXTTRAIL da file.

4. Start the Extract for Continuous Replication
During the Initial Load Extract process in Oracle GoldenGate, it’s crucial to capture the exact System Change Number (SCN) from which the data was extracted. This is achieved by utilizing the AS OF SCN clause within the SQLPREDICATE, ensuring a consistent snapshot of the source data at a specific point in time. To commence continuous replication seamlessly from this precise point—thereby avoiding data loss or duplication—you initiate the Extract process for continous Replication using the AFTERCSN parameter. This instructs the Extract  to begin processing transactions that occurred after the specified SCN, effectively skipping any transactions that were part of the initial load.

AdminClient>
CONNECT https://north:9231 DEPLOYMENT depl_north AS ggma PASSWORD *** !
START EXTRACT extn
AFTERCSN 168298293 
INFO ALL


5. Add and Start the DistPath for Continuous Replication
Now it’s time to start the DistPath for continuous replication. 

AdminClient>
CONNECT https://north:9231 DEPLOYMENT depl_north AS ggma PASSWORD *** !

ADD DISTPATH dpns
  SOURCE trail://source:9232/services/v2/sources?trail=north/ea
  TARGET   wss://target:9233/services/v2/targets?trail=north/da !

START DISTPATH dpns


6. Add and Start the Replicat Process at the target system.
Finally, start the Replicat process to apply changes from the Trail File to the target database. From Replicat’s perspective, there is only one EXTTRAIL—even though it was initially populated by the Initial Load and subsequently updated with changes from continuous replication.

You may first observe INSERT operations generated by the Initial Load. After that, Replicat will process all ongoing DML operations—INSERTs, UPDATEs, and DELETEs—captured by the continuous replication. Below is a simplified Replicat parameter file along with the commands to add the Replicat.

REPLICAT reps
USERIDALIAS ggsouth DOMAIN OracleGoldenGate
MAP hr.*, TARGET hr.*;

I am choosing a Parallel Replicat, are described as followed:

AdminClient>
CONNECT https://south:9231 DEPLOYMENT depl_south AS ggma PASSWORD *** !

ADD REPLICAT reps,
  PARALLEL,
  EXTTRAIL north/da, CHECKPOINTTABLE ggadmin.ggs_checkpointtable
START REPLICAT reps
INFO ALL



Summary and next steps
The previous section detailed how to use an Initial Load Extract to instantiate the target database. It’s important to note that this approach is a logical instantiation method.

While the example used an environment where GoldenGate was installed locally on the database servers—using a DistPath—a similar approach is also available for mid-tier or hub-based deployments that do not use DistPaths. Also, there would be the option to filter within the DIstPath or Replicat. However, it is more efficient to use the SCN filter within Extract. This avoids writing unnecessary data into the Trail File. 

If your goal is to replicate only a subset of tables, you can specify those tables directly in the Initial Load Extract parameter file. For horizontal and/or vertical partitioning, you can apply filtering through the SQLPREDICATE clause or project only specific columns intended for the target database. Your continuous replication process must align with these selections to ensure consistency.

Additionally, the required tables must already exist in the target database. You are responsible for creating any dependent objects, such as indexes, constraints, and other necessary schema components.

Finally, keep in mind that the Initial Load Extract is not limited to small datasets. It can be effectively used with large databases as well, particularly when parallelized using multiple Initial Load Extracts. This highlights the significant flexibility and capability offered by logical instantiation methods.