Using OCI GoldenGate for Snowflake Initial Load and Real-time Data Sync

April 4, 2023 | 12 minute read
Deniz Sendil
Senior Principal Product Manager
Text Size 100%:

Oracle Cloud Infrastructure GoldenGate is a fully managed, native cloud service that moves data in real-time, at scale. On top of core GoldenGate for Big Data capabilities, it brings additional advantages of a fully managed service like auto-scaling, flexible/ data throughput based licensing, improved user experience. 

OCI GoldenGate now supports Snowflake as target. You can use OCI GoldenGate for running initial load into Snowflake and for syncing cdc replication with initial load. In this blog, I'll explain how to configure the initial load and sync it with cdc replication. 

Before starting, we need to make sure that target Snowflake tables are created before instantiating the initial load. For this blog, I'm using a schema called SRC_OCIGGLL in my source Oracle Database and a schema called SRCMIRROR_OCIGGLL in target Snowflake. I'll use two different OCI GoldenGate deployment types: OCI GoldenGate Oracle (for capturing data from Oracle Database) and OCI GoldenGate Big Data (for replicating data into Snowflake). Also, I'll use two different connections: Oracle Autonomous Database (as my source) and Snowflake (as target).

You can check Create Deployments document if you don't have a deployment yet and Create Connections document if you don't have a connection.

A few details about Snowflake connection:

  • OCI GoldenGate supports both user/ password authentication and key pair authentication for Snowflake. You'll need to provide different details based on the auth type selected.
  • You can add &CLIENT_SESSION_KEEP_ALIVE=true to your Snowflake JDBC URL. CLIENT_SESSION_KEEP_ALIVE helps with keeping the session active as long as the connection is active. This is especially helpful for testing. If there are no activities, Snowflake disconnects the client and you may see an error message like "Authentication token has expired.  The user must authenticate again.". Adding &CLIENT_SESSION_KEEP_ALIVE=true to your Snowflake JDBC URL will solve the problem.

Once you create the tables in Snowflake, deployments and connections; you can continue with the following steps:

  1. Create a change data capture extract (from source Oracle Database).
  2. Get SCN from the source database.
  3. Create an initial load extract (from source Oracle Database).
  4. Create a distribution path for cdc extract.
  5. Create a distribution path for initial load extract.
  6. Create a replicat for the initial load (targeting Snowflake).
  7. Create a replcat for the change data capture replication (targeting Snowflake).

Let's start!

Create a CDC Extract

We're going to configure a cdc extract in the OCI GoldenGate Deployment. You can check Create Deployments document if you don't have a deployment yet and Create Connections document if you don't have a connection for your source Oracle Database.

Click Add Extract and select Integrated Extract.

OCI GoldenGate

OCI GoldenGate

Provide the Extract Options

OCI GoldenGate

In the Parameter File provide source table selection with Table and click Create & Run

OCI GoldenGate

Get SCN

As explained in Alex Lima's great blog post, we need to check for the SCN of the initial position of the open long transaction before creating the initial load extract. If no existing transaction exists, we use the CURRENT SCN.  Log in to the source database and run the following SQL:

***

-- Query for active transactions
--
Select T.START_SCN, T.STATUS TSTATUS, T.START_DATE,
       S.SID, S.SERIAL#, S.INST_ID, S.USERNAME, S.OSUSER, S.STATUS SSTATUS, S.LOGON_TIME
  From gv$transaction T
 Inner
 Join gv$session S
 on S.SADDR = T.SES_ADDR
Union All
--
-- Query for current status
--
Select current_scn, 'CURRENT', CURRENT_DATE,
       NULL, NULL, NULL, 'SYS', NULL, NULL, NULL
 from v$database
Order by 1;

***

Note: Capture the ACTIVE SCN from the TSTATUS column. If there is no ACTIVE SCN in the results, capture the CURRENT SCN from TSTATUS.

OCI GoldenGate

We will use the current CURRENT START SCN 39265611731373.

Create an Initial Load Extract as of SCN 

Click Add Extract and select Initial Load Extract.

OCI GoldenGate

 

OCI GoldenGate

Provide a name for the extract and click Next.

OCI GoldenGate

Provide the parameter file details and click Create & Run. You can get the useridalias & domain details from Configuration section.

Sample Parameter File:

EXTRACT InitLoad
EXTFILE IN
USERIDALIAS ATP2KFK DOMAIN OracleGoldenGate
TABLE SRC_OCIGGLL.*; SQLPREDICATE "AS OF SCN 39265611731373";

OCI GoldenGate

Click on Initial Load Extract Details and select report. You will the statistics of the initial load extract process.

OCI GoldenGate

OCI GoldenGate

Creating Distribution Paths

As we have two different extracts creating two different files, we need to create two different distribution paths to send these files into our target OCI GoldenGate Big Data Deployment. In Distribution Paths, we're going to use WSS protocol which will create a secure communication channel. We need Domain and Alias to set up wss protocol.

In your target GoldenGate for Big Data console, go to Administrator and click Add New User.

OCI GoldenGate

Provide Username, select Role as Operator, select Type as Password, provide your Password and click Submit. 

OCI GoldenGate

Now, go back to your source GoldenGate deployment. Go to Configuration and click Add Credential

Provide Credential Domain and Credential Alias. For User ID & Password, provide the Username and Password that you defined in the previous step. Click Submit

OCI GoldenGate

OCI GoldenGate

Still in the same OCI GoldenGate for Oracle deployment, click on Distribution Service and Add Path.

OCI GoldenGate

Provide a Path Name. In the Trail Name field, provide the EXTFILE name that you used in your initial load extract. IN for my case. This is good enough, you don't need to select Source, please leave empty. Select Target Authentication Method as UserID Alias. Select Target Protocol as WSS. For Target, provide the Console URL of target OCI GoldenGate Big Data deployment without https. Port is 443. Trail Name is the name of the file that will be created in target deployment. I will provide as IN. Provide Domain & Alias that you configured in the previous step (distribution, distribution). Scroll down and click Create and Run

OCI GoldenGate

When your Distribution Service starts successfully, it will be shown in a running state in the console.

OCI GoldenGate

Following the same steps, you can create a second distribution path for the file created by CDC extract.

Creating Initial Load Replicat

Navigate to your target OCI GoldenGate Big Data Deployment console and click Receiver Service. You'll see both distribution processes that you created in the previous section. These files are in our target deployment now and we can use them for creating replicats. 

For Snowflake, I need to create a Snowflake Connection and assign to the OCI GoldenGate Big Data deployment. If you need to create a Snowflake connection, you can follow the steps in this document

In OCI GoldenGate Big Data Deployment console, click Add Replicat, select Classic Replicat and click Next

Next, provide Replicat Options. Trail Name is the name of the trail file that is used for replication. I will provide IN as the file I created with initial load extract. Also, you need to select Target as Snowflake. Once selected, Available Aliases will be listed, and you will need to choose an alias. Once an alias is selected, the Replicat will be using the Credential details from the Connection that is selected. If you're using external staging, click Enable External Stage and provide external staging details. I'm using Snowflake internal staging. 

OCI GoldenGate

In the Parameter File, you can either specify source to target mapping or leave it as-is with a wildcard selection. I'll run the replicat for a single table and I'll use as MAP SRC_OCIGGLL.SRC_CUSTOMER, TARGET SRCMIRROR_OCIGGLL.SRC_CUSTOMER;

For the initial load process, we need to set INSERTALLRECORDS parameter. Using this property directs the replicat process to use bulk insert operations to load operation data into the target table and will increase the initial load performance. If you dont use INSERTALLRECORDS in initial load, you may encounter performance issues. 

Also, If Coordinated Replicat is selected as the Replicat Type, an additional parameter needs to be provided: TARGETDB LIBFILE libggjava.so SET property=/u02/Deployment/etc/conf/ogg/your_replicat_name.properties. I'm not adding this setting as I'm using Classic Replicat.

OCI GoldenGate

Last step is the Properties File configuration. In this screen, you will notice that some of the properties are pre-configured and populated. These are the basic and bare minimum properties needed for Snowflake replication. We don't need to make any changes here. 

Click Create and Run.

OCI GoldenGate

When replicat starts successfully, you'll see it in a running state. If you go to replicat details/ statistics, you can see the replication statistics. 

OCI GoldenGate

Creating CDC Replicat

You can create the CDC replicat with the trail file that was generated by the CDC Extract that you created in the first step. You just need to CREATE the replicat, please DO NOT RUN it. 

In OCI GoldenGate Big Data Deployment console, click Add Replicat, select Classic Replicat and click Next

Next, provide Replicat Options. Trail Name is the name of the trail file that is used for replication. I will provide lk as the file I created with cdc extract. Also, you need to select Target as Snowflake. Once selected, Available Aliases will be listed, and you will need to choose an alias. Once an alias is selected, the Replicat will be using the Credential details from the Connection that is selected. If you're using external staging, click Enable External Stage and provide external staging details. I'm using Snowflake internal staging. 

OCI GoldenGate

In the Parameter File, you can either specify source to target mapping or leave it as-is with a wildcard selection. I'll run the replicat for a single table and I'll use as MAP SRC_OCIGGLL.SRC_CUSTOMER, TARGET SRCMIRROR_OCIGGLL.SRC_CUSTOMER;

OCI GoldenGate

Last step is the Properties File configuration. In this screen, you will notice that some of the properties are pre-configured and populated. These are the basic and bare minimum properties needed for Snowflake replication. We don't need to make any changes here. 

Click Create. Do not click Create and Run.

OCI GoldenGate

OCI GoldenGate

On CDC replicat, click on Action and select Start with Options

OCI GoldenGate

Select Start Point as After CSN and provide the SCN that you noted in Get SCN section. Click Start.

OCI GoldenGate

CDC Replicat is up & running now.

OCI GoldenGate

OCI GoldenGate

 

 

 

 

 

 

 

 

 

 

Deniz Sendil

Senior Principal Product Manager

Deniz Sendil is a Senior Principle Product Manager in Oracle GoldenGate product group with a focus on GoldenGate for Big Data. Deniz has over 17 years of experience in data management, analytics and data integration domains. 


Previous Post

Executing and Scheduling Data Science Notebooks in OCI

David Allan | 8 min read

Next Post


Executing and Scheduling Data Science Notebooks in OCI via Data Science

David Allan | 7 min read