Today, we will learn about implementing Slowly Changing Dimensions (SCD) in Oracle Cloud Infrastructure (OCI) Data Integration. A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records.
OCI Data Integration can be used to define, deploy, and load most types of SCDs. The three broadly used SCD types are:
- Type 1 SCDs – Overwriting: In a Type 1 SCD the new data overwrites the existing data. Thus the existing data is lost as it is not stored anywhere else. This is the default type of dimension you create. You do not need to specify any additional information to create a Type 1 SCD. In Oracle Cloud Infrastructure Data Integration, this can be achieved with the help of “Integration Strategy” as Merge or Overwrite.
- Type 2 SCDs – Creating another dimension record: A Type 2 SCD retains the full history of values. When the value of a chosen attribute changes, the current record is closed. A new record is created with the changed data values and this new record becomes the current record. Each record contains the effective time and expiration time to identify the time period between which the record was active. In OCI Data Integration, this can be achieved with the help of “Integration Strategy” as Merge.
- Type 3 SCDs – Creating a current value field: A Type 3 SCD stores two versions of values for certain selected level attributes. Each record stores the previous value and the current value of the selected attribute. When the value of any of the selected attributes changes, the current value is stored as the old value and the new value becomes the current value.
In this blog, we are going to walk through a complete end-to-end SCD Type 2 implementation.
Pre-Requisites –
- Based on the previous blogs on Oracle Cloud Infrastructure Data Integration and Oracle Documentation, you have set up a Workspace, Projects, Applications, and understood the Creation and Execution of a Data Flow using Integration Task.
- Corresponding Data Assets i.e. sources and targets have been defined in the Workspace.
Let’s Start the Implementation!
There are several ways to implement SCD Type-2. Here, with the help of full-outer and left-outer joins we will identify the new and old records and use conditional statements such as Case/When to implement an SCD Type-2 process. In this example, using a single data flow, we are going to cater to the full and incremental data load to populate the SCD Type -2 table. Below is the list of tables used in the data flow with the structure of the target table and the corresponding mapping with SCD behavior.
List of source and target tables used –
- Employees
- Departments
- Locations
- Countries
- Regions
- Jobs
- Emp_Tgt [Target Table]
The tables used in this example are from the HR schema and can be referred from Oracle livesql
Target Table “Emp_Tgt” Mapping –

DDL for EMP_TGT –
CREATE TABLE EMP_TGT ( ENAME VARCHAR2(100), EMPNO NUMBER(10,0), JOB VARCHAR2(50), MGR NUMBER(4,0), HIREDATE DATE, SAL NUMBER(10,2), COMM NUMBER(7,2), DEPTNO NUMBER(10,0), DEPARTMENT_NAME VARCHAR2(200), STATE_PROVINCE VARCHAR2(200), COUNTRY_NAME VARCHAR2(200), REGION_NAME VARCHAR2(200), SEQ_NO VARCHAR2(500), START_DATE DATE, END_DATE DATE, ACTIVE_FLAG VARCHAR2(1), PRIMARY KEY (SEQ_NO) );
Below is the data flow created for building a Type 2 slowly changing dimension –

With the help of the left outer join and full outer join, we have identified the updated, inserted, and changed records based on the primary key, SCD Type 2 column. Here, the left outer join is used to get only the target data matching with the source along with additional records from the source instead of bringing the complete target data for reprocessing.

With the help of full outer join, we will be comparing the incoming records with the target records on basis of SCD type 2 columns. Here, COMMISSION_PCT and SALARY are the SCD type-2 columns in the target. We have used NVL conditions to handle NULL values coming from the source.
Join Condition -
NVL(JOIN_7_1.EMPLOYEES.EMPLOYEE_ID,-99) = NVL(JOIN_7_2.EMP_TGT.EMPNO,-99)
AND NVL(JOIN_7_1.EMPLOYEES.COMMISSION_PCT,-999) = NVL(JOIN_7_2.EMP_TGT.COMM,-999)
AND NVL(JOIN_7_1.EMPLOYEES.SALARY,-222) = NVL(JOIN_7_2.EMP_TGT.SAL,-222)

In the expression operator i.e. EXPRESSION_1, we have derived the unique ID (generating unique identifier using UUID()), active flag, effective start date, and effective end date.
-
SEQ_NO: CASE WHEN EXPRESSION_1.EMPLOYEES.EMPLOYEE_ID IS NOT NULL AND EXPRESSION_1.EMP_TGT.EMPNO IS NULL THEN UUID() ELSE EXPRESSION_1.EMP_TGT.SEQ_NO END -
ACTIVE_FLAG: CASE WHEN EXPRESSION_1.EMPLOYEES.EMPLOYEE_ID IS NOT NULL AND EXPRESSION_1.EMP_TGT.EMPNO IS NULL THEN 'Y' WHEN EXPRESSION_1.EMPLOYEES.EMPLOYEE_ID IS NOT NULL AND EXPRESSION_1.EMP_TGT.EMPNO IS NOT NULL THEN 'Y' WHEN EXPRESSION_1.EMPLOYEES.EMPLOYEE_ID IS NULL AND EXPRESSION_1.EMP_TGT.EMPNO IS NOT NULL THEN 'N' ELSE 'N' END -
START_DATE: CASE WHEN EXPRESSION_1.EMPLOYEES.EMPLOYEE_ID IS NOT NULL AND EXPRESSION_1.EMP_TGT.EMPNO IS NULL THEN CURRENT_DATE ELSE EXPRESSION_1.EMP_TGT.START_DATE END -
END_DATE: CASE WHEN EXPRESSION_1.EMPLOYEES.EMPLOYEE_ID IS NOT NULL AND EXPRESSION_1.EMP_TGT.EMPNO IS NULL THEN TO_DATE('31-12-9999','dd-MM-yyyy') WHEN EXPRESSION_1.EMPLOYEES.EMPLOYEE_ID IS NULL AND EXPRESSION_1.EMP_TGT.EMPNO IS NOT NULL THEN CURRENT_DATE ELSE EXPRESSION_1.EMP_TGT.END_DATE END

Once the flow is completed, save the data flow, create an Integration Task, publish to an application, and run the corresponding task for implementing SCD type-2.

The 1st run inserts all the records from the source table into the target assuming that there are no records in the target table. In the 1st run, for all the records, ACTIVE_FLAG = ‘Y’, START_DATE = <<current date>>, and END_DATE = ’31-DEC-9999′.

In the 2nd run, changed records coming from the source are identified with the target and SCD type-2 logic is applied to those records. Below are the screenshots which showcase the changes that happened in the target during 2nd run based on the changes in the source.


In the 3rd run – further changes were performed in the source table Employee where salary and commision_pct of certain employees were changed. After a successful run of the Integration Task, new rows got inserted for the changed records.


This concludes an overview of implementing SCD Type-2 with Oracle Cloud Infrastructure Data Integration.
Have you seen some of our other blogs? Check out all the blogs related to Oracle Cloud Infrastructure Data Integration. To learn more, check out some Oracle Cloud Infrastructure Data Integration Tutorials and Documentation.
