Moving History From One Data Source to a New One

February 13, 2020 | 4 minute read
Text Size 100%:

If you come to a situation where you have an existing datasource that has been running for a while and accumulating history and you need to break it up because it has grown too large, or your projects require different ETL times one thing you can do is use an ETL filter to break up the projects that go into each datasource and datasource 1 which has all the projects currently can get a filter for the projects to run in this ETL (and now filter out the projects you want to go into a different datasource and ETL run time - lets call it datasource 4).  Filtering out datasource 1 is done.  You have created a new staretl and installation for datasource 4 and have run the ETL with this new filter.  Your projects are now separate and you have new ETL run times, great... but what about the history you had captured for the datasource 4 projects that still exists in datasource 1.  You don't want to lose that.  If you are an OnPrem customer and have access to the database level you can move and adjust this

It is recommended to make a backup of your database before moving forward. 

There are a few items you need to be aware of:
    1- what are my history tables? 
        You need to find all your slowly changing dimension tables (tables that end with _HD)
        select * from all_tables
        where owner = 'STARUSER'
        and table_name like '%_HD'

        You need to find your historical snapshot fact tables (tables that contain %_hist%)
        select * from all_tables
        where owner = 'STARUSER'
        and table_name like '%HIST%'
        You need to find your historical non snapshot fact table (tables that end in _HF)
        select * from all_tables
        where owner = 'STARUSER'
        and table_name like '%_HF'

    2- WID's
        WIDs are unique identifiers, generated from a sequence.  
        Some are calculated -ex. eps_object_id *100 + (datasource_id - 1)

    3- Sequences
        Connect to your STARUSER in SQL Developer or a similar tool and look at the Sequences section.  You will see a Sequence for each of these tables. 
        You want to make sure the Sequence number is greater than the max ROW_WID for the data you will be reinserting later on.  (Tables to be investigated are the same as in item #1 above).
Next,  create temporary tables with the contents of those projects for each of the HD, HF, and History Fact tables. (Ex.  Create table TEMP_W_PROJECT_HISTORY_F as SELECT * FROM W_PROJECT_HISTORY_F where datasource_id =1 and project_object_id in (range of project object id's for new datasource 4).

Create a backup table of this temp table incase you run into any issues during editing of the data.  (Ex. Create table BCK_W_PROJECT_HISTORY_F as SELECT * FROM TEMP_W_PROJECT_HISTORY_F);
Now that you have your data backed up and a copy you can work with you begin adjusting the data. 

**You only need to create backups and temp tables for the data you are using.  If you are not using Unifier as a datasource you do not need to do these tables.  You can check to see if those tables are empty and then ignore them if so.

In the TEMP_ copy of the table you will need to adjust

    1- Datasource_id.  

         Change the datasource_id to be the new datasource_id (example datasource_id = 4)

     2- WID's
                  For most of the tables the ROW_WID comes from the sequence.  For other WID's,  for example EPS_WID, it is a calcuated formula (eps_object_id * 100 + (datasource_id -1)).  

For the example of EPS_WID you need to change final # in the EPS_WID to match your new datasource.  For this example if the EPS_WID in W_PROJECT_HISTORY_F on the data I am moving to datasource_id 4 is 35800 I would need to change this to be EPS_WID + 3 (Datasource id 4 -1).  New value would be 35803.  EPS_WID occurs in W_PROJECT_HISTORY_F and other Historical fact tables.   In those tables most columns (Activity_Wid, WBS_Wid, Project_Wid, etc. come from Sequence and wouldn't need to be adjusted). 

Day_Wid would never need to be adjusted.
For HD tables,  the ROW_WID comes from Sequence.  Most HDs don't include other WIDs, except for ones like W_PROJECT_HD which again includes the EPS_WID which would need adjusting as described in the History tables above.

Because there may be different tables based on the version you are on it is best to run the above queries to find the tables that are present for the version you are on to make sure none are         missed and to do the analysis on which WID's would need updating.  The above pattern, EPS_WID,  should help you detect and validate which WIDs would need adjusting.  

Now delete this data that is targeted for datasource 4 from those tables in your HD, HF, and Historical Fact tables where it is included as part of datasource 1.
Once you have adjusted the data in the temp table to reflect your new datasource_id you can reinsert it back into the corresponding HD, HF, or Historical Fact table.

In ETL_PARAMETER table you may have options for datasource 1 that you would want reflected in datasource 4, such options as disabling Relationships or disabling Resource Limit data captures. You can view these settings from the Analytics Admin App or you can look and compare the settings from datasource 1 to datasource 4. 

Once the data is back into the STAR schema for the new datasource run an ETL for datasource 4 to make sure all runs smoothly and security is populated and any new captures are created.   From here you can log into OBI to make sure you can see the data under the proper datasource.  

Moving history from one datasource into a new datasource is a somewhat extensive process where you will need to check many tables, and go table by table to determine which WIDs need adjusting, which ones are ok, and which need sequence adjusting but these are the main areas to focus on to allow you to do so and begin a fresh datasource with the history you have accumulated in the original datasource.

Jeff Mcdaniel

Previous Post

How Technology Sparks Storytelling in Architecture

Corie Cheeseman | 4 min read

Next Post

How technology is helping global industrial manufacturers drive significant savings

Janet Poses | 4 min read