Wednesday Sep 26, 2012

How to automate a monitoring system for ETL runs

Upon completion of the Primavera ETL process there are a few ways to determine if the process finished successfully.  First, in the <installation directory>\log folder,  there is a staretlprocess.log and staretl.html files. These files will give the output results of the ETL run. The staretl.html file will give a detailed summary of each step of the process, its run time, and its status. The .log file, based on the logging level set in the Configuration tool, can give extensive information about the ETL process. The log file can be used as a validation for process completion. 

To automate the monitoring of these log files, perform the following steps:

1. Write a custom application to parse through the log file and search for [ERROR] . In most cases,  a major [ERROR] could cause the ETL process to fail. Searching the log and finding this value is worthy of an alert.

2. Determine the total number of steps in the ETL process, and validate that the log file recorded and entry for the final step.  For example validate that your log file contains an entry for Step 39/39 (could be different based on the version you are running). If there is no Step 39/39, then either the process is taking longer than expected or it didn't make it to the end.  Either way this would be a good cause for an alert.

3. Check the last line in the log file. The last line of the log file should contain an indication that the ETL run completed successfully. For example, the last line of a log file will say (results could be different based on Reporting Database versions):   [INFO] (Message) Finished Writing Report

4. You could write an Ant script to execute the ETL process and have it set to - failonerror="true" - and from there send results to an external tool to monitor the jobs, send to email, or send to database.

With each ETL run, the log file appends to the existing log file by default. Because of this behavior, I would recommend renaming the existing log files before running a new ETL process. By doing this,  only log entries for the currently running ETL process is recorded in the new log files. Based on these log entries, alerts can be setup to notify the administrator or DBA.

Another way to determine if the ETL process has completed successfully is to monitor the etl_processmaster table.  Depending on the Reporting Database version this could be in the Stage or Star databases. As of Reporting Database 2.2 and higher this would be in the Star database.  The etl_processmaster table records entries for the ETL run along with a Start and Finish time.  If the ETl process has failed the Finish date should be null. This table can be queried at a time when ETL process is expected to be finished and if null send an alert.  These are just some options. There are additional ways this can be accomplished based around these two areas - log files or database.

Here is an additional query to gather more information about your ETL run (connect as Staruser):

,duration duration from ( select (e.endtime - b.starttime) * 1440 duration, to_char(b.starttime, 'hh24:mi:ss') starttime, to_char(e.endtime, 'hh24:mi:ss') endtime, 
b.PROCESSNAME, instr(b.PROCESSNAME, ']') loc, b.infotype test_script from ( select processid, infodate starttime, PROCESSNAME, INFOMSG, INFOTYPE from etl_processinfo 
where processid = (select max(PROCESSID) from etl_processinfo) and infotype = 'BEGIN' ) b 
inner Join ( select processid, infodate endtime, PROCESSNAME, INFOMSG, INFOTYPE from etl_processinfo 
where processid = (select max(PROCESSID) from etl_processinfo) and infotype = 'END' ) e on b.processid = e.processid 
and b.PROCESSNAME = e.PROCESSNAME order by b.starttime)

Wednesday Sep 19, 2012

Controlling what data populates STAR

Beginning with the Primavera Reporting Database 2.2\P6 Analytics 1.2 release, the first release that supported the P6 Extended Schema, a new ability was added to filter which projects could be included during an ETL run. In previous releases, all projects were included in an ETL run. Additionally, all projects with the option to enable publication are included in the ETL run by default.

Because the reporting needs for P6 Extended Schema are different from those of STAR, you can define a filter that will limit the data that is included in the STAR schema. For example, your STAR schema can be filter to only include all projects in a specific Portfolio, or all projects with a project code assignment of 'For Analytics.'  Any criteria that can be defined in a Where clause and added to a view can be used to filter the projects included in the STAR schema. I highly suggest this approach when dealing with large databases. Unnecessary projects could cause the Extract portion of the ETL process to take longer. A table in STAR called etl_projectlist is the key for what projects are targeted during the ETL process.

To setup the filter, perform the following steps:

1. Connect to your Primavera P6 Project Management Database as Pxrptuser (extended schema owner) and create a new view:

create or replace view star_project_view
select PROJECTOBJECTID objectid
from projectportfolio pp, projectprojectportfolio ppp
where pp.objectid = ppp.PROJECTPORTFOLIOOBJECTID
and = 'STAR Projects'

--The main field that MUST be selected in the view is the projectobjectid. Selecting any other field besides the projectobjectid will cause the view to be invalid and will not work. Any Where clause can be used, but projectobjectid is the key.

2. In your STAR installation directory go the \res folder and edit the file.  Here you will define the view to be used.  Add the following line or update if exists:


3. When running the  staretl.cmd or process the database link to Pxrtpuser will be accessed and this view will be used to populate the etl_projectlist table  with the appropriate projectobjectids as defined in the view created in step 1 above.

As of Primavera Analytics 15.1 Unifier was added as a possible source.  This same filtering criteria based on projectobjectid through a view still applies and would need to be applied to the Unifier schema and the file so it can be used by the ETL process,  just like it does for a P6 source.


Provide new information on Primavera Analytics and Data Warehouse


September 2012 »