Friday Aug 08, 2014

Change to Logging for New Web Configuration Utility

Since P6 Reporting Database 2.0 and Analytics 1.0 the ETL process log has been appending during the ETL runs. In 3.2 staretlprocess.log will contain all ETL runs. We received customer feedback on a desire to have a log for each ETL run.  With the web configuration utility added in 3.3 this was implemented.  In the new configuration tool you can schedule your ETL runs and view the results and logs from the the web config. The logs were separated out so you can see the log for each individual run.  This also gives you the option to clean up old logs from your system directory for a given time period.  

The existing (non web based) configuration utility will still do the logging based on appending each run to the same log if this is the way you choose to have your ETL process logged. However you should choose one method, either the existing configuration utility or the web based configuration utility and not use both.  The new web based config can build up a queue of ETL jobs and avoids any kind of collision. This is a major advantage for those using multiple data sources.  You can queue up multiple runs and they will handle the execution.  The existing configuration utility does not have a queue mechanism for ETL runs, they are controlled by cron jobs or manually being kicked off so collisions there are possible and you can not have more than one ETL process running at a time. Also because of the different logging methods you should not be running ETL's from both methods because the web configuration utility will not know of the logs for ETL runs kicked off from the STAR installation directory.  Each log from the new queue method will have a time stamp and unique name on it for easy identification. 

Increase to Default Values of Codes and UDFs in P6 Analytics

In the P6 Analytics 3.3, the default value of all UDF's (subject area (Project, Activity, etc.) and type (Date, Text, etc.)) were increased from 20 to 40.  This includes the RPD changes.  When choosing your UDF's from the configuration utility these changes will automatically be reflected in the schema and no changes will be needed in the RPD until you hit 40.   We also increased the default value of Codes from 20 to 50.  Same updates apply as with UDFs where all schema and RPD changes are automatically reflected up to 50. If adding greater than 50 for Codes or 40 for UDF's the appropriate RPD changes will need to be made to add additional values but all schema side changes will be handled by the ETL process.

Monday Jul 28, 2014

Deleted data showing in History?

In the Star ETL process there are 2 main types of updates.  There is the current data (dimensions and facts) and there is the historical data (_hd, hf, history) - see

The current data is overwritten every ETL update.  This will keep your data as up to date as the last ETL.  If you worked on a project, deleted a project, updated tasks, deleted tasks, scheduled - that will all be reflected. In the Historical tables, these changes once captured will always remain.  For example, you deleted a project.  In your list of projects this will still be available. Once the data is captured it is always in the database. This way you truly have a historical perspective.   

In P6 Analytics when viewing Historical data you may want to filter out this data.  You can do so by adding a code to these values and set a value of 'Deleted' then apply this filter. If you truly want to see just the most recent data the Primavera - Activity subject area may be the subject area you want to report on.  The historical subject areas is more for trending and seeing changes over time. In a future blog we will discuss how to add a delete_flag to allow for easier filtering in P6 Analytics. The main take away from this entry is once the ETL is run the data is captured as is and always available in the database if it needs to be reviewed. There was a high demand for being able to see all data, even if deleted, to help show the total historical picture of the life of a project. 

Wednesday Jun 25, 2014

P6 Links appearing as text instead of HTML hyperlinks

With the latest version of the P6 Analytics catalog, the P6 Link fields (those fields that are used to launch P6 Action Links directly from Oracle Business Intelligence analysis) are appearing as text fields as opposed to a normalized hyperlinks. This is caused by the fact that the default data type for these fields are text instead of HTML. There are a few options that can be taken in order to correct this behavior and have the P6 Link fields appear as hyperlinks in analysis. The first option is to change the P6 Link field on an analysis by analysis basis. This is very time consuming and not very practical, especially if these fields are used in a lot of analysis. The Second option (Recommended) is to change the data type to HTML and save this as the system wide default for this field. This option is also a bit tedious because you have to do this for most of the subject areas, but once it's done, there is nothing else to do.

Follow these steps to change the data type for these fields:

1. Create a new analysis that contains ALL of the P6 Links for a specific Subject Area

2. For each link, choose Column Properties,

3. Select the Data Format tab, and check the "Override Default Data Format" option. Change the "Treat Text as" to HTML from the drop down box.

4. After Step 3 is complete, at the bottom right of the Properties window, there is a button to the left of the "Ok" button called "Save as Default". Click this button, and choose "Save as the system-wide default for <whichever Link you are changing the properties for>".

5. Click "Ok" on the Properties Window to save this setting as the default for this field system wide. Repeat steps 2-5 for the remaining fields in this Subject Area. Now anytime you add this field to an analysis, it should appear as a hyperlink.

6. Repeat Steps 1-5 for each subject area.

Friday Jun 20, 2014

Star Schema Table Descriptions

In the P6 Reporting Database STAR schema there are several acronyms on the tables. In this blog we will give a high level description of some of these tables and what the acronyms represents. In each description there will be key words you can use to search and find more information about the specific data warehousing terminology.


Staging Tables

_DS, STG, _%S 

These tables are used during the ETL process and is the first stop from some data being pulled from the P6 Extended schema.  These are overwritten during each ETL run.


Dimensions
_D

These are the foundation Dimensional tables that are used with P6 Analytics.  During the ETL process these are repopulated each time. For more information search on Dimensions and Facts, high level a dimension contains information about a specific object.  For example, Project (w_project_d), this table will contain information about Project - Project Name, Description, Dates, etc.


Facts

_F

These _F tables are the foundation Fact tables that are used with P6 Analytics.  These are also repopulated during each ETL run.  For information search on Data warehouse Facts. An example is (w_activity_spread_f), and this fact table with activity spread data contains fields like costs and units.  An exception are the Fact tables which contain History in the name.  These are NOT repopulated each ETL run.  These capture the historical records in the pre-defined intervals. History can be captured on the Project, WBS, or Activity level. 


Historical Dimensions

_HD

Historical dimensions represent our Slowly Changing Dimensions tables.  SCD's capture changes over time. These are NOT repopulated during each ETL run. The historical values are saved and stored to represent when this row was valid.  For example, w_project_hd, say you changed the project name.  There will be a row representing what the project name was originally, and the period start and finish range of when that was the project name.  On the next ETL run a new row will be inserted with the new project name and a period start date so you can determine as of which date this was the most recent record. 


Historical Facts

_HF

Historical fact tables were added as part of the Slowly Changing Dimension implementation and represent in a similar nature the storing of historical fact data. Similar to the _HD tables these rows are NOT repopulated during each ETL.  They do however act like the _HD tables in that a row will represent each change captured for this fact row. For example, changing of a cost on an activity. These differences can be captured during ETL runs and will have the similar period start and end dates so you can determine when this value was the current record. _HF and _HD will not be automatically populated for all project data. These need to be opted in, by setting the project to Activity level history.  Please plan accordingly before doing so, check the P6 Reporting Database Planning and Sizing guide as this will have a long term effect on the size of your Star Schema.  


Internal Usage 

ETL_

Tables with ETL_ are used for processing events such as expanding out and populating dynamic codes or capturing process related information. Generally these are not used for reporting  as most serve an internal purpose.  An example is the ETL_PROCESSMASTER table which captures a row and result for every ETL run, or ETL_PARAMETERS which captures the settings supplied from the configuration setup.  Other tables contain metadata and mappings.   

Thursday May 29, 2014

Partitioning Strategies for P6 Reporting Database

Prior to P6 Reporting Database version 3.2 sp1 range partitioning was used. This was applied only to the history tables. The ranges were defined during installation and additional ranges would need to be added once your date range entered the final defined range.

As of P6 Reporting Database version 3.2 sp1, interval partitioning was implemented. Interval partitioning was applied to the existing History table as well as Slowly Changing Dimension tables. One of the major advantages of interval partitioning is there is no more manual addition of ranges. The interval partitioning will automatically create partitions for the defined interval when data is inserted into the table and it exceeds the existing partitions. In 3.2 sp1 there are steps on how to update your partitioning. For all versions after 3.2 sp1 interval partitioning is the only partitioning option used. When upgrading it is important to be aware of these changes. Here is a link with more information on partitioning -the types and the advantages.

http://docs.oracle.com/cd/E11882_01/server.112/e25523/partition.htm

Friday May 09, 2014

Indicator type UDFs Part 2

In a previous blog, we covered how to bring indicator type UDFs (User Defined Fields) into the STAR data warehouse by manually adding them as text type UDFs in the staretl.properties file and adding a script into the ETL process. In this post we’ll cover how to use conditional formatting in Oracle Business Intelligence to display these indicator UDFs in analyses.

First, create a new analysis in Oracle Business Intelligence, in this example we’re using the Primavera – Activity subject area, and a simple selection of two indicator type Project UDFs, Schedule Status and Overall Status.

Since the indicators UDFs are being stored as text in the STAR, the initial output from Oracle Business Intelligence is not what we want, we’ll need to apply some conditional formatting in order to display the indicators correctly.


To accomplish this, we first need to add some place holder columns that we can use to display the indicator images. You can just add a second occurrence of each indicator column and rename them so you know which column will contain the text and which will be used to display the actual indicator image.

Next, go to the Schedule Status column, click on the menu drop down and select Edit Formula.

We don’t want to display the text along with the indicator image, so select the Custom Headings option, then in place of the actual column name substitute a blank value(‘ ‘) in the Column Formula section.

Now we can take care of the conditional formatting. For the Schedule Status column, click on the drop down and select Column Properties.

Then go to the Conditional Format tab, click Add Condition, and select the column you want to base the formatting on, in this case Schedule Status Text.

Select the condition and click OK. From the Edit Format window click on Image.

Select an image to be displayed based on the Schedule Status Text, click OK.

Enter any additional format changes, in this case we’ve selected Center horizontal and vertical alignment for the column, then click OK.

Repeat the conditional formatting steps for the other values of the indicator, and then repeat the process for any other indicators in the analysis, in this example we also selected Overall Status.

Once finished, the only remaining step is to hide the text columns from the analysis so only the indicators will be displayed. From the Results tab, right-click on the heading for the text columns and select Hide Column from the drop down list.

Once the text columns have been hidden you will be left only with the indicator columns and the images that were selected in the conditional formatting steps.

Tuesday Apr 22, 2014

ETL Scheduling in a Multiple Data Source Environment

A multiple data source environment is where the Star schema is populated by either multiple P6 instances or a single P6 instance has been split into unique sources of data.  You could split a single P6 instance based on a specific criteria. Maybe you want a group of projects under an EPS to be updated in Star only on Friday's, but another group of projects need to be updated daily.  You could split into separate ETL's.  See previous blogs for more information on filtering and multiple data sources.

For this blog we are going to cover how the ETL's are executed. If you have two data sources, you need to run two separate ETL processes at different times.  ETL #1 must be run first and complete before ETL #2 can be started.  You do NOT want to allow both ETL processes to be executed at the same time.  This can accomplished with a batch process or another queueing mechanism to make sure ETL #1 completes then execute ETL #2. 

If ETL's were to be run at the same time you could see some data issues because they share staging tables.  While the data in the facts and dimensions is contained in rows that are unique to the data source the staging tables are not.  This data could be clobbered if both ETL's were running at the same time then that clobbered data may be pulled into the rows for an existing data source. 

To help control this problem a new web configuration utility was created in P6 Reporting Database and P6 Analytics 3.3.  Now there is a queuing mechanism to prevent ETL's from running at the same time.

You can setup separate tabs for each ETL.  Define the schedule for each ETL.  They will then queue up and be displayed on the home tab where the running and queued ETL's will show.  They can also be Stopped or Removed from the queue. The main take away is for multiple data source environments the ETL's are sequential not parallel.  

Monday Apr 07, 2014

Handling Codes and UDFs in a Multiple Data Source Environment


In a single data source environment codes and udfs are quiet easy.  Log into the configuration utility.  Select your codes and udfs, run your ETL process, and as long as you don't exceed the default number of columns in your RPD the codes and udfs show in OBI.  In the configuration utility the list of codes and udfs that are presented to you is populated by reading directly from the P6 Extended Schema you provided selection information for.  When you make your selections this list is written to your .properties file in the <rdb installation>\res directory.  During the ETL process these selections are read from the .properties file and inserted into Staging tables and eventually Dimensional tables in the STAR schema.  

One thing to note about Staging tables is there is only one set. In a multiple data source environment Staging tables are shared. Meaning during each ETL run they are overwritten. This is the main reason why in a multiple data source environment ETL processes can not be run at the same time.  One ETL process has to finish before the next can be run. Which leads us to how to handle codes and udfs in a multiple data source environment. Say you make your codes selection from data source 1, run your ETL process.  Now make your code selection from data source 2, and run the ETL process.  The selection you made for data source 2 is now your final selection.  The values and order you choose from data source 1 has been overwritten. 

To accommodate multiple data sources for codes and udfs requires a little coordination. Let's say you have 10 codes from your 1st data source and 10 codes from your 2nd data source that you would like to be used in P6 Analytics in OBI. Run the configuration utility for data source 1, choose your 10 codes using slots 1-10. Go to your \res folder and make a copy of the staretl.properties and save it to a different location (ds1staretl.properties). We're going to come back later to use this staretl.properties file for data source 1. Now for data source 2 lets go ahead and log into the configuration utility. Choose your codes, 1-10 don't really matter but choose codes 11-20 as the ones you want to represent this data source in OBI. Make a copy of this staretl.properties and save in a different location (ds2staretl.properties). Your are saving these backups so you can easily rebuild the codes list at a later time if runSubstitution or configuration utility are executed.  

Now you have 2 properties files. These 2 files contain all the codes you are going to use we just need to combine them into 1 file now. Go to data source 1 properties file, change the lines for codes 11-20 to represent the codes 11-20 from the properties file from data source 2. You can find code 11 in data source 2 properties file, copy the three lines and paste it where code 11 is in data source 1's properties file. Or if there is no code 11 just add it below the other codes. Do this for the rest of the codes in data source 2 until in the data source 1 properties file you have all the codes you want from data source 2 in this properties file. 

Now copy the whole codes section from this data source 1 properties file and overwrite the codes section in data source 2.  You will do this for each type of code (project, activity, resource). This section together is the 'master copy' of your codes. Do not run the configuration utility again on either data source 1 or 2 again after you have put this list together or it will overwrite it. If this 'master copy' is overwritten you will need to coordinate and create it again. So be careful when and where the configuration utility is run after this point.  

If you have a scenario where you have the same codes in data source 1 and data source 2 then this process is much easier. You would need to make the 'master copy' in just one of the data sources and then add it to the other properties files.  If the data sources are truly unique and there are codes that don't exist in one of the other data sources you must follow the steps above.Do not worry about when the ETL process runs for data source 1 that it might not have the codes for data source 2 that are now defined in the properties file. This is adding it into the staging tables and mappings for OBI to use for the data associated with data source 2. When you try to add this code along with data from data source 1 it will not return any results just like any other project or activity that is not associated with a chosen code. 



Wednesday Mar 26, 2014

Updated Burn Down Whitepapers

An effort was made to explain in detail how information flows from P6 into P6 Analytics, specifically the Primavera - Burn Down Subject Area within Oracle Business Intelligence. The Primavera - Burn Down subject area captures data at a specific point in time (user defined), then uses that point in time capture to compare how a project should progress over time if everything goes "according to plan".  P6 Analytics has the ability to capture which activities are progressing according to plan and which ones are ahead or behind schedule by comparison to the original point in time capture of the project. The Primavera - Burn Down Subject Area captures units and costs, as well as activity status counts (Not Started, In Progress, Completed, etc...).

For details on the exact measures that P6 Analytics displays in Oracle Business Intelligence, and how those measures are calculated and stored, we have created three new whitepapers. The three whitepapers detail how units and activity counts are calculated and stored on a daily basis. In addition, we have a whitepaper that details how the overall flow of data moves through the data capture process (ETL), and what the expected results will be in Oracle Business Intelligence within the Primavera - Burn Down Subject Area of P6 Analytics.

To download the three New whitepapers, follow the links below:

Primavera - Burn Down Counts Details

Primavera - Burn Down Units Details

Primavera - Burn Down Data Flow Details

Monday Mar 17, 2014

What's a WID? Where is it used?

In the Primavera PMDB tables most primary key columns are identified by that specific table _ID, example task_id on TASK, proj_id on PROJECT.  This identifies this unique row. In the early versions of our STAR schema we had object_id's as well. W_PROJECT_D (Project Dimension) had project_object_id.  W_ACTIVITY_D (Activity Dimension) had activity_object_id. Fact tables had a combination of all the associated object id's - project, task, etc. The object id's still exist but with the introduction of Multiple Datasources in P6 Reporting Database 3.0 an object_id was not unique enough.  Take the Project Dimension for example, project_object_id has to be unique for each data source. But if you have a second data source that rule doesn't apply. You could have a project_object_id of 4527 in data source 1 and in data source 2.  To make sure we never have this scenario occur row_wid's where added.  The row_wid is a calculation of that specific object id * 100 and the data source id. If this project was from data source 2 the row_wid would be 452702.  There is the possibility of having 100 data sources, however it is unlikely and not recommended. This row_wid is then used to help join other tables. The row_wid from the w_activity_d is referenced as activity_wid in other tables, similar to project_wid, eps_wid.



In the STAR schema is not the only place these wid's are used.  They are also used in the RPD for joining between dimensions and facts. This way if in a multiple data source environment the wid's make sure the data being returned in OBI is also correct and unique to that data source. If doing manual manipulation of the RPD be aware that wid's may be the field you want to join on. Consider your environment, it's needs, and it's future - do you foresee an expansion with more than one data source - if so, plan ahead.  



Monday Feb 24, 2014

Before you upgrade...what to expect (P6 Reporting Database)

If you are upgrading to P6 Reporting Database 3.1 or higher you'll want to allow for adequate time for the upgrade process. In this blog we will focus on upgrading the Star schema. The first major item to be aware of is that during the upgrade an ETL process is initiated. If you have a job scheduled to run your ETL process at a given time you'll want to make sure you allow enough time for the upgrade to finish or you may want to cancel the job until after the upgrade has completed. Having two ETL's running at the same time or overlapping at any point could be problematic for your STAR schema and the integrity of the data. 

Generally running an upgrade will take longer than a traditional daily ETL. During the upgrade process a set of BCK (backup) tables are created. Backups are taken for all history (_history_f) and slowly changing dimension (_hd and _hf) tables. The schema is recreated and then the data is reinserted into the new tables. Also backed up is the ETL_CALCULATIONS table that contains the metadata calculations.  After the backups are completed the ETL process is run to populate the remaining dimension and fact tables. 

Before running an upgrade make sure to backup your STAR schema. You'll want to make sure this data, especially your history, is preserved incase of any kind of failure during the upgrade.

Wednesday Feb 19, 2014

P6 Extended Schema - advanced techniques and additional information

With the P6 Extended Schema being the lifeline for data for the P6 Reporting Database and P6 Analytics, it is a good idea to get an understanding of how it works and some techniques you can use that can help make your usage of P6 Analytics easier and more efficient.  I wanted to call your attention to a new white paper recently released that contains some advanced techniques and new information on the P6 Extended Schema.  

http://www.oracle.com/webfolder/technetwork/tutorials/primavera/OnlineLearning/WhitePapers/P6_Extended_Schema_Advanced_Techniques.pdf

Highlights include:

Top Five Things You Need to Know about the P6 Extended Schema
Starting Over
Publishing too often
Dependencies
I Want to Build My Own Queue
and more...

Wednesday Jan 22, 2014

Report on ETL Status of a Multiple Datasource Environment

Do you have a Multiple Datasource Reporting Database environment?  In a scenario where you have multiple datasources, and multiple ETL processes running you may want to keep track of those ETL runs in an automated fashion. In an earlier blog we discussed how setup your own ETL monitoring system, in this Blog we will discuss how to use OBI to monitor your ETL runs.

Currently we have an ETL_PROCESSMASTER table in the STAR schema that tracks every ETL run - the start time, the finish time, and wheter it was successful. However there is not a column for datasource Id.  You can get this information by calling a function in the STAR schema - get_source_db_id.  

If you would like an OBI report to show this information here is a way to do it. First let's talk about how this can be accomplished in OBI.  This data for ETL_PROCESSMASTER is not exposed in any of the subject areas.  The reason is it is not for general user consumption.  The ETL_ tables in STAR are for internal trackings and usage during the ETL process.  While it can have useful information it's main intention is not for exposure in the subject areas of P6 Analytics where the focus is on your schedule and resource data, not the internal workings of the ETL process.  

But a report can be created outside of subject areas by using OBI's ability to query a data source directly.  In this type of Analysis you do not use any of the subject areas you use the OBI direct database query and formulate your own SQL. After the SQL has been generated this Analysis can be treated like any other and an administrator could use it to oversee the ETL process for all the organization's datasources. 

We need to add a new column to the ETL_PROCESSMASTER table. We will add a column called DSID. You can either run the SQL before running the ETL prcess:

ALTER TABLE etl_processmaster ADD (dsid   number);
commit;

Or another way is to add a script into the USER_SCRIPTS folder (located in \star\scripts directory).  All contents of the USER_SCRIPTS folder will be executed after the ETL process has completed.

Create a script called - addcolumn.sql and add the following contents:

DECLARE
  v_new_col1 number := 0;
BEGIN
  Select count(*) into v_new_col1 from user_tab_cols where column_name = 'DSID' and table_name = 'ETL_PROCESSMASTER';
  if (v_new_col1 = 0) then
      execute immediate 'ALTER TABLE etl_processmaster ADD (DSID number)';
  end if;
  commit;
END;
/
EXIT


This script will look to see if the DSID column exists, if it does not it will be added.  If it does exist then this step will be skipped.  Now you have the new column added.  Next we need to populate the datasource id for the last\most recent ETL run.

In the USER_SCRIPTS folder add another new script called = datasource.sql

set serveroutput on size 100000
-- update datasourceid 
declare
  v_src_id number;
BEGIN
  v_src_id := get_source_db_id;

update ETL_PROCESSMASTER set dsid = v_src_id where processstartdate =  (select max(processstartdate) from etl_processmaster); 
commit;
end;
/
Exit


This script will go find the datasource id of this ETL run and update the new DSID accordingly for the most recent ETL run row.  Now you have all the data in place.  Next you just need to generate a simply query to extract this data, such as:

select processid, processstartdate, processenddate, processtype, DSID from etl_processmaster;

Thursday Jan 09, 2014

How to find resources without any assignments?

In P6 Analytics if you create an Analysis and use a value from a Dimension - Resource Dimension for example - you will see all resources regardless if they have any assignments. However when you join to another dimension or fact you may no longer see that resource in your Analysis if it doesn't have 1 fact row. In the OBI RPD joins are constructed based around the Facts. Dimensions are joined through the Fact tables. (See RPD Physical Layer diagram). 

For this reason if you are looking for resources without assignments you may need to use another method to view this data. The data is all there. One of the great things about the P6 Reporting Database is the data is there we just need to access it. This is true for many areas - slowly changing dimensions, history, etc.

In OBI you can use a direct SQL query to query directly against the STAR schema to get your desired result.  
(New> Create Direct Database Request) and choose:

Connection Pool: Oracle Primavera P6 Data Warehouse Connection Pool 

Add the following SQL: 

select r.resource_name, nvl(count(distinct ra.activity_object_id),0) num_assignments
from w_resource_d r left outer join w_resource_assignment_d ra
on r.resource_object_id = ra.resource_object_id
where r.resource_name is not null
group by r.resource_name
order by 1;

This will give you an output of all resources and the number of assignments that occur in the STAR schema.  You can add additional filtering or ordering to satisfy your requirements.
About

Provide new information on Primavera Analytics and Data Warehouse

Search

Categories
Archives
« July 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today