Oracle BI Server Modeling, Part 3- Conformed Logical Dimensions
By user741803 on Mar 28, 2011
One of the primary business benefits of the Common Enterprise Information Model (CEIM) is enabling goal-based visibility and performance management across the enterprise. Business managers need to compare actual performance side-by-side with targets, such as actual sales compared to the sales forecast. The ratio of these two metrics– actual sales as a percent of forecast– can be compared to a threshold to create automatic stoplights and alerts. To measure and improve effectiveness, managers also need to compare measures across processes, such as a ratio of product shipments to inventory levels (inventory turns). This post lays out the basic concept for tying data from these far-flung sources together so they can be interactively analyzed as a whole, automatically manufacturing queries and handling the exception cases for the business users.
In Part 1 of this series, I briefly introduced that basic concept: conformed logical dimensions in the business model layer. In Part 2, I reviewed general dimensional concepts as applied to a single measure. In this post, I will describe the dimensionality of the business model as a whole, and how the conformed dimensions enable business performance visibility.
The concept of grain is a key to understanding conformed dimensions. This post shows how individual measures have differing grain within the overall conformed dimensions of the business model, and how grain is presented in the end user’s analytical experience. Later, this concept will be essential to understanding all the posts covering the logical-physical mapping patterns.
Consider a CEIM business model for analyzing a company’s sales. The business model acts like a single cube, with a single set of measures and a single set of dimensions that apply to them. This single, virtual cube abstracts and simplifies a broad set of physical data sources of differing shapes, technologies, and levels of detail. Even though the data may come from many places, the conformed dimensions allow the CEIM developer to wire them together, creating the end user’s experience of a single, easy-to-understand data source. Conformed logical dimensions are the common, consistent basis for comparing and combining measures, no matter where the fact data came from.
An illustration of the Sample Application’s “1 – Sample App” business model is shown below. Each dimension is listed as a row header on the left, and has a hierarchy consisting of a set of entities called levels shown left to right. For simplicity, some Sample Application dimensions and hierarchies are not shown.
Figure 1. Business Model Grain Example
The “grain” of this cube is the combination of the most detailed levels it supports, Day x Product x Office x Sales Rep x Customer, highlighted in red. At query time, measures can always be aggregated to higher levels, such as Year x Brand x Company, which is often referred to as the “query grain.” A query-time operation that artificially creates a finer-grained result than the source data is usually called an “allocation.”
The “total” level is special. This is not an entity the CEIM developer models, but is automatically created with just a single, implicit “total” member instead. When a dimension is not specified in the LSQL for a query, the total is used instead. This is similar to the behavior of a relational star, where all members of a dimension are aggregated in the results if that dimension is not specified in the query. There are also more complex scenarios involving MOLAP data sources.
Each dimension also has a detail level, which contains all the leaf-level data members for the dimension, such as a member for each day. In between, there are optionally more levels, such as the “H6 Customer” dimension’s “Segment” level.
* “H5 Sales Rep” is a parent-child (value) hierarchy. The detail entity’s data members have a recursive parent-child relationship, rather than a separate metadata entity (type) for each level. For the purpose of thinking about grain, consider this to be a single level for now.
Measures of Differing Grain
Each business process has its own grain, and a process often breaks down further into a value chain where each link may have a different grain [Kimball]. Consider the simplified sales orders in the table below, where the “1- Revenue” and “2- Billed Quantity” measures have a grain of Day x Product x Office x Sales Rep x Customer. If you took a look at the same company’s manufacturing process, you would see a somewhat different set of dimensions: time and product would be shared, and plants, suppliers and parts would be added, but offices, sales reps and customers would not be present.
Figure 2. Measure Grain Example
The operational data from a business process, such as measures 1 and 2 above, usually have a very fine grain. This is because the data comes from the transactions of the business operations themselves. In the sales example, an order transaction has a timestamp of the day the order was made, the exact product SKU, the sales office and sales rep responsible for the sale, and the exact customer information down to the ship-to and bill-to addresses.
Information created by the managers of that business process, on the other hand, normally has a much coarser grain. In the sales example, the “8- Quota Amount” and “5- Target Revenue” measures are defined at the month level, not the day. The quota is specific to the sales rep, but the target is for the organization and doesn’t cover the sales rep at all. Despite the grain mismatch, managers and employees need to be able to compare these measures, and even build ratio measures combining the two. After all, comparison of actual performance (measures 1 and 2) to targets, plans, quotas, forecasts and other “predictive” measures is essential to managing the performance of the business process. Comparison is also essential for consuming information, since it provides the context for determining whether a number is good, bad or okay. Even when there is no explicit comparison goal measure created by the managers, the dashboarding best practice is to use a generic comparison, such as the same measure in the same period in the prior year.
For these reasons, the Oracle BI EE business model virtual cube allows its measures to have differing grains, and the BI server automatically handles the resulting boundary conditions in the millions of queries it manufactures. The overall grain of the business model is the combination of the most detailed levels of all of the dimensions and measures in it. However, any typical measure will support only some of the dimensions of the business model, and may support any individual dimension at a coarser level than the finest supported by the overall business model.
To illustrate this concept, let me walk through a set of reports showing this set of measures, one dimension at a time. The product dimension is probably the simplest in the table above. Each of the measures supports it at the “Product Detail” level, except measure “10- Full Time Empl (Beg Period)”. This is clearly visible in the pivot table below:
Figure 3. Report of Measures by Product
As you can see, because all these measures use the same, conformed product dimension, the BI server was able to automatically stitch them together into a single result set, which presentation services displays in a single report. If they came from the same fact table, that would be easy, but they come from four different fact tables in the Sample Application. Even if they came from different source databases, the conformed dimensions would allow them to be stitched into the same report (see Business Model “3 – Sample Federated” in the Sample Application if you want a federated example). It is important to notice that conformance means that these sources can be mapped to a common structure – the same levels – and also the same data members.
Notice that the special “Total Products” level does indeed display the total value aggregating all the data for that measure 10, but the detail levels display null because there is no data stored below that dimension’s total grain.
You may be wondering why there is a null intersection of measure “8- Quota Amount” with the product dimension member “Tungsten E Plasma TV.” This can’t be because of the grain of the measure, since the grain table shows a lowest supported level of “Prod. Detail.” The reason for the null is “sparcity,” meaning there simply is no quota data stored for that product for that measure. This lack of data is because the business did not create a quota for this product.
Next, consider the same measures with respect to the time dimension. Figure 2 shows that measures 1, 2 and 10 have a day grain, while 5 and 8 have a month grain. In the report below, you can see that as I expanded the time dimension down to the week and day levels, the values for the month grain measures automatically become null:
Figure 4. Report of Measures by Time
The nulls in the year 2007 and for 12/31/2010 are due to sparcity, because there were no sales during those periods. The nulls in measure 10 are a different kind of sparcity. This kind of measure is known as a “snapshot” or “balance,” measuring the number of employees at a point in time, rather than the hiring and firing “flow” transactions. The values are valid at a particular day in time, but are only recorded once per month.
Many of the most important business metrics are ratios. Ratios provide most of the measurements of business process effectiveness, efficiency, variances, averages and rates, which are essential to monitoring and improving business execution. For example, sales margins, revenue per employee, inventory turns, shipments growth year over year, 12-month trailing average revenue, customer churn rate, and product option attach rate are all ratios. In most cases, the operands come from different tables or data sources, and the grains of the operands in these derived measures usually mismatch. The CEIM enables you to declare the behavior and mappings for these measures, so the BI server will plan and execute the queries and post-processing automatically.
A simple example is the average selling price of the units sold, “1- Revenue” / “2- Billed Quantity”. It is already populated in the Sample Application as the derived measure, “20 Actual Unit Price:”
Figure 5. Simple Ratio Metric With Matching Grain
These operands happen to come from the same physical fact table in the Sample Application, so the BI server can simply write SQL that does all the work.
However, it is more common for the operands to come from different sources, and to have mismatched grain. The Sample Application’s measure “26 Target Completion (Rev Pct)” is the percentage of the revenue target achieved in the sales process. In the business model, it is defined as a derived measure, the ratio of “1– Revenue” / “5- Target Revenue”. Looking at Figure 2, you can see that in the “H0 Time” dimension, measure 1 has a day grain, while measure 5 has a month grain. In the ratio, the BI server must default to the grain of the coarser operand, as shown in Figure 6 below:
Figure 6. Report of Grain Mismatched Measures by Time
The conformed dimension mappings are essential in the execution of this report, since the operands come from different physical source tables. The BI server can stitch the two result sets together on the conformed business keys, the time dimension member data values. In other words, the 93.5% for year “2008” can only be calculated because the revenue (16,500,000) and target revenue (17,650,000) are both able to conform their physically stored data values to the “2008” value. As the CEIM developer, you will create the mappings from these two physical dimension tables to the conformed logical dimension, so the BI server can automatically write the query of the tables and stitch the results together.
Level Based Measures, Share and Contribution
For some business measures, the calculation requires one of the operand measures to have a fixed grain for one of the dimensions, while it is variable for the other operand. Common examples include market share, and contribution to the overall organizational results. An example of the latter is the contribution each customer or customer segment makes to the attainment of the overall quota for the organization, as illustrated in Figure 7, below:
Figure 7. Report of Level-based Measure and Share or Contribution
To calculate this contribution, the revenue of each member (segment or customer) must be divided by the quota for the entire organization. The BI server enables this with its “level-based measure” feature. In this example, the quota measure is pinned at the “total” level in its definition, so its result for every member row is the same as the pinned total level, even though it has no data physically populated for those levels. That allows the ratio to use this constant total value as its denominator, so Abigail Hardy’s revenue of 43,330 can be calculated as contributing 0.09% to the 50,330,000 quota for the organization. In this case, the grain of the ratio result is the grain of the revenue measure operand.
The power of a level-based measure becomes even clearer when a second dimension is added to the table. Figure 8, below, shows “1-Revenue” compared to a level-based version of itself, “162 Rev (Sum for All Products)”. In the yearly subtotals, you can clearly see that the sum for all products for the year is equal to the level-based value for the year, as expected. This allows you to show the contribution of the product to the total on a year by year basis.
Figure 8. Report of Level-based Measure and Contribution by Two Dimensions
Conforming logical dimensions are one of the fundamental concepts a CEIM developer must apply. They provide the basis for the BI server to automatically create reporting result sets containing measures aggregated from different sets of tables, or even federated from different data sources. They also enable derived measure calculations, which deliver essential performance management metrics and cross-business-process visibility.
Grain is also a central concept in the mappings from the physical layer to the logical layer. Since reports and business measures are often derived from measures that have differing grain, the BI server is designed to handle the missing detail from the coarse sources automatically. While this post presents the ins and outs of grain in the context of the end user’s analytical experience, the grain concept will also be essential to understanding all the posts that cover the structure of the business model and the mappings.
In the following posts, I plan to describe the unique modeling theory for the business model layer, and then begin working through modeling patterns for the various mapping cases.
The examples in this blog series are based on the Oracle BI EE 11g Sample Application, build 825, which includes the DDL and data for samples in relational and multidimensional databases. You can download it from Oracle Technology Network (OTN). Follow the instructions to install at least the “core” functionality. In order to see your LSQL, physical SQL and MDX in nqQuery.log, be sure to enable logging for your user.
The logical columns used in this post can be found in the following places: Subject Area A: All dimension columns, and measures 1, 2, and 5 (in Base Facts), 26 (in Calculated Facts / Simple Calculations), and 162 (in Calculated Facts / Level-based Measures). Subject Area B: measure 8. Subject Area C: measure 10.
Kimball, Ralph, and Ross, Margy. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition), pp. 33, 68. Wiley, 2002. ISBN-10: 0471200247, ISBN-13: 978-0471200246