Oracle BI Server Modeling 2- Dimensional Schema Shapes
By user741803 on Mar 24, 2011
A past marketing tag line for Oracle BI EE was, “An information modeling approach.” In my experience, thinking in terms of information modeling rather than just SQL is one of the keys to developing the Common Enterprise Information Model (CEIM) successfully. When I’m designing a query factory, I need to think in terms of the models provided in the data sources, the simplified model I want to present to the users, and the schema shape transformations from one to the other. The data shape changes over and over throughout the query cycle, from the source schema, to the physical result sets, to the logical table sources, to the business model, to the Logical SQL (LSQL) result set, to the visualization. This is sort of like a*(b + c) = a*b + a*c in algebra, where the different shapes are good for different purposes, but give the same final result. Denormalized, normalized, star, snowflake, multicube, hypercube – the same information can be presented in all of these shapes.
The purpose of this post is to help you become familiar with dimensional shape-changing so you can understand the modeling patterns in the following posts. This will be important for understanding how to design the mappings in the CEIM – being able to recognize the dimensional structures within a larger physical schema, and then map them to the dimensional business model. There is also a bit of review of dimensional concepts and terminology.
For context, let me first compare the use of tabular reports to pivot tables. Tabular reports are important for delivery of pre-specified information in a standard format as part of a business process. This approach is not based on information modeling for mass-producing SQL, but on hand-crafting parameterized SQL queries individually in a tool such as Oracle BI Publisher. The query result is typically presented in a highly formatted, predefined view:
At the other extreme, ad hoc analysis in Oracle BI EE is for interactively examining some business data to discover new insights, answer a question, or find a root cause. As I discussed in the last post, Designing a Query Factory, the business model layer of the CEIM makes this easier because it acts like a dimensional cube. An advantage of the dimensional business model is that it presents a set of measures to you as the end user, and lets you analyze those measures using a set of dimensions. You can instantly create many different views of the same data and navigate toward the answer to your question, using the simplified dimensional model that matches your understanding of the business. You do this by including or excluding dimensions, filtering on their attributes, drilling down to lower levels of detail, and so on. The pivot also helps you make sense of data that has dozens of dimensions on a screen that has just two dimensions. You can think of the pivot table below as a projection of a cube containing Quota Amount as its measure, and edges for Time, Product and Company that break that cube down into intersections containing the data values for Quota Amount.
Next, you as the end user can view this data in other ways. By pivoting all the dimensions to the row headers, a fully denormalized shape is displayed below. Note that this is still the same cube, and the data value at each intersection is still the same. In other words, the quota amount for the intersection of 2008 x BizTech x Genmind is $2,966,000 no matter how you pivot the table:
Each view helps the user see different relationships, comparisons and trends.
Dimensions with hierarchies let users drill into the details of the data, which helps them discover the areas of the business driving the overall numbers. An example of a drilldown of a simple level-based hierarchy is shown below:
The product hierarchy in this example has three levels shown, “P4 Brand”, “P3 LOB”, and “P2 Product Type”. The drill down operation consists of simply adding another level to the display, which causes more detailed intersections with the measure “8 – Quota Amount” to be displayed. Note that since I didn’t filter as I drilled, the total remains the same. This drill down behavior of Oracle BI EE is controlled by the hierarchy definitions in the CEIM. In later posts, I will discuss more complex hierarchy structures, such as multiple, unbalanced, skip level, and parent-child (value) hierarchies.
Another analytical feature of a dimension is that each member, such as “V5x Flip Phone,” can have attributes in addition to its name. These are used for filtering and to provide extra information. In the example below, size and color attributes are provided for each of the products.
An attribute always belongs to a level of the hierarchy. The color and size attributes above apply to the “P1 Product” level. Other attributes could also be created at other levels, such as an “Offering Type” that equals “Services” for the Services LOB, and “Product” for the other LOBs. When designing the CEIM, you specify which hierarchy level each attribute belongs to.
Now that I’ve shown you several examples of how the shape of a cube can be projected onto a variety of visualizations for a user, I’d like to cover some shapes of the same data that exist deeper in the system. First of all, consider that a relational BI result set is always a denormalized table, so it can always be thought of as a dimensional structure. Here is the result set the first four pivots above were all rendered from, and you can see it consists of dimension columns and a fact column:
The SQL SELECT statement that requested this denormalized result set is also dimensional, in that it specifies an aggregated measure [sum(VALUE)] and the dimension columns that give it context [Company, Brand, Per_Name_Year]. The source tables have a star shape, including a bridge table, so the SQL SELECT statement changes the dimensional shape from a star-with-bridge to denormalized:
sum(T42445.VALUE) as c1,
T42413.Company as c2,
T42409.Brand as c3,
T42405.Per_Name_Year as c4,
T42409.Brand_Key as c5
SAMP_PRODUCTS_D T42409 /* D10 Product (Dynamic Table) */ ,
SAMP_OFFICES_D T42413 /* D31 Offices (Quotas) */ ,
SAMP_CHANNELS_D T42415 /* D42 Channels (Quotas) */ ,
SAMP_BRIDGE_TABLE T42417 /* D44 Chnls Offs Bridge (Quotas) */ ,
SAMP_TIME_MTH_D T42405 /* D02 Time Month Grain */ ,
SAMP_QUOTAS_F T42445 /* F60 Facts Quotas */
( T42405.Mth_Key = T42445.MTH_KEY and T42409.Prod_Key = T42445.PROD_KEY and T42413.Office_Key = T42417.Office_key and T42415.Chnl_Key = T42417.Chnl_Key and T42415.Chnl_Key = T42445.CHNL_KEY )
group by T42405.Per_Name_Year, T42409.Brand, T42409.Brand_Key, T42413.Company
The dimensional business model schema makes ad hoc analysis easier for end users, among many other advantages it provides in BI. The CEIM’s layers of mappings for mass-producing queries need to change the schema shape from whatever is provided in the source to the dimensional shape of the business model. This post describes basic terminology and capabilities, but some following posts will show you how to map a variety of source database schema shapes– header-detail, supertype-subtype and other normalized structures, snowflake, flat file, hypercube, multicube, etc. Before I get to that, though, I will continue with the basics of how dimensionality and grain apply to the business model, and then describe the unique “theory” or principles for modeling in the business model layer.
The examples in this blog series are based on the Oracle BI EE 11g Sample Application, build 825 or higher, which includes the DDL and data for samples in relational and multidimensional databases. You can download it from Oracle Technology Network (OTN). Follow the instructions to install at least the “core” functionality. In order to see your LSQL, physical SQL and MDX in nqQuery.log, be sure to enable logging for your user.
This post used subject area B, except that the attributes example used subject area A.