Retail Data Warehouse Choice

Inmon%20vs%20Kimball.JPGAnybody can be a retailer, but only a select few can be profitable retailers. What separates the two groups is usually the quality of business decisions made. Some are lucky enough to make the right decisions based on gut and experience, but most rely on analyzing data, such as sales figures, product mix, velocity, price elasticity, promotional uplift, market basket analysis, etc. To turn data into information, retailers need reliable business intelligence. This is then dissected into two parts: asking the right questions and having the right data easily accessible. This post is focused on the second part, and specifically the data warehouse.

The design approach of an enterprise data warehouse (EDW) usually falls into two camps. The Inmon paradigm states that the data warehouse holds all historical information, typically in third-normal form (3NF). Data marts source their information from the data warehouse but store the aggregates in stars. Although slow, its possible to query against the 3NF when an appropriate star is not available. Since its a big task to get the data warehouse in place before the marts can be created, this is considered a top-down approach. But once its in place, is fairly easy to add new stars to access new queries.

The Kimball crowd prefers to define their requirements, then build stars to house all the transactional data as roll-ups and aggregates. This, of course, is based on the premise that retailers are able to proactively and accurately articulate their BI requirements, which will be true for core requirements but false for less obvious, emerging requirements. Since its possible to create lots of independent marts not dependent on a central data warehouse, it is considered a bottom-up approach. As each mart is completed, the business has access to that information. But adding new marts impacts the ETL all the way back the source systems, so that can be time-consuming.

Since its roots are pure technology, Oracle never really had to side with either camp. It just kept adding more and more functionality to the database that benefited both approaches. But now that Oracle is in the applications business, and especially in the industry applications business, many examples of both approaches are found in its products.

So, which approach is best for the retail industry, and if you could buy one off the shelf, where would you place your bet?

Comments:

A while back I worked for a company that tried to produce a 'black-box' retail data warehouse - the one-size, one-model, fits all approach. Of course, it was doomed - maybe because of the technology choice - Tandem and non-stop SQL, but more probably because most retailers model of the world differs. One organisation has a rich sense of customer and customer segmentation (from customer registration) others only know their shoppers by the basket they buy and where they buy it. Some have deep hierarchies of product, others, maybe one or two layers. Some make lots of transactions per customer (food at a supermarket) other just one or two (fashion).

The key is to have the facts and reference data to support your analysis - and maybe just not from the sales domain - logistics, stock, supplier management are also key areas to include. I tend to favour a enterprise DW store so that the data miners can find insight, but published to stars that align well to the query tools for operational users. Yes, adding a new subject domain is new work from ETL to publish but this is isolated work, we should only be adding new facts and a little reference data, but we will be sharing the bulk of the existing reference data with the present domains

Posted by Pete Scott on November 14, 2008 at 05:40 PM CST #

If I had to vote then I would go Kimball. Although the choice does not have to be 'pure' Kimball or pure Inmon, there can be shades of both in a flexible reporting solution. (for example, Just what is the LOV table?)
Much is made about structure sovling performance issues, particularly in retail where there are millions of records. But technology is evolving and there plenty of Oracle software and now Hardware tools to speed things along irrespective of structure.
However, the biggest single issue cannot merely be which structure is used, but how has it been implemented?
I tend to think that a good BI consultant will not build a Kimball database just to the initial requirements, they must think about potetntial future use. They will also keep up to speed with the functionality within the database. Simple stuff like Indexes, Hints, Partitioning and Materialised Views must not be overlooked.

One final food for thought, the best performing financial reporting system I saw in Investment banking was a SINGLE table(containing every field that may be needed), analysed by Essbase!!

Posted by Adrian Ward on November 15, 2008 at 01:10 AM CST #

I’d certainly expect the “which is best for retail” responses to vary. An Enterprise Data Warehouse in many ways indeed reflects the business with which it supports. Thus, to one of Peter’s points, a one-size-fits-all solution is rarely viable. Just as the EDW’s content mirrors the business, the approach to building it is often a reflection of the state that the business (and IT) is in at the time. The Kimball approach often lends itself to retailers who want a quick start by focusing just on BI/reporting content and supporting data mart stars. Inmon’s approach involves first establishing a long-term 3NF foundation on which stars can be built as BI requirements dictate. Often, retailers will build one or more urgently-needed marts to buy time while they then go back and establish their warehouse. Other times, both a bottom-up and top-down approach are tackled concurrently. In recent years, I developed a data warehouse for a Fortune 200 retailer by starting with an Inmon approach, where we established core sales, inventory and dimensional tables and aggregations with very minimal BI requirements, but then drove the development of the remainder of the warehouse (promotions, customer, etc.) in a Kimballesque fashion. Our base data resided in 3NF, from which we rolled to stars. In hindsight, we would have had a generally “tighter” solution, better performance and less rework had we employed a pure Kimball design/approach, but the inability of the business to proactively drive requirements that were sustainable was prohibitive. As Adrian advocates, a good BI consultant will design a warehouse that meets specific requirements (a la Kimball,) but also includes features that anticipate future requirements (a la Inmon.) I think that that approach for retail is very common. So, with that, I'm wondering which structure is most common, 3NF base data rolling to stars, or pure stars?

Posted by Mark Lawrence on November 17, 2008 at 05:08 AM CST #

Post a Comment:
  • HTML Syntax: NOT allowed
About


David Dorf, Sr Director Technology Strategy for Oracle Retail, shares news and ideas about the retail industry with a focus on innovation and emerging technologies.


OpenWorld


Stay Connected
Blogroll

Search

Archives
« July 2014
SunMonTueWedThuFriSat
  
1
4
5
6
8
10
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today