Retail Data Warehouse Choice
By David Dorf on Nov 14, 2008
Anybody can be a retailer, but only a select few can be profitable retailers. What separates the two groups is usually the quality of business decisions made. Some are lucky enough to make the right decisions based on gut and experience, but most rely on analyzing data, such as sales figures, product mix, velocity, price elasticity, promotional uplift, market basket analysis, etc. To turn data into information, retailers need reliable business intelligence. This is then dissected into two parts: asking the right questions and having the right data easily accessible. This post is focused on the second part, and specifically the data warehouse.
The design approach of an enterprise data warehouse (EDW) usually falls into two camps. The Inmon paradigm states that the data warehouse holds all historical information, typically in third-normal form (3NF). Data marts source their information from the data warehouse but store the aggregates in stars. Although slow, its possible to query against the 3NF when an appropriate star is not available. Since its a big task to get the data warehouse in place before the marts can be created, this is considered a top-down approach. But once its in place, is fairly easy to add new stars to access new queries.
The Kimball crowd prefers to define their requirements, then build stars to house all the transactional data as roll-ups and aggregates. This, of course, is based on the premise that retailers are able to proactively and accurately articulate their BI requirements, which will be true for core requirements but false for less obvious, emerging requirements. Since its possible to create lots of independent marts not dependent on a central data warehouse, it is considered a bottom-up approach. As each mart is completed, the business has access to that information. But adding new marts impacts the ETL all the way back the source systems, so that can be time-consuming.
Since its roots are pure technology, Oracle never really had to side with either camp. It just kept adding more and more functionality to the database that benefited both approaches. But now that Oracle is in the applications business, and especially in the industry applications business, many examples of both approaches are found in its products.
So, which approach is best for the retail industry, and if you could buy one off the shelf, where would you place your bet?