Better Information With Master Data Management
By David Butler on Sep 02, 2010
In our last blog, we showed how every single answer produced by the BI tools on the Data Warehouse star schema was wrong.
What is the average revenue per customer? $47.50
Who is the most valuable customer? Mary Smith
How much did the most valuable customer spend? $50
Who is the number one retailer? Old Navy
What is the maximum revenue for any supplier? $50.00
The answers were wrong because the analytical side of the business is not designed to deal with the following transactional realities:
· Mary Smith married Mr. Evans and changed her name to Mary Evans after she bought the sweater from Old Navy. She is, in fact, the same person who bought the similar sweater from Banana Republic the next day.
· Old Navy and Banana Republic are both subsidiaries of The Gap.
· AI Corp is an alias for Acme, Inc. They are in fact the same supplier.
· VN-Sweater and RF-Sweater are two ids for the same actual item.
Oracle's MDM solution is designed to understand these facts, accurately reflect this reality, and provide the correct dimension information to the Data Warehouse.
We will now itemize the key MDM capabilities supporting BI. We will then rebuild the star schema and re-execute the queries to get the right answers.
In any MDM solution, the data model, or schema, is the foundation upon which all subsequent MDM application functionality is built. As an applications vendor, Oracle had to create technologies that cleansed the master data within our own applications. We used the proven application data models themselves to build applications to master the applications' own data. We then extended these capabilities to the enterprise to run with or without other Oracle applications.
The Oracle model is tailored to map to the way organizations actually do business. The Oracle MDM data models are unique in that they are an OLTP schema that represents a superset of every way master data has been defined by all attached applications. It has the flexibility to accommodate organization and industry specific extensions. It holds all necessary hierarchical information, all attributes needed for duplicate identification, removal and prevention, as well as cross-reference information for all attached operational systems.
In our example, the MDM schema holds customer data in both business-to-business (Old Navy, Banana Republic) and business-to-consumer (Mary Smith, Mary Evans) formats. In addition, it holds the master supplier data (Acme, Inc, AI Corp) and retail product data (VN-Sweater, RF-Sweater). The names and all needed attributes are maintained.
In order to deal with real time changes to master data, such a the marriage of Mary Smith to Mr. Evans, Oracle's MDM solution includes a real time Business Event System. Any change to master data attributes triggers a business event that in turn invokes a workflow process. The workflow process builds appropriate XML payload packages and executes the configured steps for the particular data change.
In our example, the introduction of Mary Evans triggered a 'New Customer' event. This kicked off a workflow to populate Mary's record with all available information. For example, it may have requested address validation from a postal address verification vendor to insure that all addresses are mailable. Standardized addresses also aid in duplicate identification. The workflow may have requested data augmentation for credit ratings, or obtained an AbiliTec ID from Acxiom to assist with duplicate identification. This is done in real time.
Oracle's MDM solution for customer data is the Customer Hub. It comes with a variety of mechanisms for finding duplicate customer records. A primary technique is to configure a rules engine to find potential matches using a large number of customer attributes. In our example, Old Navy has entered Mary Smith as a customer. Her universal ID is established. The Customer Hub manages Old Navy as a source system and records Mary Smith's ID in that system. Mary Evans is similarly managed. This is the base for the MDM cross-reference.
MDM utilizes all available attributes to determine if these are duplicates. Typical match rules will examine addresses, phone numbers, e-mail addresses etc. Additionally, 3rd party data such as an AbiliTec ID from Acxiom may be used. In our example, the system fines that Mary Smith and Mary Evans are indeed duplicates in spite of the different name and address.
Company duplicate identification uses the same general rules engines as the Person duplicate identification. The key difference is that the number and type of attributes available for a company are different. For example, companies can have a DUNs number provided by D&B. In our example, a search on AI Corp produces a match with Acme Inc. Alias information was used by out-of-the-box duplicate identification rules.
Once the Customer Hub identifies Mary Smith and Mary Evans as duplicates, it eliminates the duplicates by merging the multiple records into one. The cross reference is maintained. Where before the merge, there were two customer records each pointing back to one source system, we now have one customer record pointing back to two source systems.
Another key capability of the Customer Hub is its ability to manage the survival of customer attributes in the face of multiple sourcing systems and customer record merges. The MDM Customer Hub maintains the source system priority rankings for each attribute. While all records remain in the MDM data store, only the 'blended' single version of the truth record is seen by applications and viewers.
Oracle's MDM solution for product data is the Product Hub with Oracle Product Data Quality (PDQ) for product data standardization. This standardization enables rapid and parameterized searching and accurate duplicate identification. In our example, Old Navy uses the string: VN PO 50 Blue W 24W 36B 22A. Banana Republic's sweater is identified by: B Wool V Neck Pllver S:36. These records are loaded into the Product Hub schema through PDQ's Data Lens. Attributes such as style, color, and size are populated as well as catalog codes. An English description is generated as well as other appropriate languages as needed. In our example, we see that both products are V-Neck Pullover blue wool sweaters and that they actually have the same ID code. They are in fact the same product and now the MDM system recognizes them as such.
Hierarchy information is critical for proper aggregation and roll-ups. Oracle's Customer Hub maintains any number of simultaneous hierarchies used by the operational applications. These include Dunn & Bradstreet hierarchies with out-of-the-box connectivity to D&B for both batch and real time information access. In our example, D&B provides the hierarchy information for Old Navy and Banana Republic. It turns out that they are both subsidiaries of The Gap.
MDM has identified the customer duplicates; maintained the cross reference back to the sourcing systems across a merge; developed the single golden customer record utilizing survivorship rules; found the two products to be identical; learned that the two retailers belong to one corporate hierarchy; and found through good duplicate identification techniques that Acme, Inc. and AI Corp are in fact two names for the same vendor. If we deliver this updated cross reference and dimension data to the data warehouse, we get the following star schema.
Re-running the same query now get the correct answers:
What is the average revenue per customer? $95
Who is the most valuable customer? Mary Evans
How much did the most valuable customer spend? $95
Who is the number one retailer? The Gap
What is the Max revenue for any supplier? $95
We see that better information has been provided through Master Data Management. MDM fixed the data quality problem at its source and delivered quality dimensions to the analytics. No other technology on the market is designed to accomplish this essential task.
There are three legs to a complete Business Intelligence solution: 1) the Data Warehouse for holding the operational history; 2) the Enterprise Master Data Management solution for insuring that quality data under those operational applications and hierarchies are supplied to the Data Warehouse; and 3) the BI applications themselves that utilize the DW and MDM data to get clean authoritative information to everyone in the organization that needs it. Without MDM, the solution falls over. Poor decisions based on inaccurate data drive less than optimal performance. Compliance becomes difficult and risks increase.
Oracle MDM provides clean consolidated accurate master data seamlessly propagated throughout the enterprise. This data reflects the actual operations of the organization. It insures that this is the data the BI tools use. It is the glue between the operational and analytical sides of the business. Oracle MDM enables organizations to get a single view of the enterprise for the first time since the application landscape fragmented back in the 1970s. This can save companies millions of dollars a year, dramatically increase operating efficiencies, improve customer loyalty and support sound corporate governance.