By rbewtra-Oracle on Oct 16, 2014
By: John Siegman – Applications Sales Manager for Master Data Management and Data Quality, & Murad Fatehali – Senior Director with Oracle’s Insight team leading the Integration practice in North America.
Big data, business intelligence, analytics, data governance, data relationship management, the list of data oriented topics goes on. Lots of people are talking about all of these and yet very few people talk about poor data quality, let alone do anything about it. Why is that?
We think it’s because Data Quality suffers from the “Do I have to?” (DIHT) syndrome. Anyone with kids or anyone who was a kid will recognize this syndrome as in, “Clean up your room.” “Do I have to?” Dealing with poor quality data is not glamorous and it doesn’t get the headlines. Installing business intelligence systems or setting up data governance, gets the headlines and makes careers. But what good is better reporting and structure if the underlying data is junk?
Recently we were in a half day planning session for an existing customer. The customer wanted to know what they could do better using the existing software they had already purchased as Phase 1 of the project and what they would need to acquire to do things better for Phase 2. Reviews like this are critically important, as people change on both sides of the customer/vendor relationship, to ensure knowledge transfer and reaffirmation of goals. The customer provided access to numerous departments across their company for interviews and focus groups. All of this information was gathered, reviewed, summarized and suggestions were made. Excel spreadsheets and PowerPoints ensued. Even though the Aberdeen Group and others have shown significant performance increases in established ERP and other business systems through the use of Data Quality and Master Data Management, because the customer did not directly say they had a data issue (and very few customers ever admit this because poor data is just standard operating procedure), no emphasis was put on data quality as a way of improving the customer’s processes and results with their existing software packages. What is it about data quality that makes it the option of last resort? The go to, when all else fails. It’s got to be the belief that the data in underlying systems and sources is good by default. I mean, really, who would keep bad data around? Well, pretty much everyone. Because if you don’t know that it’s bad, you end up keeping it around.
Let’s admit it, DQ is not glamorous. There are no DQ-er of theyear awards. People in DQ don’t typically have their names on a parking spot right up front in the corporate lot. And besides not being glamorous, it’s hard. Very rarely do we see someone ‘own’ data quality – after all since bad data affects multiple people, across multiple functions, no one really has the right incentives to drive data quality improvements where the resulting benefits accrue to multiple constituencies. Nobody really wants to spend their functional budgets fixing enterprise-wide data problems. Some of the very early DQ adopting companies have teams of people, representing a cross-section of processes and functions, who spend their days manually inspecting data and creating internal systems to meet their specific data quality needs. They are very effective at what they do, but not as efficient as they could be because the whole is greater than the sum of its parts. Also, most of the data knowledge is in their heads and that’s really hard to replicate and subject to loss due to job switching or retirement or the possible run in with the proverbial bus.
So, given the underwhelming desire to fix poor data, how do you get the powers that be in your company to see the light? In our last article Data Quality, Is it worth it? How do you know? we examined the value of data quality based on units of measure that were meaningful to the given organization. To paraphrase Field of Dreams, if you build the ROI, they will come. The first step to building the ROI, is understanding how poor your data is and what impact that has on your organization. Typically that starts with a Data Quality Health Check.
A DQ Heath Check takes a sample of your data and looks at varying aspects to determine the quality level of your data. The aspects examined include: Consistency, Completeness, Accuracy, and Validity. These measures attempt to answer the question, Is your data fit for purpose? Consistency looks at the validation of the data within a variable. For example if the variable in question only allows for Ys and Ns, Ms and Ts will lower the consistency rating. Completeness is just that, how complete is the data in your database? Using our previous example if Ys and Ns are only present 20% of the time, your data for that variable is fairly incomplete. Accuracy looks at a number of things but mostly represents data within the bounds of expectations. And Validity looks at usefulness. For example, telephone numbers are typically 10 digits. Phone numbers without area codes or with letters while complete and possibly consistent are not valid.
In another recent customer engagement, we looked at customer records for data anomalies specifically for consistency, completeness, accuracy, and validity. We found that fixing these records resulted in improvements not only in marketing (campaign effectiveness), but also improved service (customer experience), higher collections in finance (lower receivables), and improved reporting. In today’s data rich, integrated, system-driven processes, improving data quality in one part of the organization (whether it be customer data, supplier data, financial data) benefits multiple organizational functions and processes.
So while data quality will never be glamorous for individuals, with a little insight providing a strong ROI for DQ we can move this from Do I Have To? to Let’s Do This.