Data Quality - Effectiveness Versus Accuracy
By Rob Reynolds on Oct 16, 2008
Dealing with data quality is a common task when implementing a BI Solution or Data Warehouse. Often times we find that data quality is really a complex, intertwined set of issues resulting from operational systems and processes that often cannot be changed to meet the needs of the BI Solution.
Avoiding data quality issues requires thinking a bit differently from considering "quality" in the traditional manufacturing sense. Discovering and correcting poor material in a manufacturing process is a binary decision. The material is right, or it isn't. The key mindset for data quality is to focus on data "usefulness," or data "effectiveness." Every application of information does not require the same levels of accuracy. If the information can be used to make the intended decisions, then it is considered effective. It is quite likely that a certain percentage of your information can meet many requirements without being perfect (i.e. marketing data). It is also likely that certain key information must be correct in every way (i.e. financial data).
First and foremost, BI & DW solutions are for making decisions. In many organizations, thresholds of acceptance can be created and data can be "good enough" to make certain decisions, but perhaps not "perfect." Data effectiveness for decision making can be accomplished at a lower level of "quality." In my next post, I will explore techniques for defining accuracy threshold and on-going monitoring of data quality against those thresholds.