Continuing the Data Quality Conversation
By Rob Reynolds on Nov 04, 2008
A colleague of mine just reminded me of a great post last week to the Rittman Mead blog by Peter Scott.
I saw this post last week after posting about defining accuracy thresholds. Peter makes some great points and his main theme is spot on: regardless of what you do about data quality, there must be transparency to the users. His suggestion of a "data quality dashboard" is a good one and is concept that I have used in the past.
I'm a firm believer that a data warehouse should match the data in the source system. If you have quality issues in the source, you should fix them in the source. Having transparency into these issues is often the catalyst for getting them fixed.
Even in the case of data where accuracy thresholds are less than 100%, there should be processes to attempt to improve the quality of that data on an ongoing basis. As mentioned previously, the main reason we accept data that is not completely accurate is often timeliness for effective decision making. Given more time for additional processing, auto correction, or human intervention, most data issues can and should eventually be resolved.
By the way, I highly recommend adding the Rittman Meed blog to your RSS feed. They are a great Oracle partner and their blog always has timely, hands on information and best practices for using Oracle BI software.