Getting your feet wet with data profiling: freeware tool

While a lot of what we do on the OWB blog is about teaching OWB users to get more out of the tool, we also want to stimulate Oracle database customers' thinking about their data in general. (We think OWB data quality and its integration with ETL compare favorably with anything else out there, so the more people think about the problem, the more they will gravitate towards OWB as a solution.)

Dylan Jones over at dataqualitypro.com is clearly thinking about some of the same things we've been thinking about in Warehouse Builder for some time. He's introduced a free tool for doing basic pattern analysis of data in an Oracle database, which in some ways provides similar insights into your data as basic profiling with Oracle Warehouse Builde. You can find out more about his freeware tool here.

If you start getting interesting insights using a tool like this, you will find more value in trying out the OWB data profiling and data quality features against your database.

Many OWB customers actually use OWB as a standalone data quality tool, mostly ignoring the ETL and data integration features when they get started. Then, once they have insight into their data, they can create data rules to actually enforce what they discover, and introduce data cleansing and data auditing on their sources and targets with OWB to catch and resolve any bad data that's coming in.

In general, you can use OWB for data quality measurement and enforcement and even data cleansing without disturbing your existing ETL logic: data profiling and data quality auditing are non-intrusive by nature, and for data cleansing you can create OWB mappings that either cleanse your data in place or copy cleansed data to a temporary location from which you can re-load it with your existing ETL method.

Anyway, we wanted to share this thought with you, and give a tip of the hat to Dylan's community, which is growing fast and generating a lot of fresh, thought-provoking content.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

ETL, CDC, Real-Time DI and Data Quality for the Oracle Database from the inside.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today