Loading...
You are here:  Home  >  Data Education  >  BI / Data Science News, Articles, & Education  >  Current Article

The Holistic Data Quality Framework – Version 1.0

By   /  May 5, 2012  /  No Comments

By Jay Zaidi

Holistic Data Quality (HDQ) is a term that I have defined to monitor and measure the quality of data in a cross-siloed manner, rather than in departmental silos (see http://www.dataversity.net/holistic-data-quality-a-new-paradigm-in-enterprise-data-quality-management/ for details of the HDQ concept and rationale). Implementing HDQ at the enterprise level is a challenging task – given the complexity of a typical data ecosystem, data-related politics, data ownership issues and budgetary constraints. Rolling out HDQ requires a strategic approach to enterprise data management, long term vision and the requisite investment in people, process, technology and data capabilities. Firms that have taken this approach have benefited greatly – since they now have a strong foundation in data quality tools and processes – that will bear fruit in an ongoing manner.

There are some fundamental aspects of HDQ that must be understood, before embarking on this exercise. They are:

– The consistent definition of data quality requirements and measurements
– The deployment of one or more enterprise-level tools to measure the quality of data

People/Process/Technology/Data related to framework

1. Data Requirements – Dimensions of data quality
2. Data Requirements – Systems or record and trusted sources per element
3. Data Requirements – Data quality rules associated with each data element and data quality dimension
4. Process – State of the data (as it flows across the information supply chain)
5. Process/timing – DQ execution timeline/events
6. Measurements – Thresholds and tolerances
7. Measurements – Data quality-related Statistics (Roll up of data at summary level/detail level)
8. Measurement – Automated alerts, based on thresholds and tolerances
9. Data Certification – Rules related to certifying data (single record) and group of related records

Enablers

1. Data quality tool(s)/rules engine
2. Metadata tool (Data Dictionaries, Glossaries, Data Lineage, etc.)
3. Continuous Improvement Process for Quality (six sigma)
4. Data quality methodology
5. Issue Management/Root cause analysis
6. Business Intelligence tool(s)
7. Data Quality Data Mart/Store

About the author

You might also like...

Stop Making These Five Simple Mistakes in Big Data Analytics

Read More →