Data Quality (DQ) as stated in the DAMA International, Data Management Book of Knowledge “Refers to both the characteristics associated with … and to the processes used to measure or improve the quality of data.”
Data is considered high quality to the degree it is fit for the purposes data consumers want to apply it. It meets their explicit and implicit business requirements. Since expectations about Data Quality are not always verbalized and known, an ongoing discussion is needed. Data Quality depends on context and the Data Consumer’s needs.
Data Quality often has the following dimensions:
- Uniqueness/ Deduplication
Other Definitions of Data Quality Include:
- “Fit for a purpose. Meets the requirements of its authors, users and administrators.” (adapted from Martin Eppler) (Peter Aiken)
- “Synonymous with Information Quality.” (Peter Aiken)
- “Reliance on accuracy, consistency and completeness of data to be useful across the enterprise.” (Michelle Knight, DATAVERSITY®)
- Tools and processes used for: (Gartner)
- Parsing and standardization
- Generalized “cleansing”
- Strong-Wang framework: (Wang, and Strong, MIT and DAMA DMBOK)
- Intrinsic DQ
- Contextual DQ
- Appropriate amount of data
- Representational DQ
- Ease of understanding
- Representational consistency
- Concise representation
- Accessibility DQ
- Access Security
- Intrinsic DQ
A Few Uses of Data Quality are:
- Increase the value of organizational data and the opportunities to use it.
- Reducing risk and cost associated with poor quality data.
- Improving organizational efficiency and productivity.
- Protecting and enhancing the organizations reputation.
- Data Profiling.
- Data Standardization.
- Data Monitoring.
- Data Cleansing.
Photo Credit: Rawpixel.com /Shutterstock.com