Data Quality (DQ) is “the planning, implementation, and control of activities that apply quality management techniques to data, in order to assure it is fit for consumption and meet the needs of data consumers.”
Since expectations about Data Quality are not always verbalized and known, an ongoing discussion is needed. Data Quality depends on context and the data consumer’s requirements.
A Short List of Data Quality Dimensions are:
- Uniqueness/ Deduplication
Other Definitions of Data Quality Include:
- “Fit for a purpose. Meets the requirements of its authors, users and administrators.” (adapted from Martin Eppler) (Dr. Peter Aiken)
- “Reliance on accuracy, consistency and completeness of data to be useful across the enterprise.” (Michelle Knight)
- Tools and processes used for: (Gartner)
- Parsing and standardization
- Generalized “cleansing”
- Strong-Wang framework: (Wang, and Strong, MIT and DAMA DMBOK)
- Intrinsic DQ:
- Contextual DQ:
- Appropriate amount of data
- Representational DQ:
- Ease of understanding
- Representational consistency
- Concise representation
- Accessibility DQ:
- Access Security
- Intrinsic DQ:
A Few Uses of Data Quality are:
- Increase the value of organizational data and the opportunities to use it.
- Reducing risk and cost associated with poor quality data.
- Improving organizational efficiency and productivity.
- Protecting and enhancing the organizations reputation.
- Data profiling.
- Data standardization.
- Data monitoring.
- Data cleansing.
Photo Credit: Rawpixel.com /Shutterstock.com