Data Quality and Data Governance have been an ongoing concern among global data technology vendors and enterprises for some time. Privacy and security issues are no longer the headaches of large enterprises alone; Big Data, Hadoop, Cloud Computing platforms, the Internet of Things (IoT), Platform-as-a-Service (PaaS) offerings, and social data have all added to rising Data Quality and Data Governance concerns. As democratization and user autonomy of Business Intelligence (BI) and Analytics solutions continue to prosper, how are small, medium, and large enterprises tackling their quality and governance issues within their respective business infrastructures?
In the recent times, many of the new technologies and trends mentioned above have truly promoted the democratization of business analytics on one hand, while raising serious doubts about the quality and governance of data on the other hand. Thanks to these modern technologies, now business data is no longer trapped in isolated silos; nor are BI and Advanced Analytics tools far from the reach of medium and small businesses. Big Data, Hadoop, and Cloud Computing jointly contribute to a developed democratic environment for BI and Analytics, IoT, social media for business, and PaaS, thus enabling inclusive, real-time data analysis from a wide variety of platforms.
So how have these technologies affected Data Quality and Data Governance?
Why Data Quality?
All product and customer-centric data, whether in an ERP or CRM system, relies on accuracy, consistency, and completeness of data to be useful across the enterprise. However, the on-the-ground reality is that Data Quality remains an unresolved challenge. According to Gartner, poor Data Quality will cost average organizations upwards of $8.2 million a year. Thus, it is clear that working with product or customer data from disparate sources without considering Data Quality can lead to disastrous results. Gartner further states that Data Governance and Master Data Management (MDM) will play a major role in ensuring positive business outcomes. Data Quality and MDM are closely interlinked, and to understand this relationship, review the DATAVERSITY® White Paper titled Emerging Trends in Metadata Management
Why is a Data Governance Strategy Important?
The practice of Data Governance ensures that the data used for analysis is clean, error-free, complaint, secure, and in a usable state. Thus, it is imperative that members of an Enterprise Data Governance team are drawn from various business units, regulatory functions, and the IT department. These days, many companies recruit a Data Governance Officer (DGO) to lead the overall governance effort. With the ever-growing volume and complexity of Big Data, enterprises now have to pay close attention to operating policies, standards, rules, roles, processes, and procedures to execute consistent Data Governance tasks across an enterprise.
The article titled Why You Should Already Have a Data Governance Strategy encapsulates the Data Governance responsibilities into five core functions:
- Data Availability
- Data Consistency
- Data Selection
- Consistent Analytics, Metrics, and Reporting
- Data Compliance
The price for non-compliance can be huge on a business, as is cited by the example of the European General Data Protection Regulation (GDPR) for May 2018, which has set the penalty for non-compliant business to be around $22 million or four percent of the total business turnover.
In the Bloomberg blog titled Understanding Data Governance, the common technical and business challenges of Data Governance have been summed up as data source, setting up data quality parameters, customer orientation, and comprehensive analytics. Review the DATAVERSITY® white paper Navigating the Data Governance Landscape for a deeper discussion of this issue.
The impacts of newer technologies on the data architecture, policies, and procedures for enterprises are many; their potential effects on Data Quality and Data Governance cannot be underscored enough.
Whenever the topic of Big Data comes up, the rising concern among businesses is assessing whether the existing talent pool is sufficient for handling the challenges of high-speed, high-volume, multi-source data. Big Data does not just imply a set of tools and techniques; data-enabled decision making requires an enterprise-wide change in mind set. The idea of Big Data has to be embraced first before any positive change can come about.
Another big challenge that Big Data has imposed on Data Governance is the required level of Data Quality to meet the enterprise Data Management goals. Also, given the complex nature of Big Data, an enterprise has to develop a strong Data Governance Framework to define roles, responsibilities, ownership, accountabilities, and any other related issues to enable consistent data handling policies across the business. The article titled Big Data: Changing the Way Businesses Operate focuses on the security requirements of Big Data and hints that a business IT framework must have a scalable security infrastructure to deal with the growing volumes of data, which may also be hierarchical in terms of classification and availability.
Internet of Things
Gartner has predicted that IoT business revenue will exceed $300 billion by 2020, and it is easy to imagine why its impact will be felt in the Big Data universe. This game changing business model will force business owners and operators to upgrade their existing Data Management infrastructure. With the rise of smart devices and real-time data collection methods, setting consistent Data Governance policies can be a nightmare for enterprises.
The general security and privacy issues may include ownership, accountability, role-based access, and maintaining Data Quality and Data Governance. Businesses with ERP, BPM or CRM systems have an additional responsibility of ensuring that critical data is always available within the supply-chain pipelines for instant decision making. With IoT devices, real-time data can impact decisions, overall management policies, and even on-the-spot announcements. So data ownership, availability, and accessibility have to be clearly defined in Data Governance manuals to control breaches. Another problem may be the issue of Data Quality from interconnected devices. Any malfunction or spurious data from one source can easily corrupt the entire system. Thus, to earn the maximum benefits from an Internet of Things infrastructure, the Data Governance policies must be properly validated to reduce risks and increase profitability. You may find more information in The Internet of Things of Enterprise Data
Four Big Data Governance Tasks to Prep for the Internet of Things talks about the impact of IoT on the global healthcare business. Data Governance is critical in healthcare as most data are related to patient care systems like EMRs and EHRs. The current challenge facing the healthcare industry is the absence of an integrated data technology platform for providing a single view of related healthcare data. A 2014 HIMSS Analytics survey indicates that about 40% of participating organizations admitted the absence of a formal EHR Data Governance policy.
As Cloud-based Data Analytics platforms range from private to hybrid systems, Data Governance can be a critical concern especially if classified or sensitive business data is involved. In use cases where businesses have to comply with regulatory requirements, a private Cloud is often the only choice. For businesses using IoT devices on Cloud platforms, Data Governance can mean structuring strong policies to control data ownership, control rights, accountabilities, and role-based accesses. The biggest advantage of a well implemented Data Governance Framework is increased efficiency and higher quality Analytics.
Social media for business
As indicated by the article titled Mission Impossible Data Governance Process Takes on Big Data, the biggest challenge of social data is unstructured data, only a small portion of which is of actual value to the business operators. Thus, the Data Governance policy is such a situation has to enable good decision making about data selection, selective data control. If such data is stored in Hadoop systems or NoSQL databases instead of Data Warehouses that may involve a huge learning curve for all concerned stakeholders.
To avoid building a costly and high maintenance Data Management infrastructure in-house, many businesses are now opting for the Platform as a Service (PaaS) model, which is generally a Cloud-based solution. The biggest benefits of a hosted Data Management platform are scalability, low cost, zero maintenance, and managed services with full compliance. The Impact of Internet of Things on Big Data discusses how IoT affects Big Data solutions on the Cloud.
A Final word
For those of you who wish to participate in the success stories of Data Quality and Data Governance, take a look at EDW Agenda at a Glance:. The Enterprise Data World 2017 Conference will take place in Atlanta, Georgia between the 2nd and 7th of April, 2017.