The numerous points of overlap between Data Management and Data Governance frequently obfuscate the usage of these terms and the realms of refining data that they encompass.
Adding to the confusion is the fact that for every reference to Data Management and Data Governance, there is seemingly another for Information Management and Information Governance. There are also varying subsets of these terms such as Enterprise Information Management, Master Data Management, Metadata Management, Lifecycle Management, and myriad others.
Finally, there are a number of different theories (and theoreticians) about the roles that governance and management of data and information should play in the enterprise: How they should work independently? How they should work together? Is a ‘top down’ or ‘top up’ approach most effective?
In helping to clarify these terms and their relationships, this article will adhere to definitions and points of distinctions from internationally recognized data-centric organizations, while adding more discussion points to the fray.
Data Management Encompasses Data Governance
Before distinguishing the terms data and information, it is best to start with the well supported notion that governance is part of the overall management of data. Specifically, Data Management consists of several different realms; one of the most notable is Data Governance. This viewpoint is substantiated by the Data Management Maturity (DMM) model formulated by the Capability Maturity Model Integration (CMMI) Institute. The DMM has six categories for effective Data Management, including one for Data Governance. The Data Management Association (DAMA) indicated in its Data Management Body of Knowledge (DMBOK) that Data Governance is a part of Data Management. In providing its definition of Enterprise Information Management, Gartner states that EIM is, “an integrative discipline for structuring, describing, and governing information assets across organizational and technological boundaries.” This definition not only underscores the closeness between the management and governance of data/information, but also reiterates the point that the management of data encapsulates its governance.
Distinguishing Governance from Management
After denoting that Data Governance is a part of Data Management, the next question involves identifying what exactly Data Management is. Governance is readily defined as the roles, responsibilities and processes of ensuring that there is accountability and ownership of data assets in an orderly, sustainable manner over time. Different sources have any number of variations of this basic definition. Data Management, however, is more broadly defined and pertains to any aspect (and the collective sums) of ingesting or utilizing data in a repetitious process over time. Simply having and tending to a data warehouse (data warehousing), for example, is an aspect of Data Management. Defining who has access to it and how, as well as implementing various standards for Metadata and stewardship of such a repository in a long-term, sustainable way are facets of governance. Even broader definitions of Data Management include the fact that most subjects of articles or blogs found on DATAVERSITY are about some facet of Data Management; some of these are specific to Data Governance. An even broader definition is that Data Management represents any process or tool for ensuring that an organization has data—transforming that data into useable information is the job of Data Governance.
Distinguishing Information from Data
The third definition of Data Management alludes to the distinction between information and data, which functions as another point of ambiguity in determining the relationships and differences between the roles of management and governance. All information is data; not all data is information. Information is those data that are readily applied to business processes and which generate value in some form. To arrive at information, data must typically undergo a rigorous governance process in which useful data are separated from those which are not, and a number of key measures are implemented to make useful data trusted and used as information. The very point of data (and arguably of both Data Management and Data Governance) is to create and use information. Gartner’s glossary, for instance, does not expressly acknowledge the terms Data Management or Data Governance alone, and chooses instead to focus on variations of Information Governance and Information Management.
What Data Governance Encompasses: Roles
There are a limited number of roles related to formal Data Governance processes. Those typically include upper level management personnel who have prioritized governance programs and facilitate funding. They also include a Governance Council made of individuals in upper level management and across business units who are tasked with assigning the responsibilities and roles of governance for specific, business imperative processes. There are also Data Stewards who ensure that governance is adhered to and is actually helping to achieve business objectives. Additionally, there are ‘citizen’ stewards, those who are not specifically denoted as Data Stewards, yet who still take active roles in governance processes related to their business domain.
One of the most critical facets of Data Governance and its relationship to Data Management is that effective governance requires more than IT involvement—which is oftentimes the general perception. Specifically, the business must become actively involved with governance the way it is with other aspects of Data Management (such as self-service analytics) in order for either one of these disciplines to benefit from it. In most cases, the specific areas of governance pertain directly to the business, which is why the perception that governance merely involves IT is dated and should be abandoned.
What Data Governance Encompasses: Areas
The numerous areas of Data Governance incorporate some aspect of:
- Metadata: Metadata requirements call for consistent definitions of data elements and terms, which are oftentimes focused on a business glossary.
- Business Glossaries: It is critical for organizations to generate requisite definitions of business terms and their context—if not across the entire enterprise, then certainly for respective business units.
- Lifecycle Management: Lifecycle Management pertains to how long data is stored and where, as well as to how its uses might change over time. Certain lifecycle management factors are influenced by regulatory requirements.
- Data Quality: Specific measures for Data Quality include the process that data goes through in order for business units to trust it. As previously mentioned, Data Quality is so critical that some consider it distinct from governance, which it significantly enhances.
- Reference Data Management: Reference Data provide context to data especially when considered in conjunction with Metadata; its governance is oftentimes overlooked because of the slow rate at which it changes.
Although all of the aforementioned areas are specific aspects of Data Management that Data Governance formalizes and accounts for, it is pivotal to note that all of an organization’s data should ideally adhere to governance conventions.
Data Modeling is another critical facet of Data Management that depends on Data Governance, and operates as a nexus point of sorts between these two disciplines. One can argue that in order for governance protocols to extend across an entire organization, it is beneficial for it to utilize modeling in a uniform way so that governance can extend across business units. Semantics are valuable for allowing a uniformity of modeling (especially when utilizing Big Data). One of the most effective means of ensuring governance throughout the enterprise is by utilizing modeling techniques that directly correlate to objectives in various areas of governance such as data lineage and Data Quality. This effect of modeling is particularly useful when incorporating unstructured data. Plus, modeling reinforces the structure and formality associated with governance.
The Critical Difference
Examples of other aspects of Data Management are found in the five other categories of the DMM, which include Data Management Strategy, Data Quality, data operations (lifecycle management), platforms and architecture (such as integration and architectural standards), and supporting processes (which focus on process and risk management among other factors). Again, the propinquity of Data Governance and Data Management is underpinned by the fact that Data Quality is frequently viewed in conjunction with governance or perhaps even considered one of its outcomes. Perhaps the best way of contextualizing these two realms is to understand Data Governance as responsible for formalizing any number of processes within Data Management, which itself is concerned with providing a set of tools and methods to ensure that enterprises actually have data to govern. Although Data Governance is a part of Data Management, the latter certainly needs the former to provision trusted information to inform pivotal business processes.