Digital libraries or information repositories such as Wikipedia are driven by a carefully governed Data Architecture, which provides anytime, anywhere, any topic access to these vast information resources. Data Governance policies related to Data Quality, ownership, stewardship, roles, and responsibilities are executed through the use of “metadata,” which provides context to data. Without metadata, even the best Data Governance plan will fail to execute. Thus, Metadata Management forms the core ingredient of any Data Governance program.
Anthony Algmin, Founder and CEO of Algmin Data Leadership, presented Governance through Glossaries, Dictionaries, and Catalogs at the DATAVERSITY® Enterprise Data Governance Online event, where he compared the old model of a hardcover encyclopedia with the newer, more efficient model of Wikipedia to explain how Data Governance drives the efficiency of information access and retrieval in a digital library.
Algmin’s experience includes many years of hands-on technology, management consulting, and executive leadership—along with a deep interest in data-driven disruptions. Algmin felt there was a strong need to move forward from the old-world encyclopedias to the new-world Wikipedia in a busy world of information seekers.
After the Death of Encyclopedias
According to Algmin, the hardcover encyclopedias provided private access to vast information pools, but were costly to maintain and often incomplete. Then came the era of the internet, which suddenly revolutionized the scope and scale of “information access and retrieval” on the digital space. The digital library information architecture is driven by metadata, which in Algmin’s words can be considered as “anything that provides context to data. And glossaries, dictionaries, and catalogs are a great place to put a lot of that context.”
Algmin said he believes that, “in fact, glossaries, dictionaries, and catalogs are the bedrock, the foundation of good governance,” and that organizations need to be changing them and evolving them, along with Data Governance as their organizations evolve.
He continued to explain that:
“We’re just operating in an old model inside our organizations with something that really looks an awful lot to me like an encyclopedia of the old world, not a Wikipedia of the new world. What I want to take a moment to do is talk about for a moment what really matters before we get into the details with catalogs, dictionaries, and glossaries.”
He then spoke about data value and its importance. “I think this is a term, until you have a very specific definition, everyone has a sense of what we’re talking about, but may not all be aligned.”
Data Value Matters
Algmin’s definition of data value is “the realized difference between what you do with data analytics compared to what you would do without it.” Algmin thinks that data value ought to be measured in terms of:
- Increased revenue
- Reduced cost
- Quality of risk management
Constant measuring of data “usage, results, and impacts” will help “maintain relevancy.”
According to Algmin, terms like “Data Management” and “Data Governance” are great:
“But the value doesn’t exist until the business improves, and that is what we really need to be calibrating against, regardless of whether we’re talking about Data Quality or dictionaries and catalogs, or governance itself.”
He said it all comes back to what change happens as a result of these activities in the business between what you do with it versus what you would have had if you didn’t do those things. Such metrics are actually quite easy to measure, he commented.
Data Glossaries, Dictionaries, and Catalogs Defined
- Data Glossary: The data glossary is more of a business context with less detail.
- Data Dictionary: “A data dictionary gets a lot more detailed, with less of the contextual surroundings. You’ve got a glossary of kind of look up information, but it’s usually just kind of a term, and hear some words.” Just like in a regular dictionary, you’re going to have pronunciation, figures of speech, different variants, “you’re going to have where its origins were from.”
- Data Catalog: A data catalog “merges these dictionaries and catalogs, and really tries to articulate the relationships between the items. The catalog, to some extent, is a super set of the glossary and dictionary, but it tends to be focused a little bit more on the relationship.”
Algmin thinks what is lacking is “momentum drivers,” which could be process or the machinery to drive the people. He commented that Data Governance organizations often have big ambitions, but lack resources while being stuck “between expectations and incentivization.” So, there are disparities which he said:
“Unravel the system that we’re trying to build, and so much about data is about building systems that create momentum and can grow over time as opposed to systems that constantly need additional resourcing from the outside.”
The Role of the Data Governance Library
The primary role of the Data Governance library is to provide “scale and the ability to build greater capabilities and sustainable systems that can grow over time, without us being a central bottleneck.” The Library had catalogs, people, dictionaries, glossaries, and other such items. He calls it the “missing link.” And it provides the needed empowerment and incentivization so that “the system feeds itself.”
Who are the Target Beneficiaries?
- Data Professionals, like technologists, data stewards, and data analysts.
- Data consumers who use data for business benefits
Algmin remarked that:
“The data consumers are the ones that are really saying, ‘I really just wish I had a better understanding of the universe around me so that I could take the optimal actions.’ Well, we can drive that behavior. That’s how we can really make a tremendous difference.”
Pragmatism over Perfection
Algmin repeatedly warned about choosing a path of “perfection.” He suggested, “pragmatism over the pursuit of perfection” is the key to Data Governance success because:
- Value creation happens only when glossaries, dictionaries, and catalogs are put to actual use
- People are provided a platform for easy sharing of business insights
- Realized benefits overcome usage costs
Planning for the Future
- Who will be the future beneficiaries of the Data Governance program?
- How to minimize friction?
- Can the “unpredictable” be accommodated?
“There’s a lot of costs that aren’t necessarily traditional costs that we need to overcome to create these tools, to create viable Data Governance. Who cares about these tools? Imagine the future.”
It’s necessary to understand why you are building such capabilities, who is expected to use the tools, where the data value exists within them, how are they going to affect regular operations and not be disruptive, and what are the “positive outcomes” from them. Otherwise, they will just be disruptive and not used effectively.
Theory without Execution Creates No Value – The CDO
He concluded his talk by stating that:
“I firmly believe, and I know I’m a minority I this, but I firmly believe that the Chief Data Officer should be part of the IT organization. And the reason is that if not part of the IT organization, why does the IT organization exist?”
The CDO deals with technology, with information systems, and is really about moving the understanding of data throughout the organization, so the CDO is an IT function he said. That doesn’t mean that the CDO is not part of the business, “it’s connected deeply throughout the rest of the business, and it should be.”
For Algmin, the biggest issue here is that IT organizations are held too far away from the business, rather than being directly involved in business decisions, like a partnership. “We need to learn how to work together in a multi-directional manner.” It’s all about utilizing the collective skills sets of everyone to change the business for the better.
Check out Enterprise Data Governance Online at http://datagovernanceonline.com/
Image used under license from Shutterstock.com