Metadata is of critical interest in today’s data-driven business environment, and there is a growing awareness of the need for both business and technical users to embrace it, says Donna Burbank, Managing Director of Global Data Strategy and author of “Data Modeling for the Business” and “Data Modeling Made Simple.” Speaking at DATAVERSITY®’s Data Architecture Online, Burbank talked about practical metadata management and the role metadata plays in Data Governance and Data Strategy.
Many people assume that metadata is something very technical and complex, but its purpose is simple to understand: Metadata provides context for business and technical users by providing the “who, what, where, when, and why” of data.
Metadata provides answers to these questions:
- Who owns the data? Who is the data steward?
- Who is using the data?
- Who is regulating the data in terms of HIPAA requirements, etc.?
- What are the data technical structures?
- What are the business rules?
- What is its purpose?
- What are commonly used abbreviations and acronyms?
- What privacy and security policies (such as GDPR) apply to this data?
- What deletion and retention rules apply?
- Where is it stored? Where is the backup?
- Where is the source of the data? Where did the data come from?
- When was it updated?
- Why are we storing this?
Start Small with Metadata Management
With so many new tools available to easily scan a lot of metadata, it’s tempting to get excited about scanning everything and going for volume. It’s better to start small, Burbank said, get buy-in from people who really understand, and curate a small set of metadata. Before jumping in, understand why this metadata is important and what the business drivers are so that it can be organized and presented from the business perspective. Holistically, all of this is important, she said, but when it’s presented, information needs to be prioritized and focused in terms of the audience.
Business vs. Technical Metadata
The following are examples of types of business and technical metadata:
- Definitions and glossary: What do we mean by “customer”? “Product”? Is “month” a fiscal month or calendar month?
- Data Stewardship
- Privacy level
- Security level
- Acronyms and abbreviations
- Business rules
- KPIs and metrics: business rules, KPI definitions
- Column structure of a database table
- Data type and length (e.g., VARCHAR)
- Standard abbreviations (e.g., CUSTOMER = CUST)
- Keys (primary, foreign, alternate, etc.)
- Validation rules
- Data movement rules
Data Governance and Architecture: Part of a Wider Data Strategy
A successful Data Strategy links business goals with technology solutions, and Burbank presented her framework for a wider Data Strategy, noting that metadata is one of the foundations of that strategy. “But metadata doesn’t live alone,” she said. Metadata ties together the people, the processes, and the company culture with Data Governance.
Metadata Management Is Hot
Burbank has been an integral part of DATAVERSITY’s annual survey, “Trends in Data Management.” In the most recent survey, over 80% of respondents stated that metadata is as important, if not more important, than in the past. She has also had a growing number of clients request help with metadata strategy, which she sees as an exciting new development.
Who Uses Metadata?
A wide variety of roles creates and consumes metadata across the organization, both in business and IT.
“Anybody who’s touching data will be touching metadata,” said Burbank. Survey respondents identified business users as the largest group of metadata consumers and she believes that’s because they can benefit most from data in context.
Metadata is used in every department throughout the enterprise. A developer needs to know what else might be affected if a particular field is changed. A business user needs to understand the definition of “regional” sales and how “market” is different from “region.” The data warehouse architect needs to understand source-to-target mappings for the data warehouse, and HR can use the data dictionary to give new hires an orientation to business terminology.
Metadata Management Is No Longer Optional
Companies in the banking and finance sector have never seen lineage and data definitions as optional, by nature of their business, but other market sectors have been able to get away without it. Finance department reports must show where the money comes from and everything about its history and location must be abundantly clear.
While working with a new client, Burbank was explaining to a group of executives the importance of metadata and why lineage and definitions need to be documented. Their finance manager was shocked to find that none of this was already required for all departments. The increased interest in metadata is a sign that more businesses are understanding the critical role metadata plays in success.
Data Governance: A Critical Enabler for Metadata Management
Everyone wants to use metadata, but not everyone wants to do the hard work of creating it, Burbank said. Data Governance creates the roles, policies, procedures, and organizational structures to facilitate metadata management.
Multiple roles work together to create business and technical metadata, and Data Governance gives ownership and roles for creation and maintenance. The finance department, for example, is responsible for creating and managing the structure for entering employee travel expenses. The data architect is responsible for data lineage and, along with the DBA, is responsible for naming standards.
Part of Everyone’s Day Job
It’s essential to be very specific when defining metadata and who manages each piece, but Burbank cautioned against relegating all metadata to one team or department. Not only is it impossible to understand glossary terms for every department without proper context, it’s just too much for one team to do: “It takes a village – metadata has to be part of everyone’s day job.”
Capturing and Storing Business Metadata
Much of a company’s history and business metadata is often held by one or two employees who are the only ones who remember how and why things are done a certain way. An employee who “just knows” that the component number field is actually the part number field should not be the only source for that information. It is important to capture this metadata in an electronic format for sharing with others. “What’s obvious to one person is never obvious to another person, so get it out of people’s heads and put it in a glossary, a metadata repository, or a data catalog,” said Burbank.
Capturing and Storing Technical Metadata
Human creation is typically necessary for design and creation of systems for metadata management, but technical metadata can often be automated for metadata discovery. Technical metadata can be used “top-down,” i.e., to design and implement systems, as well as “bottom-up” to discover metadata embedded in existing systems.
Data Lineage: Reporting and Data Warehouse Example
Metadata exists in a number of tools and data stores that are used to generate the final figure on a given report – metadata can help show that path. Burbank showed how metadata can be used to support figures in a sales report, giving the user the confidence to say, for example, “We had total sales per customer of $1.5 million this quarter,” knowing that the definitions for “customer,” the type of currency (U.S. dollars, etc.), and unit of time (calendar quarter) are all documented and understood.
Automated tools can pull needed data to populate the report, and metadata can provide information about where the data came from, even in cloud or big data environments. Many tools can scan in physical, logical, or conceptual data models as well.
While some large and/or highly regulated organizations need a very sophisticated tool for data lineage, not every company does. “Think of your use case, do an inventory of where your data is and that will help define what you need for your metadata tool because there’s a lot and you might only need pieces,” Burbank advised.
Start small, with something that could be a quick win, such as a business glossary. For example, it’s perfectly acceptable to start with a simple webpage or SharePoint site for financial terms, so users can look up “credit default swap,” and understand the agreed-upon definition for their company. The buy-in from those initial users can help with expansion down the road.
Architectural Options for Metadata Management
Common architectural options for metadata management within and between organizations include a central, enterprise-wide metadata repository to facilitate publication and sharing, tool-specific or purpose-specific repositories (ETL tool, BI tool, data dictionary, etc.), and metadata exchange and registry for information sharing and standards. These can be used together within the same organization, Burbank said: “There is no ‘one-size-fits-all’ approach.”
Scale Tools to Organizational Need
With the increased interest in metadata, there is now a multitude of tools available to support and manage it. Burbank recommended a bit of caution when shopping, “Because it’s hot. Everyone says they do it.” Whether working with vendors who’ve been working with metadata solutions for many years, or with a company new to the space, “Just be really clear on your use cases so that when you buy a tool, it’s really going to match your needs and scale accordingly.”
Want to learn more about DATAVERSITY’s upcoming events? Check out our current lineup of online and face-to-face conferences here.
Here is the video of the Data Architecture Online presentation:
Image used under license from Shutterstock.com