Selection Criteria for Business Glossary Tools

by Sunil Soares

A business glossary is a repository that brings together common definitions of key terms across business and IT. The most ubiquitous business glossary tool is Microsoft Excel. Indeed, many organizations start their data governance journey by documenting key business terms in spreadsheets that they then load to Microsoft SharePoint. However, this starts to get unwieldy when the size of the glossary extends into the hundreds of business terms. At this point,organizations start to look at business glossary software tools.

There are a number of business glossary tools that are part of broader data governance suites from different vendors. These offerings including Adaptive Business Glossary Manager, Collibra Business Semantics Glossary, IBM InfoSphere Business Glossary, Informatica Business Glossary, SAS Business Data Network and the Business Glossary feature of Sybase PowerDesigner. I will not provide an evaluation of each vendor’s offering. Rather, I will propose a set of selection criteria when selecting a business glossary tool.

1.Ease-of-Use

Business glossaries are meant to be used by business users. It goes without saying that business users will not use the glossary if it is hard to create, edit and delete business terms. There are several ease-of-use features but one is particularly interesting. Consider that you have two Dodd Frank-related terms in banking:

a. “Depository Institution Holding Company’’ means a bank holding company or a savings and loan holding company that is organized in the United States, including any bank or savings and loan holding company that is owned or controlled by a foreign organization, but does not include the foreign organization.
b. “Savings and Loan Holding Company” means any company that directly or indirectly controls a savings association or that controls any other company that is a savings and loan holding company.

As you will note, the second term is embedded in the definition of the first term. As a result, the two terms need to be linked. This can be done manually, but can become a real challenge when dealing with thousands of business terms in the glossary. Any tool that can auto-link these terms will offer tremendous advantages in terms of productivity of the data stewards.

2. Cost

Cost is another important factor to consider, especially for entry-level data governance programs. Obviously, the marginal cost of Microsoft Excel is zero for many organizations, which is why it is the tool of choice for many business glossary implementations. In addition, several data governance tool suite vendors will offer their business glossaries at a low price to provide an end-to-end solution.

3. Support for Cloud

I don’t see a lot of software vendors supporting business glossary tools in the cloud.However, I actually believe that business glossaries are tailor-made for the cloud because they contain a limited amount of sensitive data such as Personally Identifiable Information (PII) and Protected Health Information (PHI).

4. Integration with Technical Metadata

Business glossaries should be part of a broader metadata initiative that supports data lineage and impact analysis. Metadata architects should be able to easily link business terms to the associated technical metadata. So a term called “customer number” should be linked to CUST_NUM. This means that the business glossary is supported by a metadata repository that can ingest metadata from a heterogeneous environment consisting of data modeling tools, ETL jobs, business intelligence reports and other technical artifacts.

5. Linkage to Data Policies and Rules

Business glossaries are rapidly emerging as the repository for data governance policies and rules in addition to business terms. The business glossary tool should also act as a repository for data policies and rules. The tool should let you link those artifacts to the associated business terms. For example, you should be able to create term called “minor” and link to a rule that states that “a minor must have a guardian.”

6. Easy Search from Reports

It is one thing to create a business glossary, and quite another thing to ensure that it is actually leveraged by business users. Many business glossary tools offer a desktop widget that links the business glossary to the report or application. For example, you can highlight a term in Cognos or MicroStrategy, and then Shift+F5 to pull up the definition from the business glossary. This is a great selling feature with business users because they have definitions at their fingertips.

7. Stewardship

A business glossary should allow administrators to assign business terms, categories of terms, data policies and data rules to stewards. This enables a federated approach to data governance by allowing the business to own and manage key business terms. The business glossary tool should provide support for a data stewardship dashboard. The dashboard provides visibility in terms of the number of business terms assigned to each data steward as well as the number that have been approved or are still in the pending approval state.

8. Workflow

Business glossary tools should support simple or complex workflows to allow multiple parties to participate in the creation of a term. For example, the workflow for the term “net sales” in retail might involve multiple approvals by marketing, merchandising and finance. These workflows should be fully configurable. The data stewards will instantly become more productive as the software tool will shield them from routine tasks and dealing with multiple email trails.

9. Custom Attributes

Business glossary tools should allow the creation of custom attributes. For example, a data governance team might want to create a customer attribute called “Sensitive (Y/N)” to flag certain attributes such as Social Security Number in the U.S.

10. Linkage to Reference Data

Business glossary tools should also provide a mechanism to link a business term to the associated reference data. For example, you might want to link a business term called “industry classification” with the list of allowable values for North American Industry Classification System (NAICS) codes from the U.S. Census Bureau.

11. Linkage to Data Quality Tools

A business glossary tool should support the linkage of business terms and rules to the actual data rules in the data quality tool. For example, the business glossary may include the definition for the term “employee identifier.” It may also include a data rule that “employee identifier is a six digit alphanumeric code that is unique across employees.” Both the business term and the data rule are assigned to a human resources data steward in the business glossary tool. The data quality tool will then implement a set of data rules to identify exceptions to the rules in the business glossary. The metadata architect should be able to link the business term and data rule in the business glossary with the technical implementation of the data rule in the data quality tool. This provides end-to-end governance over critical terms and policies from the business to the technical implementation.

12. Linkage to Master Data Management

Many MDM hubs have hundreds of data rules that become opaque over time. For example, an MDM hub may implement a data rule that “source address should contain at least one address line and either a postal code or a city.” The data steward should also document this data rule in the business glossary. The metadata architect should then link the data rules in the glossary and the MDM hub to provide end-to-end governance from the business to the technical implementation.

Many organizations are positioning the business glossary as the centerpiece of their data governance programs. Every data governance program requires people and process. However, a robust tooling infrastructure will make business and IT stewards more productive. Hopefully, these thoughts will help. Comments appreciated!

Related Posts Plugin for WordPress, Blogger...

Sunil Soares

Sunil Soares is the founder and managing partner of Information Asset, LLC, a consulting firm that specializes in helping organizations build out their information governance programs. Prior to this role, Sunil was the Director of Information Governance at IBM, and worked with clients across six continents and multiple industries. Sunil has published a book called The IBM Data Governance Unified Process that details the fourteen steps and almost one hundred sub-steps to implement an Information Governance program. The book is currently in its second print and has also been translated into Chinese. Sunil’s second book Selling Information Governance to the Business: Best Practices by Industry and Job Function reviews the best way to approach Information Governance by industry and function. Sunil’s third book called Big Data Governance will review the importance of information governance for different types of big data such as social media, machine-to-machine, big transaction data, biometrics, and human generated data. 

  4 comments for “Selection Criteria for Business Glossary Tools

  1. June 26, 2013 at 7:52 am

    Dear Sunil,

    Thanks for putting a clear list of requirements out there. It shows exactly how software for a Business Glossary is not just a filing cabinet for a spreadsheet that went through a whole process in meetings, mails and other communication. The whole process has to be supported, so you can enable adoption, ownership and engagement.

    Kind regards,

    Stan

  2. June 27, 2013 at 9:28 am

    Sunil, I could not agree more with your comments, algnment points and business value call-outs in your blog. Thank you for creating a very succinct value proposition for Glossary creation. You also call out nicely how companies can build out their Glossary over time.

    While not specifically mentioned, I also see a value play in following this path to establishing a semantic model through an ontological representation of those terms and their associations. As I know you are well aware, improvements have been made around the creation and management of Ontologies. Your approach takes you right to that doorstep.

    My question then is, do you know if any of the metadata solutions you mentioned enable the representation of the metadata collected in a machine executable model?

    • June 28, 2013 at 3:05 am

      Hi Joe – we know quite a bit about semantics & ontology, and making things machine executable at Collibra. Let me know if you would like to learn more.

      A first glance at the fit with the requirements above is shown at http://inside.collibra.com/?p=845 (second video).

  3. Richard Ordowich
    July 22, 2013 at 8:05 am

    Before embarking on the creation of a business glossary, data dictionary or metadata repositories there are some critical factors to consider.

    1. Who will use the glossary, for what purposes and how frequently (daily, weekly, monthly)?
    2. What are the policies, and practices for naming, definitions, and semantics?
    3. Who will administer the glossary?
    4. Who is authorized for CRUD (create, Read, Update, Delete) of the glossary?
    5. A glossary should be constructed on the basis of a controlled vocabulary. Who will create and maintain the controlled vocabulary?
    6. What are the authoritative sources for the contents of the glossary?
    7. What are the quality metrics for the glossary?
    8. What are the usage policies for the glossary? Who is mandated to use it and under what circumstances?
    9. What business benefits will be derived from the glossary?

    After considering and responding to these questions, then a decision can be made as to whether to proceed with the creation of a glossary and if so, which platform would be suitable. Creating and maintaining a glossary is a long and tedious process and requires preplanning.

    Another consideration is that data dictionaries, business glossaries and metadata repositories are subject to neglect. How sustainable will your glossary be?

Leave a Reply

Your email address will not be published. Required fields are marked *