You are here:  Home  >  Data Blogs | Information From Enterprise Leaders  >  Current Article

A Grand Unified Theory of Metadata

By   /  February 17, 2014  /  8 Comments

by Ian Rowlands

There’s a wonderful idea in particle physics called a “Grand Unified Theory.” These theories unite the interaction of forces – and if ever one can be proved to be true then all these basic physical forces will be aspects of the same force. (OK, yes physic geeks, I’ve really over-simplified it, please forgive me.)

I’d like a Grand Unified Theory of metadata please! Since time immemorial (or at least, for a very long time) two information worlds have coexisted. “Data” and “Content” have been separated by an invisible, but impenetrable, barrier. I have been happy to see a few brave souls recently seeking to bridge the two by using Business Glossaries to link to items in both worlds to provide a two-dimensional view of interesting topics.

A new – and much bigger planet has appeared in the information cosmos. “Big Data” is suddenly filling the sky … and I fear another chasm opening between the information worlds. It’s time to tie everything together before the forces of chaos are reinforced and irrevocably strengthened.

Many people have, I think, toyed with ideas that fit with this notion. In my heart of hearts I think I believe (was that certain enough for you?) that the right way to do this is to define business ontology. The classes (“kinds of things”) would then be instanced as business terms, logical definitions and physical instantiations, with characteristics such as valid values being aspects of an extended model.

With this kind of notion in mind, data elements (or columns) and documents would be addressable as physical instantiations and it would be possible to take a completely rounded view of a class. In such an environment the idea the data is “Big” or “Little” (or whatever the opposite of “Big” is in this context) would be irrelevant.

“Well”, I hear you say, “that’s a wonderful speculative theory for a Friday afternoon after a couple of beers … but what’s the point?” The point is that there’s one thing our business community has been asking for, and another that it’s going to ask for fairly soon, and this kind of idea might help:

First, people keep asking for the ability to see “everything” related to a particular topic –“I want to see everything about my customer” or “I want to know all about that product” – and “everything” means data, content, and Big Data.

Second, people are going to be asking (actually, I’ve had one or two ask already) for one way to govern content, Big Data and conventional data. It is going to be seen as very silly if there is one way to mage all of this stuff – and a way that ties back to business understanding of information.

My ontology/aspects model might be a bit elaborate, and I’m certainly open to better offers, but a Grand Unified Theory of metadata would be the base for addressing those two requests – so what about it?

About the author

Ian Rowlands is ASG’s Product Marketing Manager (Data Intelligence). He heads product marketing for Metadata Management and is also tasked with providing content across ASG’s entire portfolio. Ian has also served as Vice President of ASG’s metadata product management and development teams. Before ASG, Ian served as Director of Indirect Channels for Viasoft, a leading Enterprise Application Management vendor that was later acquired by ASG and managed relationships with distributor partners outside North America. He has worked extensively in metadata management and IT systems and financial management, and presented at conferences world-wide, including DAMA and CMG.

  • Richard Ordowich

    Although ontologies are very useful, I would scale ontologies back the goal from a representing a business ontology to representing object ontologies. I have yet to see what a business ontology looks like or their practical application but there are practical ontological representations of product, party and other business objects.

    But ontologies are not sufficient to design and define metadata. Semantics, syntax, taxonomies and other data literacy factors need to be applied as well.

    I would also scale back describing ontology as a grand unifying theory of metadata. Ontologies are yet another viewpoint for metadata. Important and useful, l but not particularly grand or unifying by themselves. I suggest Data Literacy as the grand unifying theory of data and metadata.

    • Ian

      Richard, thanks for joining the conversation!
      An object ontology seems to be either a superset of the business ontology, or another set of concepts that need to be framed.
      The business ontology may not be the integrating set if concepts, but if not we need something at a higher level of abstraction — candidates welcome. I don’t claim to have solved the problem!
      Data literacy really wont’t do for the GUM however. It might be the domain in which discussions of GUM occur, but it doesn’t provide the common structured framework for information assets that I’m fumbling towards …

  • Ian,

    Thanks for posting the article about “The Grand Unified Theory of Metadata”. I like the sound of that.

    In fact I believe I made a pretty good start at an overall approach to metadata, back in 2006 with my book,”Data Model Patterns: A Metadata Map”. In it, I used my version of the Zachman “Framework for Enterprise Architecture” to organize an essential model of most of the cells in the framework.

    After a brief discussion of the different definitions of the term that were floating around at that time, I came up with:

    “Metadata are the data that describe the structure and workings of an organization’s use of information, and which describe the systems it uses to manage that information.”

    The advantage of taking the Framework approach is that the top three rows are about “business metadata” (what people see), and the bottom three rows are about “technical metadata” (what the computer sees). Note also that by addressing all six columns of the Framework, you have data not only describing data but also describing activities (functions and processes), people and organizations, locations of business, timing, and motivation.

    This is a pretty complete picture.

    You are right about big data. That changes the technology, but it does not change the structure of the underlying problem.

    I welcome hearing from you.

    • Ian

      David, thank’s so much for jumping in! I’m a big fan of the (Zachman) framework too. In practice, though, I’ve run into two obstacles — first the matrix turns out to be very sparse, and second for some reason (which I haven’t pinned down) I haven’t seen much in the way of working implementations — especially across the data/content boundary. To be honest I’d like some suggestions for something a bit more prescriptive than descriptive.
      It would be interesting to classify information assets, out them into their cells and define the relationships — and then see how the navigation can be formalized.
      I love your definition of metadata. I often scale the formality back even further — to something like “the supporting material that makes it possible to get full value out of information assets”.

  • Ian is right. Document and content are problematic, especially from an operational perspective. One problem we have in Data Governance is convincing business that the data governance function resides with them, rather than IT. Once we achieve that, the next problem we have is getting to accept governance over all data–not just the part they happen to be familiar with. And that part tends to be what it can see and understand which is usually document and content. This makes for broken governance frameworks where business embraces responsibility to manage and govern documents and content, but throws back the rest to IT. Worse yet, they inadvertently make decisions about content and data in the absence of understanding how it affects all other data functions. In part, that’s because we are effective at describing or demonstrating the interrelationship of data as it manifests in structured and unstructured objects.

    We need a conceptual framework with descriptive and prescriptive support (probably one rooted in Zachman) that can break through that barrier, so that governance of data can be applied uniformly across all operational dimensions. We will never get there if we can’t describe data in its various forms, aggregations, uses, and objectifications as one and the same thing deserving of comprehensive and uniform treatment.

  • Correction to above: In part, that’s because we are NOT effective at describing or demonstrating the interrelationship of data as it manifests in structured and unstructured objects.

  • Michael Brackett


    I wholeheartedly agree! However, as I’ve written before, I don’t believe that the term ‘meta-data’ can survive the use, misuse, and abuse of the past years. As a resolution, I just published Data Resource Data: A Comprehensive Data Resource Understanding (Technics Publication) that describes the complete data resource architecture for documenting and unerstannding an organization’s data resource, including support for formal data resource integration. Perhaps that’s a way to achieve your objective.

    Best regards,

    Michael Brackett

  • Andries van Renssen

    Ian, It sounds a bit arrogant to state having cracked this nut, but nevertheless I would invite you to investigate whether the Gellish ontology is what you are looking for. See http://www.gellish.net/ and my books ‘Formal English’ (PhD, 2005) and ‘Semantic Modeling in Formal English’ (2014).

    I extended the ontology with a syntax which created ‘the Gellish language’ (and for the English speaking world Gellish Formal English). In other words, the Gellish Expression Format (syntax) together with the Gellish ontology defines a formalized language (syntax and semantics), without modeling knowledge about particular subject areas (business domains), except for a dictionary-taxonomy (subtype-supertype hierarchy) of concepts, which acts as the (extensible) dictionary-taxonomy of the language (including also phrases and definitions of kinds of relations).

You might also like...

A Brief History of Non-Relational Databases

Read More →