It’s no secret that I am pro-MDM (Not the “throw everything in a hub” type of MDM, but the discipline of identifying and managing the “who, what, and where” entities that are important to the Enterprise). However, it may not be well-known that I am also pro-Semantic Web. I still see just as much promise in its ability to integrate disparate data sets as I did when a co-worker and I founded a W3C interest group a few years ago. When the opportunity to attend a talk about the Semantic Web presented itself, I jumped at the chance to hear someone else’s perspective on using Semantics to tackle data issues.
Normally, it would not be especially contentious at an informational talk, but this time the speaker launched into a rant about how the Semantic Web has eliminated the need for MDM and how MDM was a failure, just like universal standards. As I listened, I was sure that I heard that 1980’s classic song “Video Killed the Radio Star” playing softly in the background and mocking me with the words “Semantics Killed the MDM Hub”, but I wasn’t going down without a fight.
The fact that I have a background in both the Semantic Web and MDM made holding my tongue through the presentation quite difficult, but somehow I managed to persevere. Anyone with a background in MDM and Data Management would have easily recognized that they were being unfairly criticized and that the Semantic Web was being oversold (It wasn’t exactly subtle, but not completely surprising - the speaker was from one of the big Semantic Web tool companies). Among the most troubling statements were:
Modeling isn’t necessary: There were several negative statements made about modeling (and over-modeling). While I agree that there is a point when something has been modeled to death, we need to distinguish a failure in using a technique wisely from the technique itself. Ontologies have models (and, believe it or not, they look amazingly similar to conceptual data models). With the semantic web, you still have to know the “things” and identifying attributes of the “things”. The W3C group I co-founded did have to create a model to understand how patient data would link together. Check out figure 1 of our Translational Medicine ontology. The speaker made a quiet side-comment about "models other than ontologies" (which many attendees may have missed), but it rang loud and clear in my ear. The bottom line is that modeling is still needed and even the semantic web is not exempt from having some degree of modeling. It seems to me that the degree to which the model is extensible is the real issue here.
Universal standards are not valuable: The last time I checked, things like XML, UML, and guidance developed by the W3C were in fact standards... universal standards, no less! I would hardly call them useless. Even the concepts for semantic web interoperability revolve around standards (Why else would you need a URI?). I agree that creating standards for the sake of standards does not add value, but we need to be careful when diminishing the value of standards that serve to unite information and information technology disciplines.
Master Data is a failure: The speaker implied that master data and data integration were competing schools of thought. This could not be more wrong. They are, in fact, COMPLEMENTARY. Linking data requires sources (e.g. the Linked Data project - the linked data cloud does have several pieces of master data within it.). Sources need some sort of identity (i.e. a “key”) to link one thing to another. Master Data Management seeks to provide identities for the core entities that are used as nodes in the ontology (and after bashing on MDM, the speaker did, in fact, make a comment as to how the classes on the left of his screen were similar to MDM entities. Hmmmm…. interesting….). The speaker also questioned whether it is ever possible to have a “single version of the truth” because truth is in the eyes of the person using the data. I am not sure where the speaker got his information on MDM (he made a vague reference to a conversation with an analyst), but this “single version of the truth” doesn't apply to all data or all uses of data. Having a “single version of the truth” means that you have a trusted source for the identity of a thing - the same way that the semantic web needs a URI to identify a thing. Master data management can combine together all of the locally-created identities into something that can be filtered of "noise" and used for more formal or regulated processes.
Did I enlighten the speaker with all of these points? Unfortunately, no. The talk was running late and the question and answer period was cut short. But if I’m lucky, maybe the speaker will read this blog.
Thanks for listening to my rant. Please share your thoughts on this topic!
NOTE: Opinions voiced in this blog are those of the author, not her employer.