Initially, Master Data Management (MDM) systems and the content they contain may seem counterintuitive or even diametrically opposed to Big Data systems. Some of the considerable differences between Master Data and Big Data include:
- Volume: Comparatively, Master Data sets are much smaller than those for Big Data. One of the pivotal attractions for Big Data is that it encompasses enormous volumes; a person could argue that one of the points of attraction for Master Data is the opposite.
- Structure: Master Data tends to contain structured data, while the majority of Big Data is either unstructured or semi-structured.
- Relationship to the enterprise: Typically, MDM systems contain an organization’s most trusted data, which tends to be internal, while Big Data platforms quarter enormous amounts of external data from any number of cloud, social media, mobile, and other sources beyond the enterprise’s firewall. As indicated by Gartner, “MDM is more oriented around internal, enterprise-centric data; in an environment the organization feels it has a chance to effect change, and so formal information governance.”
Despite these differences, there are numerous ways in which Master Data Management can enhance Big Data applications, and in which the latter can do so for the former. The basic paradigm for the relationship between these two types of data pertains to the context offered by Big Data and the trust gleaned from Master Data. These virtues can inform one another equally. According to Forrester, MDM can be:
“…a hub for context in customer experience – sitting between systems of record and systems of engagement to translate, manage and evolve dynamically the full fidelity of customer identity through interactions directly or as viewed through indirect business processes and supporting activities.”
Input: Providing Context to MDM
Organizations can expand their Master Data Management with Big Data by applying the context of data from the external world to their trusted internal data. In this respect, MDM cannot only take advantage of relatively new sources of (Big) Data, but also help provide the proverbial 360 degree, comprehensive view of customers.
Although there are numerous domains for MDM, the customer domain is perhaps most readily enhanced by Big Data. The incorporation of mobile, social, and cloud data can provide numerous points of reference about a customer and his or her experience with an organization’s products that can greatly inform data traditionally stored in MDM. Such data includes customer interactions and relevant transactional data. Thus, Big Data can sufficiently enrich Master Data and facilitate the sort of context that is a critical boon of the former and lead to greater customer understanding. Furthermore, this approach results in Big Data augmenting Master Data to the point where the former is actually aggregated in an MDM hub. Additionally, it is possible to position one’s MDM in the cloud and enable applications to access it as part of Service Oriented Architecture.
Input: Facilitating Big Data Context to MDM
The challenge with applying Big Data to MDM systems lies in distinguishing relevant unstructured data that relates to Master Data from data which do not. A few options exist for this purpose: vendors have recently implemented NoSQL offerings to attain this end. The distinction in the sheer quantities of data between Master Data and Big Data generally rule out utilizing Hadoop as a means of integrating relevant data, although there are vendors who are working in this vein, as well.
A third alternative is the deployment of analytics options (such as those specializing in sentiment data incorporating Natural Language Processing (NLP) and other semantic technologies) to first ascertain which data have bearing on germane MDM fields. Aside from recently released MDM solutions that utilize NoSQL methods, it is typically not advantageous to merely add Big Data to an MDM hub without first filtering it. The aforementioned analytics approach can provide that preliminary point of distinction so that organizations can discern which Big Data can add context to their Master Data.
Output: Providing Trust for Big Data
The degree of governance that is bestowed upon Master Data and regulated within MDM systems is designed for ready incorporation into any variety of applications, including those for Big Data. Organizations can leverage their Master Data to effectively gauge the trustworthiness of Big Data—and of whatever governance mechanisms are in place at the application level. For instance, incorporating Master Data with Big Data sets can enable organizations to identify the names of customers and products in their Big Data. In such a way, Master Data can influence a number of operational systems, including those that pertain to Big Data and those that do not. As indicated here, “MDM can feed Big Data by “providing the data model backbone to bind the Big Data facts.” Viewed from this perspective, Master Data Management is a critical prerequisite for Big Data Governance—particularly when one considers the various facets of governance that are a part of any competitive MDM system. Those include aspects of:
- Lifecycle Management
- Data Quality (Deduplication)
- Data Cleansing
- Metadata Management
- Reference Data Management
NoSQL and Graphs
Some of the more exciting and recent developments within the MDM space include the incorporation of such hubs with NoSQL technologies. Vendors have experimented with key value stores, document stores and, more eminently, graph databases as a means of highlighting the sort of relationships that are an integral aspect of any MDM solution or program. Graph databases in particular are designed to illustrate relationships between customers, products, points of interaction, and other data elements. They tend to do so in a semantic way that is attuned to how humans process and view relationships. This human aspect of graphs is enhanced with NLP.
Options for incorporating graphs with MDM include utilizing hubs that are based on these stores (Pitney Bowes Spectrum MDM) and creating a customized, tailored solution in which one leverages MDM with any variety of graph database providers (such as Neo Technology and its seminal Neo4J). Additionally, third-party governance vendors (Global IDs) can provide the means for companies to build their own graphs. Graph stores were originally created as a way for accommodating Big Data sets—their incorporation with MDM solutions helps facilitate the integration of Big Data with Mater Data and using Master Data to inform Big Data.
Expanding Master Data Value
Master Data has been primarily credited with assisting enterprises in denoting the proverbial single version of the truth. However, by augmenting Master Data with Big Data, organizations are also able to greatly expand the view of whatever domain they have selected. Additionally, they can derive a more comprehensive understanding of the truth based on any host of relevant external factors. Thus, the true value in supplementing these types of data with one another is that users are able to actually modernize their MDM hubs to incorporate the latest sources of data. Such sources add a critical component of context to Master Data, and are able to issue a greater degree of trust for Big Data.
Perhaps the final frontier in relating these data sets and their attendant technologies to one another lies in the implementation of MDM hubs that are based on NoSQL options or Hadoop. The maturity of such solutions can provide the source for even more innovation in the form of exploratory analysis, in which organizations can parse Big Data through their MDM systems and incorporate it or use it for other applications as needed. Such a possibility could provide a refined form of Data Discovery to assist with other discovery options.