Click to learn more about author Ravi Shankar.
Food for Thought
At the grocery store the other day, I marveled at the staggering number of brands that were assembled under a single roof, and I thought about the incredible journey that brought each of them together, from diverse factories and farms. This ample selection could never have been assembled without a complex system of distributors, wholesalers, and other intermediaries, and this got me thinking about data.
As consumers of data, how do we get the information we need? Is there a supply chain for data? Are there intermediaries?
The Retail and Data Supply Chains
There have always been distributors in the retail supply chain, or at least for the last two centuries. Consumers like variety, but they wouldn’t want to travel to many different stores to get it, so retailers carry a variety of brands. But retailers do not want to negotiate many different manufacturer contracts, so distributors became the necessary intermediary between manufacturers and retailers. Though retailers are always seeking to reduce or eliminate the cost of the “middle man,” distributors save consumers incalculable time and money, considering the savings from not having to travel to individual manufacturers to procure individual products.
Business users, who consume data on a regular basis, also need an intermediary, a middle man, which in this case is technology. Why? Because on a daily basis they need to use so many applications, which contain disparate information. In order to gain a holistic view of data, say a complete view of the customer with all of their orders and transactions for example, they need to piece together different data from these various systems, which differ in formats and levels of quality. More often than not, they have to enlist the help of IT who designs and constructs a “road” to that data, but the business user needs to wait until that road is completed before they can travel down that road and get access to it. However, this is not a sustainable scenario, because business users are unable to get the data themselves and are overly dependent on IT. What ensues is IT becomes burdened with data requests and becomes a bottleneck, business users have to wait, and the vicious cycle continues.
The Data Intermediary
What they need is a data intermediary—one that will provide access to all of the necessary enterprise data without any worry about which systems they come from or what format they are stored. Data virtualization is a technology that is perfectly suited to playing the role of this data intermediary. Data virtualization establishes an intelligent access layer that sits in between data consumers and data sources. The sources can be of any type, from rigidly structured to completely unstructured; they can be databases, flat files, data warehouses, or cloud-based repositories; and they can be modern systems or outdated legacy ones. Consumers can access these sources through any number of applications and web services.
The virtualization layer itself contains no data. Like a trusted intermediary, it holds the necessary metadata for accessing the various sources, including the credentials, or the “keys” to the “doors” that lead to the data. Source data stays exactly where it is, and the virtualization layer provides consumers with a real-time view of it, as well as a real-time aggregate of all of the sources combined, supporting queries across the entire set.
The benefits, as you might imagine, are substantial. As long as all consuming applications point to the virtualization layer, they will always have the most complete, up-to-date view available. This puts data consumers into a “grocery store” scenario in relation to data. They can access data in a self-service manner, without needing IT, so they are more productive. IT stakeholders are also more productive, as they can easily provide data access on an enterprise level rather than putting the effort into constructing individual “roads” to select parts of the data.
What was valuable in the age-old retail industry – using the distributor as a trusted intermediary – is still valuable today, and it also applies to the data world. We would do well to borrow this concept if we want to ensure our continued success in data delivery.