Click to learn more about author Ravi Shankar.
Have you come across this phenomenon in the enterprise software market? When a new category gains prominence, many ancillary technologies with capabilities different from those of the original products that innovated that category, encroach into the mainstream and cause confusion. Case in point: Data Virtualization vs. Copy Data Virtualization.
Data Virtualization is a special kind of Data Integration technology. Most Data Integration solutions copy data from different sources and warehouse the integrated data in a new source, specifically designed for that purpose. In contrast, Data Virtualization solutions create integrated views of the data, across the multiple sources, without moving the data to a new location.
In that way, Data Virtualization creates virtual access to data. Because Data Virtualization enables real-time access to disparate data sources, it enables a wide range of Enterprise Data Architectures such as Logical Data Warehouses, which integrate data across physically dispersed data sources, or hybrid data hubs, which provide seamless access across Cloud and on-premises systems.
Copy Data Virtualization performs a much more specific function: It virtualizes redundant data copies across an organization to reduce storage footprints. There are a couple companies in the space that utilize Copy Data Virtualization, but are not necessarily marketing it as such. Adding to the potential confusion, Copy Data Virtualization has sometimes been referred to simply as Data Virtualization. These can be difficult concepts for less tech-minded customers to understand, but could have considerable impacts within their organizations.
Rick van der Lans, the founder of R20 Consultancy and an authority on Data Virtualization addressed this confusion directly by saying, “It has always been a pity and confusing to the market that Primary Data used the generally accepted term of Data Virtualization to refer to another concept. Just check out how Data Virtualization is defined, and you’ll find that it’s not in line.”
This is not the first time this confusion has been clarified. Recently, John Myers, Managing Research Director at Enterprise Management Associates elucidated the difference in his blog Virtualizing Data Storage, Or Virtualizing Data Access, and two-and-a-half years back, an ex-colleague of mine, Jesus Barrasa, wrote on the same topic. Yet the confusion persists.
Unlike the many technologies that come and go through the years, Data Virtualization is an established, seasoned technology, one that Gartner places in the most mature phase of its Data Management Hype Cycle (Source: Gartner, Inc., Gartner Reveals the 2017 Hype Cycle for Data Management, 28 September, 2017.) In addition, Gartner says that “organizations with data virtualization capabilities will spend 40% less on building and managing data integration processes for connecting distributed data assets.” (Source: Gartner, Inc., Predicts 2017: Data Distribution and Complexity Drive Information Infrastructure Modernization, Ted Friedman et al, November 29, 2016.)
As we delve more into enterprise technology, it is important to bear in mind the difference between Data Virtualization and Copy Data Virtualization.