Fundamentals of Data Virtualization

By on
data virtualization

Organizations are increasingly employing innovative technology called “data virtualization” (DV) to tackle high volumes of data from varied sources. Data virtualization is widely used in enterprise resource planning (ERP), customer relationship management (CRM), and sales force automation (SFA) systems to collect and aggregate multi-source data.

From multi-sourced data acquisition to advanced analytics, this technology seems to offer a one-stop solution. The biggest benefit of DV, as perceived by businesses today, is the presence of this additional layer on top of traditional data warehouses for fast and reliable access to data.

What Is Data Virtualization?

You can think of data virtualization as a sort of TV guide that includes the content of many different channels in one place. In simple terms, DV enables the placement of an additional layer of data access between the source of data and the user for faster access. A couple of examples of DV are “virtual data warehouse” and the “virtual data lake.”

Initially viewed as a workaround for ETL, this technology now offers fast data access, data integration, data cleaning, and analytics tools for BI users. DV enables established technologies like cloud, big data, and advanced analytics platforms to work in tandem to produce superior Data Management solutions that traditional data warehouses failed to achieve.

Data Management in the Age of Data Virtualization

Through data virtualization platforms, vendors are offering a one-stop solution for data collection, management, and data services delivery. A significant strength of DV is the complete reliability and security of real-time data. This single benefit is helping to earn huge rewards in the form of a quickly expanding DV market.

While ETL still delivers high-volume data movements, DV promises superfast access to data. Currently, the usage trends show that businesses are using both DV and ETL concurrently.

So, what are the most visible benefits of managing enterprise data with DV?

  • Fast access to secure data
  • Reduction of data replication
  • Reusability of unified data service

Data Virtualization for Big Data

Gartner predicted that through 2020, 60% of all the big data projects would fail. Data virtualization, while not able to solve all big-data-related issues, can substantially simplify processes and make big-data projects easier to handle. For starters, this technology can make big data available and ready for use on BI platforms.

One of the primary challenges of big data is the “volume, variety, and velocity” of data residing in traditional data warehouses. A logical data warehouse is a data acquisition and data organization solution across the enterprise, where data resides in structured, unstructured, batch, or real-time forms. Data virtualization can substantially reduce the data integration while preserving performance. 

The whole point behind using such a data virtualization architecture is enabling “live data” stores in the warehouse and dead data in Cloudera-type repositories, and then combining multi-source data through a logical data warehouse. Data Virtualization vs. Copy Data Virtualization explains that though users often confuse these two distinct concepts, there is a marked difference between them.

Data needs to be accessed much faster than is currently possible in advanced analytics or BI platforms. Thus, it is hoped that DV will be increasingly used for multi-source data integration across enterprises, and “unified data views” will enable users to get accurate information when necessary.

How Does DV Reshape the Traditional BI Landscape?

With a presentation layer and federated data, DV delivers both fast access to multi-source data and a unified view of data. The encapsulated view of data makes it easy for BI users to create instant dashboards with valuable insights. DV prevents data loss or inconsistencies, especially in cases where the data is derived from streaming sources.

Here are some benefits the typical BI user sees when moving away from the data warehouse to DV architecture:

  • Increased access speed for real-time data
  • Reduced data storage requirements
  • Reduced risk of data loss or inconsistencies
  • Lower workload on the system
  • Enhanced data governance through DV policies

Some of the noted disadvantages include complexities in change management, the need for a superior Data Governance model, and risk of impacting system response time. Notwithstanding its limitations, DV is ideal for agile BI and big data BI.

Data Virtualization Use Cases

Use Case 1: Today, a virtual data warehouse is the preferred technology because setting up a VDW is much quicker than setting up a traditional data warehouse. This solution is ideal for big data analytics or cloud-based BI platforms.

Use Case 2: In a virtual data lake, the instantaneous consolidation of data, regardless of source, is very useful. This type of data access is beneficial for a varied range of business users

Cisco experts believe that the real challenge in a networked data world is not a lack of data but rather the storing of data across diverse types of repositories, and no amount of technology investment is worth anything unless the resulting information is value-added. They have offered a DV solution that the vendor claims can convert “data stores into valuable information.” They believe DV has a future in combating IoT-generated data silos across the enterprise.

Leave a Reply