Denodo’s Impact: Data Virtualization and its Intersection with the Big Data Fabric

By on

Data VirtualizationThe Data Fabric market is estimated to grow from $653 million this year to $2 billion by 2022, representing a CAGR of 26.6%, according to a new report published by MarketsandMarkets. The report cites the driving forces of increasing volume and variety of business data, the emerging need for business agility and accessibility, and the growing demand for real-time streaming analytics as factors. Data Virtualization is playing a central role in this process.

Among the major players in this space is Data Virtualization vendor Denodo, according to that report. Forrester last year also cited Denodo as a strong performer in Big Data Fabric offerings and strategies, with the research firm stating that its “mature Data Virtualization technology broadens its coverage to support Big Data Fabric use cases.”

Here’s how Data Virtualization technology supports Big Data Fabric use cases like Big Data and Real-Time Analytics. Business users who want to gain insight, for example, about which customers have bought which ten products in the last six months and who bought warranties along with them and also who purchased the products online, could be in for quite the challenge in terms of pulling together information from disparate data sources. In terms of latency as well. A Big Data Fabric acts as an abstraction layer that brings together data from these multiple systems in multiple formats for exposure to consumers, functioning as a single virtual repository. There’s no physical storing of the data.

According to Forrester:

“Big Data Fabric is accelerating the delivery of insights by automating key processes for increased agility while giving business users more autonomy in the data preparation process. Enterprises use it to support many use cases, such as enabling 360-degree and multidimensional views of the customer, internet-of-things (Iot) and real-time analytics, offloading data warehouses, fraud detection, integrated analytics, and risk analytics.”

Data Virtualization in the Denodo Platform enables that virtual repository. The platform features broad connectivity to structured, unstructured, and non-traditional data sources, and it gets the information to answer the query by knowing which underlying systems to connect to that hold the relevant data, gathering it, combining it and delivering in it in realtime to the business user in pass-through fashion. The business user doesn’t have to worry about where to go to get the data, and latency goes away because there’s no moving data to a central store.

A Look at Denodo’s Evolution

So, the Denodo Platform is enabling a logical Data Warehouse architecture for Business Intelligence (BI) and Analytics through its Data Virtualization layer. In these architectures, data is distributed across several specialized data stores such as Data Warehouses, Hadoop clusters, and Cloud Databases, and there is a common infrastructure which allows unified querying, administration and Metadata Management.

“Denodo was built with logical architecture in mind, not physical,” says Lakshmi Randall, Director of Product Marketing at the company.

“That means from the start we took into consideration so many factors of working with disparate data sources and making data practical to use, and taking performance into consideration.”

Last year saw a major release in the form of Denodo Platform 6.0, Randall says, with new key capabilities including dynamic query optimization. Rather than taking a static approach to optimization, “our goal is to dynamically optimize the queries during runtime depending on the characteristics of the underlying data sources,” she says, whether Hadoop or a Microsoft or Oracle ecosystem or a SaaS app or even just an Excel file.

The release also introduced Self-service data discovery and search. “This web-based information self-service tool helps end users understand and explore data in a virtual sandbox environment,” she says. Being able to explore the data for better understanding, and greater comprehension of its inherent relationships prior to consumption helps improve Data Quality and decision making.

Another key capability introduced in 6.0 was Data Virtualization in the Cloud, with availability on AWS Marketplace. Denodo also can run on other Cloud platforms such as Microsoft Azure. This has enabled customers to run Denodo Platform on-premises, in the Cloud, or on both. Denodo seamlessly integrates data across on-premises and Cloud data sources.

Expected in the third quarter is version 7.0, where a key function will be enhancing the in-memory fabric. “We’re including in-memory fabric functionality within the platform to further enhance the in-memory capabilities we already have,” Randall says. She calls this “the icing on the dynamic query optimization cake.” That is, first you want to reduce the size of data before it’s moved to in-memory, which is the magic of the optimization, and then, once the data is reduced, it can be streamed into the in-memory layer for caching or additional processing, or in optimizing slower data sources.

The upcoming version also will enhance self-service collaboration features “where Metadata can be enriched with tagging, annotation, and comments resulting in improved Data Stewardship,” she says.

Building Awareness

“Denodo Platform brings connectivity to diverse data ecosystems within an organization,” she says, and today Denodo is being used by major enterprise customers for large scale projects and Data Architectures. Healthcare company Vizient and 3D design, engineering and entertainment software provider Autodesk, for instance, are using the Denodo Platform as a Big Data Fabric, she says.

There still has been a need to build awareness of what Data Virtualization is, though, Randall believes. Fortunately, that awareness is starting to grow nowadays. As the number of data systems within organizations expand, Data Virtualization is viewed as a means of hiding complexity and unifying disparate systems and data assets in an efficient, agile, and iterative manner, she says.

“The hybrid data ecosystem is becoming a stronger trend, resulting in increased use of Data Virtualization,” she says.

Photo Credit: Titima Ongkantong/



Leave a Reply