Click to learn more about author Ravi Shankar.
With Big Data, anyone with a modest budget can store, manage, and process vast amounts of data. The problem is, many companies are storing data from different systems in different formats, creating Big Data silos that results in large datasets that need to be integrated manually. Aside from inducing carpal tunnel syndrome, such silos erode Big Data investments, and threaten to bring the initiative into a state of failure.
Forrester saw this coming years ago, and proposed a solution dubbed “Big Data Fabric” in March 2016. Evolving from “information fabric” that was developed through 2013, Big Data Fabric is comprised of six-layers: data ingestion, processing and persistence, orchestration, data discovery, Data Management and intelligence, and data access. Working together, these layers provide seamless, real-time integration of the data in a Big Data system, regardless of data silos.
How Does It Work?
Let’s drill down into each of the six layers, to see how they work together:
- Data Ingestion:
As the first layer in the Fabric, the data ingestion layer needs to be savvy with all of the potential kinds of data, be it structured or unstructured, such as data from devices, sensors, logs, clickstreams, applications, and cloud sources, in addition to databases.
- Processing and Persistence:
Next, the data needs to be processed and persisted, and this is where Hadoop, Spark, and other Cloud-based processing systems come into play.
Here, the data needs to be transformed and cleaned, as needed, to integrate with other data. The only way to do this effectively is to do this in an orchestration layer as ad hoc transformation is costly, and potentially endless.
- Data Discovery:
This is the most important layer in the Fabric, because it directly addresses the silo problem. In the data discovery layer, companies employ data modeling, data preparation, data curation, and data virtualization. Data virtualization is critical, as it creates virtual views of the data that can be accessed by consumers in real-time. This means that analysts can query across two “silos” as if they were part of the same dataset.
- Data Management and Intelligence:
This layer sits above the other five layers, securing the data and enforcing governance. This is also where companies can apply global structures such as Metadata Management, search, and lineage control.
- Data Access:
Finally, at the opposite side of the ingestion layer, we have the data access layer, which is the layer that delivers the data directly to analysts or to analysts through a series of applications, tools, and dashboards.
Benefits of Big Data Fabric
With its many layers, Big Data Fabric offers many potential benefits and enables companies to:
- Effectively integrate Big Data assets with on-premises and Cloud data sources, for a complete view of enterprise-wise information.
- Gain access to up-to-the-minute data in real-time, via the data virtualization component of the data discovery layer.
- Easily onboard new Big Data systems and retire legacy systems, while keeping business systems running continuously. Layers and abstraction protect business users at the top of the stack from any changes at the bottom of the stack.
- Use fewer resources, especially since, with data virtualization, very little data needs to be replicated.
Establishing a Big Data Fabric might seem daunting, but the best way to proceed is to learn from others who have successfully built Big Data fabrics. To learn more about Forrester’s perspective, you can also access a free Big Data Fabric Wave report here.
The accessibility of vast amounts of multi-structured data, coupled with competitive pressure to glean actionable insights from data in real-time, is driving enterprises to seek Big Data Fabric platforms to fuel their Big Data initiatives. If you want to learn how to get the most from your Big Data investments, register now for the next Fast Data Strategy conference, and you’ll be well on your way to weaving your own Big Data Fabric, and reaping all the benefits.