In the hyper-connected world of the cloud and the Internet of Things (IoT), every computing network device is connected to another through a complex, interconnected network. This poses a serious challenge to future Data Management, as the ultimate goal is of Data Management is sharing of business data across disparate platforms and technologies.
The terms “data fabric” and “data mesh” are often used interchangeably to indicate data-access architecture in a hyper-connected Data Management world. The data fabric is more of an architectural approach to data access, whereas the data mesh attempts to connect data processes and users.
Let’s begin with the thoughts of industry experts. In May 2019, KD Nugget predicted the sudden rise of data fabric in the data world in a post titled What’s Going to Happen this Year in the Data World?
James Serra’s and Mark Beyer’s Views of Data Fabric and Data Mesh
James Serra, previously big data and data warehousing solution architect at Microsoft and currently Data Platform Architecture Lead at EY, shared his views on data fabric and data mesh:
“A data fabric and a data mesh both provide an architecture to access data across multiple technologies and platforms, but adata fabric is technology-centric, while a data mesh focuses on organizational change.”
This insightful quote by Serra signals that data fabric is more about managing data technologies (integration architecture), while data mesh is more about managing people and processes. Well, that’s good start in understanding the two different approaches to Data Management.
According to Mark Beyer, a Gartner Analyst:
“The emerging design concept called ‘data fabric’ can be a robust solution to ever-present data management challenges, such as the high-cost and low-value data integration cycles, frequent maintenance of earlier integrations, the rising demand for real-time and event-driven data sharing and more.”
Now that industry experts have confirmed that data fabric is all about data integration technology, and data mesh is all about organizational Data Management, let’s see how business data is handled and managed differently in the data fabric vs the data mesh worlds.
Two Views of Data: Data as a Byproduct vs Data as a Product
According to Gartner, data fabric is an abstract concept “integrating data with connected data processes.” In data fabric, data is treated more as a byproduct of superior data-integration technologies, where the means to an end makes all the difference. In data mesh, data is created in a silo and treated as a “product,” a critical asset in the enterprise Data Management process.
Two Approaches to Data Management: Data Integration vs Data Ownership
At the fundamental level, the ultimate goal of a data fabric world is to provide value-added data integration across multi-clouds, hybrid clouds, on-premise, and stand-alone hosted systems. The ultimate goal of a data mesh is to offer Data Management via controlled datasets (domain-specific).
A Data Integration.info article indicates that the “amount of data created or replicated in 2020 reached 64.2 zettabytes.” Now is the time to think about decentralized Data Management and that’s where data mesh comes in. Decentralized Data Management is a primary way that global businesses will scale their operations around value-driven outcomes.
Two Approaches to Data Stores: Centralized vs. Decentralized
In data fabric, the data access is centralized (single point of control) such as a high-speed server cluster for network and high-performance resource sharing.
On the other hand, in a data mesh, the data is stored within each of the units (domains) within a company. In a distributed data mesh, each node has local storage and computation power and no single point of control (SPOC) is necessary for operation. In a data mesh environment, original data remains within domains; copies of datasets are generated for specific use cases.
Data mesh is particularly useful for hybrid cloud networks, where data connectivity models and data security are unavoidable challenges.
Two Approaches to Data Access: via APIs vs via Controlled Datasets
In data fabric, data is made available via objective-based APIs or via data stores where API support does not exist. In a data mesh, data is copied into specific datasets for specific use-cases, but under the complete control of the business unit or domain that owns the data.
As an example, say a user needs to build a dashboard that compares quarterly sales versus quarterly inventory data. In the data fabric environment, the sales and inventory data will be ingested first to the respective system’s data store. Then an API will be built to join the data sets and expose them to the dashboard.
In a data mesh environment, the sales data will be copied from the department data store to a shared location. Likewise, the inventory data will be copied from the department data store to the same shared location. The dashboard owner will then build a joined table to make the datasets work in the same workbook.
That being said, let’s take a look at the real business world today. Data fabric versus data mesh? No, not really. It’s data fabric with data mesh that seems to be offering a comprehensive Data Management solution.
Use Cases: Tipping in Favor of Data Fabric
Data fabric has kept its promises of: single-point data access; mitigation of data quality and insufficient storage issues; compliance; and superior handling of security threats; it is the preferred Data Management technology in the global business environment today.
Thus, data fabric is currently applied for a wide variety of use cases. Data mesh is still is an untapped stage, mostly providing additional strength to data fabric in multi-cloud setups.
Data Mesh as a Complementary Technology for Data Fabric
Data mesh, through its single method of connectivity, can promote high data availability and reliability in a hybrid cloud environment. The data is guaranteed to be highly available, easily discoverable, secure, and interoperable with the applications that depend on accessing it. In this context, you may want to review this Forbes Council Post, authored by Joe Gleinser.
Something for the Readers to Explore
Can data mesh survive without data fabric?
While data fabric has become the preferred network architecture for business data centers, data mesh has been quietly tracking network performance for years now, and intercepting whenever some changes occur.
Data mesh has served as the vigilant troubleshooter of enterprise networks — working overtime to resolve network problems even before they happen. Data mesh works independently, so it does not necessarily need to rely on data fabric. It runs on its own software-defined network (SDN) platform. Here is a piece of news: many users feel that data mesh is a much better technology compared to data fabric for data integration. What do you think?
Data fabric and data mesh both offer powerful solutions for collecting and consolidating business data from disparate sources for enhanced decision-making. However, data mesh is still maturing; it is more suitable for applications that do not require high performance or reliability. Data fabric and data mesh, for best results, should be used as complementary technologies.
A thought-provoking article, Data Architecture: Complex vs. Complicated, discusses the need for “adaptable Data Management architectures” in a hyper-connected world of remote hosts and sensors flowing with non-stop data.
Image used under license from Shutterstock.com