Click to learn more about author Flip Filipowski.
Data silos aren’t a new IT obstacle — they’ve been around since the dawn of internet businesses. Companies in the 21st century generally tend to innovate efficiently in the name of survival. With incredible strides in data technologies and Data Management, why haven’t data silos been kicked to the curb and the real value of data cross-enterprise been unlocked?
The one IT disease that plagues every enterprise — and one that seems to remain immune to the industry’s best elimination efforts —— is the resilient data silo. In its implied analogy, data can be thought of as pieces of grain enclosed within the walls of a farm silo — impervious to the outer elements.
Data silos restrict the free-flow of information across business units, impeding a company’s ability to gain rich analytical insights, collaborate around data or make well-informed decisions. Starting out as a simple and innocent database built quickly to serve a singular purpose for a particular team, data silos quickly sprout, fueled by a lack of standardized data formatting and poor consolidation efforts.
At enterprise scale, data silos paint an ugly
picture: disjointed, disparate, and uncoordinated data leads to poorly-informed
decision making, faulty information governance, dis-coordinated teams, and
limited collaboration. Affecting the bottom line in more ways than one, data
silos are a growing threat that must be eradicated before an organization meets
a self-imposed demise. But what is the answer?
Impact on Broader Ecosystems
Normally data silos are thought of as a company-level issue, but they also manifest and damage processes in broader industry ecosystems as well — for example health care,where patient data is difficult and resource-intensive to share and access.
Two additional trends that make data silos necessary to address today include the escalating volume of data (doubling every two years) and industry’s increasing need to share said data.
As the data economy heats up, and more emerging technologies and architectures rely on a heavier need to treat data as a versatile, cross-boundary asset, data silos — in their current form — have got to go.
The data silo problem transcends any singular category, and its management difficulties are contributed to and exacerbated by a three-part union of erroneous philosophy, ecosystem and technology. Philosophy can be defined by a one-database-to-one-application school of thought. An ecosystem approach is limited by data sharing that is opaque and/or painstakingly administrative. And a technology methodology lies in proprietary SaaS vendors, a lack of standards or just varying off-by-just-one-field schemas.
A few contributors spread across the above categories include:
- Data Politics: The urgency to hold proprietorship over data is well-intentioned — you wouldn’t want to be the dev team responsible for leakage — but creates a framework centered wholly around data “protection,” versus “data versatility.” This becomes an issue as departments turn to replicating data instead of integrating data.
- Compliance: Well, of course you want to isolate data in highly-regulated environments, right? Security and compliance broadly contribute to fears that opening up data across your company could in turn open up data to the rest of the world. Without the right technology and guidance, this could very well be true.
- Nearsighted Data Architecture: Upon receiving a new mission, dev teams want to build apps quickly, foregoing the drive to build within the broader context of data versatility. At scale, this looks like multiple independent data sets — some of which nearly identical in schema and data — completely siloed off, each serving a one-to-one relationship with a singular application.
- Culture: A company’s growth naturally begets a more formal organizational structure of specialized teams. And while this is necessary for the next stage of productive business, we start to see barriers forming around business units. So much so that each team becomes its own “island” of data; generating, storing and using the information privy only to its specific set of goals. The flow of information becomes more of a trickle, and soon cultural, communication and information barriers take root.
- Vendor Lock-in: SaaS companies understand the value of data to a fault: in restricting a user’s complete access to their data, they are preserving their livelihood by keeping the customer on their cloud platform versus a competitor’s. A company today may have multiple sets of data that are held hostage in another’s cloud, unable to shift to a competitor or build its own application.
- Lack of Universal Standards, Different Underlying Schemas: Data lying in different formats and different schemas is difficult to integrate. This is especially true in data lakes, where harmonization and normalization can take an immense amount of resources to accomplish. Furthermore, in current data silo consolidation efforts, we find that differences in schema decisions cause integration slowdown.
In Data Management terms, segregating data can generate critical IT issues, including redundancy, tunnel vision, incomplete retention policies and frameworks, disjointed codebases, inconsistent data and a lack of context. These issues pose major threats to the longevity and success of a company, irrespective of department.
Companies are collaborating around data at an accelerating rate. The need to leverage data across enterprise or even industry borders has become imperative. This increasing need for data collaboration combined with emerging technologies like AI that need access to a high-quality, broad sets of data provides us with an immense-but-necessary task at hand: changing data management with versatility as a first-class feature.
Can Enterprise Blockchain Eliminate the Data Silo?
Right out of the box, today’s databases are self-imposed silos — centralized repositories serving one master. Blockchain takes the opposite approach — its architecture is native to a decentralized and democratic environment of trusted data exchange. The ability to decentralize a data set across a network of participants provides a transparent, unified and trusted repository of information — and establishes the framework for next generation platforms where data is owned by users, not by centralized entities.
Key take-aways from a blockchain implementation include interoperability, trustless data sharing, transparency and decentralized/democratic ownership. But can enterprise blockchains eliminate the data silo? Not without help.
While opening up a repository directly to relevant stakeholders brings immense benefits from deeper analytics to trusted data sharing, there are real issues that must be addressed in order for enterprise blockchain to make a widespread impact:
- Security, Compliance and Privacy: Not everything should be completely public and immutable.
- Data Management: Blockchains today are transaction-oriented and have little place for that pesky metadata. In the enterprise, data and metadata management is a central component for most applications and workflows. Blockchains are rigid forms of schemas — many of which are completely unchangeable due to their immutability. Data mechanisms must evolve to accommodate highly variable user rates.
- Standards: In order to efficiently share and collaborate around data in a wide and complex ecosystem, data needs to be formatted consistently.
- Application Centric Versus Data Centric: Stuck in an application-centric world, data sharing, security and other mechanisms of Data Management are tackled by way of APIs and other application features.
The Path Forward
Today, we tend to build around the problem of data silos with incremental changes, adding massive data lakes to consolidate repositories, and deploying webs of APIs to manipulate data back into a usable asset.
The path forward in disbanding data silos involves making the innovations directly at the data tier — changing how data is managed and challenging its current infrastructure. Adopting a forward-thinking approach to Data Governance and Data Management is a resource-saving practice for every company — and it starts with the right approach and tools. Consolidation efforts are helpful for combining legacy data, but companies looking to build a sustainable, scalable Data Management model need to build from the ground up — with data shareability in mind.