Pre-COVID data scaling, performance, and data services were largely manageable within the corporate data center. But the last several years have seen enterprises decentralize their data and IT systems, while employees now work remotely across different locations and geographies. This massive increase in remote work arrangements has created additional challenges around managing data access and protection across a highly distributed workforce.
Organizations need solutions to bridge the new gap between where data is stored and how remote workers, applications, and computers can access it.
LIVE ONLINE TRAINING: DATA FABRIC AND DATA MESH
Learn how to design and implement a data fabric, data mesh, or a combination of both in your organization – May 25-26, 2022.
These circumstances are driving tectonic shifts in three familiar paradigms as companies evolve their digital transformation strategies. Understanding these changes is crucial as organizations evolve their approach to today’s distributed data and workforce challenges.
Scale Requirements: Act Locally, Scale Globally
In the pre-COVID world, data owners calculated storage system scalability by SSD/HDD storage density within a single rack in a datacenter or by the number of nodes in a single namespace for scale-out systems.
This is not to say that capacity management and storage efficiency are now easy: The typical enterprise IT organization manages over 10 different types of storage technologies, including multiple clouds and cloud regions. But in the data center, the tools and measurements were relatively straightforward.
In a decentralized storage environment, scale is about more than just capacity management and data center space efficiency. IT must also consider which storage types (block, file, or object) their systems support, the variety of cloud storage offerings, the number of supported regions, and which cloud services to integrate.
Most importantly, IT needs to ensure that users and applications can act locally without burdening IT administrators with manual processes and proprietary vendor lock-in, which hamper agility and add costs.
A New Measure of Performance Success
In the past, data systems’ performance was focused on raw IOPS and I/O throughput between compute, applications, and local storage near the data center.
Today, IT measures needs to take the traditional performance metrics of IOPS and throughput into account but also has the added burden of providing high performance and low latency in remote environments. When users require specific data for a work project, they must be able to seamlessly access those files without having to copy them.
For example, while consumer demand for home entertainment increased during shutdowns, live-action production came to a near halt. The demand for computer-generated visual effects and animation sequences exponentially increased, yet creative and production teams were working from home. The successful studios were able to support their artists with decentralized data access and collaboration tools, and these studies typically doubled the size of their businesses in just one year.
Successful business performance requires this level of support – enabling distributed teams to collaborate as efficiently in remote settings as in local ones. The shared data environment should seamlessly synchronize changes so teams can collaboratively design, analyze, and produce results – even if they’re on opposite sides of the globe.
Data Service Requirements
Historically, data center managers used each storage vendor’s data protection, compliance, and disaster recovery solutions that were built into the storage software. This was an efficient solution for a single-vendor storage system. But in practice, data centers are multi-vendor environments whose data services are often incompatible with each other.
The immediate solution was third-party processes that worked across multi-vendor environments to protect data and fulfill compliance, such as point solutions, third-party tiering software, stubs, symbolic links, gateways, or other proprietary techniques.
But most of these solutions are only viable in a single data center, and managing decentralized data for users, applications, and lifecycle became increasingly harder. The difficulty resulted in limited choices, added costs, and decreased end-user productivity.
Today, IT needs to deploy data services that efficiently operate across a globally distributed environment. IT teams can effectively identify data sets and their locations and limit copy proliferation, control access, set data protection controls, and more.
For example, many DevOps teams struggle to maintain development timelines in the face of different time zones and without efficient shared data access. Successful DevOps use data services that support remote teams and distributed code, enabling efficient build and deployment cycles.
Bringing It All Together
We are rapidly approaching a reality where enterprises need to replace inefficient manual Data Management with cost-effective data resources serving multiple data and workforce locations. Organizations that succeed in transforming their global management strategies will keep pace with a changing world and create a significant competitive advantage.