Why AI Forces Data Management to Up Its Game

By on
Read more about author Rich Gadomski.

The Information Age has flooded the modern enterprise with data. Demand for enterprise storage capacity will only increase in the years ahead. By the end of this decade, new enterprise storage capacity shipments are forecast to be 15 ZB per year, with the active installed base exceeding 45 ZB.

Where Is This Growth Coming From? 

Business and IT leaders well understand the challenges of massive, double-digit data growth. Whether it is from the edge, in the public cloud, or on-premises, more devices and applications are generating data in increasing quantities. There are more channels for communication than ever. Email, social media, and chat only serve to multiply the amount of storage needed. 

The surge in storage demand is further added to by the need to copy and replicate data for protection and retain data for longer periods. There is even fear in some quarters about the wisdom of deleting any corporate data. Forty percent annual data volume growth rates have become the norm in some industries.

Now factor in artificial intelligence (AI). Whereas before, analytics and even recent AI applications were focused on small, tightly defined data stores, advances in computing have changed the game. AI has arrived in prime time. The door has been opened to the large-scale analysis of near-infinite amounts of data.

Hence, AI and machine learning (ML) workloads are permeating nearly every facet of the workplace. They are becoming a central aspect of the decision-making process. Market research indicates 35% of organizations have already invested in AI, and 44% plan to invest in it within the next year. The democratization of AI is upon us. 

Take the case of ChatGPT. That application from the OpenAI Foundation is happy to address a data store the size of the world wide web. We are talking here about trillions of data parameters available for analysis. Thus, AI is destined to push data storage and data management to new horizons. We need to be ready. 

Data Management for an AI/Big Data World 

Drawing on recent market reports and feedback from Data Management teams, we can safely assume that tactical measures, such as buying more storage, are no longer sustainable. Organizations have been doing that for two decades. It may have helped some of them cope with storage growth. But it no longer works to continually buy expensive storage arrays, whether disk or flash, or yet more Network Attached Storage (NAS) filers or hyperconverged infrastructure (HCI) boxes. Every three or four years, all that gear must be refreshed for bigger, better, faster storage equipment. With so much storage growth, organizations never reach the point where storage is no longer a constant challenge. 

The combination of massive capacity growth and democratized AI make it imperative to implement effective data management from the edge to the cloud. A strong foundation for artificial intelligence necessitates well-organized data stores and workflows. Many current AI projects are faltering due to a lack of data availability and poor Data Management. 

Skilled Data Management, then, has become a key factor in truly realizing the potential of AI. But it also plays a vital role in containing storage costs, hardening data security and cyber resiliency, verifying legal compliance and enhancing customer experiences, decision-making, and even brand reputation. Proficient Data Management advances a data-driven culture, where organizations treat data as a strategic asset for big decisions and everyday actions. 

Active Archives Facilitate Effective Data Management 

As an industry, how can we supply such immense quantities of storage? As consumers of enterprise storage, how can we pay for it? How can we as a society provide the necessary power and control the associated carbon footprint? How can we protect it all in this never-ending age of cybercrime? What’s needed is a modern strategy to manage the growth and volume of data. 

A central part of that solution is the active archive. In an environment where data deletion is increasingly undesirable, better data placement is needed to optimize efficiency and contain costs. Processes must be in place to simplify the gathering and storing of data. In its absence, the coming AI data tsunami will overwhelm organizations.

Active archiving solves data growth challenges in various ways. It injects an intelligent Data Management layer to place data where it belongs for cost or performance. If, for example, only 25% of organizational data is hot and needs to be available with minimum latency, why place the other 75% in expensive top-tier storage such as all-flash arrays? Active archiving differentiates the importance levels of different types of data and shunts all cold data (representing around 75% of all storage) into an easily accessible archive that can serve AI applications with low latency. 

An active archive brings other benefits. It is adaptable to any storage architecture, media, or protocol. It safeguards data from threats and risks via built-in security measures that add extra layers of protection against ransomware. It positions organizations to cost-effectively manage data growth while laying a foundation to profit from tomorrow’s opportunities.

Using metadata and global namespaces, the Data Management layer makes data accessible, searchable, and retrievable on whatever storage platform or media it may reside. It adds automation to facilitate tiering of data to long-term storage as well as cleansing data and alerting on anomalous conditions. 

AI/Storage Symbiosis 

This isn’t just a case of storage and Data Management playing catch-up with the potential of AI. As Data Management becomes more effective, it directly improves the capabilities of AI. Suddenly, AI applications and data scientists can do far more and gain access to much larger pools of data. That alone will greatly impact the quality of decision-making.  But it will also have a reciprocal impact on Data Management. As AI matures and more AI capabilities are injected into the Data Management software layer, the analysis, reporting, and automation capabilities of Data Management will only be enhanced.