AI Has the Potential to Bring Cloud Sprawl Back Under Control

By on
Read more about author Sterling Wilson.

Enterprises continue to produce an explosive volume of data each year, so much so that companies admit up to 60% of their data goes unused, and one-third of enterprises report feeling overwhelmed with rising data quantities. Cloud-based data storage solutions have made it easier than ever to store this vast amount of data, and for companies who are facing market pressures to undergo digital transformation, it can appear to be a natural fit to supplement small IT teams. 

An unintended consequence of the rapid expansion of cloud environments is cloud sprawl – i.e., an organization’s cloud environment becoming so large and/or unmonitored that it is no longer being adequately controlled by the institution. This opens the door to many risks, including operational inefficiencies, rogue IT, and cybersecurity vulnerabilities like ransomware and data theft. 

However, the rise of artificial intelligence (AI) in the enterprise provides an opportunity to tackle data growth and rein in cloud sprawl without placing unnecessary strain on already overworked IT teams.

Bringing Runaway Cloud Under Control

Following the cloud rush of the past decade, companies are struggling to catch up and gain a complete understanding of their data security posture to determine where all their data is being held in the cloud, when data is downloaded and by whom, and if that person is who they say they are or if their credentials have been stolen. Exacerbating the problem is the skills shortage in the tech industry, which is even more prevalent among cloud specialists who must have expertise not only in security but in cloud and compliance as well. Multi-cloud environments can get especially confusing because there are often no systems in place to monitor data infrastructure across multiple cloud entities.

AI can help to wrangle all of that by leveraging metadata: the data about the data. For every file, there’s also information on its size, when it was created, where it lives, etc. AI can scan and aggregate this metadata from different sources to highlight anomalies and identify trends, creating reports on where all data is stored and where companies may be wasting their money. For example, identifying when an organization is unnecessarily storing data in the cloud and paying excess ingress and egress fees. AI discovers and studies these kinds of patterns better than enterprises have been able to do previously. 

Businesses can use this information to analyze cloud performance and find opportunities to reduce redundancies. In the end, it will arm IT decision-makers with the information they need to make smarter choices about how and when they use the cloud. 

The Role of Hybrid Cloud

Using AI, organizations will be able to answer the three most important questions related to their data security posture: where is my data, how much is it costing me to store, and how am I going to recover in the event of a cyberattack or outage. Traditional methods of data mapping are often error-prone, time-consuming, and ultimately can’t keep pace with complex cloud environments. Leveraging AI, organizations can automatically scan data across diverse storage locations – regardless of working in the cloud or on-premises. At the same time, AI can quickly analyze past storage usage and predict future needs, cutting out unnecessary costs to optimize storage allocation while also analyzing historical backup and disaster recovery plans and predicting potential future threats. 

Gaining increased visibility into cloud environments through AI-based tools will contribute to decisions about where to put data in the future and, ultimately, lead to a repatriation of some data back on-premises. 

Data storage infrastructure works best in a hybrid cloud environment, with some copies of the data in the cloud and some copies on-premises in a separate location. On-prem object storage as a backup use case supplements cloud by segmenting a separate copy of data to provide an additional layer of security that is also faster to recover. Companies that have been through a ransomware attack and were a cloud-first company know that recovering data from the cloud to anywhere else after an attack can take a massive amount of time that businesses do not have to spare. Business outages can cost companies up to $1 million an hour, not to mention the loss of reputation and customer loyalty.

In addition to reducing Recovery Point Objective (RPO) and Recovery Time Objective (RTO), keeping a copy of data on-prem is in line with the 3-2-1 methodology, a widely accepted best practice that states companies should have three copies of their data on two different types of storage media, one being offsite. As companies continue to find new use cases for AI, visibility into cloud environments will improve, and companies will find a better balance between their cloud and on-prem data storage solutions. This will lead to more informed, efficient, and resilient organizations.