Challenges of Data Governance in a Multi-Cloud World

By on

Today, it would be really hard to find a business totally free of cloud services. In spite of battling centralized “no-cloud” policies in some enterprises, individual departments, work groups, and units are increasingly subscribing to services for data storage and backup, media services, CRM, hosted analytics, and more. Even developers are using Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) for testing and deploying applications.

One fact is clear — cloud is here to stay and will prosper. So, in the future, businesses will have to learn to deal with risk factors if they need hosted, cost-efficient data centers with improved agility. Increasingly, businesses are adopting private-public cloud, hybrid cloud, community cloud (resources shared by several businesses), and some other flavors of mixed cloud environments. These cloud infrastructures are no longer simple and stand alone; they are often designed, developed, and deployed over a complex network of cloud servers, depending on organizational size, needs, and budget.

Principles and Best Practices for Data Governance in the Cloud states that Data Governance is an essential part of making data work for your business. To make the data really work, the governance framework has to address policies and procedures for data collection, ingestion, storage, cataloging, deployment, backup, and periodic removal.

According to McKinsey Digital, the International Data Corporation (IDC) predicted that “corporate spending on third-party-managed and public-cloud environments will grow from $28 billion in 2011 to more than $70 billion in 2015, and over $100 beyond going into the 2020s.” This 2013 prediction came true when a majority of North American companies began to shift about three-quarters of their applications for substantial cost savings. The McKinsey survey showed that many executives stated that by replacing their in-house applications with cloud-based, Software-as-a-

Service (SaaS) models, they managed to save about 60 to 70 percent of operating costs.

Security Boulevard points to a Forrester Research Report that announced that about 86 percent of surveyed global decision-makers have already adopted multi-cloud strategy, and 60 percent are shifting to the public cloud. In a multi-cloud world, the enterprise customer is dealing with more than one — yes, more than one — cloud environment, and more than one service provider.

Cloud Data Governance — A Necessity for Enterprise Data Management

Medium argues that as multi-cloud is fast becoming a de facto infrastructure standard among enterprises, Data Governance on the multi-cloud environment is surely a necessity. According to McKinsey, data residing on the cloud platforms are usually “scaled shared, and automated.” Business users are easily attracted to cloud platforms for the agility and availability of resources for a limited budget.

The popular cloud options in large businesses now are private-public cloud combinations, hybrid cloud setups, and other multi-cloud environments to reduce risk and increase computing power. While access to public cloud sounds like a good proposition for its apparent benefits, the common concern among the c-suite executives is data breach or data security.

Moreover, reputed businesses do not want to take services from cloud service providers who do not guarantee regulatory compliances. Thus, businesses are left with a balancing act between the apparent benefits of multi-cloud and increasing data risks associated with cloud platforms. 

Cloud Computing Challenges: Navigating the Multi-Cloud Landscape quotes Jay Chapel, the founder and CEO of ParkMyCloud as saying that in a cloud computing world, data “security is either the highest or the second highest priority.”

The five critical steps of a Data Governance strategy in a cloud environment include identifying data “value” for storage and security purposes; determining the risks associated with data; determining data “location” for creating adequate safeguards and for selecting the appropriate method of data transfer; developing a solid hierarchy of data access in terms of its varied users; and a creating a clear set of data quality policies to govern both incoming and outgoing data. The link contains detailed description of each of these steps.

Data Management Risks in the Multi-Cloud Environment

The traditional contracts that worked in typical telecom network services to mitigate security breaches or other types of noncompliance events have failed to deliver the goods for the cloud. Highly scaled, shared, and automated IT platforms, such as the cloud can hide the geographic location of data — both from the customer and the service provider’s sides. This can give rise to regulatory violations. Thus, contracting for the cloud is still in its infancy, and till some litigation sheds light on regulatory issues and serves to set precedents for future cases, the data-cloud breach issues will remain unresolved.

Moreover, data aggregation will increase the potential data risk as more valuable data will occupy the common storage location. On the flip side, multi-cloud environments offer more transparency through event logging, and enterprise-wide solutions via automation tools. Solutions, once detected, can be instantly deployed across cloud networks. In recent years, risk management strategies specifically for the cloud have emerged, and these just have to be tested for the multi-cloud environments.

An IBM blog post describes a typical multi-cloud environment, where enterprises are seeking uninterrupted, data-management services without losing security and control. In this multi-cloud mode, vendor lock-in does not exist; enterprise customers are free to choose applications and services across multiple cloud infrastructures to meet their daily business needs.

The multi-cloud governance environment is today’s operational reality in high-profile sectors like healthcare, banking, or insurance. These businesses were never built with the cloud in mind, and hence retrofitting a cloud or a multi- infrastructure on a well-established business model is that much more difficult. The reality of the multi-cloud is that it often encompasses fragmented technologies sitting on top of a complex architecture — making IT governance and compliance a nightmare.

Data Governance for Multi-Cloud

Data Governance in the Cloud offers a reminder that while most business owners get bogged down with the technical nitty-gritty of cloud migration, the real challenge is Data Governance, as it controls of success of a cloud migration. The major pain points of a cloud migration are data access, compliance, and security. For enterprise, the challenge is knowing where the data resides in a multi-cloud environment.

A Computer Weekly feature article titled Managing Data in a Multi-Cloud World points out that one of the major hurdles faced by multi-cloud service providers is designing a “seamlessly integrated Data Management strategy” for making varied cloud services across cloud systems work together.

Data Governance Best Practices for the Multi-Cloud

In the current, cloud-first world, organizations have realized that strong Data Governance policies must be developed, especially if data is constantly moving around in a multi-cloud environment. The data best practices, which have been tried and tested in a regular cloud environment, are now mapped for the multi-cloud setup:

  • Identifying and allocating responsibilities to data owners, who become appointed managers for specific types of data sets by virtue of their subject matter expertise. These data owners are the creators and implementers of data policies in a multi-cloud setup.
  • Setting data lifecycles to ensure useless and redundant data is cleaned out for efficiency and cost advantages
  • Improving metadata management through consistent data-management policies implemented across an organization.
  • Ensuring duplicate data sets are uniformly implemented in a multi-cloud environment by preserving data-security information with the data set.
  • Ensuring replica data sets are tracked and audited from time to time to remove redundancies across different cloud systems.
  • Setting clear data integration policies to document the types of data that can be combined, the processes required for such integrations, and the associated security issues for the combined data.
  • Developing form policies about preserving “original data” before they are analyzed and transformed. The value of preserving original data cannot be overstated in a large organization, where different groups of users may need to revisit original data sets.
  • As smart data analysis often involves the creation and use of “data models,” these models have to be properly archived and managed for the long term, especially if the models are moved from one cloud system to another. Thus, model management must also be addressed in a comprehensive multi-cloud environment.

A Forbes Council post has proposed the idea of an operational data layer to address the data gap between on premise- and cloud-hosted data.

 Image used under license from

Leave a Reply