More than 15 years ago, Amazon launched Amazon Web Services. Just two years later, more than 100 applications had been built on top of the platform. Fast-forward to today, and you know where this story winds up – nearly every enterprise company in the world deploys applications in the cloud in some form or fashion. Yet, challenges remain, particularly for companies that maintain a large on-premises footprint. How best to migrate critical business data and metadata to the cloud to support ongoing operations and analytics remains elusive for many.
Despite increased awareness of the strategic value of cloud migration in recent years, many businesses are still taking counter-intuitive and unnecessarily costly approaches. Businesses tend to break their approach into pieces – “I’ll just move new information to the cloud, and not worry about the current data that lives on-prem,” or “I’ll think about governance and security later.” While this approach might help limit immediate concerns about budget and create clarity around scope, it can be cumbersome in the long-term and delay the potential return on investment of a cloud migration. Plus, if you leave some of your data on-premises, you ultimately limit its use for analytics in your modern data stack in the future. The reality is, to get the most out of your modern data stack, you need to have a clear understanding of the use cases that are pulling you to the cloud and the data that will be required for you to be successful.
Embrace an Agile Method
If you ask business leaders why they aren’t making a full move from on-premises to the cloud, most often their answer will be because of concerns over governance. While it’s true that governance is critical for securing data and ensuring proper use, the need to truly embrace agile data governance goes beyond that. Data has the power to keep business running and thriving amid periods of disruption, like the economy is facing right now. Today’s modern businesses simply cannot afford to get left behind when data is stuck because of governance issues.
Investing upfront in agile data governance or reinvesting in an existing process prevents clogs in data and analysis and enables you to use more modern tools that hasten return on investment. Plus, it fosters collaboration among data teams and allows enterprises to capture knowledge as you work. In a cloud migration specifically, this makes it easier for data producers to understand why the business wants to move to the cloud and what data-driven initiatives they wish to run in their modern data stack. Having this knowledge enables data engineers to create a prioritized backlog of data assets on-prem and queue them up for migration.
Get Analytics in Order
Whether you’re starting fresh on a cloud migration process or trying to level up something half-baked, organization and consistency is key. Ask big questions to establish metrics that will guide the immediate process and what future success looks like. Then structure your data into a consistent architectural style to ensure smoother sailing.
You may want to organize based on the type of architecture you already have in place. Over time, you can think about layering data models. For example, maybe your data is arranged by business unit, but in the future you want to consolidate around entities such as customers, products, and orders. Maybe you use a star schema today but want to layer on big wide tables for easier analytics in the future. No matter what you choose, applying architectural style consistently will ensure the platform is usable not just for data producers but for data consumers.
Use the Right Tools for the Process
The best approach without investment in the correct tools still won’t be fully successful. Of course, this area has gotten and will continue to get more challenging for many businesses as recession and inflation concerns put pressure on budgets. However, this new reality doesn’t need to restrict cloud migration. Understanding the process end-to-end will help you prioritize the right tools, increase efficiency, and create real business value.
Many of these choices will depend on your use cases. As your budget expands and you scale your migration, data governance platforms, data quality, profiling, lineage, and more can be brought online when strategic for your priorities. For example, if you are trying to identify complex dependencies and the most heavily used assets, lineage would be key. Or, if you are trying to keep track of the data that you have and ensure it also shows up in the new environment, metadata inventory and comparison analysis are obvious priorities. Regardless of your short- and long-term goals, a data catalog is the glue that holds metadata together, ensuring it is discoverable and searchable, can be analyzed, and empowers self-service.
Leading the Data People
As any data leader knows, one of the most challenging parts of any migration process is getting the right stakeholders involved at the right time. To be truly successful, all stakeholders should be involved in cloud migration and dealing with tangible analysis, not just hypotheticals. Picking analytics use cases that are aligned with tasks consumers actually need to achieve and setting clear deadlines can help you measure value and prevent you from boiling the ocean on the first go.
Plus, one of the benefits of building your cloud migration foundation on a data catalog is that it enables coordinated, consistent, and centralized work among parties. Data consumers can work with data in real time to evaluate how successful models are at answering questions. Stewards can document business glossaries and metrics definitions alongside the data. Since all this, and more, is happening around one platform, it makes coordination simpler and prevents knowledge debt in the future.
Ultimately, it’s never too late to engage in a cloud migration with the right agile data governance approach, analytics methods, tools, and people process. With the need for data growing exponentially, engaging in the process, even if it is a piecemeal approach, will pay dividends.