Open Source, Cloud, and Data: How to Plan Ahead

Click to learn more about author Matt Yonkovit.

For most Data Management projects today, open source will be included in some form or other. From the databases that store the data to the management tools that manage and protect that data, or the analytics projects used to interrogate or display that data, open source plays a vital role.

However, that role is changing. Although developers are using more open source in their applications, and in how they handle data, how they consume and use those tools is evolving. Rather than building their own IT and application pipelines, developers now have more options on how to implement these pipelines. This ranges from complete cloud services available as one-click installs to multiple managed services and implementation on their own cloud instances. Developers have more choice than ever.

This choice means that adopting open source is easier. However, this knocks out one of the big value propositions for companies involved in the open source community that provide paid support and service. When the cloud can automate the deployment process and make things so simple, why would you pay for expertise separately from the cloud service you run on?

Using Cloud in the Right Way

The first point to make is that there is no “one size fits all” approach to open source and cloud. For developers, the ability to get the services they need up and running successfully in the cloud is a huge selling point. However, it is important to consider some of the trade-offs that exist in these decisions first, rather than just settling on the easiest option.

Let’s look at your choices in more detail.

Implementing your database instance in the cloud can help you get moving quickly — for example, you don’t need to buy hardware. However, you’ll be responsible for managing the installation, the size of the instance you buy, and how you go about storing data over time. This could lead to additional unforeseen costs, such as finding that you have to upgrade your instance size if you didn’t estimate your capacity needs accurately at the start. This can also lead to expensive migrations over time — move your instance size a couple of times in a year, and you could end up paying double what you estimated.

Planning your cloud instance choice ahead can, therefore, help you manage your costs over time. Having more control over your instance and approach can also be great if you want to tweak for performance.

Another alternative is looking at the managed database services provided by your cloud provider. Coming under the catchy title of “database as a service” or DBaaS, these offerings mean you can get started quickly and don’t have to worry about the underlying infrastructure. Instead, the cloud provider, or company you work with, will handle all of that on your behalf. This should enable you to concentrate on more important areas like how your application works, adding new functionality, and spending less time on database administration tasks.

The challenge with DBaaS is that not all DBaaS offerings are the same — some are “full service” and include all the management tasks like backup and recovery management, security planning, and access control. Others will focus on giving you a running database instance, but those additional tasks remain your responsibility. Whatever you pick, make sure you understand what is included and — equally — what is not.

DBaaS offerings concentrate on making it as easy as possible to get up and running. What they don’t prioritize is the potential to change or tweak your database instance for performance. This can be good, after all. If you are a developer, then you want to take those nitty-gritty jobs away and focus on code. But it can also lead to additional costs over time compared to managing this yourself and generating savings through better performance. DBaaS offerings are designed to suit the majority of potential customers, rather than providing full autonomy.

Collective or Individual Needs

IT trends tend to be cyclical. Initially, computing was heavily centralized with mainframes before moving out to mini-computers, moved back centrally to client-server and three-tier applications, and have now moved to a more distributed model with cloud and microservices designs. The sheer scale that applications will have to run at means that distributed computing models will be the future for new applications.

The shift has tended to be that companies need more customization around their applications or IT in order to work, and then automation and standardization come back in to reduce cost. This frees up budgets until companies want to improve how competitive they are and how they differentiate what they do. This leads to more customization, and the cycle begins again.

While the cloud makes it easier for everyone to implement applications faster, the need to customize doesn’t go away. In fact, this is where open source has the potential to win in the shift to the cloud. The move to the cloud, microservices, and containerization relies on the cloud to abstract away infrastructure. As long as everything is working well, the shift can deliver value. But when things go wrong or are not performing as you expect, then you need that real insight to fix things. This is where open source knowledge and understanding are vital.

Over time, the shift to the cloud will see some elements of Data Management and databases become commoditized. Simply running a database well is not the competitive advantage it once was. Instead, the emphasis will be on getting access to deeper knowledge around how to make the most of the cloud and how to pick the right approach. This is where open source can provide value in the future, helping companies differentiate and thrive.

TAKE OUR DATA MANAGEMENT CERTIFICATION PREP COURSES

Data Topics

Leave a Reply Cancel reply