Click to learn more about author Margaret Scovern.
Humans are arguably the most adaptable creatures on the planet — on both a micro and macro scale, we’re able to evolve to meet new challenges or circumstances all the time. It makes sense, then, that we want the tools we build and use every day to mimic our flexibility. If we are able to adapt to environmental changes, take on new challenges, and stay relevant, doesn’t it make sense that we would have the same expectation from our technology platforms?
But when it comes to implementing a Data Pipeline, data technologists struggle with building those attributes into the design, resulting in a pipeline that may be effective at the time of its creation but unsustainable and less relevant for the future.
The key to solving this problem is having flexibility in the Data Pipeline itself. The world of data is constantly changing and expanding, and without that flexibility in a Data Pipeline design, businesses are stymied in what they can truly achieve with their data.
The Truth: Rigidity Only Limits You
Sacrificing adaptability in your Data Pipeline limits your options, and data technologists know better than anyone how quickly the field changes. New data is generated; businesses want to combine data in innovative ways, and the amount of data collected continues to grow.
Some businesses are pushing the envelope in this regard; Domino’s Pizza, for example, is combining data from Amazon Echo, SMS, Pebble, Android, and Twitter to measure customer information more consistently. Imagine managing complexity like that without a decoupled, flexible pipeline — it would create a huge headache for the company’s infrastructure, engineering, and analytics teams.
A way to avoid these headaches is to prioritize scalability and adaptability. By using an effective scalable and adaptable pattern to build your Data Pipeline, you can honor flexibility while choosing the tooling that maps to your organization’s competency and financial goals. So, let’s look at a few things you can keep in mind while pursuing that goal.
Three Factors to Consider When Adopting a Flexible Data Pipeline
- The Data Pipeline Pattern
You may have heard that ETL (extract, transform, load) is dead. With the advancement of Cloud Services and the low cost of Cloud Storage, the siloed approach of extracting data, transforming data, and loading data is inflexible and costly. So ETL isn’t necessarily dead; there are just more flexible patterns and approaches that provide robust results.
For example, a design pattern that includes ingest, model, enhance, transform, and deliver provides much more flexibility in its decoupling of Data Pipeline activities. Think about purchasing a pre-made meal at the supermarket. Let’s say you decide on three different pre-made meals to purchase. Now, think of the ingredients that make up those three meals. Rather than having three pre-made meals, you really have the ingredients that could make, say, nine different meals. When the ingredients are decoupled, you have much more flexibility in the combinations and output.
Yes, choosing a more decoupled pattern for your Data Pipeline makes things more complex, but the flexibility and business agility you achieve as a result is worth it.
- The Skill Set and Size of the Data Pipeline Team
Assess your infrastructure, engineering, and Data Architecture teams. Their skills and traits will often define how you should proceed. To continue with the meal analogy: You wouldn’t want to work with a béarnaise sauce recipe if you didn’t have some experience in the finesse of emulsifying egg yolks and butter over heat. The same holds true for your infrastructure, engineering, and Data Architecture teams. You cannot ask your teammates to design and optimize a decoupled Data Pipeline and then expect quality results when at least some of the team members don’t have relevant experience or knowledge in the decoupled approach.
Meet with team members and get a clear understanding of the skills, pace, and needs of each team and how will they complement and impact each other. Uncover their backgrounds and future desires so you can determine how they can be best leveraged to bring value to the implementation of your agile Data Pipeline. Ensure the discovery process is iterative and inclusive as you design and build the pipeline. Creating compatibility across teams is also a positive outcome of an iterative, inclusive approach, as it builds cross-team coherence.
- The Speed at Which Your Organization can Adapt
Depending on your company’s size, structure, and culture, creating a more flexible platform for data could be relatively easy or much more challenging and time-consuming.
One more time with the meal analogy: Imagine that your organization is like your family. Is your family used to eating spicy, hot food, or would hot spices create a disruption? Do you need to consider gluten-free recipes and ingredients? These differentiators should guide your dinner decisions.
Similarly, put time into analyzing your work environment’s differentiators. The competitive landscape, your organizational structure, and your company’s past experiences will all impact the rate at which you can implement a flexible Data Pipeline. In order to get a clearer picture and adequately prepare, ask a few questions: How do we usually go about a technology transformation? Who would be an effective executive sponsor of this initiative? What is the quantifiable value to the team members to make a change? Why is this approach good for our company? You’ll have a clearer idea of how the project will go once you understand the expectations of the key stakeholders.
More Flexibility, More Potential
By building a decoupled, flexible Data Pipeline, you can adapt to change. In doing so, you will end up reducing costs, increasing efficiency, and allowing your organization to react quickly to new opportunities. With a growing number of cloud solutions for managing these needs, the time is now to think about how a flexible Data Pipeline can make all the difference to your business in the years ahead.