Click to learn more about author Greg Nist.
When I think about data orchestration, I always reflect on the key root word: orchestra.
I’ve had the wonderful opportunity to watch a live symphony several times. I always leave amazed at the complexity and diversity of the moving parts and how they are seamlessly integrated, governed by the music itself and the direction of the conductor.
In many ways, the symphony is a great analogy to apply to modern computer systems. These systems require effective integration of distinctly different parts, precise scheduling, data movement, and overarching governance.
Data orchestration is critical for organizations so they can effectively use their data to deliver innovative services that run their businesses. It’s also important so they can accurately see their business through a lens that provides an integrated view for reporting and analytics. All of this occurs while ensuring that they meet strict compliance requirements and are responsible stewards of sensitive data.
So, in a fast-paced world of ever-changing data, how can an organization effectively manage its orchestration, or movement? In the past, data orchestration was largely the job of developers writing code in a language like Java to move data between systems. Each time this needed to be done, the IT team wrote more custom code and ensured that security and governance aligned with policies.
Writing code to manage data orchestration is time-consuming. The custom code that is created is often difficult to reuse, and it requires a staff with advanced and specialized skills. These limitations have motivated modern organizations to evolve and embrace new technologies to manage data movement, provide interfaces with commonly used enterprise systems, and emphasize security and governance – all without writing code.
One such technology that fits this objective is Apache NiFi.
Apache NiFi is an open-source software project designed to automate the flow of data between software systems. NiFi provides a web interface that allows users to build a Flow Controller. Think of the Flow Controller as the conductor of the orchestra. It manages the scheduling, execution, and movement between the different Processors that make up the overall flow.
A NiFi Processor is like one of the sections of the orchestra such as the strings, woodwinds, or brass. A Processor is designed to perform a specialized task. For example, an organization might need to develop services that require them to integrate data from multiple systems. A NiFi flow could include Processors that get data out of a legacy system like a relational database management system (RDBMS), convert the data to JSON, and then put it into a flexible NoSQL database that will serve as their integrated data hub. There are many Processors available in the NiFi ecosystem, each designed for interfacing with commonly used enterprise systems or data formats.
Each Processor is configured to connect and behave in a certain way. This configuration is the sheet music that tells the Processor exactly what it should do when called upon by the Flow Controller. For example, a Processor could be configured to run a specific SQL query to return data from a RDBMS.
A Connection in the NiFi Flow Controller is configured to define where the output of a specific Processor should go next. For example, the data coming back from the SQL query against the RDBMS might get passed to a Processor that then converts that data to JSON. That JSON data might then get sent to another Processor where it is loaded into another system.
Designed for the Enterprise
NiFi is not “bleeding-edge” technology, since its initial release was in 2006. Over the years, it has evolved with more Processors being developed and added to the ecosystem. For example, there are now Processors built to interface with legacy systems like RDBMS but also modern NoSQL databases and other emerging technologies. And from the beginning, NiFi was built with enterprise needs in mind. It is designed for scale, leverages encryption, thoroughly tracks data provenance, is configurable via an interface, and is extensible when needed for custom requirements. All of these capabilities make NiFi an excellent technology to consider when looking at modern data orchestration tooling. And for the modern enterprise needing to extract actionable insights from its data with ever-increasing speeds, modern data orchestration is harmony in action—much like a good symphony.