JOIN OUR DATA ARCHITECTURE BOOTCAMP
Save your seat for this live online training and accelerate your path to modern Data Architecture – February 27-March 2, 2023.
Polyglot Persistence is a process for storing data in in the best database available, no matter the data model and data storage technology. This process is based on the understanding different data stores will handle certain types of data better than others. Polyglot Persistence is, unfortunately, not available for easy “downloading”, but must designed for the unique Data Architecture of each individual enterprise. This “storage philosophy” is a recent development, and still needs to evolve, technologically.
The words “poly” and “glot” are Greek in origin, but were assembled for use in English during the 17th century. The word polyglot means to speak and write in multiple languages. The term “Polyglot Programming” appeared in 2006, to describe the understanding certain computer languages are excellent at solving specific problems, while others are not, and for this reason, programs should include multiple languages.
The persistence part of Polyglot Persistence refers to memories that are “saved” in a safe way or location, allowing them to be “persistent.” Polyglot Persistence uses a program that communicates with multiple types of databases, and uses the most “appropriate database” to store and process data.
Polyglot Persistence came about as an extension of NoSQL. The realization of NoSQL led to software capable of interacting with multiple databases, and, in turn supported the idea Polyglot Persistence. A NoSQL database is not required for the operation of Polyglot Persistence (although they do work well together). Polyglot Persistence can be applied to SQL, NoSQL, or hybrid database systems.
Many larger businesses are already using multiple data storage technologies for different kinds of data. As the use of business computers expanded to include older and newer computers, with a variety relational databases, Data Warehouses, and other equipment, a hybrid data storage environment developed, bringing with it storage problems. Adding to the chaos, much of the data currently being used is “non-relational,” and cannot be processed by the relational systems that are currently prevalent.
The belief one or two databases can fulfill all of a business’ needs is outdated. This philosophy is inefficient, a little archaic, and may be wasting money unnecessarily. There are a number of options available that support both NoSQL and SQL data stores, including the ones optimized for both analytic and operational workloads, as well as both open-source databases and commercial database products.
The Benefits of Polyglot Persistence
Polyglot Persistence provides support for multiple database models and is able to utilize the best data model for the job. The Polyglot Persistence philosophy comes with its own “technical” strengths and weaknesses. Polyglot Persistence provides some key benefits:
- Simplifies Operations: Different databases, coordinating and working with one another, make for complicated operations and cause fragmentation. Polyglot Persistence simplifies operations and helps to select the best component for the situation, helping to eliminate fragmentation.
- Faster Response Time: All the features of databases in the program are leveraged, improving response times.
- Efficiency: The ElasticSearch app can return results using “relevance” as a priority in the listings, while MongoDB cannot. A Polyglot Persistence application would “automatically” assign relevance-oriented processing.
The Disadvantages of Polyglot Persistence
Utilizing a Polyglot Persistence model can be both difficult and expensive. Specialists often need to be brought in to integrate the different databases. These expenses should be given serious consideration in evaluating long term goals. Other problems to consider are:
- Permanent IT staff will need training on the “new” systems.
- Maintenance and repairs can be time consuming, because running tests is difficult. If data is sharded into many databases, the testing of data layers can become complicated. Debugging is, of course, also quite time consuming.
- Making sure a system with multiple components is fault-tolerant is difficult, to say the least. Integrating multiple databases requires a significant amount of operational and engineering expenses. A business will need to have experts for each database technology. For the program to remain operational, all of the databases must be up and running. This situation makes the fault tolerance the weakest link.
- A Polyglot Persistence model with separate databases solves specific problems, but can also become an operational nightmare. The operation of multiple data silos can cause just as many difficulties as it resolves, starting with operational complexity.
Big Data and Polyglot Persistence
A number of factors should be included when deciding to move to Polyglot Persistence, including the collection and use of Big Data. If the use of Big Data is included in promoting a business or product, changes may have to be made to the system. As a system grows and changes, there is no one-size-fits-all data storage solution. Developers will need to assess the requirements carefully, and choose the best suited approaches for data storage and access. Polyglot Persistence is a blended database solution developers can use to maximize efficiency, particularly in terms of efficient storage.
Factors to Consider When Employing Polyglot Persistence
When using two, or more, different kinds of databases, where to store the data becomes a decision. A bad decision could result in having to rework the system, including the time needed to migrate data from one database to another. The person, or team, writing the program should be able to provide useful advice.
Applications also have to deal with the increasing complexity of your Data Architecture, having to communicate with two potentially very different data stores. When done correctly, it should be possible to separate the “persistence layer” (a group of communication files) of the application, and free up the rest of the application to do its work. The more data stores available, the greater the potential for increasing the complexity of the data persistence layer. The application will need to know:
- Which database is used for specific data sets?
- How to interface with each of the databases?
- How to respond to different kinds of errors coming from each database?
- How to respond to queries for information from across databases?
- Methods to “mock out” different databases for testing?
Working out these issues could result in large amounts of new application codes. These new codes can add to the complexity, and could result in creating some bugs. A new architecture, with databases chosen using a Polyglot Persistence model, may be able to avoid many of the problems associated with this type of programming.
Questions to Ask
- How will training for the new system be provided?
- Is there a “local” expert available to help get it up and running?
- Can this person mentor your staff?
- Who is available for repair and support when a problem occurs?
The Future of Polyglot Persistence
Currently, the use of the Polyglot Persistence model is under development in various enterprises who require such a Data Architecture model, with some IT departments investing considerable time and money into making it work. While the idea of using exactly the correct database for specific type of data is appealing, this approach forces developers to learn about a seemingly endless number of databases. Polyglot Persistence needs to become easier to develop and easier to maintain for the model to ultimately work. As long as it is difficult to initiate and support, it will continue to be time and labor extensive. If it doesn’t evolve, it may be left on the sidelines, though the benefits could eventually outweigh the disadvantages.
Photo Credit: pp1/Shutterstock.com