The myriad NoSQL database offerings in the marketplace today are testament to the fact that such systems, along with their diverse architectures, are here to stay. Such a statement was not the case only a few years ago when such “trends” were still the purview of industry thought leaders, innovative startups, and apocalyptic harbingers proclaiming the end of the SQL-relational database world. Data Management and its many associated technologies and practices – from Data Governance to Data Modeling, Enterprise Information Management to Customer Relationship Management – have entered a new age. There is no denying that any longer. Big Data, no matter what definition someone chooses to use, has changed everything, and with it the growth of NoSQL platforms and the consequences of such adoptions at the enterprise level will push the envelope on how data is collected, stored, and analyzed far into the future.
In a recent DATAVERSITY® interview with Dr. Vladimir Bacvanski, the co-founder of SciSpike, we discussed such industry-wide changes and where NoSQL is going in 2016. Dr. Bacvanski commented:
“The main three trends right now happening in the NoSQL and Big Data space are the development of microservices in the Cloud, which lead to polyglot databases; the growth of the Internet of Things that is pushing many companies into Big Data (SPARK is increasingly important for this trend, since it is providing both batch and streaming potential to many companies); and the final thing is the fact that Big Data and NoSQL have become the new normal. There is now a recognition that, in certain areas, NoSQL is better than relational systems, while in others relational databases are better.”
- Microservices and Polyglot Persistence
The term “microservice architecture,” or just “microservices” for short, is a term that has been discussed since the early 2000s, but it really started to gain industry popularity around 2013-14. Martin Fowler and James Lewis wrote a Microservices Resource Guide in mid-2014, and their work (along with many others) has helped push the swift growth of microservices to a point that they are becoming a necessity for any enterprise that wants low cost, agile, massively scalable software development and Data Management options outside of traditional, monolithic databases. According to Dr. Bacvanski, microservices allow for “really nice modularity. They are typically deployed on many different machines, so the application is broken into many small services that run in the Cloud (either public or private), with so many benefits.” A short list of those benefits include:
- The ability to use many data models for different services, rather than one single data model that the data must be forced into
- Small services and databases, which allow for powerful distributed and massively scalable architectures
- A natural platform for deployment in the Cloud
- The possibility of Agile Software Development. A single developer can rewrite the code for a microservice in two weeks. Compare that to gargantuan databases with millions of lines of code
- Microservices can be planned around modular business needs and single operations, thus aiding in faster deployment and, in theory, lowering development costs
One case study for the many benefits of running microservices is an eCommerce site, Dr. Bacvanski said. Rather than employing a huge relational database, or even a massive single NoSQL system, an eCommerce company can use microservices to create greater speed and flexibility within all their systems. They can serve their product catalog with a document database. “Since descriptions of different products vary, you would want to accommodate that,” he said. Then they could implement a key-value store for their user sessions, while a graph store would probably be best for their internal supply chains and relationships between producers, suppliers, and so on, because graph stores are built to handle such relationships. And finally, for click analytics they could dump all that data into Hadoop or SPARK and store it for later analysis.
The disengagement of all these processes into separate data stores and services is also referred to as polyglot persistence, an impressive sounding term for a concept with complex ramifications for data professionals. Where once a data modeler, data architect, software developer, system administrator or any other person working within the data stream (including business-minded professionals) only had to worry about one type of system – the RDBMS – they now have to learn multiple platforms.
“It is no longer just a ‘this or that’ option,” remarked Dr. Bacvanski. “Instead, there are many options and new challenges with polyglot persistence and microservices. You now have many more databases to deal with, much more training to be done. But, the overall execution and performance tend to be much better. So for most companies, the payoffs are well worth the time and effort put into implementing such options.”
- The Internet of Things Forces the Issue
The rapid growth of the Internet of Things (IoT) has been nothing short of remarkable. “These systems produce tremendous amounts of data,” said Dr. Bacvanski. “More than anything we’ve seen so far, and it’s still small compared to the amount of data these sensors can produce.” Enterprises across a multitude of industries, including (but not limited to) healthcare, manufacturing, energy, transportation and travel, retail, and so many others are just beginning to comprehend the benefits of collecting and analyzing the sensor data available to them. Many are only beginning to understand what they can do with such sensors and so they are installing them everywhere.
Only a few years ago there were many companies that claimed they didn’t need to worry about Big Data or NoSQL because their industries didn’t necessitate such systems. Dr. Bacvanski gave the example of auto insurance: “They told us the only events that happen for them is that people buy cars, they collide, they have insurance claims, we investigate those claims, we pay them or we don’t. There are not that many things that change.”
These companies are beginning to see the benefits of IoT, of sensors, and of tracking exactly what their customers are doing. They’ve figured out that they can put sensors in cars and get reports on what kind of a driver their customer is, how they accelerate, how they brake, how fast they go, what sort of risky driving they do. This type of data collection and analysis allows them to get a much clearer picture of what is going on and possibly adjust their rates. “So now, all of a sudden, these companies are in the Big Data business,” he said. “The old argument that only huge companies like Yahoo! or Google needed Big Data – this was a common argument – is no longer valid. Everyone has to deal with Big Data.”
The same is true in retail and healthcare. Brick-and-mortar and online retail stores both want to maximize their customers’ experiences to drive more sales. If a customer walks into a store, traditional retailers want to be able to track their movements, buying practices, and time spent at various points in the store in real time. They want to be able to recommend that someone buy one product while they’re looking at another. The same is true of the online experience.
In healthcare, physicians and nurses want to be able to track a given patient’s vital signs throughout the day, rather than checking on them only when they come in for a visit. They want to be able to collect all that data to give better recommendations to that one patient, but also analyze millions of patients with similar symptoms to get more comprehensive health plans to all their patients.
Even smaller enterprises want to leverage their social media, click stream, and other types of unstructured data, including data collected from real or virtual sensors. The amounts might not be at the petabyte scales of some of the monoliths in the Big Data sphere, but nonetheless, there are many benefits for moving into such systems.
“Even if they don’t have much data to necessitate Hadoop or SPARK,” Dr. Bacvanski said, “applying these technologies can lead to cheaper IT. Instead of buying a $2 million system, you can just dump this data onto a smaller cluster and see significant decreases in data and analytics costs.”
- NoSQL and Big Data are Here to Stay
The adoption of NoSQL systems has been slow for many, but these systems have gained so much traction that fighting the tide is no longer possible, or in most instances, no longer a wise business decision. “The growing acceptance of these systems has become significant,” said Dr. Bacvanski. “From the beginning it wasn’t ‘data people’ pushing for these technologies in most companies, but instead it was software developers.” Developers require agility, and so there have long been internal struggles between data people and software people. Such struggles are well-documented in conference presentations, white papers, and blogs. Data people want consistency; they want their data assets safe and reliable. Software developers want agility; they want their projects done quickly and in iterations (at least in Agile camps).
“They would like to have free hands to change the databases, change the schema, do whatever they want with the data,” he said. “That led, in many organizations, to developers adopting NoSQL stores just to move faster.”
Such changes have, in terms of data and software development history, happened quite suddenly. NoSQL systems inevitably led to the creation of many silos for organizations, where there are developers on one side just “picking up a database and completing a project on their own,” while data people push back against such rapid changes on the other side. Such silos have been created in companies that lack strong Data Governance. “There is a conflict between Data Governance people and developers in terms of adopting such technologies,” he remarked. “Their points of view don’t often match.” Or we should say, their points of view didn’t used to match. They are now beginning to meet somewhere in the middle.
Luckily, the early antagonism caused by the disparate adoption of NoSQL systems into the larger Enterprise Information Management structures of many organizations has lessened, and in some cases, that antagonism has become a point of congruence between both sides. Such changes are due to many factors:
- Better education on the distinctive value of NoSQL stores
- Better understanding of the necessity of teamwork in terms of Data Governance and software development
- The creation of tools that provide SQL-like capabilities within NoSQL platforms
- Numerous case studies that have clearly demonstrated the benefits of NoSQL adoption
- A clear understanding that relational systems are still important and have certain operations that will continue to be needed going forward
“The data side has realized that they need to take care of these new systems somehow,” said Dr. Bacvanski. “And the developer side understands that these new systems certainly need better Data Governance. The benefits are finally much clearer to everyone.” Such claims are certainly not true everywhere. Many companies are still dealing with these struggles and will continue to far into the future.
NoSQL, Big Data, the Internet of Things, microservices, Hadoop, SPARK, and an ever-growing assortment of new trends and technologies are here to stay. “They are no longer viewed as antagonists to relational systems, but as systems with special, unique capabilities. There is a peaceful co-existence forming, but it will still take some time. 2016 will be a big year for such changes.”
Come meet Dr. Vladimir Bacvanski at the Enterprise Data World 2016 Conference.
He will be doing a Tutorial on Thursday, April 21st at 1.15pm titled:
“Building a Modern Big Data Enterprise: Hadoop, Spark, and Beyond”