As 2012 transitions into 2013, the multifaceted world of NoSQL continues to expand. Current platforms and databases gain market share and renown, while new products emerge, staking out their place in this constantly changing environment.
Those interested in a primer on all things making up the world of "not only" relational databases need to check out the recently published DATAVERSITY™ series on the NoSQL movement. This article takes a look at what to expect for NoSQL in the coming year.
Information Technology analysts, Gartner, predict that 2013 will be a big year for NoSQL. According to Gartner, "organizations need to focus on non-traditional data types and external data sources. Hadoop and NoSQL gain momentum."
Many NoSQL applications run on the open source distributed framework known as Hadoop, one of the major products out of the Apache Software Foundation, whose work leads off this profile of NoSQL in the coming year.
Apache Software Foundation Incubates NoSQL
The Apache Software Foundation began its existence in 1999, rapidly becoming one of the leading lights in open source software development. Their development activity encompasses many products important to the NoSQL movement, including Hadoop, HBase, Cassandra, and Accumulo. Some less well-known Apache products look to make more of an impact in 2013.
An ORM (object relational mapping) framework suitable for NoSQL and Big Data, Apache Gora graduated from incubator status last year, becoming a Top Level Project at Apache. Support for Hadoop is included, and Gora handles persistence to HBase and Cassandra as well as relational and flat-file sources. Gora provides a Java-based API for developer access.
A graph database still in incubator status, Apache Giraph looks to be a promising open source project for the future. Giraph's job framework works in concert with Hadoop, leveraging the latter's distributed processing capabilities. Maybe Giraph becomes an Apache Top Level Project in 2013?
Next year's ApacheCon conference takes place February 26-28 in Portland, Oregon. Activities are also happening before and after the main conference for parties interested in checking out the vibrant open source software community. ApacheCon looks to be the place to be to get a pulse where products like Cassandra and Hadoop are headed in 2013.
Key-Value Innovations for 2013
Continuing with the 2013 trend for Cloud-based DaaS "as a service" offerings, it was recently announced that the key-value memory cache, memcached, is hitting the Cloud as part of Garantia Data's Memcached Cloud. Garantia Data is also offering a similar Cloud-based platform for another key-value database, Redis. Both new products are available on Windows Azure and Amazon Web Services.
Couchbase Server 2 Spans Key-value and Document Databases
Couchbase includes elements from both key-value and document databases. The recently released Couchbase Server 2 combines the persistence of documents marked up in JSON with speedy key-value memory caching. The company saw its genesis through the merging of the minds behind memcached as well as CouchDB.
Couchbase Server 2 hopes to make inroads in the marketplace in 2013. The product includes a commercially licensed edition with a robust API that supports most popular client languages, including Java and C#. There is also an open source community edition suitable for checking the product out; although it is not recommended for use in production environments.
MongoDB Also Hitting the Cloud
In yet another example of how the DaaS trend is touching the world of NoSQL in 2013, the popular document database MongoDB is now available through a Cloud-based service in a collaboration between MongoDB developer 10gen and infrastructure company, SoftLayer.
Announced earlier this month, the production-class instances of MongoDB leverage the global-class Cloud infrastructure of SoftLayer. 10gen CTO Eliot Horowitz commented on their new marriage in the Cloud:
“Our aim is for MongoDB to be available in combined offerings so that the growing number of organizations that want to deploy in the cloud have an easy and scalable solution. We look forward to working with SoftLayer to bring this new service to market as a convenient and effective way to deploy big data workloads.”
Google's Bigtable Continues to Resonate Throughout NoSQL
Google's Bigtable remains one of the most influential tabular databases in the industry, but it isn't available for commercial use other than by using the Google App Engine Platform as a Service offering. The continued growth in popularity of Hadoop and HBase has a lot to do with what Bigtable brought to the (big) table. HBase is a tabular database inspired by Google's work, and Hadoop offers the same map reduce functionality ushered in by Bigtable.
This month's article in Wired about CouchDB developer, Damien Katz, does an excellent job chronicling how the entire NoSQL movement ties back to Google's innovations with Bigtable, as well as an earlier "NoSQL" database – IBM’s Lotus Notes.
There is little doubt as the database world enters 2013 and beyond, that Google Bigtable's wide ranging influence will continue to be felt. All of different database types in the NoSQL family – document, key-value, tabular, and graph, display at least a measure of inspiration from Google's innovations from earlier in the last decade.
InfiniteGraph Goes to College
Objectivity, developers of the graph database, InfiniteGraph, recently announced a partnership with Germany's Ilmenau University of Technology. InfiniteGraph will be used in courses at the university, in addition to its research programs. Students will gain insight to graph databases and their applications in real-time analytics.
Ilmenau Professor Dr. Kai-Uwe Sattler commented on this new relationship:
"At Ilmenau our goal is to introduce students from around the world to the very cutting edge of technology and skill them in its practical uses. Understanding and discovering the information within Big Data is something that every institution struggles with from research to course curriculum. InfiniteGraph brings a new level or research and teaching capabilities that expand our ability to explore the reaches of current technologies and methodologies in graph data management and analysis."
As a new generation of students grows up with NoSQL databases, the opportunity for the growth of the technology remains promising for 2013 and beyond.
Neo4j Version 1.8.1 is the Most Stable Yet
Neo Technology, creators of the graph database, Neo4j, is starting off 2013 with their most stable iteration yet, version 1.8.1. This new version offers better performance for those using Neo's Cypher query language. In some cases, query times are one-third the speed compared to earlier versions.
Neo4j 1.9 has also reached milestone release status. In addition to even more Cypher optimizations, the Neo4j console underwent some interface improvements, allowing in-browser filtering and paging of result sets. Another enhancement allows for the upgrading of High Availability Clusters without any downtime.
These additional features and productivity improvements should please Neo4j users throughout 2013. Those interested in trying out either the 1.9.M02 milestone release or the stable version, 1.8.1, need to visit the Neo4j download page.
The Rise of Data Virtualization Continues
As 2013 sees the continued proliferation of NoSQL, possible confusion exists for data consumers trying to make sense of the information overload from a wide array of disparate relational and non-relational sources. Data Virtualization is a technology that helps users derive valuable information out of business data.
Suresh Chandrasekaran, Senior VP at Denodo, comments on Data Virtualization as it relates to NoSQL in the coming year:
"The chimera of a homogeneous data landscape or a consolidated single version of truth as in an Enterprise-wide Data Warehouse has started to fade in the aspirations of companies as unattainable and too expensive. In its place, companies have embraced diversity of fit-for-purpose data stores including NoSQL, Big Data and multi-structured sources and combined them into agile, real-time data services using Data Virtualization tools like Denodo."
Expect Data Virtualization to increase in importance in the next few years as the use of Big Data and NoSQL continues to grow. It remains a vital piece of technology for those who depend on information.
2013 is definitely looking to be another year where NoSQL further cements its place in the data industry. What was once merely a marketing buzzword is now becoming a core piece of technology for enterprises of all sizes.