Neo4j 2.0: The Ascendency of Graph Databases

By on

Graph DBby Jelani Harper

There’s a reason that Neo Technologies, purveyor of graph-database Neo4j, has added some of the most well-known and established companies (Hewlett-Packard, eBay, Cisco) to its list of licensed, paying customers since 2011.

It’s closely linked to the expanding community of Neo4j users, which recently closed out 2013 with 50,000 “new instance” activations per month, a rate that is three times higher than that for 2012.

It has a good deal to do with the horizontal nature of graph databases, which are showing up in a striking number of vertical industries and expediting processes that are difficult and time-consuming for relational technologies.

Quite simply, graph databases are gaining in popularity and, thanks to the recent release of Neo4j 2.0, usability as well:

“Graph databases have traditionally been used only for social media,” said Neo Technologies CEO Emil Eifrem, who founded the company in 2000. “It turns out that this ability to process connections between data elements is a completely horizontal concern. That’s something that you need in every single industry out there.”

More Than Social Media

Although Neo4j and graph databases can accelerate numerous database processes from Business Intelligence to content management, there are presently three core horizontal areas (excluding social media, in which the relationships between people/users is readily identified and modified via graphs) in which all industries benefit from its capabilities:

  • Master Data Management (MDM): Whether focused on customers, products, or multiple domains, MDM solutions greatly benefit from graph databases which readily map supply chains and relationships between domains to speed up data access and relevance.
  • Software/Network and Data Center Management: Determining potential effects of network failures and various software and hardware components has never been easier than with graph databases, which can map (even visually, in the case of Neo4j 2.0) the relationship between components and greatly accelerate what is a tedious process with relational databases.
  • Geographic Data Management: Again, the mapping potential for graph databases can expeditiously delineate routes and relationships between various points of location.

Graph databases can also greatly enhance CRM, fraud detection, recommendations, resource optimizations, and other facets of the enterprise database functions. The fact that they are employable for uses other than social media should not downplay this particular application, since the technology for graph databases and NoSQL is greatly responsible for the explosion of Big Data and the accurate gauging of sentiment data via the Internet.

Use Cases

According to DB Engines, a site dedicated to technology analysis, in recent months the popularity of graph databases has pushed Neo Technologies to the forefront of the NoSQL movement and exceeds that of relational SQL technologies as well. The January 2014 popularity of Neo4j, which is based partly on aggregates of website mentions, technical discussions and job offers referring to a system, exceeds the scores of all other relational graph bases combined:

“The problems that Mongo solves are really important, but they’re not the problems that Neo4j solves,” Eifrem said. “The problems that we solve you can’t solve with Mongo and vice versa. The same is true for Cassandra and other databases.”

After acquiring the London-based delivery service Shutl, Ebay was suddenly confronted with the task of routing, in real time, a network of carriers to deliver products to customers within 90 minutes of placing an order for its newfound Ebay Now imprint. Scaling requirements included those for consumer-to-consumer delivery and the calculation of simultaneous, multiple routes. The company replaced its legacy MySQL solution with Neo4j, which significantly enhances the speed at which routing is conducted, due to the fact that there is substantially less code required with the latter. The result is more effective code quality and a greatly reduced time to market.

Within the healthcare industry, Neo4j operates as a platform which integrates various data sources, issues queries, and serves as a customer interface for San Francisco-based Zephyr Health, which offers an analytics app for life science patients to facilitate customer engagement, and research and development. Neo4j provides the means by which customers can query and issue feedback for any number of topics, a process which requires mining Cloud data and maintaining ID relevance at high speeds and extreme scalability.

The Newest Version

Neo4j 2.0 is set to build on the aforementioned and other use cases in which rapid connections between data points are required by increasing its usability. The most recent version made a number of improvements to its native query language, Cypher, which has significantly increased its functionality and ease of use:

  • User Interface: Cypher is responsible for the revamping of Neo4j’s browser, which maintains a simplicity designed for developers. The new browser graphically represents query visualizations and provides a native environment to leverage the expressive productivity of Cypher. Additional features include tabular representations, reference topics, and help guides.
  • Labels-Schema: A key development within the 2.0 version is the option for developers to implement labels and schema. The former adds additional means of Metadata to nodes and their relationships, which enable greater specificity and assist with augmenting value properties describing relationships and nodes. While labels create subsets for nodes, they also enable schema – which is different from that for relational databases because Neo4j’s schema is optional. Whereas it may be difficult to determine schema when initially using Neo4j with new or unknown data sets, prolonged usage is actually enhanced by specific schema and its uniformity.
  • Uniqueness Constraints and Indexing: Cypher’s ability to create labels endows version 2.0 the means of easily indexing nodes – which was not possible with the previous version. Labels also allow for uniqueness constraints (in which values for data must be unique) and accessible means of looking up nodes. Uniqueness constraints considerably aid in the facilitation of certain schema.

Although one of the benefits of Neo4j is that the database can be controlled with any language that is compatible with Java Virtual Machine (JVM), the additions to Cypher make it significantly less necessary to use other languages. According to Eifrem:

“Cypher is a language with a 90-10 orientation where 90 percent of the things you want to do is super simple and the remaining 10 percent is doable but maybe harder and more awkward to express. If you have operations that fit into the 90 percent, then Cypher is 10 times more productive to work in versus Java, Ruby or whatever program language you might otherwise choose.”

Ubiquitous Graphing

With the ascendancy of Neo4j over the graph database marketplace, the release of 2.0 (nearly 10 years after the release of 1.0) is largely targeted as a technology to help catapult the mainstream adoption of graph databases – and of Neo4j. The horizontal nature of graph databases and their solidification in and expansion beyond the realm of social media will further spur this trend. The company plans to release a free, interactive online training course in the near future in which users can learn the inner operations of the database at their own pace. The following comments from Eifrem sum up the potential of graph databases:

“We believe that graph databases are going to be available anywhere you have a software system at the end of this decade. Your toaster is going to have a graph database at the end of this decade, and we want to be the supplier for this movement.”

Leave a Reply