Graph databases have had it pretty good the past year or so. Some highlights since the end of 2013 include:
- Gartner for the first time allowed graph databases to be included in its 2014 Magic Quadrant for Operational Database Management System (ODBMS).
- Graph database management systems saw a more than 250 percent increase between January 2013 and January 2014, outperforming other NoSQL database categories, according to DB-Engines.
- Forrester Research reported that graph databases will reach over 25 percent of all enterprises by 2017.
- Talk about big names getting into the action, with Google releasing Cayley, an open source graph database, and companies like Oracle, SAP, and IBM throwing somewhat wider nets into this ocean. Meanwhile, players that already built their street credibility in the space, including Ontotext and Neo Technology, saw new releases or made updates to recent releases.
Neo Technology, in fact, has had its own good year on a number of counts. For one thing, it was included in Gartner’s Magic Quadrant for Operational Database Management System, and also named by the research firm to its list of Cool Vendors in DBMS, 2014. It was dubbed by DB-Engines the leading system in the Graph DBMS category of database management systems, ranking 23 out of 210 systems. InfoWorld recognized it for a 2015 Technology of the Year award, too. And $20 million in Series C funding came its way, as well.
The vendor also announced early in January that it just closed its strongest year yet, reflected by its highest number of new customer acquisitions and 120 percent annual recurring revenue growth over 2013. A major product release in 2013, Neo4j 2.0, which hit 500,000 downloads – including an update late last year for speed and scalability – also helped propel the company forward. Neo Technology cites among its customers Walmart, which is looking at data from the graph perspective and running collaborative filtering algorithms in real time to provide personalized recommendation experiences for users. Also in the mix is eBay, Earthlink, UBS, CenturyLink, Pitney Bowes, Cisco, Medium, CrunchBase, Polyvore, Elementum and Zephyr Health. (See our story Graph Databases have Impact on Healthcare Sector here, for a look at how Zephyr is leveraging graph databases as the healthcare and life sciences industry begins to take greater advantage of the technology.)
Neo Technology CEO Emil Eifrem says the growing market for graph databases is in part a response to internal pressures that build up every day at organizations – pressures that require them to get more value out of the relationships in high-volume and high-velocity datasets.
“One thing we see that is tangibly happening is that datasets people work with of course get bigger, and also we see clearly that connection between entities, so the connected datasets, are exploding in size,” he says, regarding internal pressures faced by many businesses. “That huge explosion in the datasets that people handle exerts a huge force of pressure on existing infrastructure.”
Just take the case of IT operations/network management in enterprises and telcos, brimming with data flowing through from various IT infrastructure components – switches, routers, hubs, and so on. It’s critical that IT leaders understand how these entities all relate to each other so that they can make sense of the data and take appropriate action. For instance, if alerts on a monitoring dashboard light up for dozens of servers, do they have an efficient way of knowing that all that hardware has failed at the same time because it’s all connected to a firewall that is connected to a power supply unit that went down and took the firewall with it? Or do they have to plough through mountains of individual reports and chase down a bunch of dead ends before the relationships become clear?
That was a tricky enough issue to deal with a decade ago when an enterprise data center for a global business might have 1,000 servers. But how much more difficult is that problem to deal with today when that same business has grown to encompass thousands of physical and virtual servers. “Imagine,” Eifrem says, “if you are the operations person trying to manage what probably amounts to 50 billion connections.” If you can in real time see the dependencies between that power supply, that firewall, and those servers, the root cause of the problem is more immediately obvious, as is what needs to be done to fix it.
Another use case that enterprises have to deal with internally relates to Master Data Management, such as modeling organizational hierarchies. That world has changed a great deal in the last couple of decades; instead of single reporting lines to a manager, today’s workplace is a more networked environment, with individuals taking on multiple roles and projects, and moving seamlessly around the business. “If you were writing software for internal HR systems 20 years ago, your dataset would be a lot smaller in the number of connections than it would be today, even if your business has the same number of people,” he says.
The other force bringing attention to graph databases is competitive pressure, he believes. “Graphs,” says Eifrem, “are eating the world.” There’s a reason for that, which is that products that use relationships and connections in data typically can drive a more nuanced view of the world, and that can lead to better value for the business and its customers.
Neo Technology has seen that occur for its own customers, including in the online dating realm where social connections play a role. While other matchup sites using relational database technology might be able to look out a couple of hops in the social graph to make a potential love connection, a graph database can look dozens of hops out to recommend the best potential partners. All things otherwise being equal, he says, the site that provides the best matching algorithms that can leverage all those connective data tissues wins.
Graph Databases Reach Out
Neo4j stores data in nodes connected by directed, typed relationships with properties on both – also known as a property graph; users are able to efficiently store, handle, and query highly connected elements in their data models. But even as graph databases have extended their reach to support use cases like the ones above, and been picked up by some big, brand name companies, Eifrem wants to see the technology adoption lifecycle “go beyond the ‘alpha geeks’ and drive to the late majority market.”
That’s going to require making the technology easier to use for that community, even though the company’s launch of Version 2.0 took many steps in that direction. It was “a super, super-important release for us,” says Eifrem. The new version made labels a part of the data model so that users could tag and index data to more effectively understand relationships between datasets. It included enhancements to its Cypher declarative query language that is used to develop Neo4j graph applications, along with an interactive browser and query environment with a visual interface for data discovery. Six months after its release, Neo4j’s user community size doubled, he says, “all because of the ease of use features.”
Still, he sees incremental work to continue “to polish up the surface, documentation and really importantly, the integration with frameworks, with tooling, and just be part of the normal technology they use,” he says. There’s room to get Neo4j (and presumably other graph database technology) better integrated into the well-used Microsoft stack, for instance:
“There’s no drop-down in Visual Studio for Neo4j to click, click, and off you go,” he says by way of example, nor is there a way to automatically authenticate Neo4J users in Active Directory. “Those are the types of things we think need to happen with graph databases, not to get to [Forrester’s predicted] 25 percent [enterprise penetration] by 2017 mark but to get to 100 percent,” he says.
WANT TO STAY IN THE KNOW?
Get our weekly newsletter in your inbox with the latest Data Management articles, webinars, events, online courses, and more.
While the 2.1 and 2.2 updates to Neo4j that followed the 2.0 release were more “inside-the-engine” focused, emphasizing data import, performance, and scalability, Eifrem says he’s looking forward to updates in 2015 that will again put an emphasis on ease of use. For example, the company will laser in on making sure that it is as convenient as possible to write Cypher applications in your programming language of choice. “We think this is going to be a significant step up in terms of usability,” he says.
“Now we are getting to scale. If we roll out these kinds of improvements to significantly reduce friction, that means tens of thousands of new users. That rolls into hundreds or thousands of new enterprises using us, which is pretty exciting.”
There will be challenges in getting the word out to the world about how important graph databases can be, he admits. So 2015 also will be a year of working to “make sure the world understands how valuable it is to take existing applications and re-imagine them from a graph perspective to gain a big advantage, or build completely new products and services based on graphs.”