Loading...
You are here:  Home  >  Data Education  >  Big Data News, Articles, & Education  >  Big Data Articles  >  Current Article

Enterprises Move to Graph Databases

By   /  January 31, 2017  /  1 Comment

Graph databases continue to make their move into mainstream enterprise operations, providing a good reason for big name vendors to have planted their flags in the space and for one leader in the arena, Neo4j, to be enjoying strong growth among large business customers. As of January the vendor continues to hold the top spot in the category in the DB-Engines Ranking of database systems according to popularity.

In December, Bloor Research published an update on the graph database world, noting that Oracle and Teradata “are both examples of major companies that have implemented graph capabilities on top of existing relational technology.” It points out, as well, that companies including IBM and Informatica are using third-party graph technology in their products, while the market itself undergoes consolidation with acquisitions such as Aurelius, the developer of the Titan graph database, by DataStax.

“Despite all of this attention the market is dominated by Neo4J and OntoText (GraphDB), which are graph and RDF (or triplestore) database providers respectively,” analyst Philip Howard writes, explaining that the difference between a true graph product and a triplestore is that the former enables users to traverse a graph without needing an index and the latter doesn’t. “These are the longest established vendors in this space (both founded in 2000) so they have a longevity and experience that other suppliers cannot yet match,” Howard notes.

Neo4j is clearly invested in keeping an upper hand:  VP of Global Marketing Utpal Bhatt points to its recent $36 million funding infusion as one event that will help it maintain its top spot. It plans to use the cash to expand both its engineering function to develop even more capabilities that take advantage of relationships that exist within data and also to further fund sales and marketing efforts. Both steps will be helpful as the company pursues what is already its fastest-growing customer segment, which Bhatt says are enterprises with more than $1 billion in revenue. Such clients now represent more than half of Neo4j’s customers.

Quite a change from 2015, when Neo4j CEO Emil Eifrem told DATAVERSITY® that time still needed to be spent on making,

“Sure the world understands how valuable it is to take existing applications and re-imagine them from a graph perspective to gain a big advantage, or build completely new products and services based on graphs.”

The Native Advantage

Bhatt thinks it’s great that new players – and big players – are now part of the graph databases category. “It’s a strong endorsement of how the space is truly becoming mainstream and impactful,” he says, “and we’ve only scratched the surface.”

Increased competition comes along with the category endorsement, of course. Bhatt explains that many of the competing vendors that come from the traditional database world take a non-native approach to their graph database offerings. That is, they build a graph layer atop their relational database technology: While this has benefits for companies that want to work with a single database vendor as much as possible and is an appropriate option for lightweight graph use cases, he says, “It’s not well-suited for what we call core graph apps.”

For Neo4j, the graph is home turf, and as companies work with more and more connected data:

“The native approach will beat the non-native approach in terms of intuitiveness, performance – especially real-time performance – and the ability to meet changing requirements and new different data types without having to go through extensive schema changes,” he says.

His take is that even companies where relational databases still rule will want to use a native graph database for certain use cases, “To meet the scale and performance requirements of applications that rely on highly connected data.” In the future, he expects enterprises to standardize on both relational and NoSQL stacks. Between the relational and non-relational worlds, he expects that NoSQL graph databases are in the growth sweet spot.

A World of Data Connections

A telling indicator of that is the change Neo4j has seen over the last year or so, as customers went from bringing its graph database in for a very specific application – such as building a real-time recognition engine that tied customer, product, and social sentiment together – to leveraging their connected data for wider product or service use. Once the initial project goes live and meets with success, they find they need to use a lot of the same data plus more data sets for additional efforts. Rather than stack another database or app, they take advantage of what they already have on-hand to easily combine all these data pieces for new connected data applications.

“At the end of the day it’s all connected, and when they do that they have even richer data sets to work with,” he says. “Graphs now enable the initial use case that brought in a few different sets of data sources, and now that expands and that expansion is where people bring new types of data into the same database, which results in a new class of apps.”

By serving as a way to tie in a lot of different enterprise data sources that otherwise would be in silos, graph databases are getting recognition by enterprises for being a fast and effective way to modernize their infrastructure, he says.

Bhatt points to Neo4j customer Adidas as an example of a company that initially wanted to connect different and disconnected data – product, market, social media, brand content, and other source information – to provide relevant data and product recommendations to web site visitors.  In 2015 it worked with Neo4j to build a Shared Metadata Service based on the vendor’s graph database, leveraging a data model that connected multiple data domains, from product specifications to contracted athletes, to help build a more personalized online shopping experience. Today, Adidas continues to connect additional sources and clients, adding new entities and relationships as needed with having to implement schema changes, to improve its ability to deliver to online shoppers the right content at the right time.

“We truly offer a solution that appeals to the time to value argument – one that offers the least amount of business disruption and yet you can modernize applications by getting more value from the data you already own – something you couldn’t do before because it was stored in such a way that made it somewhat unusable as a combined data set,” Bhatt says.

More New Use Cases

Once a first graph project goes out the door for an enterprise, he says, building up experience about working with graph data and an understanding that adding new types of data and relationships is trivial, things take off quickly. Monsanto, for example, started off with a single Neo4j graph project and now has over 120 applications built off that single instance, he says.

Neo4j says new use cases for its graph database technology are emerging all the time and a lot of them were showcased at GraphConnect San Francisco last fall. They went well beyond real-time recommendations and fraud detection, he says: NASA discussed its knowledge graph that is its biggest data repository being used by active missions, for example, while Marriott presented a dynamic pricing engine to offer almost real-time pricing updates based on a series of factors across thousands of its properties.

The work Neo4j has done over the last couple of years focusing on scale, performance and enterprise-readiness has a lot to do with all this. For instance, its last release introduced causal clustering – “basically a kind of clustering that is better suited for connected data than for looking at a world of a collection of data,” he says.

With causal clustering, businesses can create a highly distributed cluster of Neo4j instances where they are guaranteed that they will never be in a situation where there will be no difference in what someone writes on one node and what someone reads on another because of network latency or other issues. “That’s an example of innovating that helps us with scaling,” he says. The company also has focused on security features to help enterprises feel comfortable using its solution with their mission-critical apps.

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

  • Thierry Caminel

    The difference between a graph database (like Neo4j) and an RDF triplestore is far more than a matter of index. Triplestores can also benefit of a standardized query language (SPARQL) and modeling languages (OWL), numerous interlinked datasets (Linked Open Data) and domain models (like the Financial Industry Business Ontology), inference and reasoning capabilities, …. Index-free adjacency has definitely some advantages, but they have to be balanced with other factors.

You might also like...

GDPR Compliance: A Data Transformation Opportunity

Read More →