by Angela Guess
Philip Howard has written an article for IT Analysis regarding graph databases and NoSQL databases. He explains, “Strictly speaking, a graph database is a NoSQL database but this is a case where strictly speaking is not very useful. There are two things that tend to typify NoSQL databases in people’s minds: the first being that Hadoop and its allies are optimised to run on low cost clusters of commodity hardware and the second is that it uses MapReduce to parallelise processing across this cluster. This works because these NoSQL databases are effectively doing either statistical analysis or search and there is only a limited shipment of data across the network. This isn’t the case with graph databases, especially where you are looking for patterns of relationships for analytic purposes.”
Howard continues, “The point to understand about graph databases, especially when it comes to analytics, is that the more nodes you have in your graph then the richer the environment becomes and the more information you can get out of it. How much more is a matter for debate: Metcalfe’s Law (which is actually no more than a hypothesis) suggests that growth in value of a network is approximately proportional to the square of the number of nodes (actually n x (n-1)). However, this has been disputed, not least because some connections (relationships) between nodes are more valuable than others. Other researchers have suggested that n(logn) would be a more appropriate figure. The answer is probably somewhere in between but there seems no doubt that the more information you can collect then the more value you can extract. So, at least for analytics, graph-based data is a big data problem.”
























