The day may come when you might not need a team of developers to write data-driven or data-aware apps that themselves can be described in just a few words. Ideally, that would mean companies would spend a lot less money on, and speed up, a long-winded process that encompasses everything from understanding requirements to discovering data sources and normalizing results, to managing data coordination across front-end and back-end teams.
That day isn’t here yet, but SemantiNet is trying to move things a step closer to that point. The company this month has introduced an open-ended alpha API that has as its centerpiece the idea of the data flow graph.
Its purpose is to enable easy querying of a collection of Web Services, Wikipedia, Linked Data and the unstructured web, and culminating in “one-liner” search bar apps, including mashups, built in minutes. Some examples: drawing out from dbPedia objects within a 50-kilometer radius of the Eiffel Tower that are somehow related to Napoleon and displaying the results as video; breaking down the revenue of the world’s major car companies listed in Wikipedia and providing a pie chart with that data and also mashing into the results the age of the companies in a table, its locations pinpointed on a map, and company snapshots called out in a graph; or finding out which pizza places close to your current location have some lunch deals on. For good measure, throw in some tweets and analyze them for sentiment, just to make sure that we’re talking about tasty pizza.
Or check out some of the output at left for the Keith Richards Guitar Gallery app, built on combining the unique DBpedia URI for Keith Richards; a fuzzy matching of the free-form text “instrument” with a predicate of dbpedia:Keith_Richards to get a list of the instruments he played; and a rendering of this list of instruments using a SemantiNet template called videolist.html.
“If applications can be described as a data flow graph, what happens if we lend the machine a computer program to automatically write applications that are interesting and valuable,” posits Semantinet VP of R&D Sagie Davidovich.
“If we know what the output of applications should be and what available sources are currently exposed as APIs or databases, and we have a semantic abstraction layer over data sources [that eliminates the need for conversion, normalization and transformation when accessing the data sources], basically the app can write itself autonomously. This is where the exciting part starts.”
Davidovich imagines a world, then, where applications are described in terms of the data flow graph, starting from user input or real-time feed, and where a reasoning engine creates, executes and manages that data flow. Information is refined and manipulated to increase relevance before being fed to an API; and from there, the results are filtered, sorted and aggregated, directed to a second API, augmented by data from another data source, and displayed drawing from a rich library of data-aware UI elements.
“This is the DNA of the application. You stop talking about codes and classes but instead talk about the data flow graph and the application becomes data-aware,” he says. “When you take a multitude of data sources, and lay semantic abstraction over them, a new paradigm emerges. You can communicate and exchange information on a higher level.”
Part of the secret sauce driving this is the semantic parsing abilities under the technology’s hood for understanding the meaning of web pages, a huge ontology based on Wikipedia, Linked Data, Freebase and, as Davidovich puts it, “almost every possible interesting data source we could find,” and using that knowledge to resolve ambiguities over extracted entities. “We have a very unique way of compressing this information in order to create a high-performance graph store for this ontology, and it lets us run very complex and heavy queries with great performance,” he says.
Semantinet sees applications of this as being appropriate to any organization that works with or publishes content, on the web or within its own borders. So, pretty much everyone rather than exclusively those you might think of as traditional publishers. It’s starting both to build the community around the platform, as well as growing its knowledge from some use cases, and it expects that community to have widespread appeal. Says Davidovich, “Almost every query you can imagine you can express in one line using this paradigm.”