Six months ago, Ontodia’s NYCFacets walked away with the win at New York City’s BigApps 3.0 conference. In the months since, the Smart Open Data Exchange that catalogs all the NYC-related data sources (which we first covered here) has been busy expanding its team, moving into the NYU-Poly hosted incubator, and getting ready to launch its Smart City platform for general use next year.
A preview of that platform will take place at the upcoming Semantic Technology & Business Conference in NYC. “We are going to our original mission of really creating that data exchange using semantic technology,” says Ontodia co-founder Joel Natividad. It’s putting the focus not on raw data or learning new technologies, but on being a linked answers marketplace – converting raw data to answers rather than just linking raw data.
NYCFacets stores more than 1 million metadata facts from nearly 1,000 data sets in the New York City Open Data Catalog. Federating data opens the door to creating an answer space for questions the community will be interested in that require multiple sources to answer. Ontodia is creating a Q&A interface so that users can pose their questions in plain English, with certain parameters. As one example, Natividad notes they might inquire to see a list of all 311 noise complaints in their neighborhood that happened since the spring and also discover the number of street fairs held during that time, which would call upon the NYC 311 data set and also expose a list of street fairs and such events. “That requires us to federate those things and maybe there’s a correlation. Maybe noise complaints go up when there is a street fair,” he says.
Most data marketplaces today take the tack of asking people to download raw data and do the work themselves. For its alternate serve-up-the-answers approach, Ontodia is mapping all the city data sets to a city ontology it is creating, aligned with Schema.org. “The benefit of that is we basically outsource the whole description thing. …We just stand on the shoulders of giants and use that….We will expand it as need be but using that as our starting point.” Schema.org also will support RDFa Lite, and says Natividad, “because we are Semantic Web geeks, since schema.org does RDFa Lite, we will use that to mark up our pages.”
The Ontodia team will process inquiries and create the SPARQL queries to be answered by its system, concentrating first on what the community, through crowd-knowing feedback, thinks are the highest-value queries to answer.
In addition to providing current, point-in-time answers, Ontodia also looks forward to providing a time series of how that answer changes over time. “Right now a lot of data the city exposes is only current data,” he says. “As we suck it down to answer these things this will not only archive historical raw data but also the answers we create so you can see patterns over time, and those answers themselves become data assets against which you can ask higher level things.” The answers, apart from being viewable on the web site, will have associated answer identifiers, and individuals can make use of that via its API in downstream fashion.
Natividad says this phase one of Ontodia’s efforts will let it demo and people see the power of Linked Data without getting stuck in the weeds of unfamiliar technology. “We want to show the utility of this technology and why only semantic technology can do this at scale.”
The bigger picture is to be the Bloomberg of Linked Data. “There is a confluence of events happening. Big Data and Open Data phenomenon, and data science phenomenon, and the common thing is data. We want to basically be the utility in the middle where they get high quality, high resolution Linked Data,” he says, noting that a lot of data scientists in both the public and private sector spend a lot of time massaging data. “Part of the mission is in semantic terms to be the high-resolution DBpedia of Smart Cities.” New York is just the first in what should be an expanding slate of cities.
Revelytix was Ontodia’s partner in a lot of its work, Natividad notes. “A lot of their technology accelerated what we are doing,” he says. There’s still building going on, and standing up the community is the first order of business. But Ontodia does have some revenue models in mind. “We will expose all answers we have to the general public but for hi-res, near real-time or historical data, or all 1 millions results instead of just the top 100, then we can see users having some kind of arrangement with us,” he says. “We’ll expose to the public as a way of advertising the value of what we enable and hopefully drive adoption.”
You can register for SemTech in NY here.