The U.K. is moving ahead with plans to introduce more transparency and accountability into the public agenda, through efforts such as the data.gov.uk initiative to make public data more easily available. Often, governmental agencies and semi-governmental bodies are getting onboard with the open data movement by exporting information from databases or spreadsheets into CSV files and putting them up in that format on their website.
But so much more can be accomplished if they head in the direction of Linked Data, expressing their data in RDF and using dereferenceable URIs to identify the things in those databases and spreadsheets, so that ultimately their information can be meshed with other Linked Data sets in what hopefully will be useful applications for the citizenry.
That, however, represents a technological hurdle for many of these organizations – one that PublishMyData would like to help them through with what it likens to a content management system that’s geared up for Linked Data. Its hosted service will translate these organizations’ information into Linked Data and look after all the infrastructure issues that go along with it, such as managing triple stores.
“In some ways what’s more interesting than the transparency and accountability angles are the more positive benefits in terms of efficiency organizations can get” from embracing Linked Data, says Bill Roberts, founder of Swirrl, the Semantic Web and Linked Data consultancy behind the PublishMyData publishing service. “A lot of times they are the biggest users of their own data, but with information silo issues no one knows what’s there.” Adding it to the Linked Data ecosystem can change that – and open up the door to enhancing the services they provide to citizens over the web, at lower costs. “When there’s good interaction on the web site people don’t have to call up so often or go and ask questions of staff too much,” he says. “So you spend a small amount on doing the online stuff and you can save on ways to help citizens.”
PublishMyData has as its business model working for the data owners, but Roberts figures it’s in those owners’ best interest to actively help push that data out to the software development community at large, who may themselves have a hand in driving agency efficiencies. He mentions FixMyStreet as an early example of how open data, albeit not Linked Data, has made it easier for U.K. residents to report problems on their street, without having to know which of their local councils actually should handle the issue, thanks to its use of geolocation and other identifying data. “If more government data were available as Linked Data, that kind of application becomes easier to build,” says Roberts.
Smart as are those software developers, both within and outside of government, there are still some challenges to be met to get them to create Linked Data apps that mash up different data sets of interest into a coherent whole that the average end user can take advantage of. Linked Data still hasn’t taken the world by storm, so they could use the help of things like interfaces that make it more easy for them to consume Linked Data in other apps and tutorials that help them get started, Roberts thinks. To that end, it just published such a tutorial on how to consume Linked Data as part of a mashed-up map app. “Our objective is to be the most developer-friendly of Linked Data providers,” says Roberts. “Linked data only is useful if people can find it, understand it and use it.”
Others, such as Structured Dynamics CEO Mike Bergman in this article, have commented on how Linked Data has been held back by issues including poor quality and shallow context, among other things. Roberts concurs there are some issues. “We do worry about how do we ensure good quality and how do you make convenient access points for people, and do that cost efficiently,” he says, and indeed Linked Data’s underlying technology is a few decades’ development behind relational databases in terms of robustness, speed and ease of deployment.
Another issue revolves around the complications of distributed data publishing. “There are technology issues about how much you keep local copies of things for convenient access vs. keeping up to date with the original data source, and we’re tackling that in a small way,” he says. “Some of the problem is solved by looking first at data sets that don’t change much—data that is valuable but static.” A little more difficult is using APIs to feed in active updates from original data sources to Linked Data data sets, and PublishMyData is doing some experimentation there.
The service is currently in beta. Roberts says it’s being trialed with a few different groups and he hopes within the next few months to have some of their live data sets represented on the site.