Earlier this month 16 out of 42 papers were accepted for the upcoming Linked Data on the Web (LDOW) 2012 Workshop in Lyon, France in April.
What might be discerned from the tenor of the submissions is something of a shift in focus in the Linked Data space, according to workshop chair Dr. Michael Hausenblas, Linked Data Research Centre, DERI, NUI Galway, Ireland. Other organizing committee members include Tim Berners-Lee, Christian Bizer and Tom Heath. “In 2008 to 2010 it was more like we were establishing the field, getting people to talk about what they do in terms of publishing and best practice around Linked Data, Open Linked Data and Linked Enterprise Data,” says Hausenblas. Now, with the web of Linked Data having grown to about 32 billion RDF triples last year, “we’re moving more towards the consumption – publishing is a necessary precondition but not an end in itself.”
The opportunity now is to show how with Linked Data it’s possible to do things more easily or cost-sensitively or otherwise in ways that are not possible with competing technologies. Offered and accepted papers hit on issues ranging from mobilizing Linked Data access to assessing and automating its quality, including when links break. “The web consumer is human and sees when a link is broken,” he says. “But software that consumes Linked Data can’t deal easily with a 404.”
In terms of making Linked Data more useable, he points to one paper that talks to discovery and integration issues using the Vocabulary of Interlinked Data Sets (VoID), an RDF based schema to describe linked datasets. Those who leverage the de facto standard can optimize processes – for example, if you do a federated query it’s possible to take only data sources into account where you know from a VoID description that they actually contribute to your answer, Hausenblas notes. Some 30 percent of the Linked Open Data cloud conforms to this, and more and more tools can generate it automatically to a certain extent, he notes.
Beyond the workshop, Hausenblas had some other Linked Data thoughts to share. As Rob Gonzalez of Cambridge Semantics has recently discussed here, there’s a case to be made for semantic technologies to have a bigger role in the NoSQL conversation – and Hausenblas offers that in general the NoSQL community can learn a lot from what the Linked Data world has been doing for some years now. “You find this more and more that there is a mutual understanding of systems overlap. We are a NoSQL graph database happy to process and consume Linked Data and RDF.” Go here to read Hausenblas’ article on a number of NoSQL systems with regard to their Linked Data processing capabilities. For example, regarding Google’s BigQuery, Hausenblas himself created bigquery-linkeddata for loading RDF/N-Triples content into Google Storage as well as exposing an endpoint for querying the data in BigQuery's SELECT syntax. Others discussed include he Apache Cassandra project, MongoDB, Apache Hadoop, and Sindice (see our story on that one here).
He also notes that the W3C is working on a Linked Data Platform working group that will take into consideration an IBM proposal for a collection of best practices and a simple approach for Linked Data (see here). In its proposal IBM wrote, “We believe that Linked Data has the potential to solve some important problems that have frustrated the IT industry for many years, or at least to make significant advances in that direction. But this potential will be realized only if we can establish and communicate a much richer body of knowledge about how to exploit these technologies. In some cases, there also are gaps in the Linked Data standards that need to be addressed.”