After a great day yesterday I was eager to to discover what today’s program had to offer. Unfortunately I had to set off for the airport, where I am now writing this, before the end. However I caught most of the day and here are my few thoughts and recollections.
Today’s Keynote was in the form of a panel discussing Semantics in the Automotive Industry with Martin [GoodRelations] Hepp, John Kendall Streit of Tribal DDB, William Greenly of AQKA, and François-Paul Servant from Renault. They discussed their experiences in pioneering the use of Linked Data / Semantic Web technologies and approaches in the automotive domain.
Martin Hepp’s introduction highlights:
Back in 1908: Ford had one car in one option form, the Model T. Now in 2012, for one manufacturer you can have 10^25 possible options – with only 10^20 of theses are actually available, because of engineering and legal restrictions. This means that only 1 in 100,000 possible options are available – a needle in haystack problem. The industry needs these options to be communicated up value chain, to a wide and varied audience with no pre-communication set up. One answer is to embed small data packets in web page – RDFa using ontologies such as GoodRelations.
Industries that benefit most from distributing data for consumption this way are those with many options and variation. The automotive industry high in the list of use cases for this approach – hence this panel.
John Kendall Streit : Worked with VW to implement semantic technology. The initial requirement was to improve search, however key benefits will be the ones now possible, but not yet implemented or thought of. eg. Book a test drive in the exact model you want, via phone in minutes.
William Greenly: Has been in Semantic Web for a while – great involvement in automotive initiatives, but can’t talk about it. However can say that there is significant interest/work in automotive industry.
François-Paul Servant: Renault – many uses internally, now starting to publish data on the web – also talking to devices.
William Greenly: Looking forward Sparql 1.1 endpoints to enable querying of model configurations to identify [current] possible options that can be built.
John Kendall Streit: Now there is a semantic platform in place applications are evolving – mobile, in showrooms, web sites etc. VW is exposing the data in RDFa – there could soon be a public sparql endpoint.
- Question on coordination for cross industry standards of description: Can’t match all schema – pragmatic approach is to initially agree on schema/ontology for generic features such as fuel consumption and use manufacturer’s own for their detail variations such as models/derivatives/wiper blades etc. – eventually others may well start to map between the different manufactures ontologies, there are not that many.
- Would this information be available for used cars? If this data is public, of course, and is being done.
My takeaway from the panel: Here is an industry that pragmatically picked up a technology that can not only solve it’s problems but also can enable it to take innovative steps, not only for individual company competitive advantage but also to move the industry forward in it’s dealings with its supply/value chain and customers. However, they are also looking more broadly and openly to for instance make data publicly available which will enhance the used car market.
Next up for me was a session entitled Corporate Semantic Web – The Semantic Web Meets the Enterprise from Adrian Paschke of Freie Universität Berlin. Corporate Semantic Web can bring together data from all areas from back office to front office. This wasn’t just a technical presentation of how you might adopt technology within the enterprise, but Adrian did start off talking about techniques. These techniques include auto extraction of current semantic & non-semantic into Linked Data with ontology learning Next came semantic enhancement, either manually with semantic text editing and automatic text mining, leading to the reuse of knowledge (faster search faster knowledge transfer, efficiencies etc.) You can then build semantic archives and enrich data from corporate heterogeneous, and external, sources
This foundation then enables applications for Corporate semantic engineering, semantic corporate search & collaboration. Adrian moved on to discuss some of the issues relating to how to encourage in-house adoption. The enterprise will almost certainly have established practices, processes and infrastructure than can act as a barrier to such adoption. Integration in to this environment not only has technical challenges – decision makers at al levels need to be aware of and therefore convinced of the costs and benefits of introducing Semantic Enterprise technologies. The corporate semantic web is not that different than the open semantic web except you may have much more data.
After a very necessary, on this second day, coffee infusion I moved on to Ontology Modeling – Relationships Matter from Frank Coyle. Relationships matter – in entity relationship diagrams it is the entities that matter the lines between are more just connectors. In object orientation it is the objects. In the Semantic Web it is the relationships that matter. In this, what might have been expected to be a dry, presentation Frank gave us a great insight in to how using the predicate [property] element of the triple, with the help of OWL & RDFS, we can deduce relationships between entities. This was all made more interesting by his use of the topical use of Frederick II (2012 is the 300th anniversary of his birth), and his relationships with his parents, children, where his body is entombed, and who he corresponded with.
After a good lunch, came a late, but eagerly awaited, entry to the program from Denny Vradecic Wikipedia’s Next Big Thing: The Wikidata Project. Wikipedi’s importance to the web of Data, is represented if nothing else by the size and links surrounding Dbpedia on the Linked Open Data Cloud diagram. So any announcement from Wikipedia with the word data in it, was obviously greeted with anticipation. The headline pitch from Denny was that Wikidata will provide an infrastructure and stable URLs to store and access data for use in Wikipedia articles, as well as for any other use. Of course that raised as many questions as it answered. Denny, who announced that he will be working for the Wikimedia foundation in Berlin from March, took us through an explanation of the successes of Wikipedia but also highlighted issues such as information not being replicated across languages, and the duplication of lists and list of lists. WikiData is an initiative to collect together the data elements of the world’s information so that it can eventually be used to automatically create those lists and fill those info-boxes in any language’s version of Wikipedia.
Turning the Dbpedia model, which extracts information from info-boxes and publishes it as data, on it’s head, WikiData will provide an editable environment for data that will then populate the info-box. Denny listed the objectives of the WikiData project to be:
- Provide a database of the world’s knowledge that anyone can edit
- Collect references and quotes for millions of data items
- Engage a sustainable community that collects data from everywhere in a machine-readable way
- Increase the quality and lower the maintenance costs of Wikipedia and related projects
- Deliver software and community best practices enabling others to engage in projects of data collection and provisioning
WikiData Phase 1 should be complete in the summer. It includes creating one WikiData page for each Wikipedia entity, which lists it’s representations in each language. Those individual language versions will then pull the language links from WikiData.
The beginnings of an interesting project from WikiMedia that could radically influence the data landscape. As this was such an interesting development I have posted a more in depth report on the session.
The last session I had chance to visit, before having to head off to the airport for a joyous EasyJet flight back to the UK, was close to my heart being from a Library background. Felix Ostrowski took us through an overview of Current LOD Trends and Developments in the German Library Ecosystem. Having been associated with similar initiatives with the British Library, I was interested to understand what synergies there might be. Felix’s presentation was a wide ranging history, not just concentrating on technology and data but also covering advancements in library policy and legal initiatives – ever present in the rich metadata world of libraries.
A great couple of days in Berlin which continued the confirmation my opinion that this Semantic and Linked Data stuff is becoming more and more relevant and useful to the business community who can make it real. It is events such as this one that are starting to establish the connections between those of us that understand and are enthusiastic and those that can benefit and take it forward. Here’s looking forward to the next Semantic Tech and Business Conference – June in San Francisco.
Richard Wallis is Founder of Data Liberate.