As has been our tradition these last few years, The Semantic Web Blog steps back for experts in it and in related spaces to give us their opinion of the highlights of 2014. Here is Part 1:
Vladimir Alexiev, OntotextData and Ontology Management Lead:
Strong interest and commitment by the Cultural Heritage community to LOD (see this story); – emergence of a Linguistic LD cloud; much enlarged and improved DBpedia data sets; new European Community research instruments in Horizon 2020 that promote commercial innovation and startups.
Phil Archer, W3C Activity Lead:
Easy. JSON-LD.I know I’m cheating as it was developed a good while ago, but it became a W3C Rec in January 2014 and it is a game changer for the Semantic Web. The publishing industry likes it, schema.org likes it, even developers who come out in a serious rash at the mention of triples like it because they can ignore the LD (Linked Data) part if they want.
It’s also helping the other ongoing growth story, which is schema.org. If it isn’t already, it’s becoming the format of choice for annotating new pages; you can put it all in a block at the top and not have to integrate it within the HTML. That’s a lot easier for some people to work with. And schema.org is expanding its scope too. The work on actions ties in with Activity Streams (and therefore Cards [containers for content]). It’s all part of the ongoing move from desktop to mobile, from Web pages to tasks and services. Since that’s all dependent on links, and those links can be encoded in JSON-LD, the potential for the growth of the Linked Data side of the Semantic Web is substantial.
John Breslin, Insight Centre for Data Analytics, National University of Ireland Galway Senior Lecturer:
Full disclosure: I am an advisor to Aylien. I think the work that it is doing in terms of putting semantically-powered text analytics into the hands of both technical and non-technical users is fantastic.Techies can make use of their API and get data back in the format they need, from a variety of angles (sentiment, categories, concepts, etc.). Non-techies can use the Google Sheets interface and easily analyze both social media and longer content like articles, with the results pushed into a form they are comfortable with (spreadsheets) and used to using.
Brandon Burroughs, Clarabridge Product Manager:
The types of feedback loops people are incorporating into their products continue to expand and the quality continues to improve. There is a lot of free information your users generate by using the software/app. People have been and continue to find innovative ways to use this.
Siri (allegedly) uses the feedback when you correct it to get a better understanding of your voice and the domain in which you live. Similarly, Apple has started using your iMessage conversations to predict what word you’ll use next. While this isn’t a new concept, what is new is the context in which they derive these best guesses. It is specific to the person you are talking to. The words it predicts for your coworker aren’t going to be the same as the words it predicts for your college roommate.
Jeff Catlin, Lexalytics CEO/Founder:
2014 should definitely be called the year of Matrix Math, or perhaps in a more user-friendly description… “The year big data got some big smarts.” The huge volume of content out there has allowed more companies to build what might be termed a poor man’s Watson, where systems have begun to understand how to categorize content, measure sentiment and even determine purchase intention without heavy user configuration and tuning.
2014 also saw the continued shift of on-premis licensed software to cloud-based text services. This transformation was already under way, but as 2014 draws to a close it’s worth noting that most deals are now signed on cloud services because of the ease of deployment, ability to scale and easier deployment of new features.
I was pleased to see the major developments around schema.org that happened over the year, proof that the search engines are committed to making the web a more semantically-rich space. I’m thinking in particular of the addition of Actions, Role and most recently Breadcrumbs (a very common type relevant for pretty much any site). Worth noting also is the addition of examples in RDFa and JSON-LD, formats which had been supported by schema.org since 2013, but were lacking concrete examples up until May.
Finally I was also pleased to see the keynote from [Google’s] R.V. Guha and the presentations from Dan Brickley at SemTech 2014, which both featured Drupal among other platforms which are adopting schema.org.
Bob Ducharme, TopQuadrant Director of Digital Media Solutions and author of O’Reilly’s Learning SPARQL:
As RDF-based tools have become more popular for data integration work, more people have noticed the need for a standardized way to express and maintain validation constraints as a means toward ensuring greater data quality, so we were happy to see the W3C charter the Data Shapes Working Group this year. It looks like the result of this Working Group, which we’re participating in, will be simpler and easier to use than OWL while covering more commonly-needed data constraint use cases than OWL can –for example, that an activity’s start date must be before its end date.
I’d characterize the most important 2014 developments as relating to awareness — for instance, of the power of deep learning for text (and image) modeling and information extraction but also of the value, for essential text-understanding functions such as disambiguation and classification, of long-established assets that include taxonomies and syntactic and semantic networks, built through hybrid human-machine approaches.
Cognitive computing? There’s something there, or rather some things, plural. IBM and a number of smaller innovators are trying to define a new market based on a set of advanced learning, inference, and interaction technologies. We’ll see if the term catches on.
Join us tomorrow for further insights in Part 2 of this year in review.