Are you hearing the term “Semantic Web” as often as you may have in the past? There’s no denying the importance of the technologies, standards, concepts, and collaborations that define the Semantic Web proper and all that is affiliated with it or grown out of it. These range from a dependency on RDF/OWL triple stores for the Financial Industry Business Ontology (FIBO) that harmonizes data across repositories as a common language for the financial services industry, to the use of schema.org by websites to mark up their pages to power richer search results.
But if anything, the terms “Semantic Web” or “Semantic Web technologies” are receiving less attention, points out Amit Sheth, educator, researcher, and entrepreneur whose roles include being the executive director of Kno.e.sis—the Ohio Center of Excellence in Knowledge-enabled Computing.
As we head into 2017, DATAVERSITY® wanted to follow up the state of the Semantic Web and Semantic technologies (both standards-body related and not). In addition to Sheth, Michael Bergman, co-founder of knowledge-based Artificial Intelligence startup Cognonto (see our recent article here) and CEO of Structured Dynamics, and David Wood, CTO of 3 Round Stones, Director of Technology at Ephox + TinyMCE and author of books including Linking Enterprise Data, also participated.
A Look Back into the Past
A dive back into the passing year saw interviewees broach the topic of whether the Semantic Web and some related technologies, like Linked Data, were diminishing in stature – but only, that is, as ends in and of themselves.
Sheth notes that he’s been on the same page for the last few years as it relates to his expectations about slow progress in broader adoption of Semantic Web standards and the technical challenges that hinder Linked Data usage:
“One key challenge that continues to hinder more rapid adoption of Semantic Web and Linked Data is the lack of robust yet very easy-to-use tools when dealing with large and diverse data, [tools] that can do what tools like Weka did for Machine Learning,” he says.
On top of that, simply adding more Linked Data to the Linked Data Cloud doesn’t necessarily produce advantages.
“Indifferent quality, limited interlinking, and limited expressiveness of mappings between related data hinder broader adoption—while a few datasets that are extracted from actively maintained repositories (e.g., DBpedia from Wikipedia) and highly curated data continue to have the lion’s share of applications,” he says.
WANT TO STAY IN THE KNOW?
Get our weekly newsletter in your inbox with the latest Data Management articles, webinars, events, online courses, and more.
In fact, Sheth singles out Cognonto – which is integrating six large public knowledge bases (including DBpedia and Wikidata) to benefit Machine Learning applications – to be a notable highlight. As it happens, Sheth and Cognonto co-founder Bergman share similar thoughts about the state of the Semantic Web as a goal in and of itself. “If you were to look at Google Trends, ‘Semantic Web’ has been dropping like a rock in terms of popularity, and I think the same is true of ‘Linked Data,’” Bergman says. Linked Data, unfortunately, places too much burden on publishers for staging information, he explains, and at the consumer level, “there’s just no ability to distinguish the quality of mapping and quality of data that’s out there.” That’s not to say that the technology is the problem or even that the idea of linking data is the problem, he says, “but it needs to be done in a vetted way.”
That said, both Sheth and Wood make a case that we not forget just how mainstream Semantic technologies and techniques – not necessarily using Semantic Web standards – have become, at least for certain requirements. “The world has seen faster adoption of Semantic techniques, especially involving the building and uses of knowledge graphs,” Sheth says. Adds Wood, “Google, Facebook, Yandex, Baidu, Bing have all embedded Semantic technologies into their core businesses. The amount of Web content that uses microdata/RDFa/schema.org information to enhance search results reached double digits.”
Wood also directs attention to IBM’s launch of Watson on the Web via AlchemyLanguage, which returns concept information as DBpedia/yago/freemix URIs, including a Linked Data API. “These are rapidly finding their ways into consumer products,” he says. “The average developer barely realizes it, but semantics are now just about everywhere.”
Semantic Web Technologies Reach into the Future
It seems that Semantic technologies and techniques, then, are by no means irrelevant to a very exciting future – one, in fact, that’s already underway.
Sounds like a yes, based on our participants’ responses. Bergman believes that the transition taking place requires Semantic technologies be there as an essential enabler, albeit not sufficient on their own. “The market interest and justification is in solving problems, and the Semantic Web is a part of that, contributing to what is being done in knowledge applications,” he says, including Cognonto’s own. The role of the Semantic Web and Semantic technologies is essential, if more subsidiary and contributory, to the development of true knowledge-based Artificial Intelligence, where what things mean and how people understand what they mean is critical, he says.
“If you go back to where the Semantic Web was placed in the pantheon of computer science-type topics decades ago, it always was kind of seen as a bit of a branch of AI, and I think we are now seeing that being affirmed,” Bergman says.
Sheth concurs. Artificial Intelligence is a far bigger field with many more followers and practitioners who now recognize that background knowledge (both broad-based and domain-specific) are increasingly key to further improving Machine Learning and NLP, he says. “In other words, AI, with its much larger footprint in research and practice has realized that knowledge will propel machine understanding of (diverse) content,” Sheth says. “In this context, the core value proposition of the Semantic Web is being co-opted by or swallowed by the bigger area of AI.”
Sheth adds that he believes there will be increasing emphasis on developing knowledge graphs and using them for “top-down” (emphasizing the use of models or background knowledge) or “middle-out” processing in conjunction with “bottom-up” data processing (that is, learning from data). He expects the future to see more and more companies investing in developing their own knowledge graphs as an investment in intellectual property. “A good example is the Google Knowledge Graph, which has grown from a fairly modest size based on Freebase to one that is much larger,” he says.
By extracting the right subset of a bigger knowledge graph for a particular purpose, he expects there will be more progress in this direction, which to date has required significant human involvement and costs. Even so, “the pace of progress will be moderate…one reason is the lack of skilled personnel in Semantic Web and knowledge-enhanced computing topics.” But tools and applications ranging from search to chatbots to knowledge discovery “are waiting to exploit such purpose-built knowledge graphs by using them in conjunction with Machine Learning and NLP,” Sheth says.
Also on his list of things to come: Deeper and broader information extraction from a wider variety of textual as well as multimodal content that exploit Semantics, especially knowledge-enhanced Machine Learning and NLP. And, “in addition to extracting entities, relationships of limited types, and sentiments and emotion, we will develop deeper understanding through more types of subjectivity and semantic or domain-specific extraction.” An example of the latter would be for clinical text, revolving around identifying more phenotype-specific relationships, intentions for clinicians, or consumer search for health content, severity of disease, and so on.
And if you haven’t heard of singleton property yet, it’s a good time to get familiar with the approach for making statements about statements in RDF without the use of RDF reification: “I expect further progress in the near term, followed by broader adoption,” Sheth says. (You can see the full text of Sheth’s responses to our queries here.)
Wood is excited about researchers working on Social Linked Data (pronounced “Solid”) and its prospects for enabling “a cool future where people can control their own data,” via decentralized identity, authentication, access control and generic and RESTful data API. “I don’t expect great uptake yet, but I am hopeful that in time people will again care about privacy and control,” he says. Personally, in the coming year he hopes to participate in projects to facilitate scholarly research and publishing using Semantic technologies, noting that he’s already working on productizing enhancements to Ephox’s general-purpose rich text editors that use Semantic technologies behind the scenes. “In neither case should the end user care, any more than users currently need to care, which language a software product is written in,” Wood says.
Never doubt one thing: That is, that the Semantic Web, and Semantic technologies and techniques, have and will continue to propel the industry forward. “Sometimes when you are embedded in something it doesn’t seem like things are going that fast,” says Bergman. “But when you look back you see that we are in the midst of an unbelievable revolution here.”