You are here:  Home  >  Data Blogs | Information From Enterprise Leaders  >  Current Article

Get In On CrowdSourcing An Open Knowledge Graph API

By   /  August 14, 2012  /  No Comments

Last week The Semantic Web Blog continued its coverage of Google’s Knowledge Graph with the news of its worldwide launch for English-language users. This week we’ve learned about a paper submitted to the 1st International Workshop on Knowledge Extraction and Consolidation from Social Media (KECSM2012), which takes place in November in Boston, about work underway on the topic of crowd-sourcing an Open Knowledge Graph API.

The paper, authored by Thomas Steiner of the Universitat Politècnica de Catalunya in Barcelona and Stefan Mirea of Jacobs University, Bremen, Germany, and currently pending review, proposes that the crowd step in where Google has so far failed to tread when it comes to creating an interface to the Knowledge Graph of more than 500 million objects – landmarks, celebrities, cities, sports teams, buildings, movies, celestial objects, works of art, and more – and 3.5 billion facts about and relationships between them. There is no publicly available list of all those objects, and, say the authors, even if there were, “it would not be practicable (nor allowed by the terms and conditions of Google) to crawl it.” Hence, the crowd-sourcing approach.

“With SEKI@home, which stands for Search for Embedded Knowledge Items, we propose a browser extension-based approach to crowd-source such an API. As people with the extension installed search on Google.com, the extension sends extracted anonymous Knowledge Graph facts from Search Engine Results Pages (SERPs) to a centralized, publicly accessible triple store, and thus over time creates a SPARQL-queryable Open Knowledge Graph,” the paper explains. The authors have implemented and made available a prototype browser extension tailored to the Google Knowledge Graph.

But the Knowledge Graph is just one example of SEKI@home possibilities. The bigger-picture idea is that it can make any closed knowledge base programmatically and openly accessible via the crowd-sourcing model. The authors say that they have had 15 users testing SEKI@home, browsing the Knowledge Graph by following links starting here after having installed the browser extension. According to the paper, the users haven’t noticed any difference in their computing experience, while the extension does its work in the background of sending back extracted Knowledge Graph facts to the RDF triple store “at full pace.”

Steiner and Mirea comment that it’s the strings, not the things, that get displayed to search engine users via the Knowledge Graph, and address that by modeling “the plaintext Knowledge Graph terms (or predicates) like “Born”, “Full name”, “Height”, “Spouse”, etc. in an informal Knowledge Graph ontology under the namespace okg (for Open Knowledge Graph)….This ontology has already been partially mapped to common Linked Data vocabularies. One example is okg:Description, which directly maps to dbpprop:shortDescription from DBpedia.”

Close to 400 Knowledge Graph terms had been collected at the time the paper was submitted, but expect that linking them to other Linked Data vocabularies will be a never-ending task. It is a next step for the project to provide a more comprehensive mapping of Knowledge Graph terms to other Linked Data vocabularies, as well as to look to applying the SEKI@home approach to other closed knowledge bases. “Videos from video portals like YouTube or Vimeo can be semantically enriched …. We plan to apply SEKI@home to semantic video enrichment by splitting the computational heavy annotation task, and store the extracted facts centrally in a triple store to allow for open SPARQL access,” they write.

To test out the unofficial crowd-sourced Open Knowledge Graph API yourself, go here.

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

You might also like...

The Strategic Chief Data Officer: Leveraging Data for Customer Value

Read More →