You are here:  Home  >  Data Blogs | Information From Enterprise Leaders  >  Current Article

Wikidata: People And Bots Busy Filling The System In Phase One

By   /  December 17, 2012  /  No Comments

Ever heard of the Finnish television series Matkaoppaat? It’s a program about tour guides abroad – something of a reality show that looks like it has already spawned copycat programs with more on the way in other countries.

But of more interest to readers of The Semantic Web Blog is that just a couple of days ago, the series was added as item Q1000000 to Wikidata, on the heels of other recent entries like the English town Newton-le-Willows (item ID Q750000) and American alpine skier Tim Jitloff (ID Q500000). They’re following in the footsteps of earlier items like Dutch Wikipedia (ID Q10000), which was added just four days after Wikidata was launched on Oct. 30.

“Right now the system is launched (since end of October) and people and bots are filling it,” says Wikidata project director Denny Vrandecic, of the Wikimedia Foundation’s effort to create a free knowledge base about the world that can be read and edited by humans and machines alike.

It’s all part of the phase one deployment, even as phase two development is underway in parallel. As reported here, phase one is about centralizing links between the different language versions of Wikipedia in one place, while in phase two editors will be able to add and use data collected in a structured format by Wikidata. This phase is also known as the infobox phase, as it pertains to gathering Wikipedia infobox-related data with the goal of filling these resources with data from Wikidata.

Phase three realizes the automatic creation of lists and charts based on the data in Wikidata. Inline queries in Wikipedias should be supported with several formats.

That’s down the road a bit, but the starter pistol has been fired. “In January we plan to be deployed on the Wikipedias,” Vrandecic says, in language-by-language editions, starting with Hungarian. Hebrew will follow, then probably Italian, English and others. “This means, those languages will be able to use Wikidata to replace their language links — the links that link the different language versions of a Wikipedia article together,” he says.

“All we cover now is creating items in Wikidata which represent the topics of Wikipedia articles, and we collect the links to those Wikipedia articles in different languages,” he explains. So for all these topics – like Q65 for Los Angeles – stable identifiers are in place. “Then we know the links to the different language versions of this article, and when it is deployed to the Wikipedias, the Wikipedia can access this information.”

Right now the links are stored redundantly in each of the different languages, he notes, but “Wikidata will change that by centralizing this information.”

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

You might also like...

Is Data Governance Solely About Controls on Data?

Read More →