Bringing the Semantic Web and Linked Data to All

By   /  March 5, 2015  /  No Comments

semantic web and linked databy Jennifer Zaino

The Semantic Web and Linked Data promise so much to so many in the world. They provide the foundation for identifying and classifying content, while exposing, sharing, and connecting pieces of data on the Web via the W3C’s Resource Description Framework (RDF) standard. The vision revolves around increasing the Web’s usefulness and people’s access to knowledge.

But as it happens, the ability to profit – intellectually, monetarily or both – from a growing web of structured and shareable data still is not available to a large segment of the human population. Only around 40 percent of the world population has an Internet connection today, according to Internet Live Stats, one of the Realtime Statistics’ projects. That adds up to just a little over three billion Internet users in a world of some 7.1 billion people.

That other 60 percent of the population, many of them in third world countries and rural settings, also would benefit from being able to share structured data – about market prices, government issues, health, education, and so on. But they can’t rely on the same methods and means of access as can those primarily serviced by today’s Semantic Web technologies, according to Christophe Guéret. Guéret, a research associate at Data Archiving and Networked Services, which promotes sustained access to digital research data files and encourages researchers in the sustainable archiving and reuse of data. Guéret has focused much of his own work on the design of decentralized, interconnected knowledge systems, and their social implications. Along with other researchers and open data advocates, Guéret is involved with a movement to downscale the Semantic Web: The World Wide Semantic Web group is a community that believes in enabling access to open, linked, and semantically rich data to everyone across the globe.

Focus on Infrastructure, Relevancy, and Interfaces

“Instead of only looking at scaling of technologies to accommodate more users and more data, we also need to look at very specific needs and characteristics of some smaller communities that do need to share data but cannot rely on the same tools,” Guéret believes. The World Wide Semantic Web’s activities are centered on improving access to structured, semantically enriched, and linked data with consideration to infrastructures, relevancy, and interfaces.

“We are working on making an infrastructure for data sharing happen in these three different aspects,” he says. “That’s where we are now.”

Given the lack of access by many to big hardware and stable Internet connections, there’s a need to design systems that work with limited electricity, offline or locally connected in intranets, and using small hardware that is easy to replace, he says. So, on the infrastructure end of things, the SemanticXO Data Management stack for facilitating asynchronous data sharing across devices (whose origin lies with the One Laptop Per Child (OLPC) project to create the XO, a low-cost, rugged laptop) is part of the World Wide Semantic Web’s efforts to realize its vision, making every XO Linked Data ready. Semantic XO, which enables every activity to get access to an API to store and retrieve structured information, uses the lightweight RedStore RDF triple store to store all resource descriptions across mesh network nodes.

Another outgrowth of the effort and a piece of the World Wide Semantic Web’s vision is the Entity Registry System (ERS), which associates data to a uniquely identified entity and is designed for the publishing and consumption of Linked Open Data in a decentralized and off-line setting. In essence, “the system was developed to do 5-star Linked Data without using the Web as a platform,” Guéret says. Users can “collaboratively build the global knowledge graph without having one place to store it.”

When it comes to relevancy of semantic data, World Wide Semantic Web researchers consider that the level of relevancy – global or local – is matched to the needs of those using the data, so they get the right information at the right time. “We have to assess what is meaningful,” Guéret says. If there is only a window of a couple of seconds to share data about crops, for example, to a node on a truck traveling across farms in remote African villages as the vehicle comes within radio range of a laptop with information on it, “you want to share what is most important and urgent first.”

Users also must get data in the right size, since data might need to be downscaled depending on device infrastructure and other issues. In this vein, SampLD is one ongoing activity at World Wide Semantic Web aimed at reducing a set of triples to its most useful subset in order to save resources. The key is to sample large linked data sets so that they are still representative, but also work on limited hardware platforms.

Regarding interfaces, Guéret points out work spearheaded by Victor de Boer, researcher and developer at the VU University Amsterdam, and also a researcher engaged in projects at World Wide Semantic Web. Mr. de Boer is tackling the problem of making Linked Data accessible to local people who often have limited computer skills, low literacy, and speak local languages that are not generally supported by speech systems through Project Voices, which is funded by the European Union through the 7th Framework Programme.

Project Voices takes as its starting point the idea that, rather than interacting with Linked Data through text, as is most often done today, a voice-based interface provides the way to realize the full potential of mobile ICT and the Web in developing economies. Project Voices develops appropriate text-to-speech and speech recognition elements for under resourced local languages, leverages content relevant to local participants, and integrates local community radios and ICT to leverage the quality and volume of radio content broadcast and sharing.

“We want to make the Semantic Web work for context when you don’t have big hardware, and for context when you don’t have traditional Web interfaces,” Guéret says.

Making Good Progress

Guéret says a flagship example that pulls together many components of the work done so far can be seen in RadioMarche, a voice-based market information system that was designed in co-creation with villagers in the Tominian region in Mali. In that country, only 1.8 percent of the population has Internet access, only 10 percent has access to the electricity network, and only 26.2 percent is literate. With the help of voice-based interfaces, local information points, and the blending of ICT-based and community radio-based approaches to information dissemination, RadioMarche improved the information flow of a paper-based market information system that farmers were using to gather information about pricing their products, such as shea butter and honey. From taking two days to collect pricing information from farmers across these villages, it now takes two minutes, he says. Not only that, but it’s helping to improve farmers’ incomes.

Among other projects in the works is one focused on connecting relevant, locally produced data about conditions such as rainfall with global meteorological Linked Data that exists in the Cloud, with the goal of improving forecasting rainfall on a precise target area. Additionally, researchers are trying to speed up prototyping of systems right in the field, so that farmers and other potential users can quickly see how the use cases they describe to researchers may be realized, Guéret says.

Ultimately, building a truly world wide – and downsized – Semantic Web is being undertaken primarily in the service of helping to improve the lives of billions of people around the globe. “We listen to them, to what they need and want, and we try to answer that,” says Guéret.

You might also like...

5 Pitfalls to Avoid from Real Stories of Analytics Projects (Part 2)

Read More →