The Digital Enterprise Research Institute (DERI) is kicking off a project with Fujitsu Laboratories Ltd. in Japan to build a large-scale RDF store in the cloud capable of processing hundreds of billions of triples. The idea, says DERI research fellow Dr. Michael Hausenblas, “is to build up a platform that allows you to process and convert any kind of data” — from relational databases to LDAP record-based, directory-like data, but also streaming sources of data, such as sensors and even the Twitter firehose.
The project has defined eight different potential enterprise use cases for such a platform, ranging from knowledge-sharing in health care and life science to dashboards in financial services informed by XBRL data. “Once the platform is there we will implement at least a couple of these use cases on business requirements, and essentially we are going to see which are the most promising for business units,” Hausenblas says.
“Essentially it’s quite challenging because on one hand we really have to do some deep thinking and foundational research there regarding scalability and performance, but also what we do has to be very relevant to business units.” At the end of the three-year project, the goal for Fujitsu is to have solid technology and scenarios based on it that make a convincing argument to potential enterprise customers that the cloud platform is more than a proof-of-concept but actually responds to their business needs. And that could well be around just linking their own data, having nothing to do with a Linked Open Data model.
For instance, a company could have customer information spread across a swath of internal systems. “Of you really want to connect and understand at the end of the day how to sell more product or build a better product, linked data and the cloud are a natural fit,” he says.
Fujitsu’s recent research efforts have been focusing on cloud technologies, as well as server technologies. “They have this huge cloud infrastructure and of course they also already have identified Big Data as an important trend,” says Hausenblas of DERI’s new partner. Big data questions revolve around volume (data sets likely too large for traditional database software tools to support), variety and velocity, and also around a tendency to being unstructured. DERI’s Linked Data expertise will support the project’s efforts to make sense out of Big Data for its enterprise scenarios, structuring, inter-relating and creating interoperable datasets by using URIs describing subjects in a highly scalable and distributed fashion. “Linked Data is involved with answering at least two of the three big data questions” – that is, volume and variety – says Hausenblas.
Or, as Fujitsu describes it in a white paper it released in March, called Linked Data: Connecting and Exploiting Big Data:
Effectively, linked data can be used as a broker, mapping and interconnecting, indexing and feeding real-time information from a variety of sources. We can infer relationships from big data analysis that might otherwise have been discarded and then, potentially we end up running further analysis on the linked data to derive even more insight.
According to the press release, the investment in the research program with DERI focused on Networked Knowledge will be significant. “Big data will be the foundation for enabling … a [Human Centric Intelligent Society], and we at Fujitsu view as essential the data processing of big data – in other words, the gathering, semantic analysis, and categorization of big data,” said Tatsuo Tomita, President of Fujitsu Laboratories Ltd., in a statement in the release. “This joint research collaboration with DERI featuring large-scale research resources in the field of Semantic Web offers new R&D opportunities and represents a step forward toward the realization of a Human Centric Intelligent Society envisioned by Fujitsu.”