You are here:  Home  >  Data Education  >  Big Data News, Articles, & Education  >  Current Article

Under the Hood: A Closer Look at Information WorkBench

By   /  May 31, 2012  /  No Comments

fluidOps Logofluid Operations’ Information Workbench is part of the semantic infrastructure supporting the BBC’s revolutionary coverage of the 2012 Olympic Games.  Below is a conversation with fluid Operations Senior Architect for Research & Development Michael Schmidt in advance of his 2012 Semantic Technology and Business Conference presentation. This conversation is a supplement to the series “Dynamic Semantic Publishing for Beginners.

Q. Is the Information Workbench a response to the need for more robust applications to help process “Big Data”? How is it different than other popular tools?

A. Dealing with Big Data involves a number of different challenges, including increasing volume (amount of data), complexity (of schemas and structures), and variety (range of data types, sources).

However, most Big Data solutions available on the market today focus on volume only, in particular supporting vertical scalability (greater operating capacity, efficiency, and speed.)   This means that such solutions mainly address the analysis of large volumes of similarly structured data sets. Yet the Big Data problem is not fully solved only by technologies that help you process similarly structured data more quickly and efficiently.

A major problem is that of horizontal scale. Looking at the wealth of data that is published in open data initiatives or stored in enterprise-internal systems and databases, one is faced with a massive number of data sources, with a high degree of variety and heterogeneity in coverage, data models, and structure.    There is a great deal of opportunity and potential for technologies that solve these problems and enable users to analyze on-demand this wealth of structured and unstructured data.

Motivated by these “big data” opportunities, we designed the Information Workbench to enable clients to use semantic technologies and Linked Data to instantly integrate, analyze, and correlate large amounts of internal or publicly available data sources, regardless of their origin, data types, schemas or structure.

The Information Workbench (IWB for short) was designed as a self-service platform, making it easy for clients to develop Linked Data applications and perform enterprise analytics regardless of their prior experience with semantic technologies or languages.

For more information about how the IWB makes use of ontologies and reasoning to resolve the problems of data complexity and data variety, please take a look at the following presentation: http://www.slideshare.net/phaase/on-demand-access-to-big-data-through-semantic-technologies

Q. What is your target market for IWB? Is it larger enterprises, Small to Medium Enterprises (SMEs), or all of the above? What industries stand to benefit most from its use?

A. The Information Workbench delivers customized solutions for different application areas with different types of customers:

  • Data Center Management and Semantic Master Data Management: Here the focus has been mainly on larger enterprises, but SMEs can also benefit from the solution and use the Information Workbench to improve internal IT operations processes.
  • Life Science: The Information Workbench can be applied in both large and medium-sized enterprises, for instance to support collaborative, data-intensive drug discovery processes.
  • Media and Publishing Industry: Here we typically deal with large customers.

Q. Can you give some concrete examples of how Information Workbench has been used in the publishing industry to improve the use and presentation of news and data?

A. For the BBC, the Information Workbench supports the data creation, maintenance, and authoring and publishing process.  It supports the editorial process–from authoring and curation to publishing of ontology and instance data (data describing specific instances, e.g. a specific athlete at the Olympics)–following an editorial workflow.  The solution integrates and interlinks semantically enriched data in a central place, and approved content is then available for automatic publication on the website. The platform seamlessly integrates into already existing editorial processes and automates the creation and delivery of semantically enriched content.

The Information Workbench supports the following specific authoring and publishing steps at the BBC:

  • Automated generation of content from both structured and unstructured data and metadata.
  • Content enrichment using metadata.
  • User-friendly forms and auto-suggestion list for easy editing of semantic metadata.
  • Support for multi-level internal approval processes.
  • Semantic metadata processing for automated publication.

For other customers the Information Workbench has been implemented to support use cases in the area of data-driven journalism. This means that the tool supports the aggregation of data from different sources, the enrichment of data and the analysis of aggregated data. For example, we are currently working on a project with a customer who is using the Information Workbench for the aggregation, enrichment and analysis of data from disparate datasets about soccer players, such as statistics, background information, etc. Using the Information Workbench, journalists have access to a self-service user interface for the analysis and presentation of structured data gathered from various sources.

Q. Often a criticism of many semantic applications and tools is the poor quality of the user interface/ experience. How important to fluid Operations is the user interface of IWB?

A. The Information Workbench delivers an end-user-friendly interface to Linked Data and can be used by anyone, not only users with technical skills.  The Information Workbench was built as a self- service platform and delivers a living, widget-based user interface which configures itself to display the information most relevant to the user. Widgets are components of the user interface that display information in different ways, e.g. geo-mappings to display locations mentioned in the data, heatmaps, tag clouds, Twitter live feeds, etc. This concept is driven by a Semantic Wiki approach, where we allow widgets to be embedded into Semantic Wiki pages using an intuitive, declarative syntax (offering wizards for non-expert users to create and modify widgets).

In addition, the Information Workbench provides further interactive views on data:

  • A table view displays structured information in a tabular format and allows users to add or change information using ontology-based auto-suggestion lists,
  • An interactive, customizable graph view displays information objects and their linked objects, information and data. Thus, previously unknown relationships can be discovered very quickly, and
  • A Pivot Viewer that visualizes results and hits and provides interactive filtering, grouping and sorting capabilities for the visualized data.

Q. What kinds of staff/expertise are needed to set up and use Information Work Bench?

A. The Information Workbench runs on commodity hardware (physical or virtualized) and runs in principle on all platforms. It is implemented in Java, using common Web Technology. We provide an easy-to-use installer for Windows, and installation guides for the other platforms.

At SemTechBiz 2012, fluid Operations’ Lead Architect Peter Haase will speak alongside BBC Senior Technical Architect Jem Reyfield and OntoText’s Borislav Popov.  Stop by the fluid Operations booth for a demonstration of Information Workbench.

Kristen Milhollin is a writer, mother, champion of good causes, and semantic web enthusiast.  She is also the Project Lead for GoodSpeaks.org.

You might also like...

The Strategic Chief Data Officer: Leveraging Data for Customer Value

Read More →