The OceanLink Project is bringing semantic technology to the geosciences domain – and it’s doing it with the idea in mind of not forcing that community to have to become experts in semtech in order to realize value from its implementation. Project lead Tom Narock of Marymount University, who recently participated in an online webinar that discussed how semantics is being implemented to integrate ocean science data repositories, library holdings, conference abstracts, and funded research awards, noted that this effort is “tackling a particular problem in ocean sciences, but [can be part of a] more general change for researchers in discovering and integrating interdisciplinary resources, [when you] need to do federated and complex searches of available resources.”
The project has an interest in using more formal, stronger semantics – working with OWL, RDF, reasoners – but also an acknowledgement that a steeper learning curve comes with the territory. How to balance that with what the community is able to implement and use? The answer: “In addition to exposing our data using semantic technologies, a big part of Oceanlink is building cyber infrastructure that will help lessen the burden on our end users.”
The project is using Linked Open Data to join together data from multiple repositories, such as the Biological and Chemical Oceanography Data Management Office (BCO-DMO) and Rolling Deck to Repository, a central repository for research vessels. Given the popularity of relational databases in geosciences, and the continued interest in using them there, Narock discussed the project will use tools to map existing relational databases to Linked Open Data, as well as republish metadata with LOD tools and methodologies. “Moving forward with Linked Open Data does not imply that you have to do away with your existing infrastructure,” he noted.
To ease with the challenge of querying multiple Linked Open Data sources, potentially federating a query across multiple data sets, work is underway to build a hub, “a piece of infrastructure that will provide a facet-based search interface,” he explained, with all the conveniences of one-stop-searching, drop-down menus, checkboxes, and the like. “It’s very user-friendly so you don’t need to become familiar with the semantics behind it,” he said. Not only that, but the hub will periodically poll the LOD sources and attempt to automatically identify the links between data sets, and perform co-entity resolution.
“So we’re going to automatically identify the links across the Linked Open Data sets, make those available for search and we’re going to provide one common interface for users to submit their queries to, and then this hub will federate the search out, aggregate the results, and provide the results back out to the user,” Narock said.
The initial version of the faceted-based search portal is being worked on now, and it’s hoped to be live by summer. Users can come in and search on facets such as cruise, people, or organization, or by particular funded project or ocean science program. “Linked Open Data and semantics will be utilized to point you to related data sources, to highlight links between data sets, to point you to publications and presentations, and identify all the supporting materials that go along with whatever it is the user is searching for,” he said.
The OceanLink Project also is eager for the community’s feedback. “We know data is messy, … and we are soliciting the geosciences community to help us with feedback to build a better system,” he noted. It also is looking to branch its work out to other geosciences communities, such as ecology.