Opportunities are opening up in the library sector, both for the institutions themselves and providers whose solutions and services can expand in that direction.
These vistas will be explored in a session hosted by Kevin Ford, digital project coordinator at the Library of Congress at next week’s Semantic Technology & Business conference in San Jose. The door is being opened by the Bibliographic Framework Initiative (BIBFRAME) that the LOC launched a few years ago. Libraries will be moving from the MARC standards, their lingua franca for representing and communicating bibliographic and related information in machine-readable form, to BIBFRAME, which models bibliographic data in RDF using semantic technologies.
For libraries, the transition will result “in bringing our data hopefully out of its silo and make it more comprehensible – or at least more approachable – to a greater number of people,” Ford says. Moving to a more general data model vs. the very specific MARC record data model also should create a more competitive market for libraries’ business – and so increased opportunities for semantic tech vendors to cater to the sector with solutions that build atop that model. “This is a pretty significant shift in the sector,” Ford says. The Library of Congress has itself developed a number of tools to help libraries make the move, including solutions to move MARC records to BIBFRAME resources and a frontend BIBFRAME editor pluggable into any backend system. But vendors can take things further, he believes, from creating cataloguing modules to tools that help users find materials in libraries but also publish data in such a way as to be understood by leading search engines.
“We need to get our data out to where users are, and they are coming from Google, Yahoo, Bing and Facebook,” says Ford. With semantically-described data, such that search engines and social networks can understand and potentially do something with it, the future may see more users not only being able to find out where to buy a book from a search, but also have an option to check it out of their library, for instance. As it is, well over 50 percent of hits to some libraries’ websites come via search engines, he says, especially if they’ve implemented schema.org.
Alignment in the Library Sector
As Ford describes it, schema.org and BIBFRAME must work together for improving description and discovery in the bibliographic environment. There are some conceptual misalignments today between how schema.org does things and how the library world is looking to do things: “Right now you look at schema.org and you can describe a book and that is fantastic,” he explains, but it also has aggregated elements of the Functional Requirements for Bibliographic Records (FRBR) that the library world separates into individual pieces. In schema.org, he says, a thing called a book can be associated with author, title and ISBN number. BIBFRAME, on the other hand, supports the library world’s view of author and title being “elements of the work, the conceptual essence of the thing, whereas [a book’s] ISBN is an attribute of a specific manifestation of that work,” Ford explains.
But with BIBFRAME data stored in RDF and triples, “it makes it significantly easier to bring [schema.org and BIBFRAME] together.” In fact, Ford points to the OCLC and the work it has done since the beginning of the BIBFRAME initiative, including analyzing it with respect to schema.org, as having played an important role in bridging the two worlds. “Schema.org is not to be ignored but at the same time the needs of libraries are so great and the data becomes so granular, that it’s going to take a little massaging,” Ford says, noting that he expects the parties to take up the issues again over the next couple of months. (Also speaking at SemTechBiz will be Richard Wallis, OCLC technology evangelist and chair of the Schema Bib Extend W3C Community Group) who will discuss Extending Schema.org: A View From the Bibliographic World.)
The transition from MARC to BIBFRAME will be a multi-year process, Ford notes, pointing out that there are well over 1 billion MARC records on the planet. “Almost every library in the US has a MARC-based system,” he says, “and many, many international systems can talk MARC or are MARC-based.” The scope of its reach means there are not only opportunities for vendors to help libraries with solutions as they transition over, but also with training staff whose world has been the MARC system.
It’s still early days, though, and the effort has not even officially begun yet. But in fiscal 2015, the Library of Congress will plan and likely in the third quarter implement a pilot cataloging project in order to test some additional ideas with BIBFRAME, Ford says. “We are going to have to build a system where we can store and search and retrieve BIBFRAME data. We have the editor but now we need to tie that to a backend system ourselves,” he says. More than a dozen internal cataloguers also will be re-educated to learn how to catalogue material using the BIBFRAME editor, and vocabulary and data model.
If you would like to learn more about the use of Linked Open Data in Libraries, Archives, and Museums, consider attending the LODLAM Training Day next Tuesday, August 19 in San Jose, California.