You are here:  Home  >  Data Education  >  Big Data News, Articles, & Education  >  Big Data Articles  >  Current Article

Industries Dive into Schema.Org Extensions and Semantic Web Technologies

By   /  August 10, 2017  /  No Comments

semantic webThe schema.org Global Semantic Vocabulary expands its reach through means including hosted/reviewed and external extensions. The former is managed and published as part of the schema.org project, while the latter live elsewhere in the Semantic Web, typically managed by third parties with their own processes and collaboration mechanisms.

Examples of extensions in various stages of proposal, development or use are auto.schema.org for the automotive industry; bib.schema.org for bibliographic resources and the library sector; iot.schema.org for the Internet of Things (IoT); health-lifesci.schema.org; and, fibo.schema.org (a pending name) for the financial sector.

Closely involved with some of these efforts is Dr. Mirek Sopek, who founded digital solutions agency MakoLab S.A. and led its Semantic Web-oriented R&D division. He also is the president of Chemical Semantics Inc., which applies the Semantic Web to computational chemistry.

Schema.org transcends all domains and language barriers, currently containing in its core more than 2,000 terms, 753 types, 1,200 properties, and 220 enumerations, he explained at the recent Enterprise Data World 2017 Conference event. It covers common entities, actions, and the intersection of their relationships. Equally critical, he believes, is that schema.org offers an extremely convenient framework to build and deliver industrial ontologies, a subclass of domain ontologies, to represent concepts used in any given industry.

That said, some domain concepts may be adopted into the schema.org core, when they become important beyond their specific vertical. For example, in the financial services industry the EDM Council proposed adding to the core basic terminology from FIBO (Financial Industry Business Ontology) around financial products like BankAccount and LoanorCredit because of how applicable they are to the public and companies overall. In May, it said it would provide Schema.org with its FIBO content for defining financial concepts to create a shared global financial language so that website administrators could easily tag their content to generate more intelligent search results.

The State of Extensions

Sopek noted that a few classes and properties from the automotive domain extension also were taken into the core because of their value beyond the auto industry. But as an industrial extension, auto.schema.org was published as a hosted extension last year, with proof of concept being the next step.

Auto.schema.org, the first phase of the General Automotive Ontology that builds upon ontologies used by MakoLab for semantic search insights, is aimed at supporting richer automotive data description to search engine users. It contains about 300 classes and about 40 properties, Sopek said, and efforts are underway to convince more companies to use it, including manufacturers and dealers who want to capture the attention of consumers searching online for vehicles. Toyota, Renault, and Nissan are currently involved.

Carinsearch.org gives a perspective on what benefits can be derived from leveraging the schema extension on web sites to promote smarter search. Car search results can be displayed in similar ways with more meaningful data: In response to a search for “compact car with good acceleration,” for example, relevant search results will pop up that specify the body type, the km/h acceleration, top speed, and other factors. There’s a “unique opportunity for car manufacturers who are able to provide detailed, clear, compelling information on their websites through search, to capture audience attention and begin the process of swaying their decision,” Carinsearch states.

Sopek also pointed to the additional potential for the auto industry to take advantage of adding extension and core schema.org entities into tools like Google Analytics. “Marketers can ask important questions, like what car color to use in display campaigns,” he said, based on frequency of visits to pages that feature cars in different colors.

Taking the Leap

Many industry ontologies, he said, can qualify for the schema.org extension transformation. And he hopes to see more people convinced of the extensibility of schema.org to build and convert their individual industry ontologies to web-scale ontologies.

In this vein, another effort he took note of was the work being done to extend the GS1 Web Vocabulary to the semantic web. “I strongly believe it will change e-commerce,” he said.

“Imagine if you work with Amazon or eBay and your products are described deeper using the GS1 extension. Why couldn’t Google use that and then others too? So, there is a process that could lead to a widespread use of these extensions.”

Eric Kauz, GS1 Director Data Systems, discussed the industry standards organization’s work, which revolves mainly around retail and supply chain, healthcare, and transport and logistics, at EDW. Its standards include GDSN (Global Data Synchronization Network), a network of interoperable product data pools enabling collaborating users to securely synchronize Master Data based on GS1 standards, and the GPC product classification used in that network. “Our goal was to take attributes defined in our global data dictionary and migrate to build a vocabulary out of that,” he said.

Its desire was to make its existing standards available in machine interpretable format for the web as well as providing an extension to schema.org. Last year it published an initial release of its Web Vocabulary as the first schema.org external extension.

Focusing on consumer facing properties for clothing, shoes, food, beverage, tobacco, and attributes common to all products, it is one overall aspect of the GS1 SmartSearch standard that GS1 says allows businesses to benefit from better search results for consumers to find the products and information they need; greater visibility of their products in online searches; improved, accurate online product information; and, shared product information via consumer-facing mobile devices and websites, which ultimately drive sales.

The extension was an opportunity to introduce more attribute depth in product classes, such as AllergenDetails, Kauz said. AllergenDetails is an important attribute in Europe, for instance, where the EU 1169 regulation requires providing nutritional information in electronic format. “We had a lot of allergen details in our Global Data Dictionary already, so we just built that out,” he said. Key was not just adding properties for the sake of doing so, but to “look where we were strong” and had something to offer to enhance schema.org capabilities. It also builds upon some existing schema properties, like Warranty, with some additional information.

The preference for the GS1 community, though, “Is to use schema.org properties where available, and where not, use GS1,” Kauz said. He recommends other extension efforts also be as consistent with schema.org as possible, not defining the same property with a different definition and data type, for instance. There are some challenges, of course, like keeping up with the moving target of schema.org additions and checking them for conflicts.

Creating the extension led to increased visibility of the GS1 Web Vocabulary and use of its language, he noted, which was one positive outcome. There is also potential integration with Google tools, like structured data testing tools, he said.

Kauz said that Tesco and some other organizations, including coupon retailers, are starting to implement small pilots of the extension. Europe and Brazil are promising areas, as that’s where the uptake in the GS1 Web Vocabulary has been. “The U.S. is getting there but there aren’t a lot of implementations yet,” he said.


Photo Credit: alphaspirit/Shutterstock.com

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

You might also like...

Thinking Inside the Box: How to Audit an AI

Read More →