Simply put, RDFa is another syntax for RDF. The interesting aspect of RDFa is that it is embedded in HTML. This means that you can state what things on your HTML page actually mean. For example, you can specify that a certain text is the title of a blog post or it’s the name of a product or it’s the price for a certain product. This is starting to be commonly known as “adding semantic markup”.
Historically, RDFa was specified only for XHTML. Currently, RDFa 1.1 is specified for XHTML and HTML5. Additionally, RDFa 1.1 works for any XML-based language such as SVG. Recently, RDFa Lite was introduced as “a small subset of RDFa consisting of a few attributes that may be applied to most simple to moderate structured data markup tasks.” It is important to note that RDFa is not the only way to add semantics to your webpages. Microdata and Microformats are other options, and I will discuss this later on. As a reminder, you can publish your data as Linked Data through RDFa. Inside your markup, you can link to other URIs or others can link to your HTML+RDFa webpages.
Why publish RDFa?
The web is evolving from human-only consumption to human-machine consumption. Haven’t you ever needed data from a website and the only way to extract the data was to write a scrapper based on regex in Perl specific for that website? But if the HTML changed in the page, then your scrapper would fail and you would have to tweak it again. By adding RDFa to your webpage, those days are gone! RDFa processors can easily extract all the structured data from a webpage with a click of a button. This means that data can be easily shared and reused, without the need of maintaining the raw structured data in a separate file in a separate format.
Sharing and reuse of data is not the only advantage. Search engines have started to realize that if publishers would add semantics to their webpages, it would be much easier for them to figure out what that page actually means. Yahoo was a pioneer in this area, starting with Search Monkey. Google started then with Rich Snippets. Recently, Google, Yahoo, Bing and other search engines got together and created Schema.org, a recommendation for publishers on how to semantically markup their webpages. By adding the semantic markup, search engines can understand what the page is about and offer more relevant search results to users. Google Recipe is an example of what can be done when structured data is published on the web. Additionally, third party tools can also take advantage of semantic markup on webpages.
How to publish RDFa?
There are several ways to add RDFa markup to your HTML. You could manually add it in your HTML code. For example
<p> My name is Juan Sequeda. Please visit my homepage! </p>
With RDFa, this could look like:
<p vocab="http://schema.org/" typeof="Person"> My name is <span property="name">Juan Sequeda</span>. Please visit <a property="url" href="http://www.juansequeda.org/">my homepage</a>. </p>
You may have noticed that I used schema.org to add the semantic markup. Schema.org is a vocabulary to state what things are about. They are trying to cover a lot of domains from TVEspisodes all the way to Bars! Schema.org offers a complete list of things that can be stated with the vocabulary. You can also use other vocabularies such as GoodRelations for e-commerce or FOAF for social networks.
If you don’t want to write the semantic markup by hand, you can use several tools. For example, Drupal 7 already includes support for RDFa. Another way is to use free extensions that are available for popular e-commerce software, such as MSemantic for Magento ecommerce platform.
RDFa, Microdata and Microformats
RDFa is not the only way to semantically markup your webpage. Microdata and Microformats are two other ways accomplish the same goal. For example, Schema.org is currently supporting Microdata but have recently announced that they will support RDFa. Each approach has its own pros and cons. Jeni Tennison has extensively compared both approaches and states that “the choices developers make between microdata and RDFa will, I suspect, be largely dictated by what their consumers/toolsets/publishers will support.” Jeni thoroughly compared Microdata and RDFa and points how features that are better supported by each approach:
- multilingual embedded data is better supported in RDF than microdata+JSON
- explicit datatypes for values can be provided by RDFa but not microdata
- resources with multiple types are a lot easier to describe in RDFa
- property values that include markup are a lot easier to write in RDFa
- mixed vocabulary use is a bit easier in RDFa than in microdata
- HTML5 link relations may be misinterpreted by RDFa processors
- properties with list values are much easier to support in microdata
- common content adopted by multiple entities is much easier in microdata
Jeni is also the editor of the HTML Data Guide, which aims to help publishers and consumers to use RDFa and Microdata correctly.
Wrapping it up
The web is not just about HTML anymore. The web is now about data. Publishing structured data on the web is what the Semantic Web is all about. Adding semantic markup with RDFa will allow us to start doing things that were not possible before.