The media industry has had a complicated relationship with the Web, and that’s putting it kindly. While other sectors pretty quickly realized ways to take advantage of that new thing called the Internet – to sell goods, accelerate supply chains, and build deeper customer relationships – established content providers spent years trying to figure it out. And many still are tussling with big issues, such as whether or not to charge for access to content.
Given the Web’s impact on their business model and their revenues, you can forgive publishers if they might prefer if the darn Internet just stood still for a few minutes and let them catch their breaths and catch up. Since that isn’t about to happen, the thing to do is to make peace with those changes, many of them thanks to Semantic Web technologies – and figure out fast how they’re going to profit from them.
They’ll have an opportunity to do just that at the upcoming Semantic Web Media Summit in New York City, whose speakers will include Michael Dunn, VP and CTO at Hearst Interactive Media on the topic of why media companies should be interested in this critical part of the Web 3.0 world.
Dunn sees a number of reasons for using Semantic Web technologies as the means for structuring the wealth of content that publishers produce. There’s improving its discoverability by the world via search and social, of course, but it matters for internal operations, too. And add to that the relationship with online advertising so that content can be better monetized.
You might prefer to think of all these aspects as part of the bigger picture of being able to harness new opportunities at the speed of innovation. Think, he urges, of IPv6 (Internet Protocol version 6), which supports some 340 undecillion addresses. “All of a sudden there will be new, additional IP space and innovative companies will create new devices or new services or new aggregation methodologies with new ways to demand content,” he says. “You want to be ready for that innovation. You want your content to be more like data so that you can just enable it for whatever you need.”
You don’t even need to wait for IPv6 to have the problem of not being able to efficiently meet content demands. That problem exists today. Consider a diverse publisher, such as Hearst, with ownership ranging from newspapers and magazines to TV and cable channels, and the potential for getting more leverage out of existing content across titles or venues, or to use what’s already there to jump into a new space.
“As new markets come up in existing or new industries, we don’t really have a method of seeing the full spectrum of content that’s at our disposal,” Dunn says. This leads to such things as recreating it or purchasing it from a service provider – maybe multiple times – or simply missing out on these opportunities. “So the initial draw of semantic technology was as a structured way to contextualize or categorize our content and put it into a format so that not only our internal tools and workflow could understand the breadth of content we have, but so that it’s in a structured, leveragable format for external environments we work within.”
If content is properly structured so that you can query it properly, he says, you can find out easily if you have exactly the content you need and format it for the new opportunity.
Bigger publishing organizations would be wise to start experimenting now with semantic technologies, with focused trials, especially in the face of competition from smaller or more focused entities that will be able to more easily jump into structured content opportunities. At Hearst, plans are underway to do a trial with rNews, a set of specifications and best practices for using RDFa to embed news-specific metadata (headlines, bylines, publication dates and so on) into HTML documents, to see if it helps visibility across search engines, social platforms and aggregation sites, he says.
“If we improve structure via metadata, in a standardized way as part of rNews, in the end the gain is we are going to improve the viability of our content in those platforms we already interact with,” Dunn says. “In today’s world those entities scrape our content or get an XML feed. If we put it into a standardized, RDFa specific structure that has news-defined metadata, are we going to see better results? That is a very straightforward, very concrete thing.”
The end game of semantic-infused content? Maximum control over it, API-enabled access to it, the ability to build analytics and metrics around what you do with it, and the chance to meet the voluminous content needs and opportunities that exist now and that will increase. “It’s not necessarily easy for publishers to begin to think of content as part of their platform,” he says.
But if they start thinking of it that way now, maybe the publishing industry’s transition to the next stage of the web won’t be quite as arduous as it has previously been.