You are here:  Home  >  Data Education  >  BI / Data Science News, Articles, & Education  >  Current Article

Parse.ly Brings A Dash of Semantics To Online Publishers

By   /  January 24, 2012  /  No Comments

Online publishers and other content providers have a new analytics tool to help them understand what their readers care about and use that information to better connect them to their sites’ relevant and compelling content. Launching today is Dash, based on the predictive content analytics platform Parse.ly. The technology crawls every article page for Parse.ly’s publisher-partners, and analyzes, in real time and at scale, the text to identify relevant topics to group related content together. Behind this lies natural language processing technology, which uses language queues hidden inside the text to determine its affiliated topics. To date Dash has extracted over 350,000 unique topics through all the URLs is has crawled during private beta for a healthy taxonomy of topics across the web being consumed by users.

“Most analytics tools, like Google Analytics and Omniture, are one-size-fits-all,” says Sachin Kamdar, co-founder and CEO. Dash indexes a publisher’s posts and cross-references them against corresponding authors, topics, sections and referral sources, and these can be mashed them together in various ways to help publishers understand not just that something is moving up or down, but why. It provides visualizations tying in metadata and a post’s lifecycle status so that, for instance, publishers instantly can make the connection that a post published a few weeks ago on a certain topic is moving up in traffic again, thanks to Facebook.

Dash delivers analytics both for the local site, so that you can see, for instance, top performing posts for a specific topic during a certain date range, and webwide trends, so that publishers can gauge the topics global audiences are gravitating to, as well.  “The combination of the two helps you decide what to write about from a planning and promotional perspective,” he says – not just who should be assigned a topic gaining steam based on past traffic trends for their coverage, but what audiences to reach out to when a topic resonates across the web.

It’s processed 4 billion pages so far with a beta pool of publishers that include The Next Web, Atlantic Media Group, US News, Press Enterprise, Wet Paint, Daily Caller and Apartment Therapy, and crawled data from over 4 million URLS. It’s on track to do about 700 million page views per month across its publisher network.

In addition to leveraging metadata such as authors, published dates, sections an article appears in, the NLP capabilities of the tool identify inside each story multiple topics, says co-founder CTO Andrew Montalenti. So, publishers can learn things like the fact that, at a recent point in time, the number one cause of traffic for stories about Newt Gingrich as a topic were those with South Carolina as a co-topic. “So what this tells me is that, if I were an editor, if I write about Newt Gingrich now I should be writing about Newt Gingrich in South Carolina,” he says. Its co-topics feature is a lynchpin of its search aids, too – publishes looking at a topic such as “Barack Obama” can go to co-topics and see how things either directly or tangentially related to the topic have performed, as they consider next coverage steps.

Montalenti says he’s excited by other semantic-related efforts underway that can help Dash bring even more value to its customers – the International Press Telecommunications Council’s rNews standard among them. He’s encouraging more of the publishers Parse.ly works with to consider adopting the standard for using RDFa to annotate news-specific metadata in HTML documents. “It’s not just providing a way to more consistently extract metadata from publications, but for Parse.ly to give publishers a better way to integrate with semantic standards for SEO and CMS (content management system) workaround benefits,” he says. Schema.org this past fall added support for the rNews standard, so that online publishers can implement it using an approach supported by the Google, Bing and Yahoo! search engines. “Part of the educational challenge is to make publishers realize how important microdata standards like that are for concrete benefits like better display in SEO.”

The publishing community does get semantics as far as it relates to things such as the Facebook Open Graph protocol, where they see the value pretty immediately in the Likes they get. “Dash is positioned to bring the same sort of immediate impact into publishing organizations, where by picking up all the semantic data will show how powerful it can be from a tracking, planning and promotional perspective,” Kamdar says.

At launch the company is starting to make its API avail to its private publisher-partners, as a tool for their techies vs. their editorial teams. They can leverage it to get access to information that lets them start to integrate some of the trending topic-author tracking and other functionality deeply into their websites to better engage their users using their own data.

Pricing for Dash begins at $499/month and varies based on the desired feature set of the publisher. The company was incubated at DreamIT Ventures, and is financed by Blumberg Capital, ff Venture Capital, Scott Becker, Don Hutchinson, Jeffrey Greenblatt and Jon Axelrod.

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

You might also like...

Three Traditional Storytelling Techniques That Add Value to Data and Analytics

Read More →