Just a little over a year ago The Semantic Web Blog introduced our readers to Gravity in this article. The project, spearheaded by former MySpace execs, is focused on building the Interest Graph. The team’s been pretty quiet about development efforts since that time — until just this month, when it announced Gravity Labs to let the public in on a little more about its underlying Interest Graph infrastructure and to showcase the platform. It also announced that it was open-sourcing some of the “plumbing” code it came up with during development, while understandably keeping its core IT, ontology and algorithms under wraps.
The announcement noted that the internally-named Gravity Interest Service for personalizing content at scale, in real-time, went live at production-scale 6 months ago. So far the technology has created over 400 million user interest graphs; served over 13 million pieces of personalized content per day; personalized the daily Internet experience of tens of millions of users per month; and processed over 25 million inbound interest signals per day, the company says. It expects that at this rate, that in under six months it will be handling 10X all of these numbers.
The Semantic Web Blog once again caught up with Gravity CTO Jim Benedetto to talk some more about the Interest Graph, a term he acknowledges gets thrown around quite a bit these days, with a lot of web sites claiming they’ve got the goods. But, he says, “what they effectively are saying is that buried deep within the data of our logs or deep in the data of how our users interact with our site, we know there are interest indicators there. But a lot of them are not doing much with their data.” Interest Graphs, he says, aren’t owned, but interest data resides in individual places and across the web at large — and they need the Gravity platform to help unlock that to create dynamic and personalized experiences for users, Benedetto says.
“The local sports site I go to will always have a better sports graph on me than the finance site I go to,” he explains. “So this is not a winner-take-all type of situation, and we realized that early on. So we built the underlying platform or infrastructure that lets these sites unlock their Interest Graphs, which then effectively unlocks the Interest Graph across the entire web.”
Gravity’s platform is live now on partner sites including the Wall Street Journal and TechCrunch, and Benedetto says 15 other partners will be going live with it over the coming months. Major news publishers and content providers clearly can benefit from providing their audience content that is more personalized to their interests, but so too can daily deal e-commerce sites. “These have very similar features and attributes as news articles,” he says. “Short half-lives, to be targeted and personalized right away. By the time there are ten thousand clicks on a daily deal to build a collaborative filtering matrix, the deal might have been sold out.”
In our last article, Benedetto talked to The Semantic Web Blog about what sets its technology apart when it comes to personalization. For instance, Gravity isn’t based on explicitly asking users about their interests, but uses implicit NLP (natural language processing) and machine learning to look at people’s actions and behaviors and determine what really draws them or is primed to catch their eye. Inferencing is an important part of that picture. In the year since, Benedetto says the team has been working on relevancy as key to a successful Interest Graph, which can build on all the work that has gone into ensuring the factual accuracy of public ontologies such as DBpedia and Freebase. A core focus has been “being able to build both machine-learning processes that derive relevancy and can infer relevancy from behavioral actions and publicly available data across the web in standard ontology forms in our own web crawl, and building algorithms to leverage that derived data,” he says.
As an example, he points to Kobe Bryant – there are a hundred or more facts about him that exist in ontologies: who he’s played for, what years, his Olympics experience or Gatorade sponsorship. “All those facts are accurate,” Benedetto says, “but what is relevant to a human? If I gave you a list where someone said they liked Kobe Bryant, what would that person be interested in? A human being could say he’s not interested in his work for Gatorade or performance in the Olympics, even though both are true.” What’s more relevant is getting computers to infer from the fact that someone likes Kobe Bryant that that person also likely is interested in basketball, just as another human could draw that conclusion. And, he says, Gravity’s work tuning for relevancy has resulted in dramatic improvements in areas such as session length, click through rates, and return visits across its partner sites, that are “an order of magnitude higher than in previous technologies.”
Moving forward, the challenge “is paramount to accomplishing the goal of personalizing the Internet across all domains and letting each machine learn and understand interests and intent based on relevancy in a non-specific fashion.” And, Gravity also is accounting for content virality and importance in the Facebook and Twitter realm in ensuring that people don’t miss something important just because it isn’t correlated with their Interest Graph. “We use that data set to help build serendipity in our equations, to stop the filter bubble effect,” he says. So, even if someone’s Interest Graph profile doesn’t itself indicate a particular affinity for news about weather events, when a tsunami occurs and the traction it gets in the social realm indicates it to be an important happening, that information can become part of personalized results.
Gravity’s goal of being the plumbing of the Interest Graph – the infrastructure behind it – isn’t at odds with its becoming as known to consumers as Facebook or Twitter is, Benedetto says. “We hope the end user will know us. We want to be prominent on every site that we help to personalize. We want the user to know Gravity makes the individual web experience better,” he says. “We don’t necessarily believe that a consumer site is necessary for us to be well known by users. Anywhere they go they can interact with Gravity across any site on the web, and it becomes better and more personalized to them.”