New and updated APIs for Infochimps take some next steps toward expanding developers’ access to structured data and possibly influencing sentiment analytics.
One of the two new API calls has Infochimps’ IP-Census data mapping to MaxMind’s GeoLite commercial IP-geolocation database data. Infochimps says it returns census level demographics based on a given IP address’ zip code, which helps in use cases such as targeting Twitter users with online ads based on their demographic profiles or filtering users by demographics for research cases. MaxMind’s API for its GeoLite data isn’t restful, says Infochimps COO Joseph Kelly, while “our API, because it’s restful, means 99 percent of the population out there can use itâ€¦.That’s one aspect to bringing that type of data into a broader audience by making it more easily accessible.”
He also thinks users would be unlikely to find the mix of census and geolocation data in such a comprehensive format. “It’s interesting that this is a merging of two different data sets, one from a commercial supplier and one from a free and public resource. So from our own point of view of what we want Infochimps to be for people, that is a key part of our platformâ€”look, find , mash together and create value for everyone,” Kelly says. That’s on the same page with the semantic web’s goals of creating a connected Internet, as he sees it. “When you have data in RDF triples it’s the easiest stuff to mash together,” he says. “That’s about connecting knowledge and creating value, too.”
Infochimps also is serving up another API call, Strong Links, that reveals a list of the top users with whom users communicate the most. And it’s refreshed API datasets including Trstrank that now has a new field called Trstquotient that can be used as a spam indicator; Conversations, which now gives a full summary of interactions between users; and Influencer Metrics, which has added more power with more fields that help organizations get a finer glean on a user’s influence footprint in the Twitter world (such as the number of tweets a user sees in a day and the ratio of times a user has been retweeted to his number of tweets out), which could matter to marketers and advertisers in their brand evangelism and outreach efforts.
From Quant to Qual
That’s all quantitive results, but Kelly says there is potential to bring qualitative results into the picture in some way, as well. Imagine, for example, companies developing against Infochimps’ existing API calls for understanding who are the main influencers in discussions about their product, and also developing against APIs for understanding the tenor of those influencers’ posts. “We don’t do text analysis right now,” Kelly says. But Infochimps is in discussions with a possible partner, whose name Kelly can’t disclose, “who has best of breed NLP where their software can understand text, and extract entities and sentiments from contentâ€¦. We would be really excited to work with this partner to offer APIs for that, but it’s still unclear on what that product would look like.”
Probably the closest thing Infochimps does to anything like textual analysis is its Wordbag API, Kelly says, which lets you build a word cloud for a Twitter user in any app. It captures the user’s 100 most characteristic words, and that’s gotten a refresh, too. You can try out the Wordbag API in an example app Infochimps has created called Twwhat here. (The Semantic Web Blog word cloud revealed by Twwhat is pictured at left.)
Kelly says things are still on track for data set refreshes to go weekly by end of August. And he says Infochimps is keeping its eye out about whether any of its latest efforts, such as the IP-Census data mapping to MaxMind’s GeoLite, are raising privacy hackles. He points out that that API call is using public census data and IP addresses that also are public, and just taking things one step further. “To display different offers to someone you think is in Beverly Hills vs. El Paso, we think it serves that kind of use â€“ you’re still just serving an ad to someone you think is in a certain neighborhood, but you don’t know enough about that single person,” he says. “We take privacy concerns very seriously but this is coarse-grained enough that we think it’s ok.”
• Don’t forget to propose your startup for our Semantic Web Impact Awards. The deadline is Sept. 15.