<img alt="swing5.jpg" src="/files/original/swing5.jpg" width="177" height="94"
Swingly stepped up to the plate this morning. The new service, which we first mentioned here, takes text on the web â€“ from tweets to NY Times’ articles to large structured web databases â€“ and builds up what it says is the world’s first web-scale answer engine, with north of 100 billion Q&A pairs so far. Its focus in its launch is on answering factual questions, using NLP to extract and index factual information from documents and semantic inference techniques to recognize links between questions and their answers, building a page rank style graph to quickly identify authoritative content. The company sees its work not only â€“ and perhaps not even mostly â€“ as a standalone answer engine, but as a complement to existing social Q&A services, among other outlets.
With its roots in the NLP work done by government contractor Language Computer Corp., the privately-funded Swingly spin-off tries to understand the user’s query intent, according to founder and managing partner Andy Hickl. “Then we take that plus what we understand about the query and the words or names mentioned in it, and we try to find the answers that are most authoritative to satisfy the user’s needs,” he says.
What sets its ability to do this apart from others with similar goals, he says, is “the amount of semantics we can bring in from the text,” adding up to annotating about 10,000 different types, from tv shows to business execs, and making those annotations available to its search model. “So if you have that level of semantics you can piece things together to get fine-grained relationships, like that between a baseball player and his manager.”
Hickl’s hoping that social Q&A sites will be eager to tag on to the success auto-generating Q&As that he expects Swingly to achieve. The thinking is that such sites won’t necessarily have specialists who’ve generated every possible question and answer pairing in subjects they cover â€“ not to mention the ones their editorial references or community contributors may not dip into at all. “Having a machine system successful in this way lets these filter Q&A sites have more to say,” he says. And if they’re compensating for expertise, there’s potential to cut costs, too, as well as for the sites to flag current answers and offer bounties to improve them.
“Then social Q&A by a Facebook or an Answers.com is on an equal footing with traditional document search models. It can fulfill the promise of micro-search.” He’s also looking to “democratize” the knowledge out there so that users have an easier times finding out information that may be personally relevant only to a small subset of people that isn’t likely to be found in a structured database or community site.
Though he isn’t prepared to disclose anything yet, he says Swingly has seen “tremendous momentum” on this front from relevant parties in the social and social Q&A space. There’s also partnership interest, he says, from opportunities to promote someone’s knowledge resource up high as an answer — say, a source for the best barbeque sites in a city â€“ on the main site. “You don’t want to skew the objectivity or quality of knowledge, so a sponsored link would be given the same kind of ranking model for finding your answers,” he says.
Swingly was designed with cost-effectiveness in mind, miniaturizing its technology to run in a half gig of RAM, Hickl says. It uses the 80legs service for web crawling and building its semantic database and turning it into its index. The service is committed to spidering as much as it can on its budget, including community Q&A sites, and that’s where almost all the 120 billion Q&A pairs so far hail from. But it also can in a matter of seconds process the first 40 pages it gets via search APIs from Google or Bing to do a deep dive into their links to build an index, join that to its existing index, and then search the aggregate to find answers. “Since we spider current data and get it from search engines, we’re more responsive to information that may not have made it into the silos yet,” he says. “We have an infrastructure that helps us get data faster than communities do.”
The site also is designed to give some social weight to answers, too. “We co-opeted the Open Graph like modality and added likes to our index,” says Hickl. “So if lots of people like an answer it may get boosted even if it’s not the semantic answer. I would like someday to incorporate trustworthiness of sites into this, too.”
Another opportunity he sees for the technology is to be used within enterprises, applying Swingly’s indexing to corporate knowledge bases and turning them into in-house FAQs. “Micro search also lends itself to the mobile space,” he says. “Take that vertical data, create a FAW and wrap it into the mobile search experience and you can make sure people are never far from the corporate knowledge base.”
Hickl acknowledges there are still some “clunker” answers that can come by way of NLP, but says that that’s changing rapidly “We want to show we can embed this as a way to access knowledge,” he says. “I think the right way to envision this is that it could stand on its own but there is a lot of merit in seeing a combination of this technology with social Q&A services or even traditional search services, and that what we’re looking at aggressively in terms of trying to find partners to be force multipliers for this.”
• Swingly will expedite invitations and fast track access to the service to the first 500 people who enter the code “SemanticWeb.”
• Don’t forget to propose your startup for our Semantic Web Impact Awards. The deadline is Sept. 15.