What can semantic search do for your enterprise? One example comes from the recently launched Searchbox online semantic search engine by the company of the same name (which formerly was known as salsaDev).
One of the vendor’s biggest customers is the European Commission, according to Nicolas Gamard, CEO of the Switzerland-based company. That early adopter of Searchbox is using the technology for improving search related to its public grants funding, which amounts to tens of billions of dollars since 2007. Before deploying Searchbox, both researchers and its own commissioners struggled with conducting searches across 15 different repositories, as they looked for previously funded projects and partnership possibilities across the continent, for example. Tooling through a research grant PDF document of some 150 to 600 pages was another time-consuming issue, he says.
“It was like a full-time job just to look at all the different data sources. Things were not formatted in the same way – they used different terms and structures,” Gamard says.
Today, Searchbox powers a single web application for the European Commission, where all such content is interlinked together. “So, if a researcher is looking at a grant, we suggest all the related relevant research grants, partnership opportunities across Europe, all previously funded projects, and all the information he or she needs,” says Gamard. “That’s done automatically so that, within a single look, within 5 minutes you can have identified all the research opportunities right for you.”
The story is no different in the enterprise, where data also lives in disconnected silos, and usually in different structures, making integrating all the sources together a painful activity.
Searchbox isn’t aiming to replace full-text search engines like the Google Search Appliance for enterprise search, but wants to complement and enhance such solutions with its semantic capabilities. In fact, it integrates with its semantic indexing, searching, categorization and analytics the Apache Lucene/Solr open source search engine. “The added value is not on full-text capability but on semantics,” says Gamard. It can integrate with Google’s technology too, but Lucid Imagination, the commercial company for Apache Lucene/Solr enterprise search technology, is its core business partner because of its open source model and similar focus on the mid-range space.
The addition of semantic indexing with Searchbox isn’t trivial, he says. “It adds a lot of value in the enterprise space to interlink content, and save time for users. It doesn’t matter if you spend 15 minutes of your private time searching a public web index, but that same 15 minutes in the enterprise is basically money lost by the company,” he says. One number that has made the rounds is that Fortune 1000 companies lose $2.5 billion annually because of an inability to locate and retrieve information. “You need a higher level of relevancy and performance to further help the user. And we see semantic indexing with the features it proposes as a good complement to full-text search engines.”
On the front-end the software has complete integration with the Liferay open-source content repository portal. Aiming to be a turnkey semantic solution for helping organizations work with any information relevant to the enterprise – whether public on the web or private in a content management system – the company also plans to build more connectors to enterprise content management systems. It’s also participating in the Apache Stanbol project for integrating semantic services into traditional content management systems.
Searchbox also will be one of the application plug-ins to the soon-to-be-launched Liferay marketplace. Liferay is based in the U.S., and Searchbox’ inclusion on the marketplace could help the company gain more traction stateside.
“You ask if enterprises have an issue with search and in 99 percent of the cases they will say yes,” says Gamard. “They don’t know if the problem can be solved or not, but at least it’s an area of strong interest. …I would say they are pretty open to improvements because they know it has a strong impact on users if they can quickly save time. Imagine a company intranet with 3,000 documents and now, instead of spending 1 hour to find something, they spend 5 minutes. That’s really beneficial for users.”