You are here:  Home  >  Data Education  >  Current Article

Searchbox Wants To Help You Build Your Enterprise's Specialized Search Engine

By   /  July 10, 2013  /  No Comments

Searchbox is taking its enterprise semantic search technology in a new direction. The offering, which The Semantic Web Blog initially covered here, today is packaged as a software-as-a-service (SaaS) solution, and it’s now based on the Apache Solr open source enterprise search platform from the Apache Lucene project rather than on proprietary technology.

“We completely changed the technology stack for keyword search and integrated our semantic technology into Solr,” says chief product officer Jonathan Rey. On top of Apache Solr, he says, the company developed a search application framework that IT managers, CIOs, and developers can leverage to provide a richer experience to end users.

“There is no such thing as ‘standard enterprise search.’ Searchbox is a platform onto which companies can build a specialized search engine,” Rey says.

With Searchbox, companies index their information sources into its Solr backend using its Connector framework, the standard Solr API with a client library, or custom data import handlers. From there, they can configure the search experience.

Companies can base multiple database search projects on the same source code. “Just change settings to change the look and feel of the search engine, and [change] the data that is being used,” says Rey. Companies can choose the search fields and presets they want to display, define user filters and facets and create visualization templates for the data. “We transform search requests, send them to Solr and transform [results] into something useable for the end client,” he says.

Anyone with a Solr installation can obtain a key to try out for a month Searchbox’s search plug-ins, as well, which also are available as part of its hosted solution. These plug-ins include  what Rey notes is its “key semantic technology,” Searchbox Sense, for adding conceptual search and related content capabilities. Also in the set are three plug-ins that leverage natural language processing algorithms: the Searchbox Snippet text summary plug-in, the Searchbox Tagger for extracting keywords and tags from text, and Searchbox Language Detection.

There’s a PubMed Searchbox demo you can explore here, as an example of what an enterprise could build on Searchbox. It enhances search with features such as a sort by function, clickable tags, and a facets range with histogram. Click on a result for a search on “cigarette smoking teen,” for example, and you’ll have the opportunity to discover documents that are semantically close to the one you’ve chosen. “Our application framework embeds the features the search engine has – it does it all for the end client,” says Rey.

Another Searchbox-powered effort, the Opportunity-Finder prototype (discussed in the earlier story) that provides a platform in Europe enabling access to public funding information, such as individual research grants from the 7th Framework Program and research partnership opportunities, was upgraded this spring. It now features live search (updates the result set in real-time), query-completion, spell-checker and query correction, as well as specific search presets. Currently, additional data is being indexed, as well.

Rey says in the next couple of weeks or so to expect its next offering, SOLR hosting as a service. “It will be based on the same technology we use but it won’t have the entire front end to build their own search engine,” he says. Why this effort? “Eventually people will use our SOLR server and realize they need a search engine on top of it, and they will upgrade their account,” he says.

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

You might also like...

Is Data Governance Solely About Controls on Data?

Read More →