Dow Jones & Company’s Factiva information service has long been distinguished by the semantic tools it applies to its content to surface relevant search information. Last week the company announced what it says is one of the most significant investments it’s made in the Factiva product suite, licensing new search technology from MarkLogic Corp.
The arrangement is positioned as providing standardized search technology across the Dow Jones digital network, including Factiva, WSJ.com and Dow Jones Financial Services products. To be specific, the investment in one underlying search technology that will be used by the company’s multiple businesses and products means that, “one powerful, unified search platform will service the search needs of our consumer and enterprise customers around the world,” says Georgene Huang, head of Factiva. “Any improvements or customizations we build atop this infrastructure will be scalable and efficiently accessible to all.” That will allow better and easier synergies between the development, products and the content, she says.
The new search technology, Huang says, complements its continuing investment in Factiva’s core metadata and taxonomy strengths in many ways.
“The new implementation will provide better ability to link, tag and accurately retrieve structured and unstructured content,” she explains. “Additionally, it will allow for dynamic taxonomy evolution and coding of content.” As an example, the same person (Barack Obama and President Obama) or company (BP and British Petroleum) may be referred to and discovered without requiring the user to specify all variations of what they are seeking. This is true even if the “synonym” is in any of the other 28 languages that Factiva supports, she notes.
Another highlight, she says, is the improved ability to use proximity of search elements within its content — for example, users can now find company “X” with the word merger no farther than 10 words and with >100M revenue, or within the same paragraph.
An industrial strength search engine’s capabilities matter to the user experience on many fronts beyond these. For one thing, without that users might experience growing latency in the time it takes to return very precise search results across a more than-50 year (and growing) archive of content. Also, clients can have the choice of using a personalized search that customizes results by their history search patterns and interests; by similar user groups; or generalized based on the current task they need to perform.
“It will also narrow the result set and improve direct access to relevant content within the structured or unstructured data,” she says. “It will be easier to detect related sources of content, which may save the user time which is otherwise wasted in going through duplicate information.”
Factiva today not only has 36,000 licensed publications among its aggregated news content, but also gives its users access to Twitter commentary, blogs, YouTube and message board content. MarkLogic’s robust technology, she says, will facilitate Factiva’s processing of that tremendous content store.
MarkLogic supports near-instantaneous alerting on customer-selected topics in a world where real-time breaking information increasingly comes from non-traditional publications, she points out. Says Huang, “This search upgrade is critical to supporting all types of content, especially the exponential growth of user-generated news.”