You are here:  Home  >  Data Education  >  BI / Data Science News, Articles, & Education  >  Current Article

OpenText Takes Next Steps In Automatic Content Classification

By   /  July 17, 2014  /  No Comments

otextby Jennifer Zaino

OpenText yesterday made its secure file sharing and synchronization product, Tempo Box, available for free to customers using its OpenText Content Suite enterprise information management tool.

“A lot of our customers have major concerns about employees sharing documents with cloud tools like Dropbox,” says Lubor Ptacek, vp of strategic marketing. They want them to be available, synched and sharable across all their devices, but using such services can create security and compliance problems. By deploying Tempo Box on top of their existing infrastructure, at no charge to all internal employees and any external parties they may need to share content with, companies get a seamless and cost-effective way to share files in the cloud without compromising security, records management requirements and storage optimization, he says – “the things that enterprise customers care about, especially those operating in regulated environments.”

Among those capabilities is applying automatic content classification, which is usually required for records management reasons – for example, helping companies determine if a document is an employee record they must keep for five years or a tax record they have to hold for seven years. That under-the-hood classification engine is an outgrowth of OpenText’s acquisition a few years back of text mining, analytics and search company Nstein. Since the acquisition, says Ptacek, the company has been looking at ways to apply the technology to specific business problems and make it part of its applications.

It has launched, for example, its Semantic Navigation product to target website content and automatically recommend related articles to visitors, to keep them at the company’s website. Its Auto-Classification for Records Management product helps organizations deal with retention issues, litigation risks and storage and eDiscovery costs, removing the burden on business users to manually identify records and apply classifications to content including unstructured information.

Ptacek says that companies can achieve an 80 to 90 percent accuracy rate, far beyond the 60 to 65 percent they can expect to see if they manage to get their employees to abide by manual practices.

InfoFusion, which replaces a variety of individual information applications—and their associated indexes, connectors, hardware, and support—with a common information management platform, also uses its content classification engine. “It helps you plug into multiple repositories and navigate access to their content, otherwise for every search query you’ll have a lot of results,” he says. The classification engine enables automatically and dynamically grouping content based on patterns discovered in results sets – such as extracted entities, to bring together all documents from connected repositories with a certain individual’s name in them, or by size of documents or other parameters. “That helps people to do something meaningful with content from a search- based paradigm,” he says.

Expect later this year to see more specific product announcements around InfoFusion. Ptacek says there is a lot of interest by clients in initiatives such as voice of the customer – they want, for example, to leverage the technology to apply sentiment metrics to content and commentary plugged in from many different online sources. “If we aggregate all that together, they ask can we apply the engine to measure the voice of the customer, and the answer is an absolute yes,” says Ptacek.

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.

You might also like...

Predictive Analytics: Giving Smart Manufacturers an Edge

Read More →