Chief Knowledge Architect of the KAPS Group, Tom Reamy has created a new term: enterprise content categorization, or ECC. ECC “is an element of the text analytics field (along with text mining, ontologies, sentiment analysis)” with the mission of revealing the semantic infrastructure of an organization.
According to Reamy, the core capabilities and features of ECC are categorization, entity extraction, fact and event extraction, summarization, and clustering. Taxonomy, controlled vocabulary, metadata and ontologies also fit in as “related semantic elements.”
Reamy feels that currently many enterprises “don’t have a clear idea of what content and/or content structure they have.” He goes on to define the three dimensions of semantic infrastructure as content and content structure, technology, and the team of people responsible for both. Reamy argues that “enterprise content categorization would lead to cost savings in search and improved content management” by “adding structure to unstructured content – through categorization, noun phrase extraction, ontologies – and using faceted navigation.”
Image: Courtesy Flickr/ Bo47