Semantic Wikis in the Enterprise – Sanjiva Nath

I consider myself firstly a fan and evangelist of semantic wikis and secondly a provider of technologies and solutions that offer those capabilities to enterprises.  After spending several years following technologies related to this particular feature, on the eve of the Semantic Wiki session at the 2009 Semantic Technology conference next week, I wanted to offer some perspectives on Semantic Wikis and what the enterprises can hope to achieve from their deployments. 

These capabilities are not directed at a single product or solution, rather they are based on the convergence of the technologies and the needs of enterprises in various sectors (government, health sciences, education, technology, etc.).

Semantic Wikis are one of the earliest implementations of technologies involving the Semantic Web.  I remember one of the first articles I came across on the topic.  It was based on the work done by Adam Souzis (Building a Semantic Wiki).
Adam's vision was very consistent with what we had been seeking ourselves, i.e. "capture and represent informal, human-authored content in a semantically rich manner", although Rhizome was mostly a concept prototype.

Since then, we have tracked a number of efforts principally from ontoprise/University of Karlsruhe (Semantic Mediawiki aka SMW) and Salzburg Research (IkeWIki).  One of the foremost differentiators was whether the entire technology (including the wiki features) was built from the ground up (IkeWiki) or offered as an extension to a popular wiki (SMW).  Last year, we also decided to follow the latter trend and extend Atlassian's popular wiki Confluence with semantic capabilities by delivering our own solution (Wikidsmart).

All these offerings strive to solve a common set of problems for wiki users related to content accessibility, maintainability, reliability, search-ability and integrate-ability.

Wow!  When you think about hundreds and thousands of wiki instances in mid-large organizations, hundreds of thousands of page content -- those are very serious problem areas.  Its a wonder that anyone is able to make reasonable use of wiki content.

And yes, semantic wikis are capable of addressing these problems, although their focus and implementations may vary.

In our view, here are some of the key features of ‘semantic wikis’ that enterprises can and must expect:

1.    Domain-specific Definitions:  This refers to a mechanism for categorizing and annotating (IOW ‘semantisizing’) content appropriate to the needs of a domain, preferably via formal definitions of metamodels and ontologies. This is generally accomplished by:
a.    allowing the users to create their own vocabularies and attributes based on the content they create, 
b.    providing a defined set of metamodels and ontologies for that domain that can be referenced to structure that content to maintain consistency.  This approach has the potential to provide a powerful structure for content annotation.
2.    Easy annotation of content: The strength and popularity of the wiki in the enterprise is mostly due to the fact that it allows for easy and collaborative creation of free-form unstructured content.  Without impacting this, there needs to be an easy mechanism that allows users to add some structure or meaning (via categorization or annotation of specific sections/paragraphs) to make the content more accessible and machine-processable.  This is accomplished by:
a.     allowing the users to annotate the content on the fly,
b.     providing semantic templates and forms for semi-structured content entry
c.    asynchronous processing of content to extract relevant meaning, using domain-specific NLP algorithms (semantic tagging).
3.    Federated Content Repositories and Search: One of the key problems that enterprises face is the fragmentation of content as a result of multiple instances of wiki content (often dozens and hundreds).  How does one access and unify it across multiple instances?  The semantics of content must not be limited to or specific to a wiki instance.
4.    Integration and Interoperability with Applications:  If the wiki is a source of information, it isn’t the ONLY source.  Users often attempt to integrate outputs of other applications and tools into the wiki to maintain some unification of information, although this is normally accomplished crudely using scripts or widgets.  This unification needs to be achieved in the semantic repository that can provide easy mechanism for other applications to both contribute as well as access the same content and context.
5.    Wiki as an Information Portal:  The wiki need not serve as the source of its own content.  Using simple SPARQL queries, for example, information in the wiki can be ‘dynamically’ rendered.  Furthermore, this information comes from potentially federated sources in a manner that allows for the wiki to serve as ‘The Information Portal’ for all users across the enterprise.
6.    Content Filtering and Role-based Access:  Most enterprise wikis support role-based access to pages.  This becomes complicated once content is semi-structured and more granular and available across the wiki and in other applications using simple query/access mechanisms.  Applying access privileges at the concept and attribute level becomes a critical need across departments.
7.    Content Classification: Beyond simple categorization of content, supporting multiple and complex hierarchies, particularly those that can be inferred based on the underlying ontologies allows for more powerful and flexible categorization and accessibility of content. 
8.    Smart Search: Search isn’t effective if it is limited to keyword matches in wiki content.  How about being able to search for specific content categories (ex: Process, Role, Person, Project, Milestone, etc.).  Furthermore, given the existence of metamodels that drive content annotations, this can become even more flexible by providing the ability to search for specific properties:  Person:lastName, Document:author, Project:team, Process:rolePlayed, etc.  This level of precision is only practically achievable using these technologies.

Finally, the semantic extensions need not impose the choice of a wiki.  There are a variety of wikis popular in enterprises and even within a single organizations, it is not uncommon to find a number of different Wiki flavors. 

Looking at the list above, there are two key takeaways.  First, most or all of these capabilities are currently available or feasible, although not all of them is necessarily offered in any single product or solution.  So it is prudent to focus on what is most relevant for your needs. 

Second, if you look at all of the capabilities above, a wiki starts to look like a fairly powerful content management system – all through the application of semantic technologies.

-Sanjiva Nath
CEO zAgile Inc.