— PETER SWEENEY, ANNE HUNT
How do we roll out the Semantic Web? Paradoxically, the fast track may involve getting help from billions of people who know nothing about the Semantic Web and have no interest in it. Much of the effort in semantic representation has been focused on annotating existing content. Unfortunately, creating a shared semantic layer over existing content is proving to be a daunting task. We need alternative strategies, and more importantly, many non-technical contributors. How can we better devise solutions to put those billions of semantic generators to work on building the Semantic Web?
Challenges with Content-Directed Approaches to Semantic Representation
Much of the effort in semantic representation has been focused on annotating existing content. Human experts create semantic representations to organize existing assets such as databases, documents, and social media. In so doing, they seek to represent objective, shared knowledge. Machines operate against these knowledge models and execute tasks on behalf of consumers. Under the content-directed approach, knowledge models are prepared in advance of the machine-executed tasks, often with limited direct input from the consumers for whom those tasks are executed.
As an example, consider a real estate application. Under a content-directed approach, knowledge engineers would provide some knowledge model to organize the real estate properties in a database. To be effective, they would attempt to anticipate the needs of the consumers and semantically annotate the property listings accordingly. Finally, machines are tasked with searching through these listings based on queries submitted by consumers to find a property that meets their needs.
Creating this type of semantic layer over existing content is proving to be a daunting task. Creating shared semantic representations for existing content is expensive due to the sheer glut of online content and the compounding effect in the volume of data needed to create machine-readable semantics.
Further, when we move beyond existing content assets to include knowledge generally, much of it subjective, the scope of the problem expands. There is a need to incorporate personal semantics, representing the interpretations and viewpoints of the consumers of the content. Not only is it proving intractable to annotate an ever-increasing body of content, applying an objectively defined semantic network to the content misses the subjective, personal aspect that is vital in consumer-facing applications.
Lastly, as compared with the models, standards, and protocols that preceded it, the Semantic Web has placed a much higher barrier to entry on producers, a stark change from the very accessible document-centric model of the World Wide Web. We need far more contributors of semantic data than what we are seeing under content-directed approaches.
Given these challenges, we need alternative strategies, and more importantly, many non-technical contributors. What if we turned the tables, democratizing and personalizing the effort, letting every individual consumer create their truth as they see it?
Thought Networking: A Personalized, Consumer-Directed Approach
Thought networking is a term we coined at Primal to describe a consumer-directed approach to building semantic networks that leverages consumers in a novel way. Under this framework, semantic networks are constructed in real time, based directly on consumer interactions. This approach complements content-directed approaches and moves semantic representation into the realm of personal thoughts.
Under this consumer-directed approach, systems enable consumers to create semantic representations that reflect their intentions. We refer to these consumer-directed representations as “thought networks” and the dynamic activity of creating them as “thought networking.” Rather than describing the content, thought networks represent aspects of the consumers’ tasks and their expectations about the content required for those tasks. The major difference here is that the content organization is discovered through the expectations of consumers, rather than being imposed by knowledge engineers in advance.
Note that this consumer-directed approach is not a Web 2.0-style collaborative approach. While Web 2.0 collaborative processes are obviously consumer-driven, they are often framed within the task of annotating and sharing content as opposed to annotating the mental models of the consumers themselves. One of the key user interaction challenges of consumer-directed semantic networking is that tactics are needed to shelter consumers from its complexity. This is a key differentiator from existing Web 2.0-style approaches, where consumers are collaborating directly and with full knowledge of the activity. Here, we must leverage implicit and indirect means to relate consumer activity to aspects of the semantic representation.
Once the mental models of the consumers are created and organized within semantic networks, they can be used to direct software agents: collating content and building documents; traversing the Web to find related information, or interfacing with social networks to connect like-minded individuals. Software agents interact with web services such as search engines to accomplish tasks such as searching, filtering, harvesting, and organizing information.
The resultant thought networks may also be used to impose a semantic structure on the unstructured content found on the Web. As these software agents interact with the semantic data and the Web, formerly unstructured sources may be annotated with the semantics provided by consumers.
Returning to the real estate application example introduced above, within a consumer-directed approach, the system would begin by creating a mental model of each consumer’s requirements, for example a representation of their “dream home”. Once the consumer specifies their dream home, the representation is used as a lens through which the property listings can be evaluated. Given a formal structure of the consumer’s input, the property listings can remain in a largely unstructured form. However, in the process of filtering the listings through these mental models, each property could be annotated with the semantics identified by consumers.
In other words, semantic webs evolve as a by-product of this consumer-directed process, avoiding the knowledge modeling bottleneck at the start. Technically, the problem shifts from the semantic representation of existing content to the semantic representation of abstract thought.
Technical Feasibility and Testing
To the question of technical feasibility, there are many existing technologies that would support the consumer-directed approach: NLP to translate natural language; Web 2.0-style workflows such as semantic wikis; ontology editors; and many others. While various technologies have an affinity with a consumer-directed approach, they should not be confused as implementations of a consumer-directed approach. For example, NLP may be used to transform a natural language statement into a formal semantic representation. Under a content-directed model, it may be used as a gateway into the exploration of existing knowledge models, while under a consumer-directed model, it may be used in the creation of knowledge models. Similarly, tools such as ontology editors may be used by knowledge engineers or simplified for use by lay people, depending on the overarching implementation strategy.
Critically, however, the consumer-directed approach requires a break with past approaches, putting the focus of the knowledge modeling unambiguously on the consumer wants instead of the producer wares.
At Primal, we have a compelling proof of concept for this consumer-directed approach that leverages a new technical capability called semantic synthesis. Semantic synthesis places thoughts in meaningful, task-oriented contexts. In real time, the technology enables consumers to create, manage, and deploy semantic networks without requiring knowledge of the underlying technical structure. Consumers input their thoughts through free text or by indicating concepts of interest from a set of suggestions that the system provides. Primal synthesizes (or fuses) thoughts as primitive concepts into higher-order constructs to form ideas, arguments, and perspectives. In a dynamic process, the thoughts of consumers are represented semantically and connected to online content. Text analysis is used to categorize and annotate unstructured online content within expansive knowledge models generated by the consumers.
We are examining the productivity of this consumer-directed approach through alpha testing of our thought networking services at www.primal.com. We are demonstrating a significant production of semantic data with relatively minimal effort from consumers. We are also testing various product designs to enable thought networking through alternate user interactions and channels. We plan to publish derivatives of these semantic representations as Linked Data through our Primal Developer API and services.
Benefits for Building the Semantic Web
Much of the effort in this field has focused on the complexity and intricacies of making existing content machine-readable. Creating semantic representations of content is valuable, but only one side of the coin. We also need to create knowledge models for people as the consumers of content, to give a voice to their perspectives. But most importantly, we need to recognize that every person is a semantic generator; collectively, we are the ultimate meaning-making machines. How can we better devise solutions to put those billions of semantic generators to work on building the Semantic Web?
Consumer-directed semantic networking can help by providing many hands on the problem, a wealth of knowledge models, and semantically annotated content. In terms of scalability, this approach may seem counterintuitive: our content assets are very tangible, while the thoughts and intentions of consumers are abstract and seemingly boundless. On closer inspection, however, it becomes tractable. It is far easier for a consumer to provide a snapshot of their knowledge directed to a specific task, than for a producer to try to anticipate all the possible perspectives on their content.
Obviously, on a global scale, the Semantic Web remains an extremely large undertaking, but the proposed consumer-directed approach offers an important scalability benefit. The factors of production—the consumers driving this process—move in lockstep with their consumption. Further, as consumers begin to capture their mental models and tasks in this formal way, there is the potential for individual reuse. We can reapply aspects of the same mental models to many tasks.