Exchanging information between government parties requires a consistent, reusable and repeatable approach to specifying data exchanges as structured electronic business documents built from components. At the Ministry of Justice in The Netherlands, hereafter referred to as “The MoJ,” a new approach is underway to construct XML Schemas from OWL Ontology Models.
The MoJ is challenged to handle the complexity of electronic message exchange. With ten central information systems on a government level, specialized information systems for the criminal chain, juvenile chain, immigration services and over twenty organizations communication is a big undertaking. As a principal member of the Central Information Systems of the Dutch government, the MoJ is pioneering new approaches to business documents and message design with an emphasis on semantic checking, model-based generation of schemas and reuse of business components.
A good understanding of which language is used within the organization is crucial to both human and machine-to-machine communication. Strong motivation for the use of ontologies arose from the failure of traditional approaches to deliver a reusable component-based solution. Approaches based on Object Models and XML Schemas have lacked sufficient semantic consistency for transformation to message building blocks. A conceptual model in OWL can represent the richness of the language spoken and capture the knowledge within a certain domain. In the world of electronic message exchange, where XML documents often have un-named hierarchical structures, this richness is lost. However, by using schemas that are based on OWL models, the richness can be recovered by translating XML back to OWL. For electronic messages, it is important to reconcile the rich semantics of the OWL world with the representational needs of documents (messages) in the XML world.
The MoJ, in collaboration with TopQuadrant Inc., has undertaken a project to implement new approaches to electronic message design. Inspired by the U.S. National Information Exchange Model (NIEM) and the NASA Constellation Program’s approach to Data Architecture, the MoJ wanted to take full advantage of OWL to express different conceptual models and relate them to each other.
Law is a complex world in its one right. To express law enforcement in electronic messages is even more difficult. Through a “Projection, Qualification and Transformation (PQT)” method the conceptual world and implementation world of electronic messaging has been bridged.
This new solution replaced a previous system which focused only on message implementation, ignoring the conceptual world on which the messages were based. Issues with the old approach stemmed from the bad design and misunderstandings of concepts. Duplication of concepts occurred with entities having the same name but different meanings. To prevent this semantic chaos, the new solution uses OWL models.
To bridge the conceptual world and the implementation-driven world of electronic messaging, an additional standard is needed to provide a foundation. The United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT) “Core Component Technical Specification (CCTS)” standard was chosen. CCTS describes an electronic message in logical terms. The project also used the UN/CEFACT Naming and Design Rules (NDR) for XML documents.
CCTS is a standard for defining technology-independent building blocks to support electronic messaging. CCTS supplies a clear separation between reusable templates, Core Components and business specific implementations of them. CCTS works with a selection paradigm in contrast to one of addition and restriction. Elements are defined only once. For instance the concept person has several attributes, such as: first name, last name, birthday, hobby, religion, address, driver’s license, and social security number. Not all these attributes are necessary to report a traffic violation.
The CCTS standard bridges between the conceptual and the implementation-oriented world of electronic message exchange. To take full advantage of this bridge, an approach to transform an OWL model into CCTS and CCTS into XML Schema has been developed. In this new approach, the W3C Standard OWL is used both for the representation of the conceptual models of legal domains, UN/CEFACT core components, business documents (data exchange definitions) and controlled vocabularies.
From “Rich” Ontologies to “Precise” Messages
Ontology models are “rich” in the sense that they capture precise semantics about subject areas of interest. Electronic messages are, by necessity, concise descriptions about situations and affairs. Take the example of a vehicle crash. The phenomena of the crash and relevant legal domains of policies and procedures can be thought of as “ontologies of” the world. These kinds of ontologies can include policies and procedures that dictate what an electronic message needs to say as opposed to how it says it. In the MoJ project, ‘ontologies of’’ a world are referred to as “rich ontologies” – rich in the sense that they dive deep into the nature of what makes up a domain of discourse or state of affairs. On the other hand ‘ontologies about’ a subject area are shallow – they are concerned with documenting or reporting some aspects of a state of affairs.
Using the crash example, we move from “of-hood” to “about-hood” of the crash. The “ontologies of” a crash specify those aspects of a crash that ground and contextualize articles of law and legal statutes and obligations that pertain to conducting a legal case. The “ontologies about” the crash are specifications of how for the electronic messages needed to conduct the legal case. The crash example is used in Figure 1.
Figure 1: From Richness to Conciseness – the Crash Example
Through projections, selected aspects of OWL conceptual models become UN/CEFACT CCTS building blocks. Qualification of the projections adds metadata for the transformations and the attributes that will be required in the CCTS models. The “Rich” Ontologies of legal domains and contexts of law are transformed using SPARQL Inferencing Rules Notation (SPIN) into CCTS-based Ontologies. Details of SPIN can be found at http://www.spinrdf.org.
Using the CCTS Ontology Models, Information Analysts tailor and compose components to specify the business documents that make up the electronic messages. An Adobe Flex-based User Interface is used to construct the message exchange schemas. Ontologies are queried and updated using TopQuadrant’s TopBraid Live SDK.
The final step is a transformation to XML Schemas. This is done by first generating XML SchemaPlus (XSP) from the OWL Models. Developed by the NASA AMES Research Center, XSP is a specification language for XML Schemas that ensures sufficient semantics are retained from OWL models in the XML world, and that XML Naming and Design Rules are enforced. XSP captures the necessary semantics of the OWL model for round-tripping XML documents back into OWL models.
Figure 2 shows an overview of the workflow of the Ontology-Based CCTS Approach.
Figure 2: The Netherlands Ministry of Justice Ontology-Based Approach to Designing Business Documents
As an example of the MoJ Metadata Workbench, the screenshot in Figure 3 shows how a business document is being constructed from components. Other screen layouts deal with the need for qualified datatypes, codelists, metadata properties and annotations.
Figure 3: Constructing a Business Document in the MoJ Metadata Workbench
A screenshot of XSP Generation in the Metadata Workbench is shown in Figure 4.
Figure 4: Generating XML SchemaPlus from the Metadata Workbench
Ontology-Driven Approach to XML Message Schemas
For effective use, ontologies need to be organized. A key decision for the MoJ solution was to have an Ontology Architecture that isolates OWL models that are used for inferencing from those used for transformations. Figure 5 depicts the relationships between the named graphs that comprise the OWL models. Note that the figure refers to “OWL DL” and “OWL FULL”. At the time of the project OWL-2 was not yet a standard. The current models are now compliant with the relevant OWL-2 profiles.
Figure 5: Ontology Architecture of the MoJ Metadata Workbench
Bridging the OWL 2.0 and CCTS Worlds.
To fulfill the needs of canonical data models, electronic business documents, and business intelligence, a conceptual, logical and physical model must be in place. Such a three layer approach is common practice for the design of Information systems, and for electronic messaging, this is no different. Before the MoJ implemented an ontology-based approach, the bridge between the conceptual models and the implementation models was a tedious and error-prone manual task. Figure 6 illustrates the three levels of models.
Figure 6: The Three Layers of Models Need for Electronic Message Design
Reuse of the concepts in the implementation world is essential. For the purpose of maintainability and understanding, a concept is defined only once. In the conceptual world it is important to know which concepts live in the domain, what their meaning and purpose is, and which relations hold between concepts.
The conceptual model, partitioned also in several layers, is based on OWL, the W3C standard that is used to describe concepts and their relationships. OWL is used in a practical way to bridge the semantic gap between organizations. Each organization is free to define its own model but compliance with standard definitions that are common across organizations is required. Awareness of the existence of alternative definitions and the explicit description of the differences is needed for semantic interoperability.
With the use of atomic building blocks and a supporting syntax, conceptual models will be designed to stand the test of time. OWL brings flexibility and consistency where it is needed. Projections are needed because not all the information described in the conceptual world is needed in an electronic message. With simple transformation rules and annotations, the conceptual OWL model is transformed into a CCTS compliant model.
Within the conceptual world, an Ontologist is free to decide how to model the OWL ontology. However, the Ontologist must be aware of the implementation-driven world as well to make good design decisions. Clean and simple transformation rules are in place to guide the Ontologist, and the need for awareness of the CCTS model has been kept to a minimum. In most cases, existing OWL ontologies can be incorporated directly without the need to refactor them. An important prerequisite is that the OWL ontology is in its canonical form before transformation to CCTS takes place.
Conceptual models are modular, maintainable and under version control. Several models can be combined together to form a base-lined Ontology. The base-lined ontology forms the input for the creation process of a CCTS ontology.
A conceptual model in OWL does not express implementation considerations. In terms of modeling styles, the main difference between OWL and CCTS can be seen as similar to comparing object-oriented versus component-oriented software engineering. OWL has constructs for inheritance. In
CCTS this becomes composition of parent properties directly in the child.
CCTS is implementation-driven and reflects XML Schema in many ways. CCTS specifies business documents through the composition of reusable generic components. From a template called a “Core Component,” an implementation building block, called a “Business Information Entity” is derived.
One of the strengths of OWL is the usage of namespaces. Each concept is uniquely identified by its namespace and local name. For clearer understanding, prefixes are used to distinguish namespaces. Taking advantage of this strength is an aspect of our solution that does not change or violate the CCTS Standard. Naming and Design rules give flexibility over the usage of different namespaces. Considering the fact that many synonyms exists, namespaces are an essential architectural principle in the solution.
To bridge the gap between OWL and CCTS, a meta-model of CCTS was specified in OWL. The meta-model is implemented according to the Core Component Technical Specification CCTS V2.01 standard. For XML transformation the naming and design rules of the UN/CEFACT Naming and Design Rules are used. Essential enhancements are made to fit the specific needs of the Ministry of Justice. Naming and Design rules (NDR) are also important to the form of a structured XML Schema.
The metamodel supports reasoning over electronic messages. The CCTS model is realized as both DL-compliant and OWL FULL models. The OWL FULL part is needed to hold additional information important for the XML Schema generation process.
With the three layer approach, the MoJ has bridged conceptual OWL Models and implementation-level XML electronic message schema design. CCTS provided the standard to define business documents and OWL fulfilled the need for more expressive power than XML.
Looking to the Future
From the business perspective, the MoJ benefits are higher quality of messages and easier support for evolution and extensibility of the electronic messages. Reuse of concepts is guaranteed by the automated transformation from OWL to CCTS.
Looking to the future, when XML instances are available, translation to OWL enables the power of reasoning to be available. We envision applications that can infer new information, perform “smart” queries and generate comprehensive reports. Through workflow controls, sophisticated content-based routing of electronic messages becomes possible. These are the next steps that the MoJ will be considering.
In addition to developing web based user interfaces, we also extended the TopBraid Composer desktop modeling tool with plug-ins to support the creation of Core Components and Business Information Entities. TopBraid Composer is based on the open source Eclipse Integrated Development Environment, making it open to extensions and customizations..
1. CCTS, “Core Component Technical Specification”, UN/CEFACT,http://www.unece.org/cefact/codesfortrade/CCTS_index.htm
2. NIEM, “National Information Exchange Model”, http://www.niem.gov/
3. SPIN, “SPARQL Inferencing Notation”, http://www.spinrdf.org
4. TopBraid Live, http://www.topquadrant.com/products/TB_Live.html
5. XSP, “XML SchemaPlus”, http://www.xspl.us