How do you define “basic services”? What’s the difference between them and “essential services”? What is meant by terms like “natural capital”, “raw material” or “essential medicines”? How do these all fit into an ontology?
The truth is that there often aren’t universally accepted or precise definitions for terms like these – or even more simple ones, like “forest” – as they relate to their use in the United Nations Environment Program’s (UNEP) Sustainable Development Goals Project. The Sustainable Development Goals (SDG) – successors to the UN’s Millennial Development Goals, which expired last year – are a set of 17 goals and 169 targets to be achieved by 2030 to promote human prosperity worldwide while protecting the environment and addressing climate change.
For example, within the goal of ending poverty in all its forms everywhere, targets include reducing at least by half the proportion of men, women, and children who live in poverty in all its dimensions (according to national definitions); to implement nationally appropriate social protection systems and measures for all; and, to reduce poor people’s exposure and vulnerability to climate-related extreme events and other economic, social, and environmental shocks and disasters. Each goal’s targets will have one or more indicators, which are linked to specific data points that UN statisticians and the general public can monitor to assess progress on that issue.
For the project to come together in a way that ensures data quality and transparency and sets the stage to enable more accurate analysis and measurement of progress, the Sustainable Development Goals Interface Ontology (SDGIO) is being developed to represent the various facets of the SDGs.
An ontology is a structured set of terms and logical relations that represents not only the data, but what the data is about, says Mark Jensen, who is working on the UNEP ontology effort as a consultant. Jensen is pursuing his Ph.D. in the Department of Biomedical Informatics in the Jacobs School of Medicine and Biomedical Sciences at the University at Buffalo.
“An important distinction that well-made ontologies maintain is one between the information stored in databases and the entities out in the world that are described, measured or designated by that information,” he says.
The SDGIO project employs an approach in creating ontologies based in ontological realism, developed in large part by Barry Smith. Smith, who brought the UN effort to Jensen’s attention, is SUNY Distinguished Professor in the Department of Philosophy and Director of the National Center for Ontological Research at the university.
One use of ontologies in the UN project is to make it possible to intelligently tag incoming data to ensure that users are able to discover and better understand the information they are seeking, even when target and indicator data points intersect across domains and when there are inconsistencies in how member states define data points like “basic services” or “safe access” or, yes, even “forest.” For example, some definitions of “forest” will include palm tree plantations, while others do not, which can potentially impact data that relates to forest acreage, and subsequently, any analysis that occurs when data is aggregated across countries and regions with varying conceptions of what qualifies as a forest.
“As ontologists, we create semantic models that represent knowledge in particular domains of inquiry,” says Jensen. A UNEP ontology will provide a model to represent knowledge that’s relevant to the SDGs with more precision and better consistency, and that will provide a better way of integrating information used in monitoring the status of how various targets and goals are being addressed around the world, he explains.
Steps to the Ontology
Jensen is working closely on the SDGIO project with Pier Luigi Buttigieg, a post-doctoral researcher at the Alfred Wegener Institute for Polar and Marine Research. The lead of the Environment Ontology project (ENVO), Buttigieg was invited to participate in an UNEP-led meeting on integrated measures for global monitoring of the SDG process in 2014, Jensen relates. The value of the ontology in promoting integration and interoperability was recognized at that time by UNEP Chief Scientist Jacqueline McGlade – who is directing the SDG project with Ludgarde Coppens, UNEP head of the SDG Information and Knowledge Management Unit – and other representatives. This, Jensen says, resulted in the formation of a team of ontologists to create the SDGIO. “Pier continues to play a leading role in SDGIO’s evolution and is enhancing ENVO to help meet its aims,” says Jensen.
The aim of creating a better, more uniform approach to representing the data doesn’t mean changing the way member states conceptualize their own understandings of certain terms. But it does mean creating a way to represent the diversity of definitions and make that diversity of usage more transparent to people looking for data that is relevant to indicators. In addition to that, there needs to be a way to show how disparate data is linked together through a variety of common themes that cuts across multiple goals and targets.
An ontology enters the picture because of the advantages it affords of being a precise way to go about defining terms In hierarchical fashion and establishing relationships and formalized links between the lower-level terms in that hierarchy, he says. A top-level general definition for the term “forest,” then, can encapsulate all its different conceptualizations, with the variations between definitions represented as different lower-tier species or versions of what qualifies as a forest. There’s a great deal of unpacking of the semantics, or meaning, behind each indicator that’s required to facilitate the consistency users need to measure progress toward targets by making clear the links between various data and indicators, too. For instance, an indicator that is about the proportion of a population living in households with access to basic services has to account for all kinds of data that could relate to it, including what qualifies as a household and how that differentiates from living in a slum or informal settlement.
Jensen says the team will be finalizing the first phase of the ontology this spring, which will be implemented on the portal UNEPLive:
“An important aspect of the workflow is to elicit the feedback of relevant domain experts to guide their efforts in refining the semantics to better reflect the various domains surrounding the SDGs, such as legal entities, social policies, economic systems, equity and human rights, ecosystems and biodiversity, infrastructure, public health and education,” Jensen says.
CHECK OUT OUR NEW PODCAST
Tune in weekly to hear different data experts discuss how they built their careers and share tips and tricks for those looking to follow in their footsteps.
This first phase includes the discovery and creation of all the needed terms and definitions and their formal implementation in OWL as an ontology. The team will reuse existing efforts by other groups developing ontologies, including those developed by ENVO (for the environment), CHEBI (for chemicals), OBI (for biomedical investigation), and PCO (for population and communities), and the complete semantics will grow over time as new ontologies are formed to address many of the domains that work on SDGIO has revealed a need for. For example, no ontology yet exists for human rights or financial measures, Jensen explains:
“SDGIO is truly an ‘interface.’ Not only is there a need to interface the data with the goals, targets and indicators, but also to interface the growing community of domain-specific ontologies that are and will be relevant to the SDGs.”
UN statisticians will tag the data using a few terms from the ontology (in addition to metadata such as provenance, geographic location, and so on). The ontology’s value for helping to enhance the discoverability of data should become clear in a few months as users go into the portal and type something like “access to basic services” or simply “basic services.” “We want to provide the ability for them to find data related to basic services in all contexts, not just for one particular indicator,” Jensen says.
Alternately, users might go in looking for data related to “essential services” but be served up data tagged as “basic services” because the two categories overlap very closely and often the distinction is hard to maintain. “We make the links between these terms and formalize that in the ontology so that if you search for one you can also find information tagged with the other,” he says. Researchers can drill down to determine whether information aligned with the other data tag fits their assessment requirements and use it or not, as they see fit.
Users also will be able to leverage the ontology for visualizing terms and the relationships between them. Along with those definitions they’ll find editor notes and comments about variations in usage. Says Jensen, “Hopefully they’ll utilize not just the discovery/search feature the ontology facilitates but also pay attention to the mapping and semantics it affords and the extra annotations we can add.”