You are hereBlogs / David Schaengold's blog / Quality Control for Ontologies Using SPARQL

Quality Control for Ontologies Using SPARQL


By David Schaengold - Posted on 28 July 2010

There are a number of existing tools that check ontologies for OWL DL compliance. To our knowledge, all of these tools include code written especially for checking RDF against the OWL DL ruleset. Recently, we have taken some preliminary steps towards creating a generalized, SPARQL-based validation/quality control strategy that requires no purpose-built code, only the ability to direct SPARQL queries at a graph. Our initial effort resulted in three sets of validation queries, each of which was designed to return violations of a ruleset:

1. The OWL DL ruleset. For instance, the following triple statements bind to ?instance all instances that violate the DL rules about maxcardinality=1 restrictions by having more than one assertion using the restricted property:

 

?instance rdf:type ?class .
?class rdfs:subClassOf ?restriction .
?restriction rdf:type owl:Restriction .
?restriction owl:onProperty ?property .
?restriction owl:maxCardinality 1 .
?instance ?property ?value .
?instance ?property ?value2 .
FILTER(?value != ?value2)

 

2. OWL under closed-world assumptions. See Mike Lang, Jr.'s recent post on integrity constraints for an explanation of this concept. As an example, the following triple statement binds to ?subject all resources that are used as the subjects of properties with declared domains where the resource is not typed into the domain class. Note that under standard OWL assumptions, this situation would produce an inference that the subject is an instance of the domain class. Under closed-world assumptions, what is not asserted is assumed to be false, and the graph structure is therefore incorrect.

 

?subject ?predicate ?object . 
?predicate rdfs:domain ?domain .
{?subject rdf:type ?class . 
FILTER(?class != ?domain)

 

3. The Revelytix RDB Mapping Ontology specification.

You can see the queries in action at the validation wiki of our public Mapping Ontology community on Knoodl.com. The text of the queries is available here.

Validating ontologies using queries has several advantages over validating using code. 

  • SPARQL is a W3C recommendation and is completely non-proprietary, the queries can be used in any context where it is possible to direct SPARQL at a graph (cloud or desktop, accessible to the public or behind a firewall, etc)
  • The queries can be easily altered, parameterized, or extended to suit particular use-cases
  • The queries can be managed, or changed if necessary, by ontologists alone, without the aid of developers
  • The processing work can be outsourced to triple stores
  • The queries are granular; if a particular use-case does not require the validation of a particular rule, it can be easily removed from the query

The queries are written in SPARQL 1.0, and so do not exhaustively validate any of these three rulesets. We expect that with support for SPARQL 1.1, an exhaustive validation will be possible.