How does 'is_a' work with the search() method?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How does 'is_a' work with the search() method?

kuoks
Hi Jiba,

System Information: MacOS 10.15.7; Python 3.8.2; owlready2 0.30

I am a novice to ontology and the owlready2 package, but I have spent a week or so exploring it. I am trying to use the python anytree package with SWEET ontology to build a tree representation of the ontology.

Basically, my strategy is to start off at the head of the ontoloty as the root node of the tree, find and create tree nodes for its children, and assign them the root node as their parent node, then proceed recurrsively down the chain. However, I noticed the following (code and returns below):

>>> from owlready import *
>>>  onto = get_ontology('data/owlapi.xrdf').load() # I have downloaded the SWEET xrdf file to local drive.
>>> print(list(Thing.subclasses()))
[phen.Phenomena, proc.Process, prop.Property, realm.Realm, state.State, repr.Representation, human.HumanActivity, matr.Substance, sosa.Sensor, matrParticle.Nucleon, phenCryo.PhysicalProcess, procChemical.Sorption, reprSciModel.Sample, schema.org.Organization]

and

>>> onto.search(iri = '*GeologicFea*')
[realmGeol.GeologicFeature]
>>> onto.search(iri = '*GeologicFea*')[0].is_a
[owl.Thing, rela.hasRealm.only(realm.Geosphere)]

So, apparently, 'Thing' is a parent of 'realmGeol.GeologicFeature' but the latter does not appear in the subclasses of 'Thing'! Thus, I alternatively tried to use 'onto.search(is_a = Thing)' to find 'rela.hasRealm.only(realm.Geosphere)' as a 'subclass' of 'Thing' and construct it as a child node on the tree. But, the search simply hung (with CPU usage around 100%), and worse still, once the search started, I couldn't interrupt it. I had to force quit the python shell or kill the terminal it was executed in. Out of curiosity, I tried 'onto.search(subclass_of = Thing). That hangs too! I have also asked a friend of mine (who is more familiar with ontology than me) to try it on his platform, same results.

My question is: Did I misunderstand the usage of the search method and use it incorrectly? Or, is there possibly a bug?

Thanks,
Kuo
Reply | Threaded
Open this post in threaded view
|

Re: How does 'is_a' work with the search() method?

Jiba
Administrator
Hi,

There was actually a bug in Thing.subclasses() : it missed GeologicFeature because GeologicFeature has a parent class, but this parent is not a named class but a construct.

Finding subclasses of Thing, i.e. the top-level classes, is actually tricky because some ontologies does not declare top-level class as subclass of Thing, and some others declare non-top-level classes as subclass of Thing.

I fixed this bug in the development version of Olwready on Bitbucket. You may also use the following SPARQL query to work around this problem:

print([x[0] for x in list(default_world.sparql("""SELECT ?x { ?x a owl:Class. FILTER(ISIRI(?x)). FILTER NOT EXISTS { ?x rdfs:subClassOf ?y. FILTER(ISIRI(?y) && ?y != owl:Thing) } }"""))])

onto.search(subclass_of = Thing) also include descendants classes, it is more complex and it takes longer. Here, it never terminates -- possibly there is some cycle somewhere?

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: How does 'is_a' work with the search() method?

kuoks
Hi Jiba,

Many thanks for the fast response.

I am sorry to report that, when I tried the SPARQL query, I got the following error:

>>> print([x[0] for x in list(default_world.sparql("""SELECT ?x { ?x a owl:Class. FILTER(ISIRI(?x)). FILTER NOT EXISTS { ?x rdfs:subClassOf ?y. FILTER(ISIRI(?y) && ?y != owl:Thing) } }"""))])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kskuo/GitHub/NASA-IMPACT/kg_scripts/venv/lib/python3.8/site-packages/owlready2/sparql/main.py", line 281, in execute
    for l in PreparedQuery.execute(self, params):
  File "/Users/kskuo/GitHub/NASA-IMPACT/kg_scripts/venv/lib/python3.8/site-packages/owlready2/sparql/main.py", line 277, in execute
    return self.world.graph.execute(self.sql, sql_params)
sqlite3.OperationalError: near "ANDq2": syntax error

My more experienced friend helped me and got something similar to work as follows, after loading the SWEET ontology with owlready2 (ver. 0.30):

>>> import rdflib
>>> graph = default_world.as_rdflib_graph()
>>> r = list( graph.query("""
... PREFIX owl: <http://www.w3.org/2002/07/owl#>
... PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
... SELECT ?x { ?x a owl:Class. FILTER(ISIRI(?x)). FILTER NOT EXISTS { ?x rdfs:subClassOf ?y. FILTER(ISIRI(?y) && ?y != owl:Thing) } }
... """))
>>> r
[(rdflib.term.URIRef('http://sweetontology.net/phen/Phenomena'),), (rdflib.term.URIRef('http://sweetontology.net/proc/Process'),), (rdflib.term.URIRef('http://sweetontology.net/prop/Property'),), (rdflib.term.URIRef('http://sweetontology.net/realm/Realm'),), (rdflib.term.URIRef('http://sweetontology.net/state/State'),), (rdflib.term.URIRef('http://sweetontology.net/repr/Representation'),), (rdflib.term.URIRef('http://sweetontology.net/human/HumanActivity'),), (rdflib.term.URIRef('http://sweetontology.net/matr/Substance'),), (rdflib.term.URIRef('http://www.w3.org/ns/sosa/Sensor'),), (rdflib.term.URIRef('http://sweetontology.net/matrParticle/Nucleon'),), (rdflib.term.URIRef('http://sweetontology.net/phenCryo/PhysicalProcess'),), (rdflib.term.URIRef('http://sweetontology.net/realmGeol/GeologicFeature'),), (rdflib.term.URIRef('http://sweetontology.net/procChemical/Sorption'),), (rdflib.term.URIRef('http://sweetontology.net/reprSciModel/Sample'),), (rdflib.term.URIRef('https://schema.org/Organization'),)]

About searching for subclasses, i.e. 'onto.search(subclass_of = Thing)', it was not the first thing I tried. I tried 'onto.search(is_a = Thing)' first, as stated in my initial post. That hung too. Is it due to the same reason, i.e. going through all descendants of 'Thing'?

Thanks,
Kuo
Reply | Threaded
Open this post in threaded view
|

Re: How does 'is_a' work with the search() method?

Jiba
Administrator
Hi,

I apologize, these is also a bug in the (new) SPARQL engine; I fixed the bug last week in the development version and I tested with that version.

onto.search(is_a = ...) is the union of rdf:type and rdfs:subClassOf. It also goes through descendants, as you guessed.

Jiba