I am just in my preliminary stages of leveraging Owlready. I created my ontology in Protege, and use Owlready and Python to load the ontology and add some instances, relationships etc. so far so good! I then wanted to use the SQLite backend so I tried to follow the documentation. This is what I have:
------ from owlready2 import * onto_path.append('/users/mattfitz/Desktop') default_world.set_backend(filename = "/Users/mattfitz/workplace/databases/myriaddata.sqlite3") onto = get_ontology('myriadont.owl') onto.load() ------ That seems to work ok, but when I do this: ------ emp = onto.Employee('mattfitz1', alias='mattfitz', jobLevel=7) ------ I end up with: --------------------------------------------------------------------------- IntegrityError Traceback (most recent call last) <ipython-input-6-ea1847d4080e> in <module>() ----> 1 emp = onto.Employee('mattfitz1', alias='mattfitz', jobLevel=7) ~/anaconda3/lib/python3.6/site-packages/owlready2/namespace.py in __getattr__(self, attr) 65 def __repr__(self): return """%s.get_namespace("%s")""" % (self.ontology, self.base_iri) 66 ---> 67 def __getattr__(self, attr): return self.world["%s%s" % (self.base_iri, attr)] #return self[attr] 68 def __getitem__(self, name): return self.world["%s%s" % (self.base_iri, name)] 69 ~/anaconda3/lib/python3.6/site-packages/owlready2/namespace.py in __getitem__(self, iri) 326 327 def __getitem__(self, iri): --> 328 return self._get_by_storid(self.abbreviate(iri), iri) 329 330 def _get_by_storid(self, storid, full_iri = None, main_type = None, main_onto = None, trace = None): ~/anaconda3/lib/python3.6/site-packages/owlready2/triplelite.py in abbreviate_dict(self, iri) 131 storid = self.abbreviate_d[iri] = _int_base_62(self.current_resource) 132 self.unabbreviate_d[storid] = iri --> 133 self.execute("INSERT INTO resources VALUES (?,?)", (storid, iri)) 134 return storid 135 IntegrityError: UNIQUE constraint failed: resources.storid ----- Any ideas what I have done wrong? This works totally fine if I go the in-memory route. Cheers, -Matt |
Administrator
|
Hello,
This is surprising because the in-memory approach also use SQLite (in memory). Are you sure that an entity named "mattfitz1" does not already exist ? If so, could you send me your ontology so as I can try it ? Best regards, Jiba |
Ok, I seem to have solved this. Not sure what I did, but once I started to formulate my script more it went away. I do have a couple of questions however. Similar to others, I am reading a CSV into a Pandas' data frame, I call apply() on it which effectively calls a function for every row that does something like this:
onto.Employee(alias, alias=alias, jobLevel=level, jobTitle=family, isManager=ismanager, \ isTech=istech, timeZone = tz, hasManager=onto.Employee(manager, alias=manager)) This appears to work quite well, if the Employee being related (via hasManager) already exists, a new instance is not created (which is as expected, same class, same URI - use the existing instance!), however - what does happen is that a <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> gets added to the pre-existing instance every time onto.Employee() is called; this in effect is polluting the backend quad-store. I guess you could reproduce it by doing: onto.Employee("manager") onto.Employee("manager") onto.Employee("manager") (3x times), there will be a single instance, with 3 occurrences of <rdf:type ...> within it. This doesn't seem normal? Or am I doing something wrong. Cheers, -M |
Administrator
|
Hello,
I've tried to reproduce your previous problem (with unique constraint) but in vain. For entity creation, the existing entity is returned if it already exists. However, you are true, there is a "quadstore pollution" problem... I think the redundant triples have no impact, but they do pollute the quadstore. I'm going to investigate this problem, a search for a solution. Thank you, Jiba |
Administrator
|
In reply to this post by mafi
Hello again,
I've now fixed the "quadstore pollution" problem in the development version of Owlready (on Bitbucket). Additionnaly, the instance returned by mutiple calls to the class with the same name are now the same Python object. Best regards, Jiba |
I installed 0.10 but still seeing the redundant triples in the quad-store!
|
Administrator
|
Hello,
Are you sure you are using the BitBucket version of Owlready ? I've tested with the script below and the ontology you sent me, and it works well. I've also added some test case in test/regtest.py. Best regards, Jean-Baptiste Lamy MCF HDR, Laboratoire LIMICS, Université Paris 13 ---8<-------------- from owlready2 import * onto_path.append("/tmp") onto = get_ontology("file:///tmp/myriadont.owl").load() with onto: e1 = onto.Employee("manager") e2 = onto.Employee("manager") e3 = onto.Employee("manager") onto.graph.dump() |
Yeah, I am! I since wrote a small script and confirmed that you have fixed this. Must be something else with the way I am calling the creation. It’s within a Pandas apply() function, so obviously some tight iteration going on. I though about a thread issue, but pretty sure everything is single-threaded?
|
Administrator
|
Hello,
> I though about a thread issue, but pretty sure everything is single-threaded? Yes, everything is single-threaded in Owlready (and thus one should not use multiple thread with it, unless caring for synchronization). Best regards, Jean-Baptiste Lamy MCF HDR, Laboratoire LIMICS, Université Paris 13 |
Hello again,
The code below will replicate the issues I am seeing (namely redundant <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> triples): from owlready2 import * onto_path.append('/tmp') onto = get_ontology('myriadont2.owl').load() with onto: mgr = onto.Employee("worker5") for x in range(10): emp = ("worker%s" % x) e = onto.Employee(emp, alias=emp, jobLevel=x, timeZone="PST", isManager=False, isTech=True, \ jobTitle="SDE", hasManager=mgr) onto.save(file='/tmp/onttest.owl') Any ideas if it's something I am doing wrong or a bug? :) -M |
Administrator
|
Hello,
I tested this script with the ontology you sent me, but I get only 10 NamedIndividuals (thus only one for the manager). Are you sur you are using the development version of OwlReady ? Could you try to run the regtest.py module in the source, with the test cases ? Some of them are related to your problem. Jiba |
Absolutely positive I am using the latest development version. I understand I am not getting duplicate instances, it's the actual NamedIndividual triples associated with that instance. For example, here are two instances generated from the previous sample script I provided:
<Employee rdf:about="#worker8"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> <rdf:type rdf:resource="#Employee"/> <alias rdf:datatype="http://www.w3.org/2001/XMLSchema#string">worker8</alias> <jobLevel rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">8</jobLevel> <timeZone rdf:datatype="http://www.w3.org/2001/XMLSchema#string">PST</timeZone> <isManager rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</isManager> <isTech rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</isTech> <jobTitle rdf:datatype="http://www.w3.org/2001/XMLSchema#string">SDE</jobTitle> <hasManager rdf:resource="#worker5"/> </Employee> <Employee rdf:about="#worker9"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> <alias rdf:datatype="http://www.w3.org/2001/XMLSchema#string">worker9</alias> <jobLevel rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">9</jobLevel> <timeZone rdf:datatype="http://www.w3.org/2001/XMLSchema#string">PST</timeZone> <isManager rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</isManager> <isTech rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</isTech> <jobTitle rdf:datatype="http://www.w3.org/2001/XMLSchema#string">SDE</jobTitle> <hasManager rdf:resource="#worker5"/> </Employee> Are you saying this is not what you're seeing when you run the same script? Cheers, -Matt |
Administrator
|
Hi,
I do not have dupplicated triple if the individuals do not exist in the original ontology. However, I do have them if the original ontology already includes these individuals. I fixed this problem in the development version, could you try again ? Best regards, Jiba |
Hey Jiba,
Yep - that fixed it! Thanks very much for that. One more question; once I populate my quad-store. What's the usual way of working with it? My script currently does a: default_world.set_backend([path to sqlite db]) onto = get_ontology([ontology file]) onto.load() I'm thinking that onto.load() blows away the contents of the database each time? So, I assume to leverage the database I would do: default_world.set_backend([path to sqlite db]) onto = default_world.get_ontology() onto.load() Is this correct? Thanks very much again for resolving that triple issue! Cheers, -Matt |
Administrator
|
Hi,
When storing the quadstore on disk, onto.load() check if the ontology has been changed and, if so, remove it from the quadstore and reload it. So it can blow away the contents you added previously, if the OWL file has been modified. Best regards, Jiba |
Free forum by Nabble | Edit this page |