Unique constraint error with SQLlite

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Unique constraint error with SQLlite

mafi
I am just in my preliminary stages of leveraging Owlready.  I created my ontology in Protege, and use Owlready and Python to load the ontology and add some instances, relationships etc.  so far so good!  I then wanted to use the SQLite backend so I tried to follow the documentation.  This is what I have:


------
from owlready2 import *

onto_path.append('/users/mattfitz/Desktop')
default_world.set_backend(filename = "/Users/mattfitz/workplace/databases/myriaddata.sqlite3")
onto = get_ontology('myriadont.owl')
onto.load()
------


That seems to work ok, but when I do this:

------
emp = onto.Employee('mattfitz1', alias='mattfitz', jobLevel=7)
------

I end up with:

---------------------------------------------------------------------------
IntegrityError                            Traceback (most recent call last)
<ipython-input-6-ea1847d4080e> in <module>()
----> 1 emp = onto.Employee('mattfitz1', alias='mattfitz', jobLevel=7)

~/anaconda3/lib/python3.6/site-packages/owlready2/namespace.py in __getattr__(self, attr)
     65   def __repr__(self): return """%s.get_namespace("%s")""" % (self.ontology, self.base_iri)
     66
---> 67   def __getattr__(self, attr): return self.world["%s%s" % (self.base_iri, attr)] #return self[attr]
     68   def __getitem__(self, name): return self.world["%s%s" % (self.base_iri, name)]
     69

~/anaconda3/lib/python3.6/site-packages/owlready2/namespace.py in __getitem__(self, iri)
    326
    327   def __getitem__(self, iri):
--> 328     return self._get_by_storid(self.abbreviate(iri), iri)
    329
    330   def _get_by_storid(self, storid, full_iri = None, main_type = None, main_onto = None, trace = None):

~/anaconda3/lib/python3.6/site-packages/owlready2/triplelite.py in abbreviate_dict(self, iri)
    131       storid = self.abbreviate_d[iri] = _int_base_62(self.current_resource)
    132       self.unabbreviate_d[storid] = iri
--> 133       self.execute("INSERT INTO resources VALUES (?,?)", (storid, iri))
    134     return storid
    135

IntegrityError: UNIQUE constraint failed: resources.storid
 
-----

Any ideas what I have done wrong?  This works totally fine if I go the in-memory route.

Cheers,
  -Matt
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
Hello,

This is surprising because the in-memory approach also use SQLite (in memory).

Are you sure that an entity named "mattfitz1" does not already exist ?

If so, could you send me your ontology so as I can try it ?

Best regards,
Jiba
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

mafi
Ok, I seem to have solved this.  Not sure what I did, but once I started to formulate my script more it went away.  I do have a couple of questions however.  Similar to others, I am reading a CSV into a Pandas' data frame, I call apply() on it which effectively calls a function for every row that does something like this:

onto.Employee(alias, alias=alias, jobLevel=level, jobTitle=family, isManager=ismanager, \
                      isTech=istech, timeZone = tz, hasManager=onto.Employee(manager, alias=manager))

This appears to work quite well, if the Employee being related (via hasManager) already exists, a new instance is not created (which is as expected, same class, same URI - use the existing instance!), however - what does happen is that a <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> gets added to the pre-existing instance every time onto.Employee() is called; this in effect is polluting the backend quad-store.  I guess you could reproduce it by doing:

onto.Employee("manager")
onto.Employee("manager")
onto.Employee("manager")

(3x times), there will be a single instance, with 3 occurrences of <rdf:type ...> within it.  

This doesn't seem normal?  Or am I doing something wrong.

Cheers,
 -M
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
Hello,

I've tried to reproduce your previous problem (with unique constraint) but in vain.

For entity creation, the existing entity is returned if it already exists.

However, you are true, there is a "quadstore pollution" problem... I think the redundant triples have no impact, but they do pollute the quadstore. I'm going to investigate this problem, a search for a solution.

Thank you,
Jiba
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
In reply to this post by mafi
Hello again,

I've now fixed the "quadstore pollution" problem in the development version of Owlready (on Bitbucket).

Additionnaly, the instance returned by mutiple calls to the class with the same name are now the same Python object.

Best regards,
Jiba
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

mafi
I installed 0.10 but still seeing the redundant triples in the quad-store!
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
Hello,

Are you sure you are using the BitBucket version of Owlready ? I've tested with the script below and the ontology you sent me, and it works well.
I've also added some test case in test/regtest.py.

Best regards,
Jean-Baptiste Lamy
MCF HDR, Laboratoire LIMICS, Université Paris 13


---8<--------------

from owlready2 import *

onto_path.append("/tmp")
onto = get_ontology("file:///tmp/myriadont.owl").load()

with onto:
  e1 = onto.Employee("manager")
  e2 = onto.Employee("manager")
  e3 = onto.Employee("manager")
 
onto.graph.dump()
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

mafi
Yeah, I am!  I since wrote a small script and confirmed that you have fixed this.  Must be something else with the way I am calling the creation.  It’s within a Pandas apply() function, so obviously some tight iteration going on.  I though about a thread issue, but pretty sure everything is single-threaded?
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
Hello,

> I though about a thread issue, but pretty sure everything is single-threaded?

Yes, everything is single-threaded in Owlready (and thus one should not use multiple thread with it, unless caring for synchronization).

Best regards,
Jean-Baptiste Lamy
MCF HDR, Laboratoire LIMICS, Université Paris 13
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

mafi
Hello again,

The code below will replicate the issues I am seeing (namely redundant <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> triples):

from owlready2 import *

onto_path.append('/tmp')
onto = get_ontology('myriadont2.owl').load()


with onto:
  mgr = onto.Employee("worker5")

  for x in range(10):
        emp = ("worker%s" % x)
        e = onto.Employee(emp, alias=emp, jobLevel=x, timeZone="PST", isManager=False, isTech=True, \
                          jobTitle="SDE", hasManager=mgr)

onto.save(file='/tmp/onttest.owl')


Any ideas if it's something I am doing wrong or a bug? :)

-M
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
Hello,

I tested this script with the ontology you sent me, but I get only 10 NamedIndividuals (thus only one for the manager).

Are you sur you are using the development version of OwlReady ? Could you try to run the regtest.py module in the source, with the test cases ? Some of them are related to your problem.

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

mafi
Absolutely positive I am using the latest development version.  I understand I am not getting duplicate instances, it's the actual NamedIndividual triples associated with that instance.  For example, here are two instances generated from the previous sample script I provided:

<Employee rdf:about="#worker8">
  <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
  <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
  <rdf:type rdf:resource="#Employee"/>
  <alias rdf:datatype="http://www.w3.org/2001/XMLSchema#string">worker8</alias>
  <jobLevel rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">8</jobLevel>
  <timeZone rdf:datatype="http://www.w3.org/2001/XMLSchema#string">PST</timeZone>
  <isManager rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</isManager>
  <isTech rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</isTech>
  <jobTitle rdf:datatype="http://www.w3.org/2001/XMLSchema#string">SDE</jobTitle>
  <hasManager rdf:resource="#worker5"/>
</Employee>

<Employee rdf:about="#worker9">
  <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
  <alias rdf:datatype="http://www.w3.org/2001/XMLSchema#string">worker9</alias>
  <jobLevel rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">9</jobLevel>
  <timeZone rdf:datatype="http://www.w3.org/2001/XMLSchema#string">PST</timeZone>
  <isManager rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">false</isManager>
  <isTech rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</isTech>
  <jobTitle rdf:datatype="http://www.w3.org/2001/XMLSchema#string">SDE</jobTitle>
  <hasManager rdf:resource="#worker5"/>
</Employee>


Are you saying this is not what you're seeing when you run the same script?

Cheers,
  -Matt
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
Hi,

I do not have dupplicated triple if the individuals do not exist in the original ontology. However, I do have them if the original ontology already includes these individuals.

I fixed this problem in the development version, could you try again ?

Best regards,
Jiba
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

mafi
Hey Jiba,

Yep - that fixed it!   Thanks very much for that.  One more question; once I populate my quad-store.  What's the usual way of working with it?

My script currently does a:
default_world.set_backend([path to sqlite db])
onto = get_ontology([ontology file])
onto.load()

I'm thinking that onto.load() blows away the contents of the database each time?   So, I assume to leverage the database I would do:
default_world.set_backend([path to sqlite db])
onto = default_world.get_ontology()
onto.load()

Is this correct?

Thanks very much again for resolving that triple issue!

Cheers,
 -Matt
Reply | Threaded
Open this post in threaded view
|

Re: Unique constraint error with SQLlite

Jiba
Administrator
Hi,

When storing the quadstore on disk, onto.load() check if the ontology has been changed and, if so, remove it from the quadstore and reload it.

So it can blow away the contents you added previously, if the OWL file has been modified.

Best regards,
Jiba