After analysing the Parallelism page from the documentation and the flask example, I still have some problems with the quadstore, getting easily the database lock error.
So let me explain what I'm doing: I wrote a small ontology (using the owlready2 classes), stored it in an owl file and saved the default_world in a sqlite3 file. Now a generic script that I'm using to access that ontology is: >from owlready2 import * >my_world = World(filename = "./ontology.sqlite3", exclusive = False) >onto = my_world.get_ontology("...").load() >with onto: > # For Example insert or read operation > my_world.save() My main goal is simply having an .sqlite3 as a main database for triples storage, more specifically, I want to store a bunch of instances of classes and respective relations in one database. But when I try to use the sqlite3 file in a django project or simply having a couple of jupiter notebooks accessing the same a sqlite3, I instantly get the database lock error. Based on the documentation, I thought that by simply adding the "exclusive = False" flag, I wouldn't have locks issues. Can someone explain me what is the issue or what am I doing wrong. Honestly I didn't really understood that small flask example in the documentation, that's why I'm seeking some help to figure out what is the issue. |
Administrator
|
Hi,
I just improved the parallelism system in the development version of Owlready on Bitbucket. It should now support multiprocessing, but also distinct programs sharing the same quadstore (that was not the case previously). You can get database lock if the database is never released. Owlready locks the database when you start a "with onto:" block, and releases it when you call World.save(). So you need to call World.save() regularly. In addition, it seems that releasing and then relocking the database very quickly may not permit another programs to take the lock. For example: with onto: class Drug(Thing): pass for i in range(10000): with onto: onto.Drug(label = ["xxx"]) default_world.save() This program releases the lock after each creation of an instance, but then relocks the database in the next iteration. This problem can be solved by adding a small sleep after World.save(): for i in range(10000): with onto: onto.Drug(label = ["xxx"]) default_world.save() time.sleep(0.0001) Jiba |
Thank you Jiba.
I'm going to install the newest version of the package and try the sleeping after saving method. I know that I'm repeating myself, but thank you for the update and the explanation. |
Update and related issue
First of all, I have to thank you Jiba. With the recent owlready update and explanation provided in this thread, I was able to further develop the project that I'm doing, which is almost completed. But now I'm facing a related issue (I'm sorry for the long post but I think that should explain what I'm doing). I'm developing a django project, which utilizes the owlready in two main entry points: The first entry points is in the models, the idea is: text -> cleaning process -> entity recognition -> relation extraction -> OwlReady -> store the updated World and the remaining data in other db. The second entry point is in the views: Simple queries, such as get_properties/get_inverse_properties, to be rendered by the template. In the models, I was able to implement everything exactly like explained (default_world.set_backend(filename = "path_to_sqlFile.sqlite3" exclusive = False)), everything inside "withs", saving the world and sleeping. So for example with a form, add a new document and automatically extract and store the respective relations in a owlready world. But when I repeat the same process in the views (default_world.set_backend...), I get the following Error: if exists: raise ValueError("Cannot save existent quadstore in '%s': File already exists! Use a new filename for saving quadstore or, for opening an already existent quadstore, do not create any triple before calling set_backend() (including creating an empty ontology or loading a module that does so)." % filename) ValueError: Cannot save existent quadstore in 'redacted': File already exists! Use a new filename for saving quadstore or, for opening an already existent quadstore, do not create any triple before calling set_backend() (including creating an empty ontology or loading a module that does so). For testing, I can simply (only in the views, in the models still use the set_backend): my_world = World(filename ="...", exclusive = False) onto = my_world.get_ontology("...") I'm able to navigate through the rendered pages, but If I try to add new documents, I get the database lock error. I know that the error is related to non stored triples, but I make sure that the last step in the "models", is activate the reasoner to derive classes according to object/data properties and save the world. The way that the owlready is imported in both scripts is the following: from owlready2 import * onto_path.append("path to main dir") default_world.set_backend(filename ="main_world.sqlite3", exclusive = False) onto = default_world.get_ontology("base_iri") with onto: onto.load() default_world.save() time.sleep(0.0001) So I believe that I'm not adding new triples before loading the ontology. Again, I'm sorry for the extremely long post, but I'm seeking some help to figure out what might be the issue, because I'm pretty sure that I'm storing the triples correctly before loading the ontology. |
Administrator
|
Hi,
The ValueError "Cannot save existent quadstore..." means that there is already some triples asserted in the previous (in memory) quadstore. Beware that just creating an empty ontology already creates 1 RDF triple (stating that the ontology IRI is an OWL ontology). I suggest you to use the following command to print the quadstore content just before calling set_backend(), in order to help you figure which are the asserted triples: default_world.graph.dump() The database lock error means that another process/programs modified the database and did not call World.save() yet. If you are using Owlready in a server setting, you should also consider the option of a single-process server -- it is easier and, in some situation, the time gained with multiprocessors is actually smaller than the time spent in synchronization. Jiba |
Free forum by Nabble | Edit this page |