Slow on search_one with properties

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Slow on search_one with properties

Bart
I find searching for instances using properties and the class is very slow, and gets slower as the ontology grows.

I'm using a random ID for each instance I insert so the label for an instance is something like "<class>_<random_int>". (it's a security requirement).

This means when a new instance is being processes, I don't have the ID (as it's randomly generated when the first time this instances was encountered). I do, however, have some attributes. For example, if I have the "name" value, I can create a new instance and then later want to look for it using that "name" value.

# Create a new instance.
rand_id = str(numpy.random.randint(10000))
instance = MyClass(rand_it)
instance.hasName.append("My unique name")
print(f"ID = {instance}")
>> ID = 8567

# Some time later, I can load the instance by name given.
properties = {'hasName': "My unique name"}
instance = namespace.search_one(is_a=MyClass, **properties)
print(f"ID = {instance}")
>> ID = 8567


The problem is that this search by properties is very slow on a large ontology.
Is there a way to index by properties make this faster somehow?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Slow on search_one with properties

Jiba
Administrator
Hi,

Here is some suggestion for improving performance:

 * Use "type=MyClass" instead of "is_a=MyClass". is_a is actually a shortcut for both RDF type and RDFS subclassof, but if you have only instances (and not subclasses), type will be faster.

 * If the name is unique for all individuals, whatever the class, you may obtain faster results with namespace.search_one(**properties)  (without checking the class)

 * You may also try to use the Owlready native SPARQL engine, with default_world.sparql(), which is sometimes more optimized than search().

Jiba