No module named pymedtermino2 and destroy RDF triples

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
How do I can export the umls archive in sqlite3? I have the error “No module named ‘owlready2.pymedtermino2’”. Then I want to destroy (command delete) a triple subject predicate object in the sparql query but I have an error on version 0.16. Can you write an example of code of this?
Many thanks.
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

I forgot to install the pymedtermino2 module in setup.py. I fixed setup.py in the development version, so you should now be able to import pymedtermino2.

Here is an example of DELETE SPARQL command:

    world = self.new_world()
    o = world.get_ontology("http://www.semanticweb.org/onto.owl")
    g = world.as_rdflib_graph()
    g.bind("onto", "http://www.semanticweb.org/onto.owl#")

    r = g.update("""
    DELETE {
    onto:ma_pizza
    <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
    onto:Pizza .
    } WHERE {}""")

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Many thanks.

Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
I have tried again to run the python script:

from owlready2 import *
from owlready2.pymedtermino2 import *
from owlready2.pymedtermino2.umls import *
default_world.set_backend(filename = "pym.sqlite3")
import_umls("umls-2018AB-full.zip", terminologies = ["ICD10", "SNOMEDCT_US", "CUI"])
default_world.save()

I have an other error:

* Owlready2 * Warning: optimized Cython parser module 'owlready2_optimized' is not available, defaulting to slower Python implementation
Importing UMLS from 2018AB-full/2018ab-1-meta.nlm...
Traceback (most recent call last):
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\zipfile.py", line 1171, in _RealGetContents
    endrec = _EndRecData(fp)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\zipfile.py", line 241, in _EndRecData
    fpin.seek(0, 2)
io.UnsupportedOperation: seek

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Script5.py", line 5, in <module>
    import_umls("umls-2018AB-full.zip", terminologies = ["ICD10", "SNOMEDCT_US", "CUI"])
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\owlready2-0.17-py3.6.egg\owlready2\pymedtermino2\umls.py", line 666, in import_umls
    with zipfile.ZipFile(umls_zip.open(filename), "r") as umls_inner_zip:
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\zipfile.py", line 1108, in __init__
    self._RealGetContents()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\zipfile.py", line 1173, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file



How do I can resolve?
Many thanks.

Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

It seems that the Zip file cannot be opened. UMLS is quite complex because it is a Zip file including other compressed file.

Can you try after unpacking the UMLS Zip file? Unpack it and pass the UML directory instead of the Zip filename (this is also accepted by PyMedTermino2).

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
I have an other error on the code’s row 661 with:
With zipfile.Zipfile(umls_zip_filename, “r”) as umls_zip
The error is:
Permission error: [errno 13] Permission denied: ‘2018AB-full’
Is there an other solution to the problem?
Many thanks.
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

This looks like a problem of permission in your filesystem. Are you sure you can access the content of the 2018AB-full directory?

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
I have tried to run the python script as root and to change the permits of the directory. I have the follower error:
IsADirectoryError: [Errno 21] Is a directory: ‘2018AB-full’
on the line 661 of umls.py and on line 1009 of __init__

How do I do?

Many thanks.
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

You need to put a "/" at the end of the path, so as PyMedTermino understand that you are use a directory instead of a Zip file, e.g. ‘2018AB-full/’.

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
It works. The only problem is that the sqlite3 file has a size of 100 KB and the execution of python script is about 2 seconds.

What can you say me about this?
Many thanks.
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

It seems that the extraction process was performed, but nothing was extracted...

Do you see the following lines in the terminal? It should display the name of the files parsed (MRRANK.RRF, etc).


Importing UMLS from /home/jiba/telechargements/base_med/2018AB-full/2018AB/META/...
  Parsing MRRANK.RRF as MRRANK...
  Parsing MRCONSO.RRF as MRCONSO...
  Parsing MRDEF.RRF as MRDEF...
  Parsing MRREL.RRF as MRREL...
  Parsing MRSAT.RRF as MRSAT...
Breaking ORIG cycles...
    SNOMEDCT_US : 0 cycles found:
    ICD10 : 0 cycles found:
    SRC : 0 cycles found:
Finalizing only properties and restrictions...
Finalizing CUI - ORIG mapping...
Indexing...
FTS Indexing...
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
My output is the follower:

* Owlready2 * Warning: optimized Cython parser module 'owlready2_optimized' is not available, defaulting to slower Python implementation
Importing UMLS from 2018AB-full/...
Breaking ORIG cycles...
Finalizing only properties and restrictions...
Finalizing CUI - ORIG mapping...
Indexing...
FTS Indexing...

What do you think about this?

Many thanks.
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

It means that PyMedTermino found no *.RRF files, and thus it imported no data.

Could you :

1) verify that you have the following files in the UMLS directory (the one passed to import_umls()) : MRRANK.RRF, MRCONSO.RRF

2) retry with the update development version, and after removing the "/" at the end of the path  ? (I'm improved the directory detection, because I'm not sure adding a "/" at the end is a good option under windows).

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
I have written an email to you.

Best regards,
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

I think the *.RRF files are in an inner compressed directory in the UMLS Zip file (probably 2018ab-1-meta.nlm and 2). Could you try to uncompress them ? You should obtain a 2018AB/META/ directory with these files.

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jean-Baptiste,
I have unarchived the first zip file .nlm and I have found the *.RRF files. The output is the same. How do I do?

Many thanks.
Best regards,
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

I've just added additional prints in the development version; could you try it and send here the output ?

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
My output with the new development version is the follower:

* Owlready2 * Warning: optimized Cython parser module 'owlready2_optimized' is not available, defaulting to slower Python implementation
Importing UMLS from directory 2018AB-full...
Files found in this directory: 2018AB, 2018AB.CHK, 2018AB.MD5, 2018ab-1-meta.nlm, 2018ab-2-meta.nlm, 2018ab-otherks.nlm, Copyright_Notice.txt, README.txt, mmsys.zip
WARNING: file MRCONSO.RRF not found!
Breaking ORIG cycles...
Finalizing only properties and restrictions...
Finalizing CUI - ORIG mapping...
Indexing...
FTS Indexing...

In the meta directory I have the zip files *.RRF.aa and *.RRF.ab. How do I do?

Many thanks.
Best regards,
Hamico
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Jiba
Administrator
Hi,

Files *.RRF.aa, *.RRF.ab, etc, are the RRF files (they are splitted in several files because some windows file system does not support large file of more than 4 Gb). PyMedTermino2 can read these files "as is".

You should pass to import_umls() the directory that contains these files.

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: No module named pymedtermino2 and destroy RDF triples

Hamico
Hi Jiba,
I have written an email to you.

Best regards,
Hamico
12