Error when importing all terminologies

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Error when importing all terminologies

Kim Tang
Hello Jiba,

I would like to make use of the MedMentions dataset by gathering all synonyms for each CUI in the dataset.
MedMentions was created 2017 so I am using the umls-2017AA-full release.

Based on the PyMedTermino2 installation instructions, I set up PyMedTermino2 with the terminologies "ICD10", "SNOMEDCT_US", "CUI" as shown in the documentation (https://owlready2.readthedocs.io/en/latest/pymedtermino2.html#installation).

But unfortunately errors occured, when I tried to query several CUIs.

For instance the CUI "C0847557" with
CUI['C0847557']
 can not be found and returns:
CUI["C0847557"] # None

I checked for a handful of concepts, for which the querying failed, in which terminologies they are:

Terminologies with some concepts for which querying failed:
- ICPC2P (C0847557)
- NCI (C1708520)
- CPM (C1254354)
- CHV · AOD (C0683579, C0680954)
- ICD10AM (C0845989)
- CHV · MDR · MDRSPA · MDRDUT · MDRFRE · MDRGER · MDRITA · MDRJPN (C0858354)
- MEDCIN (C3646020)

My assumption is that only CUIs of concepts occuring in any of the defined terminologies are imported (e.g. if only SNOMED-CT is provided as a terminology, then only CUIs linking to concepts in SNOMED-CT are stored and retrievable? (Was not clear to me based on the documentation but makes sense)

So that would mean I need to import all terminologies, since I don't know beforehand to which terminology a CUI belongs to.

I went ahead and omitted the terminologies parameter according to the instructions to load all terminologies, but unfortunately, after a while of letting "import_umls" run (created file is 31.9 GB large) and processing the UMLS zip, it fails with following error:





How can I fix this issue and load PyMedtermino2 properly with all terminologies?

Kind regards
Kim Tang
Reply | Threaded
Open this post in threaded view
|

Re: Error when importing all terminologies

Kim Tang
Okay, so gathered synonyms for a lot of CUIs and was wondering why fewer error messages (exceptions) occured as compared to when I gathered the labels for each CUI.

Turns out the CUIs are apparently stored in PyMedTermino2 even if the respective terminology is not imported, but the information seems incomplete.

All the concepts from above don't have a label, but they might have accessable synonyms that can be queried:



These synonyms don't stem from any imported terminology though (ICD10, SNOMED, CUI).

I still dont understand why this is the case, but perhaps it helps when looking into it, if it turns out to be some error.

Any help is appreciated! :)
Reply | Threaded
Open this post in threaded view
|

Re: Error when importing all terminologies

Jiba
Administrator
Actually, the CUI imported are those from the imported terminology + those from the UMLS semantic types. The CUIs without label that you observed where extracted from MRSTY and correspond to semantic types.

If you want to know from where a CUI was extracted, you can use the .originals attribute that returns the list of terms in the original terminology. If it is empty, the CUI comes from semantic types only.

Jiba
Reply | Threaded
Open this post in threaded view
|

Re: Error when importing all terminologies

Kim Tang
Hello Jiba,

thank you for the clarification!

And do you know why the initial error occured as outlined above, when I tried to import all terminologies (by omitting the terminology parameter)?

Kim Tang wrote
I went ahead and omitted the terminologies parameter according to the instructions to load all terminologies, but unfortunately, after a while of letting "import_umls" run (created file is 31.9 GB large) and processing the UMLS zip, it fails with following error:





How can I fix this issue and load PyMedtermino2 properly with all terminologies?