-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UMLS. init_from_nlm_zip can't decode charmap #8
Comments
Hi @elsirdavid 1. Can you confirm that the UMLS zip file isn't corrupted?Test this via the command line 2. Try creating a new conda env using the
|
Hi @jason-fries (and Happy New Year!!) Thank you for your help. I couldn't use the md5 command from the command line. I did use the checksum suggested and used other code to get a md5 hash of the file. The checksum was added inline, the hash is below the list of python libraries in the environment. The UMLS code seems to have a problem with the declaration of the 'release' variable. 1_Installing_the_UMLS_md5_checksum.pdf for the creation of a new environment, I used the 'requirements.txt' file as directed by the README. This manages to install some libraries but crashes when collecting scipy (error in preparing metadata regardign pyproject.toml). I installed msgpack, pandas by hand. The results were the same and are below: |
Hi @elsirdavid Two issues: (1) For your MD5 hash check, your provided code
generates a hash of the string literal not the contents of the UMLS zip file. You'll want to use
to generate a hash of the contents of the zip file. The above code snippet should return (2) Trove is only tested with Python 3.7.x. From your PDF it looks like your environment is On my machine installing from the latest trove Also make certain to wipe your temp directory ( |
Could you point me to that |
Also, thanks for fixing my hash code. It is indeed not corrupted, I do get the right hash thankfully. |
Thank you for the changing branch idea. I have now tried to to use the relevatn yml file. The creation fails with the output in the included txt file. I am going to try to install the relevant libraries and python version by hand. |
I ended up installing python 3.7, msgpack and pandas as the yml file directed and the resulting notebook is here: |
Describe the bug
I can't install the UMLS as directed by the tutorial notebooks. The UMLS object can't be initialized.
Steps to reproduce the bug
I downloaded the relevant zip file from the provided link (https://download.nlm.nih.gov/umls/kss/2020AB/umls-2020AB-metathesaurus.zip) and placed the file in the same directory as the 1_Installing_the_UMLS.ipynb notebook in the tutorials folder. Then I ran the notebook as given in the github.
Sample code to reproduce the bug
Expected results
A clear and concise description of the expected results.
Actual results
Specify the actual results or traceback.
The libraries and python version are all on the pdf attached
Environment info
datasets
version:1_Installing_the_UMLS_with_decode_error.pdf
The libraries and python version are all on the pdf attached
The libraries and python version are all on the pdf attached
The text was updated successfully, but these errors were encountered: