Skip to content

conda install transformers (not working) behaving differently from pip install transformers (working) for CentOS 7.9 #11003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
harrisonbay opened this issue Mar 31, 2021 · 3 comments

Comments

@harrisonbay
Copy link

harrisonbay commented Mar 31, 2021

A fresh environment where I conda install pytorch torchvision torchaudio -c pytorch then conda install transformers produces a glibc2.18 error on CentOS 7.9 upon import with python -c "from transformers import AutoTokenizer". I suspect this is a similar error to #2980, i.e., CentOS 7.9 might just be incompatible. However, a different fresh environment where I pip install torch torchvision torchaudio then pip install transformers does not produce any error upon import with python -c "from transformers import AutoTokenizer".

Environment info (pip-installed)

  • transformers version: 4.4.2
  • Platform: Linux-4.19.182-1.el7.retpoline.x86_64-x86_64-with-glibc2.10
  • Python version: 3.8.8
  • PyTorch version (GPU?): 1.8.1+cu102 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Using GPU in script?: N/A
  • Using distributed or parallel set-up in script?:

Environment info (conda-installed)

In fact, this command doesn't even work. See attached cli_error_trace.txt.

Who can help

I'm not sure if I did this right since this seems to be more of a lower-level issue than implementation issue.

-huggingface/transformers/blob/master/src/transformers/models/auto/tokenization_auto.py @LysandreJik

Information

Model I am using (Bert, XLNet ...):

N/A

To reproduce

This is all done on CentOS 7.9.

Steps to reproduce the good, pip-installed behavior:
  1. conda create --name test python=3.8
  2. conda activate test
  3. pip install torch torchvision torchaudio
  4. pip install transformers
  5. python -c "from transformers import AutoTokenizer"
Steps to reproduce the bad, conda-installed behavior:
  1. conda create --name test2 python=3.8
  2. conda activate test2
  3. conda install pytorch torchvision torchaudio -c pytorch
  4. conda install -c huggingface transformers
  5. python -c "from transformers import AutoTokenizer"

Additionally, I have attached the environment.yml files for both environments and also the trace for the transformers-cli env command and the trace for the import error (both for the conda install-ed environment). The traces look pretty similar, and it seems the issue is with the dependencies of tokenizers. The .yml files have an appended .txt extension since apparently GitHub doesn't support the .yml extension for uploaded files.

environment_pip.yml.txt
environment_conda.yml.txt
cli_error_trace.txt
import_error_trace.txt

Expected behavior

I would expect conda install-ing and pip install-ing to both work as intended.

@LysandreJik
Copy link
Member

Hello! From what I'm seeing, the error comes from the tokenizers library instead:

[...]
  File "/homes/gws/hcybay/miniconda3/envs/test2/lib/python3.8/site-packages/transformers-4.4.2-py3.8.egg/transformers/tokenization_utils_fast.py", line 25, in <module>
  File "/homes/gws/hcybay/miniconda3/envs/test2/lib/python3.8/site-packages/tokenizers/__init__.py", line 79, in <module>
    from .tokenizers import (
ImportError: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /homes/gws/hcybay/miniconda3/envs/test2/lib/python3.8/site-packages/tokenizers/tokenizers.cpython-38-x86_64-linux-gnu.so)

Do you mind opening an issue there? They'll probably be able to help out better.

@harrisonbay
Copy link
Author

Sure--sorry, didn't know which to open it in

@harrisonbay
Copy link
Author

harrisonbay commented Mar 31, 2021

Looks like I definitely should've searched the issues there first... huggingface/tokenizers#585

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants