Transformers: v3.3.0 - Issue with name conflict in transformers & datasets - AttributeError: module 'datasets' has no attribute '__version__'

Created on 29 Sep 2020  路  4Comments  路  Source: huggingface/transformers

Version 3.3.0 tries to import the module datasets:
https://github.com/huggingface/transformers/blob/v3.3.0/src/transformers/file_utils.py#L69

However, this can cause some undesirable behavior if there is a "datasets" folder in the same folder.

An example to re-produce the error:

datasets/   <= Folder that contains your own data files
myscript.py

myscript.py with the following content:

import transformers

This produces the following error:

python myscript.py
Traceback (most recent call last):
  File "myscript.py", line 1, in <module>
    import transformers
  File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/__init__.py", line 22, in <module>
    from .integrations import (  # isort:skip
  File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/integrations.py", line 42, in <module>
    from .trainer_utils import PREFIX_CHECKPOINT_DIR, BestRun  # isort:skip
  File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/trainer_utils.py", line 6, in <module>
    from .file_utils import is_tf_available, is_torch_available, is_torch_tpu_available
  File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/file_utils.py", line 72, in <module>
    logger.debug(f"Succesfully imported datasets version {datasets.__version__}")
AttributeError: module 'datasets' has no attribute '__version__'

The issue is with the import logic of Python. The datasets-folder will be treated as a module and transformers tries to load this module. This obviously fails, as we talk here about the datasets-folder and not datasets package.

As datasets is a quite common folder name in many setups to contain the files for the own datasets, I can image that this name collision will appear frequently. As soon as there is a datasets folder, you can no-longer import transformers.

Solution

I am not sure what the best solution is for this. One quick fix would be to change:
https://github.com/huggingface/transformers/blob/v3.3.0/src/transformers/file_utils.py#L74

to

except:
    _datasets_available = False

This would catch all exceptions. Old scripts, that have a datasets/ folder would then still be working.

Environment info

  • transformers version: 3.3.0
  • Platform: Linux-4.15.0-39-generic-x86_64-with-debian-buster-sid
  • Python version: 3.7.9
  • PyTorch version (GPU?): 1.6.0 (False)
  • datasets package is not installed

Most helpful comment

Great, thanks for the quick fix and release of a new version.

It is now working fine in my case :)

All 4 comments

Indeed we'll fix this and release a patch soon.

The bug has been fixed in #7456 and v3.3.1 is out with this fix. The problem should be solved for now, let us know if that's not the case!

Great, thanks for the quick fix and release of a new version.

It is now working fine in my case :)

I had the same error but my setup only included a data/ folder but now it works fine with version 3.3.1.

Was this page helpful?
0 / 5 - 0 ratings