Version 3.3.0 tries to import the module datasets:
https://github.com/huggingface/transformers/blob/v3.3.0/src/transformers/file_utils.py#L69
However, this can cause some undesirable behavior if there is a "datasets" folder in the same folder.
An example to re-produce the error:
datasets/ <= Folder that contains your own data files
myscript.py
myscript.py with the following content:
import transformers
This produces the following error:
python myscript.py
Traceback (most recent call last):
File "myscript.py", line 1, in <module>
import transformers
File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/__init__.py", line 22, in <module>
from .integrations import ( # isort:skip
File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/integrations.py", line 42, in <module>
from .trainer_utils import PREFIX_CHECKPOINT_DIR, BestRun # isort:skip
File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/trainer_utils.py", line 6, in <module>
from .file_utils import is_tf_available, is_torch_available, is_torch_tpu_available
File "/home/user/miniconda3/envs/sberttest/lib/python3.7/site-packages/transformers/file_utils.py", line 72, in <module>
logger.debug(f"Succesfully imported datasets version {datasets.__version__}")
AttributeError: module 'datasets' has no attribute '__version__'
The issue is with the import logic of Python. The datasets-folder will be treated as a module and transformers tries to load this module. This obviously fails, as we talk here about the datasets-folder and not datasets package.
As datasets is a quite common folder name in many setups to contain the files for the own datasets, I can image that this name collision will appear frequently. As soon as there is a datasets folder, you can no-longer import transformers.
I am not sure what the best solution is for this. One quick fix would be to change:
https://github.com/huggingface/transformers/blob/v3.3.0/src/transformers/file_utils.py#L74
to
except:
_datasets_available = False
This would catch all exceptions. Old scripts, that have a datasets/ folder would then still be working.
transformers version: 3.3.0Indeed we'll fix this and release a patch soon.
The bug has been fixed in #7456 and v3.3.1 is out with this fix. The problem should be solved for now, let us know if that's not the case!
Great, thanks for the quick fix and release of a new version.
It is now working fine in my case :)
I had the same error but my setup only included a data/ folder but now it works fine with version 3.3.1.
Most helpful comment
Great, thanks for the quick fix and release of a new version.
It is now working fine in my case :)