Describe the bug
Simply running the code FlairEmbeddings('news-forward') gives a ValueError, in Python 3.6.10 (Conda), which is the python environment included in the most recent PyTorch Docker container from Nvidia. (https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel_20-06.html#rel_20-06)
Here is the error message:
Traceback (most recent call last):
File "fineTune_langModel.py", line 10, in <module>
language_model = FlairEmbeddings('news-forward').lm
File "/opt/conda/lib/python3.6/site-packages/flair/embeddings/token.py", line 578, in __init__
self.lm: LanguageModel = LanguageModel.load_language_model(model)
File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 202, in load_language_model
dropout=state["dropout"],
File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 63, in __init__
self.to(flair.device)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 465, in to
return self._apply(convert)
File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 404, in _apply
for info in torch.__version__.replace("+",".").split('.') if info.isdigit())
ValueError: not enough values to unpack (expected at least 3, got 2)
To Reproduce
Run this code in the container (after pip installing flair):
from flair.embeddings import FlairEmbeddings
FlairEmbeddings('news-forward')
Expected behavior
The function should work with no problems (in particular, the .lm is intended to be given to the LanguageModelTrainer function).
Environment (please complete the following information):
Additional context
Running the code in Python 3.7.7 (Conda) gives no problems.
@tylerlekang thanks for reporting this. Could you print the torch version you get with torch.__version__? Also, can you try updating to Flair 0.5.1?
@alanakbik torch.__version__ reports 1.6.0a0+9907a3e , which matches as shown in https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel_20-06.html#rel_20-06
Used pip install --upgrade flair to upgrade to 0.5.1. The same error persists:
>>> flair.__version__
'0.5.1'
>>>
>>> from flair.embeddings import FlairEmbeddings
>>> FlairEmbeddings('news-forward')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/conda/lib/python3.6/site-packages/flair/embeddings/token.py", line 586, in __init__
self.lm: LanguageModel = LanguageModel.load_language_model(model)
File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 202, in load_language_model
dropout=state["dropout"],
File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 63, in __init__
self.to(flair.device)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 465, in to
return self._apply(convert)
File "/opt/conda/lib/python3.6/site-packages/flair/models/language_model.py", line 404, in _apply
for info in torch.__version__.replace("+",".").split('.') if info.isdigit())
ValueError: not enough values to unpack (expected at least 3, got 2)
Did you test with this container? It seems like a common and important container to verify, as it is official Nvidia optimized for PyTorch applications.
Thank you very much for your support! :)
@alanakbik In the models/language_model.py code, the first line of _apply is (starts at line 402):
major, minor, build, *_ = (int(info)
for info in torch.__version__.replace("+",".").split('.') if info.isdigit())
If I simply run torch.__version__.replace("+",".").split('.') in the container (python 3.6.10) it returns ['1', '6', '0a0', '9907a3e']. Then if I run for i in (int(info) for info in torch.__version__.replace("+",".").split('.') if info.isdigit()): print(i) it prints:
1
6
However, on my local machine running vanilla python 3.7.7, the torch version is just 1.5.0. So this may be the problem.
I have no idea why Nvidia has chosen this version of Pytorch with letters in the version number, but they did make this choice and this container is supposed to be an easy solution for highly optimized GPU runs on their hardware.
Do you have any workaround ideas?
@alanakbik could I just hardcode the major, minor, build numbers, in my local version of language_model.py if there is no workaround?
major = 1
minor = 6
build = 0
It seems the code just checks that the major.minor is >= 1.4 ? But I don't want to mess up any other parts of the code.
Yes, I guess you could just overwrite torch.__version__ as a quick fix by calling this before your script:
import torch
torch.__version__ = '1.5.0'
Meanwhile, I will put in a PR to fix the error.
@alanakbik just wanting to triple-confirm, that shouldn't cause any problems with the rest of the FlairEmbeddings or LanguageModelTrainer codes? Thank you!
It shouldn't cause any problems on the flair side. We use the string to determine whether and old version of torch is used (<1.4.0) or not, so changing it to another string that is above 1.4.0 won't change anything.