Transformers: Can not import DataCollatorForLanguageModeling

Created on 22 Apr 2020 · 3Comments · Source: huggingface/transformers

🐛 Bug

Information

Model I am using (ALBERT):

Language I am using the model on (Sanskrit, Hindi):

The problem arises when using:

[x] the official example scripts: (give details below)
[ ] my own modified scripts: (give details below)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[x] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

In Google Colab
!python /content/transformers/examples/run_language_modeling.py \ --train_data_file /content/corpus/train/full.txt \ --eval_data_file /content/corpus/valid/full_val.txt \ --model_type albert-base-v2 \
This worked yesterdy, bbut the latest added DataCollatorForLanguageModeling can't be imported.

The error i am getting

2020-04-22 05:12:25.640328: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 Traceback (most recent call last): File "/content/transformers/examples/run_language_modeling.py", line 29, in <module> from transformers import ( ImportError: cannot import name 'DataCollatorForLanguageModeling'

So, I checked if it can be imported directly.

from transformers import DataCollatorForLanguageModeling
ERROR
from transformers import DataCollatorForLanguageModeling ImportError: cannot import name 'DataCollatorForLanguageModeling'

Expected behavior

Environment info

transformers version: 2.8.0
Platform: Colab
Python version: 3.6.9
PyTorch version (GPU?): 1.4.0
Tensorflow version (GPU?): 2.2.0-rc3
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Source

parmarsuraj99

Most helpful comment

It's because the pip package hasn't been updated. The script to train is changed fundamentally. so you can try building from scratch using
git clone https://github.com/huggingface/transformers cd transformers pip install .
or
You can use old script of run_language_modeling.py from previous commits.

parmarsuraj99 on 22 Apr 2020

👍10