Flair: Error with TransformerWordEmbeddings

Created on 30 Jun 2020 · 4Comments · Source: flairNLP/flair

Describe the bug

ImportErrorTraceback (most recent call last)
<ipython-input-4-3bbdce17bb1a> in <module>
      2 embedding_types: List[TokenEmbeddings] = [
      3     WordEmbeddings('de'),
----> 4     TransformerWordEmbeddings('bert-base-german-cased'),
      5 ]
      6 

/opt/conda/envs/gpu/lib/python3.6/site-packages/flair/embeddings/token.py in __init__(self, model, layers, pooling_operation, batch_size, use_scalar_mix, fine_tune)
    819 
    820         # load tokenizer and transformer model
--> 821         self.tokenizer = AutoTokenizer.from_pretrained(model)
    822         config = AutoConfig.from_pretrained(model, output_hidden_states=True)
    823         self.model = AutoModel.from_pretrained(model, config=config)

/opt/conda/envs/gpu/lib/python3.6/site-packages/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
    204         config = kwargs.pop("config", None)
    205         if not isinstance(config, PretrainedConfig):
--> 206             config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
    207 
    208         if "bert-base-japanese" in str(pretrained_model_name_or_path):

/opt/conda/envs/gpu/lib/python3.6/site-packages/transformers/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    201 
    202         """
--> 203         config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
    204 
    205         if "model_type" in config_dict:

/opt/conda/envs/gpu/lib/python3.6/site-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    236                 proxies=proxies,
    237                 resume_download=resume_download,
--> 238                 local_files_only=local_files_only,
    239             )
    240             # Load config dict

/opt/conda/envs/gpu/lib/python3.6/site-packages/transformers/file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, local_files_only)
    569             resume_download=resume_download,
    570             user_agent=user_agent,
--> 571             local_files_only=local_files_only,
    572         )
    573     elif os.path.exists(url_or_filename):

/opt/conda/envs/gpu/lib/python3.6/site-packages/transformers/file_utils.py in get_from_cache(url, cache_dir, force_download, proxies, etag_timeout, resume_download, user_agent, local_files_only)
    748             logger.info("%s not found in cache or force_download set to True, downloading to %s", url, temp_file.name)
    749 
--> 750             http_get(url, temp_file, proxies=proxies, resume_size=resume_size, user_agent=user_agent)
    751 
    752         logger.info("storing %s in cache at %s", url, cache_path)

/opt/conda/envs/gpu/lib/python3.6/site-packages/transformers/file_utils.py in http_get(url, temp_file, proxies, resume_size, user_agent)
    639         initial=resume_size,
    640         desc="Downloading",
--> 641         disable=bool(logger.getEffectiveLevel() == logging.NOTSET),
    642     )
    643     for chunk in response.iter_content(chunk_size=1024):

/opt/conda/envs/gpu/lib/python3.6/site-packages/tqdm/notebook.py in __init__(self, *args, **kwargs)
    206         total = self.total * unit_scale if self.total else self.total
    207         self.container = self.status_printer(
--> 208             self.fp, total, self.desc, self.ncols)
    209         self.sp = self.display
    210 

/opt/conda/envs/gpu/lib/python3.6/site-packages/tqdm/notebook.py in status_printer(_, total, desc, ncols)
     95         if IProgress is None:  # #187 #451 #558 #872
     96             raise ImportError(
---> 97                 "IProgress not found. Please update jupyter and ipywidgets."
     98                 " See https://ipywidgets.readthedocs.io/en/stable"
     99                 "/user_install.html")

ImportError: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html

To Reproduce

# imports
from flair.data import Corpus
from flair.datasets import CONLL_03_GERMAN, ColumnCorpus 
from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings, PooledFlairEmbeddings, FlairEmbeddings, TransformerWordEmbeddings
from flair.visual.training_curves import Plotter
from flair.trainers import ModelTrainer
from flair.models import SequenceTagger
from typing import List

columns = {0: 'text', 1: 'ner'}
data_folder = '/workspace'

corpus: Corpus = ColumnCorpus(data_folder, columns,
                              train_file='train.train',
                              dev_file='dev.dev',
                              test_file='test.test')

# define task + tag dict.
tag_type = 'ner'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
print(tag_dictionary )

# initialize embeddings
embedding_types: List[TokenEmbeddings] = [
    WordEmbeddings('de'),
    TransformerWordEmbeddings('bert-base-german-cased'),
]

embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)

# initialize sequence tagger
from flair.models import SequenceTagger

tagger: SequenceTagger = SequenceTagger(hidden_size=256,
                                        embeddings=embeddings,
                                        tag_dictionary=tag_dictionary,
                                        tag_type=tag_type)

# initialize trainer
from flair.trainers import ModelTrainer

trainer: ModelTrainer = ModelTrainer(tagger, corpus)

trainer.train('taggers/distil',
            learning_rate=0.1,
              mini_batch_size=32,
              train_with_dev = True, 
              max_epochs=150)

Expected behavior
The model gets trained

Environment (please complete the following information):

OS [e.g. iOS, Linux]: Linux
Version [e.g. flair-0.3.2]: Tested both: 0.5 and 0.4.5

Additional context
This problem only occur with TransformerEmbeddings not with Flair

bug

Source

pascalhuszar

👀1 🚀1 ❤1

Most helpful comment

Should be an env problem. Tried to update these things, nothing changed. On GoogleColab it works but running this notebook on vast.ai, the TransformerWordEmbeddings don't work.

pascalhuszar on 3 Jul 2020

👀1 🚀1 ❤1

All 4 comments

I also tried

TransformerWordEmbeddings('distilbert-base-german-cased')

and other models

pascalhuszar on 30 Jun 2020

Hm this code is running for me (using a different corpus). It looks like your environment has something missing. It says to update jupyter - maybe try this?

alanakbik on 3 Jul 2020

Should be an env problem. Tried to update these things, nothing changed. On GoogleColab it works but running this notebook on vast.ai, the TransformerWordEmbeddings don't work.

pascalhuszar on 3 Jul 2020

👀1 🚀1 ❤1

ipywidgets , you will need to intall that ( if you are running the code from notebook)