Flair: aggregated_embedding not working with CPU

Created on 3 Feb 2020  路  11Comments  路  Source: flairNLP/flair

Describe the bug
error in the end of the learning for 90 epoch using the other language word embeddings

To Reproduce

    WordEmbeddings('ar'),

    # contextual string embeddings, forward
    PooledFlairEmbeddings('ar-forward', pooling='min'),

    # contextual string embeddings, backward
    PooledFlairEmbeddings('ar-backward', pooling='min'),

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS [e.g. iOS, Linux]:
  • Version [e.g. flair-0.3.2]:

Additional context

2020-02-03 16:59:38,777 Reading data from /content/gdrive/My Drive/resources/tasks/conll_03
2020-02-03 16:59:38,780 Train: /content/gdrive/My Drive/resources/tasks/conll_03/train.txt
2020-02-03 16:59:38,782 Dev: /content/gdrive/My Drive/resources/tasks/conll_03/dev.txt
2020-02-03 16:59:38,785 Test: /content/gdrive/My Drive/resources/tasks/conll_03/test.txt
2020-02-03 17:00:24,966 ----------------------------------------------------------------------------------------------------
2020-02-03 17:00:24,970 Model: "SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('ar')
    (list_embedding_1): PooledFlairEmbeddings(
      (context_embeddings): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.1, inplace=False)
          (encoder): Embedding(7125, 100)
          (rnn): LSTM(100, 2048)
          (decoder): Linear(in_features=2048, out_features=7125, bias=True)
        )
      )
    )
    (list_embedding_2): PooledFlairEmbeddings(
      (context_embeddings): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.1, inplace=False)
          (encoder): Embedding(7125, 100)
          (rnn): LSTM(100, 2048)
          (decoder): Linear(in_features=2048, out_features=7125, bias=True)
        )
      )
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=8492, out_features=8492, bias=True)
  (rnn): LSTM(8492, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=12, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)"
2020-02-03 17:00:24,975 ----------------------------------------------------------------------------------------------------
2020-02-03 17:00:24,978 Corpus: "Corpus: 1328 train + 710 dev + 605 test sentences"
2020-02-03 17:00:24,981 ----------------------------------------------------------------------------------------------------
2020-02-03 17:00:24,985 Parameters:
2020-02-03 17:00:24,987  - learning_rate: "0.1"
2020-02-03 17:00:24,990  - mini_batch_size: "32"
2020-02-03 17:00:24,991  - patience: "3"
2020-02-03 17:00:24,994  - anneal_factor: "0.5"
2020-02-03 17:00:24,996  - max_epochs: "90"
2020-02-03 17:00:24,999  - shuffle: "True"
2020-02-03 17:00:25,001  - train_with_dev: "True"
2020-02-03 17:00:25,002  - batch_growth_annealing: "False"
2020-02-03 17:00:25,004 ----------------------------------------------------------------------------------------------------
2020-02-03 17:00:25,007 Model training base path: "/content/gdrive/My Drive/resources/taggers/example-ner"
2020-02-03 17:00:25,011 ----------------------------------------------------------------------------------------------------
2020-02-03 17:00:25,014 Device: cuda:0
2020-02-03 17:00:25,015 ----------------------------------------------------------------------------------------------------
2020-02-03 17:00:25,020 Embeddings storage mode: cpu
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type StackedEmbeddings. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type WordEmbeddings. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type PooledFlairEmbeddings. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type FlairEmbeddings. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type LanguageModel. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type Dropout. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type Embedding. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type LSTM. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:292: UserWarning: Couldn't retrieve source code for container of type Linear. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
2020-02-03 17:00:45,361 ----------------------------------------------------------------------------------------------------
2020-02-03 17:00:45,365 Testing using best model ...
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-10-4c66a380caa5> in <module>()
     35               mini_batch_size=32,
     36               max_epochs=90,
---> 37               checkpoint=True)

6 frames
/usr/local/lib/python3.6/dist-packages/flair/embeddings.py in _add_embeddings_internal(self, sentences)
   2155                         else:
   2156                             aggregated_embedding = self.aggregate_op(
-> 2157                                 self.word_embeddings[token.text], local_embedding
   2158                             )
   2159                             if self.pooling == "fade":

RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'other' in call to _th_min
bug

All 11 comments

the same error thrown when i run the below code:

model = SequenceTagger.load('/content/gdrive/My Drive/resources/taggers/example-ner/final-model.pt')

from flair.data import Sentence
# create example sentence
sentence = Sentence('兀丨亘 丕賱賱睾丞 丕賱毓乇亘賷丞')

# predict tags and print
model.predict(sentence)

print(sentence.to_tagged_string())
RuntimeError                              Traceback (most recent call last)
<ipython-input-6-1f4c63ec6a60> in <module>()
      7 
      8 # predict tags and print
----> 9 model.predict(sentence)
     10 
     11 print(sentence.to_tagged_string())

4 frames
/usr/local/lib/python3.6/dist-packages/flair/embeddings.py in _add_embeddings_internal(self, sentences)
   2155                         else:
   2156                             aggregated_embedding = self.aggregate_op(
-> 2157                                 self.word_embeddings[token.text], local_embedding
   2158                             )
   2159                             if self.pooling == "fade":

RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'other' in call to _th_min

@abeermohamed1 and @mauryaland: I am having some problems to reproduce this error. When I train models on GPU or CPU with pooled embeddings it seems to work in my setups. Could you paste a minimal code example (i.e. full training code) that produces this error?

I am using colab GPU.

I am trying to remember how I overcame this problem.

by changing the colab runtime accelartor to 'none' instead of GPU

in the training phase, i used GPU but the problem happened in prediction only.

@alanakbik The problem happens with the evaluate and predict methods. Training is working fine, it is only at the end during the evaluation that the problem occurs.

Thank you @alanakbik and @mauryaland
yes, and when I switch to CPU to avoid this problem CPU is not sufficient and crashing.
so is there a fix for this problem?

@abeermohamed1 Yes, I proposed a fix here : https://github.com/flairNLP/flair/pull/1417.
It is a really small change as you can see but it makes the evaluate and predict methods working on my side.

@mauryaland Thank you, you saved my day. could you please let me know how to install the fix.

@mauryaland Please do let us know how to use your fix.

@ARArun your issue is actually a different one that is specific to ELMoEmbeddings (see #957 #635).

@abeermohamed1 @mauryaland - thanks, I can reproduce the error now!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

gopalkalpande picture gopalkalpande  路  3Comments

shoarora picture shoarora  路  3Comments

davidsbatista picture davidsbatista  路  3Comments

alanakbik picture alanakbik  路  3Comments

happypanda5 picture happypanda5  路  3Comments