Flair: Getting error in loading sequence tagger model which uses scratched trained roberta model

Created on 12 Dec 2020 · 10Comments · Source: flairNLP/flair

Describe the bug

I got the error when I loaded the trained sequence tagger model which uses the scratched trained roberta model from hugging face

To Reproduce

I trained the model with this config..

from flair.data import Corpus
from flair.datasets import ColumnCorpus
from flair.embeddings import FlairEmbeddings, TokenEmbeddings, StackedEmbeddings, BertEmbeddings, CharacterEmbeddings, BytePairEmbeddings, WordEmbeddings, TransformerWordEmbeddings
from flair.visual.training_curves import Plotter
from torch.optim.lr_scheduler import OneCycleLR
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer
from typing import List
from torch.optim.adam import Adam
from torch.optim.lr_scheduler import OneCycleLR
import sys
import flair, torch 
import gensim

flair.device = torch.device('cuda:0')

columns = {0: 'text', 1: 'tag'}
data_folder =  './data/Model_Trainer_Data'
corpus: Corpus = ColumnCorpus(data_folder, columns,train_file= sys.argv[1] , dev_file=sys.argv[3], test_file= sys.argv[4], in_memory=False)

tag_type = 'tag'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)

robert_model = TransformerWordEmbeddings('/home/user/Documents/mk/flair/embedding_model/roberta/v2c', fine_tune=True,  pooling_operation='mean', layers='-1', allow_long_sentences=True)

embedding_types: List[TokenEmbeddings] = [
    robert_model
    ]

embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)

tagger: SequenceTagger = SequenceTagger(hidden_size=256,
                                        embeddings=embeddings,
                                        tag_dictionary=tag_dictionary,
                                        tag_type=tag_type,
                                        use_crf=False,
                                        use_rnn=False)

trainer: ModelTrainer = ModelTrainer(tagger, corpus, optimizer=Adam)

trainer.train('./results/'+ sys.argv[2],
                learning_rate=3e-5, # very low learning rate
                mini_batch_size=64,
            max_epochs=10, # very few epochs of fine-tuning
            embeddings_storage_mode = 'gpu',
            train_with_dev=False,
            checkpoint=True
    )

I loaded and predicted like this on other machine..

from flair.models import SequenceTagger
from typing import List

tagger_model = SequenceTagger.load('../models/w3c_series/w3c_10/best-model.pt')

When loading the model, I got the error message like this

020-12-12 21:14:15,430 loading file ../models/w3c_series/w3c_10/best-model.pt
404 Client Error: Not Found for url: https://huggingface.co//home/user/Documents/mk/flair/embedding_model/roberta/v2c/resolve/main/config.json

Expected behavior
I think the link from hugging face is wrong and concatenated with the path from the gpu server path which I trained (not from my local machine).
I just want to load like other models which uses different embedding models without errors.

Environment :

OS : Linux Mint 19.03
Version : latest branch

bug

Source

Michael95-m

All 10 comments

Hi @Michael95-m, can you please check whether

import os
os.path.abspath("../models/w3c_series/w3c_10/best-model.pt")

is evaluating to your desired model destination? I guess that the relative pathname causes the issue

whoisjones on 13 Dec 2020

👍1

Hi @whoisjones , I still get the same error when I changed the path as u said..

from flair.models import SequenceTagger
from typing import List
import os

tagger_model = SequenceTagger.load(os.path.abspath('../models/w3c_series/w3c_10/best-model.pt'))

( Note.. I found out that when loading the model on the trained gpu server with the same code didn't cause any errors but when I did on my local machine, the error happened. I think it is the same error like the bytepair model loading happened in version between 0.5 and 0.6.1 .)

Michael95-m on 14 Dec 2020

Hi @Michael95-m, seems like the folder structure from your projects is different (more precisely, your main entry point to your program is different). So evaluating os.path.abspath('../models/w3c_series/w3c_10/best-model.pt') in both settings in your train file should give you different absolute paths (at least I assume that).

Simply place in your script these lines and let me know the outcome on both machines:

import os
print(os.path.abspath("../models/w3c_series/w3c_10/best-model.pt"))

whoisjones on 14 Dec 2020

Hi @whoisjones , as u said I check out the absolute paths of model file on both server and local machine.

For servers, the absolute path is at /home/server/Documents/mk/flair/results/w3c_10/best-model.pt
For my local machine, the absolute path is at /home/user/Projects/flair/flair_eval/models/w3c_series/w3c_10/best-model.pt

Michael95-m on 14 Dec 2020

Hi @Michael95-m, and can you confirm that under both path the respective model is available? Then i would have to think about another solution to your problem.

whoisjones on 14 Dec 2020

Hi @whoisjones , yes. Both models are at where I said. Btw, there are also other models at both places. They worked for loading and prediction on both machines but only that model ( w3c_10 which uses roberta model for training) can't even load in local machine.

Michael95-m on 15 Dec 2020

Hi, I am having the same problem with bpe model. Model is on the local machine.

UrszulaCzerwinska on 14 Jan 2021

Hi @UrszulaCzerwinska , for bpe model, there is a solution I used. U should change flair cache model to yr local path like

model_path = os.path.abspath('./models')
flair.cache_root = model_path ## in "models" folder, bytepair embedding file will be saved..

I hope it helps !!!

Michael95-m on 15 Jan 2021

❤1

@Michael95-m Thank you, I will check it out !

UrszulaCzerwinska on 15 Jan 2021

I still cannot load bpe model even with @Michael95-m hack

UrszulaCzerwinska on 21 Jan 2021

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How to turn predicted label from flair.data.Label into int?

jannenev · 3Comments

Early Stopping

Aditya715 · 3Comments

How to improve the speed classifier.predict(sentence) sentiment analysis ?

Rahulvks · 3Comments

AttributeError: 'DistilBertConfig' object has no attribute 'return_dict'

Y4rd13 · 3Comments

early stop according to dev set performance

ciaochiaociao · 3Comments