Flair: _pickle.UnpicklingError: STACK_GLOBAL requires str

Created on 6 Aug 2019  ·  4Comments  ·  Source: flairNLP/flair

OS: Ubuntu16.04
Software: Latest Flair

I want to run the https://github.com/zalandoresearch/flair/blob/master/resources/docs/EXPERIMENTS.md about the “WNUT2017”。Since I can not download the embeddings using your code, I place them in my local folder(I think this is not the reason for my problem)

965 problem

Traceback (most recent call last):
File "1.py", line 17, in
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy'),
File "/home/chencan/anaconda3/lib/python3.6/site-packages/flair/embeddings.py", line 293, in __init__
str(embeddings)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/models/keyedvectors.py", line 1540, in load
model = super(WordEmbeddingsKeyedVectors, cls).load(fname_or_handle, *kwargs)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/models/keyedvectors.py", line 228, in load
return super(BaseKeyedVectors, cls).load(fname_or_handle, *
kwargs)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/utils.py", line 426, in load
obj = unpickle(fname)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/utils.py", line 1384, in unpickle
return _pickle.load(f, encoding='latin1')
_pickle.UnpicklingError: STACK_GLOBAL requires str

My code like this:

from flair.data import Corpus
from flair.data_fetcher import NLPTaskDataFetcher, NLPTask
from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings, FlairEmbeddings
from typing import List
corpus: Corpus = NLPTaskDataFetcher.load_corpus(NLPTask.WNUT_17, base_path='')
tag_type = 'ner'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
embedding_types: List[TokenEmbeddings] = [
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy'),
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/twitter.gensim.vectors.npy'),
FlairEmbeddings('/data/checan/NER/bilstm_crf/flair/news-forward-0.4.1.pt'),
FlairEmbeddings('/data/checan/NER/bilstm_crf/flair/news-backward-0.4.1.pt'),
]
embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)
from flair.models import SequenceTagger
tagger: SequenceTagger = SequenceTagger(hidden_size=256,
embeddings=embeddings,
tag_dictionary=tag_dictionary,
tag_type=tag_type)
from flair.trainers import ModelTrainer
trainer: ModelTrainer = ModelTrainer(tagger, corpus)
trainer.train('resources/taggers/example-ner',
max_epochs=150)

question

Most helpful comment

solved

All 4 comments

Hello @GGchen1997 - I think the problem is that you are passing the wrong file to the word embedding. You're loading them like this:

WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy')

But instead you should load them like this:

WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M')

However, in order for this to work, you need to download two files:

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M.vectors.npy

and

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M

Since for some reason, Gensim splits up large embedding files into two files.

Hello @GGchen1997 - I think the problem is that you are passing the wrong file to the word embedding. You're loading them like this:

WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy')

But instead you should load them like this:

WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M')

However, in order for this to work, you need to download two files:

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M.vectors.npy

and

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M

Since for some reason, Gensim splits up large embedding files into two files.

So in order to reimplement your result, should I load the file"en-fasttext-crawl-300d-1M" only? What about the file “en-fasttext-crawl-300d-1M.vectors.npy”,how do we use this?

Both files need to be downloaded and placed into the same folder next to each other. Then the above line should work.

solved

Was this page helpful?
0 / 5 - 0 ratings