Flair: _pickle.UnpicklingError: STACK_GLOBAL requires str

Created on 6 Aug 2019 · 4Comments · Source: flairNLP/flair

OS: Ubuntu16.04
Software: Latest Flair

I want to run the https://github.com/zalandoresearch/flair/blob/master/resources/docs/EXPERIMENTS.md about the “WNUT2017”。Since I can not download the embeddings using your code, I place them in my local folder(I think this is not the reason for my problem)

965 problem

Traceback (most recent call last):
File "1.py", line 17, in
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy'),
File "/home/chencan/anaconda3/lib/python3.6/site-packages/flair/embeddings.py", line 293, in __init__
str(embeddings)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/models/keyedvectors.py", line 1540, in load
model = super(WordEmbeddingsKeyedVectors, cls).load(fname_or_handle, *kwargs)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/models/keyedvectors.py", line 228, in load
return super(BaseKeyedVectors, cls).load(fname_or_handle, *kwargs)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/utils.py", line 426, in load
obj = unpickle(fname)
File "/home/chencan/anaconda3/lib/python3.6/site-packages/gensim/utils.py", line 1384, in unpickle
return _pickle.load(f, encoding='latin1')
_pickle.UnpicklingError: STACK_GLOBAL requires str

My code like this:

from flair.data import Corpus
from flair.data_fetcher import NLPTaskDataFetcher, NLPTask
from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings, FlairEmbeddings
from typing import List
corpus: Corpus = NLPTaskDataFetcher.load_corpus(NLPTask.WNUT_17, base_path='')
tag_type = 'ner'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
embedding_types: List[TokenEmbeddings] = [
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy'),
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/twitter.gensim.vectors.npy'),
FlairEmbeddings('/data/checan/NER/bilstm_crf/flair/news-forward-0.4.1.pt'),
FlairEmbeddings('/data/checan/NER/bilstm_crf/flair/news-backward-0.4.1.pt'),
]
embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)
from flair.models import SequenceTagger
tagger: SequenceTagger = SequenceTagger(hidden_size=256,
embeddings=embeddings,
tag_dictionary=tag_dictionary,
tag_type=tag_type)
from flair.trainers import ModelTrainer
trainer: ModelTrainer = ModelTrainer(tagger, corpus)
trainer.train('resources/taggers/example-ner',
max_epochs=150)

question

Source

GGchen1997

Most helpful comment

solved

GGchen1997 on 7 Aug 2019

👍2 👎1

All 4 comments

Hello @GGchen1997 - I think the problem is that you are passing the wrong file to the word embedding. You're loading them like this:

WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy')

But instead you should load them like this:

WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M')

However, in order for this to work, you need to download two files:

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M.vectors.npy

and

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M

Since for some reason, Gensim splits up large embedding files into two files.

alanakbik on 6 Aug 2019

❤1

Hello @GGchen1997 - I think the problem is that you are passing the wrong file to the word embedding. You're loading them like this:
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M.vectors.npy')
But instead you should load them like this:
WordEmbeddings('/data/checan/NER/bilstm_crf/flair/en-fasttext-crawl-300d-1M')
However, in order for this to work, you need to download two files:

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M.vectors.npy

and

https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/embeddings-v0.3/en-fasttext-news-300d-1M

Since for some reason, Gensim splits up large embedding files into two files.

So in order to reimplement your result, should I load the file"en-fasttext-crawl-300d-1M" only? What about the file “en-fasttext-crawl-300d-1M.vectors.npy”，how do we use this?

GGchen1997 on 7 Aug 2019

Both files need to be downloaded and placed into the same folder next to each other. Then the above line should work.

alanakbik on 7 Aug 2019

solved

GGchen1997 on 7 Aug 2019

👍2 👎1

Was this page helpful?

0 / 5 - 0 ratings