from gensim.models.wrappers import FastText
fasttext_model = FastText.load_fasttext_format('wiki-news-300d-1M.vec')
print(fasttext_model("TestTest"))
results in: NotImplementedError: Supervised fastText models are not supported
Alternative approach:
from gensim.models import KeyedVectors
fasttext_model = KeyedVectors.load_word2vec_format('wiki-news-300d-1M.vec')
print(fasttext_model("TestTest"))
results in: "KeyError("word '%s' not in vocabulary" % word)
I would have expected these issues fixed by this update: https://github.com/RaRe-Technologies/gensim/pull/1916. Could you please check?
Linux-4.4.0-124-generic-x86_64-with-Ubuntu-16.04-xenial
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609]
NumPy 1.13.3
SciPy 1.0.1
gensim 3.4.0
FAST_VERSION 1
CC @manneshiva
Has this been fixed?
@quajak
That's correct behavior: NotImplementedError: Supervised fastText models are not supported we really not support dumps of supervised fasttext.
For KeyedVectors - this stored words (not ngrams), word "TestTest" really missing.
@menshikh-iv but Fasttext should support OOV words -- can you point @quajak to how to load the fasttext model (incl. OOV) properly?
@quajak you should use FastText.load_fasttext_format but with unsupervised model.
@quajak more detailed answer (to similar question): https://github.com/RaRe-Technologies/gensim-data/issues/26#issuecomment-408814033 (I hope this will be helpful for you).
@piskvorky did you find anything good to deal with OOV words?
@shadylpstan see comments above. We also clarified the fastsText loading instructions recently, check out the fastText module docstrings. CC @mpenkov .
Most helpful comment
@menshikh-iv but Fasttext should support OOV words -- can you point @quajak to how to load the fasttext model (incl. OOV) properly?