Fasttext: Loading pre-trained .bin models for supervised training

Created on 25 Jul 2017 · 2Comments · Source: facebookresearch/fastText

I'm having trouble loading the pre-trained Wikipedia word vector models in the .bin file format for supervised training.

Specifically,
./fasttext supervised -input b_1.txt -outputfb -pretrainedVectors wiki.id.bin -dim 300
yields:
Dimension of pretrained vectors does not match -dim option
Here's the fb_1.txt

I am using the 300-dimension vectors and have confirmed that the .vec format contains 300 dimensions. Does anyone know how to do this?

I'm also having trouble loading .bin word vector models that I trained myself using:
./fasttext skipgram -input fb_1_unlabeled.txt -output fb_1_unsup

The reason I want to do this is that, to my understanding, the .bin model contains sub-word information such as character n-grams and also model parameters to allow training continuation - all of which should help build a better classifier. Am I wrong?

Source

rhezab

👍6

Most helpful comment

yes i think you just need to use the .vec file.

the below example trains OK for me:

./fasttext supervised \
  -pretrainedVectors mj/corpus/wiki.zh.vec \
  -input mj/corpus/faq.train \
  -dim 300 \
  -output mj/corpus/faq.model

however, it seems to be taking a long while :)

dcsan on 13 Aug 2017

👍7

All 2 comments

I am new to fasttext but the docs and examples imply that you should give a .vec file as pretrainedVectors. I also would expect to be able to give the embeding you produced with fasttext (unsupervised) but my guess is it would have complicated their implementation. I hope I am wrong. Anyway I am interested in how they are using the vectors for classification, what happens to words from a document but not in the .vec file?

patrickmesana on 27 Jul 2017

yes i think you just need to use the .vec file.

the below example trains OK for me:

./fasttext supervised \
  -pretrainedVectors mj/corpus/wiki.zh.vec \
  -input mj/corpus/faq.train \
  -dim 300 \
  -output mj/corpus/faq.model

however, it seems to be taking a long while :)

dcsan on 13 Aug 2017

👍7

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Is there support for regression in fastText?

hughbzhang · 3Comments

About the input format of `fastext`

pengyu · 3Comments

Unable to install fasttext (python) on windows.

ragvri · 3Comments

Adding the bin file to wiki-news-300d-1M-subword.vec.zip

kurtjanssensai · 3Comments

wordNgrams in unsupervised mode (cbow and skipgram)

mino98 · 3Comments