Fasttext: supervised classification with pretrained vectors

Created on 19 Jun 2017  路  6Comments  路  Source: facebookresearch/fastText

I am currently trying to train a classifier using pretrained vectors. I am basically using the command:

./fasttext supervised -input <path-to-input-file> -output <path-to-output-file> -pretrainedVectors <path-to-pretrained-vectors-file>

Unfortunately, I always get the following error message:

Read 0M words
Number of words:  13564
Number of labels: 2
Dimension of pretrained vectors does not match -dim option

Even I explicitly set the dimension to the size of my pretrained vectors (using the "-dim" option), I still obtain the same error message. Have I misunderstood anything? Thank you.

Most helpful comment

I found the mistake: I did not include the first line of Word2Vec which explicitly states the size of the vocabulary and the number of dimensions. Now, having this information included, it seems to be accepted by fastText.

All 6 comments

just to clarify: I am using embeddings I have trained with Word2Vec. The file with those embeddings are used in the non-binary format.

What is the dim of your pretrained vector file?

400 dimensions

Is that a .bin file?

The file is not a binary file. It is a text file which is human readable.

I found the mistake: I did not include the first line of Word2Vec which explicitly states the size of the vocabulary and the number of dimensions. Now, having this information included, it seems to be accepted by fastText.

Was this page helpful?
0 / 5 - 0 ratings