Fasttext: Program outputs: `Model file has wrong file format!'

Created on 5 May 2017 · 15Comments · Source: facebookresearch/fastText

Obviously you have changed the file format with the recent commits.

Is there a way to port my model which I trained with an older version of the code?

Thanks

Source

onurgu

👍1

Most helpful comment

Idea: At least improve the error message saying something like "Model was generated with an older version. Please retrain with current version."

JensRantil on 19 May 2017

👍8

All 15 comments

Hey @onurgu,

Thank you for your post. We anticipated this, but as a first step decided, that it's not entirely unreasonable to ask users to first try retraining their models. The reasoning for this was the general performance of the library and the generally short training times. Please let me know if there are significant hurdles that come along in retraining your model and we might be able to work out something else.

Thank you,
Christian

cpuhrsch on 5 May 2017

Thank you, as you suggested I just restarted the training.

It is not a big problem as one epoch takes 30-60 mins, however people with bigger corpora could have trouble. Nevertheless I believe those would then decide to write the transformation code themselves as I guess it would take less time :)

Cheers

onurgu on 5 May 2017

I just got bitten by this.

You guys are doing a great job with this library, but a bit of warning might have been nice. Or at least a message saying why the model was the wrong format.

jazoom on 14 May 2017

Do you know the latest commit where this doesn't occur?

joshmalina on 16 May 2017

Idea: At least improve the error message saying something like "Model was generated with an older version. Please retrain with current version."

JensRantil on 19 May 2017

👍8

Can somebody give an example of how to retrain a model with "-retrain" parameter?

oldmonk101 on 23 May 2017

@oldmonk101 You might want you check this paper fastText.zip
to get more detailed idea on -retrain parameter

spate141 on 23 May 2017

👍2

We're aware of the pain this has cost for some of our users and will improve upon it for future releases. I'm closing this task now, please feel to reopen if this continues to be an issue for you.

Thanks,
Christian

cpuhrsch on 4 Jul 2017

@cpuhrsch I get this error with the new pre-trained model: lid.176.bin

curl https://s3-us-west-1.amazonaws.com/fasttext-vectors/supervised_models/lid.176.bin -o /root/lid.176.bin 
cd fastText
make clean & make
./fasttext predict-prob /root/lid.176.bin - 2
Model file has wrong file format!

loretoparisi on 10 Oct 2017

@loretoparisi I think it's because of this new subwords for supervised models update. New lid.176.bin model is created with this code.

spate141 on 10 Oct 2017

@spate141 ah yes thanks! I have seen that update! Do I have to update the repo and make the binary only?

loretoparisi on 10 Oct 2017

@spate141 yes and it was that one!!!

__label__en 0.656128 __label__zh 0.0380391

Thank you!

loretoparisi on 10 Oct 2017

👍1

I would like to use pretrained english word vectors given in fasttext website. I run following command ./fasttext print-word-vectors wiki-news-300d-1M-subword.vec < queries.txt , then I receive `Model file has wrong file format!' error. I try command again by changing vector file format from vec to txt, but I receive same error. How can I use pretrained english vectors?

isspek on 5 Dec 2017

👍3

Same issue as @isspek, except I'm getting the error in fastText.load_model, when trying to read in wiki-news-300d-1M.vec

joshi-sh on 19 Mar 2019

👍1

I'm also having the same problem as @joshi-sh . Are there retrained versions of these datasets? If not, how could we go about retraining them? (I'm not very familiar with the training portion of the library.)