Obviously you have changed the file format with the recent commits.
Is there a way to port my model which I trained with an older version of the code?
Thanks
Hey @onurgu,
Thank you for your post. We anticipated this, but as a first step decided, that it's not entirely unreasonable to ask users to first try retraining their models. The reasoning for this was the general performance of the library and the generally short training times. Please let me know if there are significant hurdles that come along in retraining your model and we might be able to work out something else.
Thank you,
Christian
Thank you, as you suggested I just restarted the training.
It is not a big problem as one epoch takes 30-60 mins, however people with bigger corpora could have trouble. Nevertheless I believe those would then decide to write the transformation code themselves as I guess it would take less time :)
Cheers
I just got bitten by this.
You guys are doing a great job with this library, but a bit of warning might have been nice. Or at least a message saying why the model was the wrong format.
Do you know the latest commit where this doesn't occur?
Idea: At least improve the error message saying something like "Model was generated with an older version. Please retrain with current version."
Can somebody give an example of how to retrain a model with "-retrain" parameter?
@oldmonk101 You might want you check this paper fastText.zip
to get more detailed idea on -retrain parameter
We're aware of the pain this has cost for some of our users and will improve upon it for future releases. I'm closing this task now, please feel to reopen if this continues to be an issue for you.
Thanks,
Christian
@cpuhrsch I get this error with the new pre-trained model: lid.176.bin
curl https://s3-us-west-1.amazonaws.com/fasttext-vectors/supervised_models/lid.176.bin -o /root/lid.176.bin
cd fastText
make clean & make
./fasttext predict-prob /root/lid.176.bin - 2
Model file has wrong file format!
@loretoparisi I think it's because of this new subwords for supervised models update. New lid.176.bin model is created with this code.
@spate141 ah yes thanks! I have seen that update! Do I have to update the repo and make the binary only?
@spate141 yes and it was that one!!!
__label__en 0.656128 __label__zh 0.0380391
Thank you!
I would like to use pretrained english word vectors given in fasttext website. I run following command ./fasttext print-word-vectors wiki-news-300d-1M-subword.vec < queries.txt , then I receive `Model file has wrong file format!' error. I try command again by changing vector file format from vec to txt, but I receive same error. How can I use pretrained english vectors?
Same issue as @isspek, except I'm getting the error in fastText.load_model, when trying to read in wiki-news-300d-1M.vec
I'm also having the same problem as @joshi-sh . Are there retrained versions of these datasets? If not, how could we go about retraining them? (I'm not very familiar with the training portion of the library.)
Most helpful comment
Idea: At least improve the error message saying something like "Model was generated with an older version. Please retrain with current version."