Fasttext: Updating the model (online-learning)

Created on 10 Oct 2017 · 13Comments · Source: facebookresearch/fastText

Hi,
here a simple question and/or feature request: would it be possible to progressively update a model (learned via skipgram or cbow) by passing new text to fasttext?

That would be very useful for online-learning of a text stream.

Thanks.
ps: this may be similar to #279. I could not find other open issues for this topic.

alignmenupdate

Source

mino98

👍9

Most helpful comment

./fasttext supervised -input train.txt -output model -inputModel model.bin -incr

lucky050619 on 24 Apr 2018

👍5 ❤2 🎉2 😄1

All 13 comments

also interested in this, please let us know if you find a way to avoid retraining from scratch all the time. maybe a dupe of #312 ?

dcsan on 13 Oct 2017

👍2

Issues #279 and #312 seem to be a duplicate of this.
Apparently, @elbamos solved it in #191 but is waiting permission to share the code.

mino98 on 4 Nov 2017

I’m unlikely to get that permission at this point :( it was not terribly hard to do though, really a days work once you’re familiar with the code.

On Nov 4, 2017, at 3:46 PM, Mino notifications@github.com wrote:

Issues #279 and #312 seem to be a duplicate of this.
Apparently, @elbamos solved it in #191 but is waiting permission to share the code.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

elbamos on 4 Nov 2017

Thanks anyway @elbamos. Could you at least share some pointers?
Also, were you expanding the dictionary when continuing training with new text?

mino98 on 4 Nov 2017

👍3

@elbamos @mino98 this is why it does not make any sense to modify this library (that is provided with a license) and do not send a PR since the code is proprietary of a company and cannot be released/used due to an "another" license. In that case, better to not share the implementation at all (and I'm sorry for this of course)! Hopefully this feature will go on priority some day.

loretoparisi on 9 Nov 2017

Sure @loretoparisi.
If anyone has good starting pointers to share, I might have a go myself.

mino98 on 9 Nov 2017

❤1 👍1

did any one get around this?

rajivgrover009 on 19 Dec 2017

👍2

Here's a good place to consult for deriving this enhancement:

→ loading an existing model
→ entry point into the training of the embedding(unsupervised training in the docs' lingo)
→ the c++ code ultimately called from there to kick off each epoc of training
→ the actual training is kicked off, per the training kind indicated for the run, here, where it goes on to diverge between supervised learning (the classification task), and the two flavors of unsupervised learning (word embedding generation) cbow and skipgram.
→ the gory math, which my similar use case requires tinkering, resides in and around here, by the way.

Give or take some detail and some unknowns, all it takes is adding an option of running the executable such that ― it loads an existing model and then carries on with learning a word embedding.

That said, online learning is not guaranteed to work by just iterating an existing offline-trained model with the same optimizer and/or same hyperparameters.

matanster on 3 Jan 2018

./fasttext supervised -input train.txt -output model -inputModel model.bin -incr

lucky050619 on 24 Apr 2018

👍5 ❤2 🎉2 😄1

@balaram-bhukya that is amazing from this PR!!! https://github.com/facebookresearch/fastText/pull/423
Can't wait to see it working.

loretoparisi on 24 Apr 2018

@loretoparisi Me too. anyone tried this in python ??? any luck ??

lucky050619 on 7 May 2018

👍1

@balaram-bhukya not sure, also there are some issues right now in the PR on the dictionary size...

loretoparisi on 7 May 2018

./fasttext supervised -input train.txt -output model -inputModel model.bin -incr

May I ask how is it going? Is it worked? I read the full documentation and I did not see -incr there.

Thank you.

alucard001 on 29 Dec 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Question: How to analyze sentence similarity under fastText?

leonardgithub · 4Comments

Python fasttext build failure

shriiitk · 3Comments

Questions regarding the embeddings produced by the `skipgram` and `supervised` options

danoneata · 3Comments

Unable to install fasttext (python) on windows.

ragvri · 3Comments

Pre-trained models - loading GloVe models

loretoparisi · 3Comments