Fasttext: Advantage of using pre-trained word vectors

Created on 1 Aug 2017 · 4Comments · Source: facebookresearch/fastText

Can anyone explain what is advantage of using pre trained vectors as input to fast text?
does the length of feature vector still remain equal to vocabulary length? or it becomes equal to size of vector dimension?

Source

omerarshad

Most helpful comment

@borissmidt any idea on how can fasttext skipgram/cbow vectors can be incrementally trained, i.e use the existing pretrained fasttext model.

ankitarya10 on 14 Aug 2017

👍13

All 4 comments

Hey omerarshad,
The reason that you would use a pretrained model is that you could shorten your own training time or have a better quality wordvector model.
For example google has a trained vector model on google news which is trained on a corpus which contains several billions of words. But if you would like to use this same model on a different type of text it might be useful to retrain it to another vector model.

Another usecase is classification, so you just train a vector model once and then use it to train several classification models.

But there are probably other reasons which i do not know of.

borissmidt on 7 Aug 2017

👍2

usually the vector and word embeddings are a much larger training set than just your sentences you're using to classify. I've heard people call this the "background vector"