Fasttext: Does Text classification utilize word embedding part?

Created on 26 Mar 2018 · 7Comments · Source: facebookresearch/fastText

As the title, I wonder whether
./fastText supervised
will utilize the fastText embedding such as skip-gram or CBOW when training the classifier or not.
Thanks :D

Source

ss87021456

Most helpful comment

Hi @Bishnukuet,
When you use the supervised training, the word representation is neither skipgram nor cbow. The word vector you will obtain is tailored for the classification task you are training on.

Consider these tasks:
1) given a word, predict me which other words should go around
2) given a sentence with a missing word, find me the missing word
3) given a sentence, tell me which label corresponds to this sentence

For each of them, you have the following models : 1) skipgram, 2) cbow, 3) classification.
And for each of them, you will have word vectors in the latent space that fit best the tasks you are working on.

For 1) and 2), two words in the same context will have their vector representation close to each other. For 3), words that are most discriminative for a given label will be close to each other.

Best regards,
Onur

Celebio on 24 Apr 2019

👍10 🚀2

All 7 comments

Yes, if you include the flag that tells it where the embedding model file is. Search the docs or --help for that kind of quite standard usage. If you don't provide such a model file of ready word embeddings, then unless your supervised training data is huge (and even then), your classification results would be poor.

matanster on 1 Apr 2018

Thanks! I've figured out this recently

ss87021456 on 3 Apr 2018

@ss87021456 Can you tell how did you pass pre-trained word embedding model as an input for classifcation?
Did you use this command ./fasttext supervised -pretrainedVectors ../cc.gh.300.vec ...

ayushbits on 31 May 2018

Thanks! I've figured out this recently

Hi, could you please tell me which model (Skip-gram or CBOW) is used when using supervised training of FastText?

Bishnukuet on 23 Apr 2019

For 1) and 2), two words in the same context will have their vector representation close to each other. For 3), words that are most discriminative for a given label will be close to each other.

Best regards,
Onur

Celebio on 24 Apr 2019

👍10 🚀2

Hi @Bishnukuet,
When you use the supervised training, the word representation is neither skipgram nor cbow. The word vector you will obtain is tailored for the classification task you are training on.

Consider these tasks:

given a word, predict me which other words should go around

given a sentence with a missing word, find me the missing word

given a sentence, tell me which label corresponds to this sentence

For each of them, you have the following models : 1) skipgram, 2) cbow, 3) classification.
And for each of them, you will have word vectors in the latent space that fit best the tasks you are working on.

For 1) and 2), two words in the same context will have their vector representation close to each other. For 3), words that are most discriminative for a given label will be close to each other.

Best regards,
Onur

Hi,
Thanks for your information, it's really nice to know that.
Could you explain how to get word embedding for each word after the supervised training? I trained a supervised model with my own dataset, and finally got a model.bin. I want to know how to get word embedding from this model.
Thanks a lot in advance.

Best regards,
Jun