From going through the code I have the impression that fasttext embeddings are not updated during training, I also compared the value of a token on the original fasttext file, and then inside a trained model and they have the same value.
But, just as confirmation, are the fasttext word embeddings updated during training?
The char-LM embeddings as far as I've understood are never updated as well, they are generated on the fly based on the characters of a word, and used as such in the bi-LSTM+CRF architecture.
So, the only updates done during training are the weights of the bi-LSTM+CRF architecture, correct?
Hi @davidsbatista please excuse the late reply - I guess this one somehow fell through the cracks. I believe this is a duplicate of #632: In short, the embeddings themselves are fixed but there is a linear layer on top of each embedding that learns an updated representation of each word before it gets passed into sequence labeling RNN. So the weights of this linear layer and the BiLSTM get updated during training. Hope this clarifies!
Hi @alanakbik it does clarifies, thanks, I also read the related issue, much clearer now. Just need to go through the code myself and understand better the forward() function.
Ok great - will close the issue but feel free to reopen if you have more questions / comments.