Did you try using facebookresearch/fastText instead of GloVe, if so, what are the conclusions?
It makes sense that you would, given that it is a facebookresearch project.
That's a very good point. The only reasons why I considered GloVe in InferSent (so far) is that fasttext pre-trained embeddings were not available when I started this project, and I wanted to be comparable to the literature that had been using GloVe for SNLI.
From what I know, FastText embeddings have proven to be better than GloVe embeddings in several downstream tasks, and it also provides embeddings for any OOV word (using the character n-grams). So we could expect a gain in performance and a better coverage using them.
The sentence model here has much more power than the word embeddings though, so I would expect the gain to be relatively small. This is something I will try in the near future, I'll let you know when I know more.
so...any updates on using fasttext ? It is clearly the way forward
instead of loading glove , is it possible to have a similar function that loads fasttext embedding?
If anyone has tried using the latest fastText Common Crawl embeddings (600B) in place of the Glove embeddings, please let us know if you found improved results.
Yes, this will definitely work better with the latest fasttext embeddings. See e.g. https://arxiv.org/abs/1804.07983, Table 1: compare BiLSTM-Max F (fasttext) and G (glove). It's a slightly different setup and uses different hyperparams than InferSent, but it's the same arch and should give you a decent idea of what kind of performance you can expect.
That's great, thanks for clarifying!
The infersent2.pkl model is now trained with the latest fastText common-crawl word embeddings. Note however that these embeddings are not trained with character n-grams as specified in https://github.com/facebookresearch/fastText/issues/428#issuecomment-365046063
This model should come soon but you still have 2M words in the vocabulary!
Thanks