Is there any reason why there is no BERT results on SNLI dataset, but there is for MultiNLI ?
Yeah we didn't report on it because it's not part of the GLUE eval (and also because MultiNLI generally subsumes it). I think BERT-Base gets about 91% and BERT-Large gets about 92% on it though.
I fine-tuned BERT-Base on SNLI and got 90.686% :)
I use a single GPU (GeForce GTX 1080, 11Gb RAM), and it took approximately 5 days to fine-tune.
Awesome results !
Most helpful comment
Yeah we didn't report on it because it's not part of the GLUE eval (and also because MultiNLI generally subsumes it). I think
BERT-Basegets about 91% andBERT-Largegets about 92% on it though.