Bert: what is the synthetic self-training

Created on 9 Mar 2019  路  8Comments  路  Source: google-research/bert

I noticed that the best model for squad2.0 uses bert + synthetic self-training trick, but i can't find any description about this. Could anyone show me how this trick work?
Thanks!

Most helpful comment

All 8 comments

I have also been wondering about the same,I am not able to find any further details about these methods can be used along with BERT .
Any details on this will be very helpful.

same question

So how about N-Gram Masking? Does anyone know?
Even Single Model beats the others lol

So how about N-Gram Masking? Does anyone know?
Even Single Model beats the others lol

Baidu recently open sourced a model called ERNIE, which masked n-gram instead of 1-gram during pre-training. I think this is similiar to the bert + N-Gram Masking.
https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE

Is synthetic self training in this repo?

So how about N-Gram Masking? Does anyone know?
Even Single Model beats the others lol

Baidu recently open sourced a model called ERNIE, which masked n-gram instead of 1-gram during pre-training. I think this is similiar to the bert + N-Gram Masking.
https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE

What is N-Gram Masking? Anyone care to translate the readme in Baidu's repo to English?

Was this page helpful?
0 / 5 - 0 ratings