Recently Google updated their TF implementation (https://github.com/google-research/bert
) with Whole Word Masking Models that masks whole random word instead of just random wordpieces, which results in a performance gain.
Just wondering if this will be implemented here?
Thanks.
It's not yet but thanks for the pointer, we can probably add it fairly easily. I'll have a look.
+10000 This would be very helpful!
Hi,
I converted the cased and uncased whole-word-masking models using the command line tool. If you're interested in adding these to the repository, I've uploaded them to this kaggle dataset.
Is this resolved? These seem to be available at head, and I don't see anything immediately wrong when I try them...
Yes they are working fine, I've added them to master last week.
They will be advertised in the next release.
When fine-tuned with run_squad they give pretty nice results: exact_match: 86.91, f1: 93.15
.
I've included a version fine-tuned on SQuAD as well.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Does Whole Word Masking support bert base as well?
Most helpful comment
It's not yet but thanks for the pointer, we can probably add it fairly easily. I'll have a look.