Transformers: Whole Word Masking Models update

Created on 4 Jun 2019 · 7Comments · Source: huggingface/transformers

Recently Google updated their TF implementation (https://github.com/google-research/bert) with Whole Word Masking Models that masks whole random word instead of just random wordpieces, which results in a performance gain.

Just wondering if this will be implemented here?

Thanks.

wontfix

Source

frankxu2004

Most helpful comment

It's not yet but thanks for the pointer, we can probably add it fairly easily. I'll have a look.

thomwolf on 4 Jun 2019

👍16

All 7 comments

It's not yet but thanks for the pointer, we can probably add it fairly easily. I'll have a look.

thomwolf on 4 Jun 2019

👍16

+10000 This would be very helpful!

lirongyuan on 6 Jun 2019

👍3

Hi,

I converted the cased and uncased whole-word-masking models using the command line tool. If you're interested in adding these to the repository, I've uploaded them to this kaggle dataset.

bkkaggle on 11 Jun 2019

Is this resolved? These seem to be available at head, and I don't see anything immediately wrong when I try them...

sleepinyourhat on 27 Jun 2019

Yes they are working fine, I've added them to master last week.
They will be advertised in the next release.
When fine-tuned with run_squad they give pretty nice results: exact_match: 86.91, f1: 93.15.
I've included a version fine-tuned on SQuAD as well.

thomwolf on 27 Jun 2019

🎉2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.