Transformers: SpanBERT support

Created on 25 Jul 2019 · 10Comments · Source: huggingface/transformers

Hi,

I think the new SpanBERT model should also be supported in pytorch-transformers 😅

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.

Paper can be found here.

Model is currently not released yet, I'll update this issue here whenever the model is available :)

wontfix

Source

stefan-it

👍14

Most helpful comment

You can have a look here, the official implementation has just been released: https://github.com/facebookresearch/SpanBERT

thomwolf on 6 Sep 2019

👍4 🚀1

All 10 comments

are we going to get this? :) thanks :)

hamediramin on 22 Aug 2019

Fyi https://github.com/mandarjoshi90/coref#pretrained-coreference-models describes how to obtain the coreference models that should contain SpanBERT.

ArneBinder on 5 Sep 2019

👍2

@ArneBinder Thanks for that hint!

I downloaded the SpanBERT (base) model. Unfortunately, the TF checkpoint conversion throws the following error message:

INFO:pytorch_transformers.modeling_bert:Loading TF weight width_scores/output_weights/Adam_1 with shape [3000, 1]
INFO:pytorch_transformers.modeling_bert:Skipping antecedent_distance_emb
Traceback (most recent call last):
  File "/usr/local/bin/pytorch_transformers", line 11, in <module>
    load_entry_point('pytorch-transformers', 'console_scripts', 'pytorch_transformers')()
  File "/mnt/pytorch-transformers/pytorch_transformers/__main__.py", line 30, in main
    convert_tf_checkpoint_to_pytorch(TF_CHECKPOINT, TF_CONFIG, PYTORCH_DUMP_OUTPUT)
  File "/mnt/pytorch-transformers/pytorch_transformers/convert_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
    load_tf_weights_in_bert(model, config, tf_checkpoint_path)
  File "/mnt/pytorch-transformers/pytorch_transformers/modeling_bert.py", line 111, in load_tf_weights_in_bert
    assert pointer.shape == array.shape
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 591, in __getattr__
    type(self).__name__, name))
AttributeError: 'BertForPreTraining' object has no attribute 'shape'

I think some variables must be skipped, so a debugging session is unavoidable 😅

stefan-it on 6 Sep 2019

👀1

Hi @stefan-it, the SpanBERT authors shared their (~pytorch-transformers-compatible) weights with us, so if you'd be interested we can send them your way so you can experiment/integrate them here.

Let me know!

julien-c on 6 Sep 2019

🎉2

@julien-c this would be awesome 🤗 I would really like to do some experiments (mainly NER and PoS tagging) - would be great if you can share the weights (my mail is [email protected]) - thank you in advance :heart:

stefan-it on 6 Sep 2019

Hi @julien-c, I would also like to receive the spanbert pytorch-compatible weights for semantic tasks like coref. could you send it to me too? my mail is [email protected]. many thanks.

wenyudu on 6 Sep 2019

You can have a look here, the official implementation has just been released: https://github.com/facebookresearch/SpanBERT

thomwolf on 6 Sep 2019

👍4 🚀1

Well, two preliminary experiments (SpanBERT base) on CoNLL-2003 show a difference of ~7.8% compared to a BERT (base, cased) model 😱 So maybe this has something to do with the named entity masking 🤔 But I'll investigate that further this weekend...

stefan-it on 7 Sep 2019

👍1

Update on that: I tried SpanBERT for PoS tagging and the results are pretty close to DistilBERT. Here's one run over the Universal Dependencies v1.2:

| Model | Dev | Test
| ---------------------------------------------------------- | --------- | ---------
| RoBERTa (large) | 97.80 | 97.75
| SpanBERT (large) | 96.48 | 96.61
| BERT (large, cased) | 97.35 | 97.20
| DistilBERT (uncased) | 96.64 | 96.70
| Plank et. al (2016) | - | 95.52
| Yasunaga et. al (2017) | - | 95.82

stefan-it on 12 Sep 2019

👍3 👀2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.