Transformers: About Summarization

Created on 11 Dec 2019 · 9Comments · Source: huggingface/transformers

❓ Questions & Help

Thank you very much for your wonderful work. I found that some new code for summarization has been added from "pretrained encoder" paper. However, I see only the evaluation part of the code. I want to ask if you will add the code for the training part. Thank you very much!

wontfix

Source

lcl6679292

👍1

Most helpful comment

@TheEdoardo93 I think this is a good encoder-decoder framework based on BERT. In addition to the summary task, it can also do many other generation tasks. If the training code can be integrated into this library, it can be used to finetune more downstream generation tasks. I think this library currently lacks downstream fine-tuning for NLG tasks, such like query generation, generative reading comprehension and other summarization tasks.

lcl6679292 on 12 Dec 2019

👍8

All 9 comments

If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the src/train.py, src/train_abstractive.py or src/train_extractive.py Python scripts.

TheEdoardo93 on 11 Dec 2019

@TheEdoardo93 Thank you for your reply. I know, will you plan to integrate the source training code into transformers? It is more convenient to use your transformers code for training.

lcl6679292 on 11 Dec 2019

At the moment, I think that it is not on the roadmap. Do you have a particular reason for asking to integrate the training algorithm into this library?

@TheEdoardo93 Thank you for your reply. I know, will you plan to integrate the source training code into transformers? It is more convenient to use your transformers code for training.

TheEdoardo93 on 11 Dec 2019

lcl6679292 on 12 Dec 2019

👍8

Thanks for the help. How do I load the checkpoints model_step_20000.pt that was trained from src/train.py to replace model= BertAbs.from_pretrained("bertabs-finetuned-cnndm")

If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the src/train.py, src/train_abstractive.py or src/train_extractive.py Python scripts.

yxlin1 on 19 Dec 2019

👀1

Hello! As I know, you can't load a PyTorch checkpoint _directly_ in BertAbs model, you'll indeed get an error. A PyTorch checkpoint typically contains the model state dict. Therefore, you can try to use the following source code for your task:

> import transformers
> import torch
> from transformers import BertTokenizer
> tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)
> from modeling_bertabs import BertAbs
> model = BertAbs.from_pretrained('bertabs-finetuned-cnndm')
> model.load_state_dict(torch.load(PATH_TO_PT_CHECKPOINT))

where _PATH_TO_PT_CHECKPOINT_ could be e.g. _./input_checkpoints/model_step_20000.pt_.
N.B: this code would work only in the case where the architecture of bertabs-finetuned-cnndm model is equal to the one you're trying to load into, otherwise an error occur!

If this code doesn't work as expected, we can work together in order to solve your problem :)

Thanks for the help. How do I load the checkpoints model_step_20000.pt that was trained from src/train.py to replace model= BertAbs.from_pretrained("bertabs-finetuned-cnndm")

If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the src/train.py, src/train_abstractive.py or src/train_extractive.py Python scripts.

TheEdoardo93 on 19 Dec 2019

Its Important!! ADD IT.

shashankMadan-designEsthetics on 27 Dec 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 25 Feb 2020

@TheEdoardo93 is there any way to load a pretrained model with different architecture? I used the source library to train a model with source embedding size of 1024 instead of 512 as in the pretrained one as 512 was too small for my data.