Transformers: About Summarization

Created on 11 Dec 2019  ยท  9Comments  ยท  Source: huggingface/transformers

โ“ Questions & Help


Thank you very much for your wonderful work. I found that some new code for summarization has been added from "pretrained encoder" paper. However, I see only the evaluation part of the code. I want to ask if you will add the code for the training part. Thank you very much!

wontfix

Most helpful comment

@TheEdoardo93 I think this is a good encoder-decoder framework based on BERT. In addition to the summary task, it can also do many other generation tasks. If the training code can be integrated into this library, it can be used to finetune more downstream generation tasks. I think this library currently lacks downstream fine-tuning for NLG tasks, such like query generation, generative reading comprehension and other summarization tasks.

All 9 comments

If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the src/train.py, src/train_abstractive.py or src/train_extractive.py Python scripts.

@TheEdoardo93 Thank you for your reply. I know, will you plan to integrate the source training code into transformers? It is more convenient to use your transformers code for training.

At the moment, I think that it is not on the roadmap. Do you have a particular reason for asking to integrate the training algorithm into this library?

@TheEdoardo93 Thank you for your reply. I know, will you plan to integrate the source training code into transformers? It is more convenient to use your transformers code for training.

@TheEdoardo93 I think this is a good encoder-decoder framework based on BERT. In addition to the summary task, it can also do many other generation tasks. If the training code can be integrated into this library, it can be used to finetune more downstream generation tasks. I think this library currently lacks downstream fine-tuning for NLG tasks, such like query generation, generative reading comprehension and other summarization tasks.

Thanks for the help. How do I load the checkpoints model_step_20000.pt that was trained from src/train.py to replace model= BertAbs.from_pretrained("bertabs-finetuned-cnndm")

If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the src/train.py, src/train_abstractive.py or src/train_extractive.py Python scripts.

Hello! As I know, you can't load a PyTorch checkpoint _directly_ in BertAbs model, you'll indeed get an error. A PyTorch checkpoint typically contains the model state dict. Therefore, you can try to use the following source code for your task:

> import transformers
> import torch
> from transformers import BertTokenizer
> tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)
> from modeling_bertabs import BertAbs
> model = BertAbs.from_pretrained('bertabs-finetuned-cnndm')
> model.load_state_dict(torch.load(PATH_TO_PT_CHECKPOINT))

where _PATH_TO_PT_CHECKPOINT_ could be e.g. _./input_checkpoints/model_step_20000.pt_.
N.B: this code would work only in the case where the architecture of bertabs-finetuned-cnndm model is equal to the one you're trying to load into, otherwise an error occur!

If this code doesn't work as expected, we can work together in order to solve your problem :)

Thanks for the help. How do I load the checkpoints model_step_20000.pt that was trained from src/train.py to replace model= BertAbs.from_pretrained("bertabs-finetuned-cnndm")

If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the src/train.py, src/train_abstractive.py or src/train_extractive.py Python scripts.

Its Important!! ADD IT.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@TheEdoardo93 is there any way to load a pretrained model with different architecture? I used the source library to train a model with source embedding size of 1024 instead of 512 as in the pretrained one as 512 was too small for my data.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

adigoryl picture adigoryl  ยท  3Comments

iedmrc picture iedmrc  ยท  3Comments

hsajjad picture hsajjad  ยท  3Comments

lcswillems picture lcswillems  ยท  3Comments

rsanjaykamath picture rsanjaykamath  ยท  3Comments