Bert: BERT for text summarization

Created on 11 Jan 2019  Â·  44Comments  Â·  Source: google-research/bert

BERT is designed to solve 11 NLP problems. Which includes text summarization.

Is there any example how can we use BERT for summarizing a document? An approach would do and and example code would be really great.

Thanks in advance

Most helpful comment

Please see our paper using BERT for both extractive and abstractive summarization

https://arxiv.org/abs/1908.08345

With code and models released at https://github.com/nlpyang/PreSumm

All 44 comments

I am also interested in seeking a reply to the above question
https://github.com/google-research/bert/issues/352#issue-398233998
Kindly do reply
Thanks

I know the eleven tasks but wanted to know if anyone has used this for abstractive text summarization?

I did extractive summarization. After getting embedding I clustered them
and took 1 sentence from each cluster.
What are steps of abstractive summarization? Let me give a try..
On Jan 26, 2019 12:25 PM, "makamkkumar" notifications@github.com wrote:

I know the eleven tasks but wanted to know if anyone has used this for
abstractive text summarization?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/google-research/bert/issues/352#issuecomment-457808234,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AeIVIPSFlUFGu3ZJtMO2V2jZD5_iLuBIks5vG_vTgaJpZM4Z7N9Q
.

https://github.com/santhoshkolloju/bert_summ
I have replaced the Encoder part with Bert and kept the transformer decoder as it is . let me know if it helps

I think you are using Google.colab.
I want to run the same on a local machine which is having P4000 GPU with 8GB RAM it is modest but I think suffices my requirements. However i am unable to run it here.
Can you tell me how to do a work around.
Thanks in advance

What is the error you get.. By default texar places all the tensors on gpu

### While running this block i.e. the last block

_#tx.utils.maybe_create_dir(model_dir)

logging_file = os.path.join(model_dir, 'logging.txt')

model_dir = "gs://bert_summ/models/"uncased_L-12_H-768_A-12/bert_model.ckpt
logging_file= "logging.txt"
logger = utils.get_logger(logging_file)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
sess.run(tf.tables_initializer())

smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)

if run_mode == 'train_and_evaluate':
    logger.info('Begin running with train_and_evaluate mode')

    if tf.train.latest_checkpoint(model_dir) is not None:
        logger.info('Restore latest checkpoint in %s' % model_dir)
        saver.restore(sess, tf.train.latest_checkpoint(model_dir))

    iterator.initialize_dataset(sess)

    step = 5000
    for epoch in range(max_train_epoch):
      iterator.restart_dataset(sess, 'train')
      step = _train_epoch(sess, epoch, step, smry_writer)

elif run_mode == 'test':
    logger.info('Begin running with test mode')

    logger.info('Restore latest checkpoint in %s' % model_dir)
    saver.restore(sess, tf.train.latest_checkpoint(model_dir))

    _eval_epoch(sess, 0, mode='test')

else:
    raise ValueError('Unknown mode: {}'.format(run_mode))_

### The error I am getting is:-


PermissionDeniedError Traceback (most recent call last)
in
10 sess.run(tf.tables_initializer())
11
---> 12 smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)
13
14 if run_mode == 'train_and_evaluate':

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py in __init__(self, logdir, graph, max_queue, flush_secs, graph_def, filename_suffix)
350
351 event_writer = EventFileWriter(logdir, max_queue, flush_secs,
--> 352 filename_suffix)
353 super(FileWriter, self).__init__(event_writer, graph, graph_def)
354

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/event_file_writer.py in __init__(self, logdir, max_queue, flush_secs, filename_suffix)
65 self._logdir = logdir
66 if not gfile.IsDirectory(self._logdir):
---> 67 gfile.MakeDirs(self._logdir)
68 self._event_queue = six.moves.queue.Queue(max_queue)
69 self._ev_writer = pywrap_tensorflow.EventsWriter(

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py in recursive_create_dir(dirname)
372 """
373 with errors.raise_exception_on_not_ok_status() as status:
--> 374 pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(dirname), status)
375
376

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
517 None, None,
518 compat.as_text(c_api.TF_Message(self.status.status)),
--> 519 c_api.TF_GetCode(self.status.status))
520 # Delete the underlying status object from memory otherwise it stays alive
521 # as there is a reference to status from this from the traceback due to

PermissionDeniedError: Error executing an HTTP request (HTTP response code 401, error code 0, error message ''), response '{
"error": {
"errors": [
{
"domain": "global",
"reason": "required",
"message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/.",
"locationType": "header",
"location": "Authorization"
}
],
"code": 401,
"message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/."
}
}
'
when reading metadata of gs://bert_summ/models/

I received the same error.

the problem is i am writing it to my google cloud platform which you will not have access please change the location to your local filesystem (all gs: file paths with your local paths)

https://github.com/santhoshkolloju/bert_summ
I have replaced the Encoder part with Bert and kept the transformer decoder as it is . let me know if it helps

Do you have any examples of generated summaries?

I cannot share the results its my own data. But I have good results it was able to copy rare words as well. Initially I tried fine tuning both encoder(bert) and decoder both because of which Bert weights got disturbed. The. I freezed the weights of Bert and just trained the decoder part.
It was giving much readable and grammatically correct sentences.

@santhoshkolloju, Can you share your experience?
When I use your code, 'hypotheses' always have same value on every references.
for example,
references: ['do', 'n', "'", 't', 'wear', 'rings', 'when', 'working', 'on', 'engine', 'internal', '##s', '.', '[PAD]', '[PAD]', '[PAD]', ...]
hypotheses: ['do', 'n', "'", 't', 'try', 'to', 'do', 'n', "'", 't', 'mix', '.', '', '', '', ...]
references: ['broke', 'the', 'elevators', 'at', 'work', ',', 'basically', 'shot', 'myself', 'in', 'the', 'foot', 'in', 'doing', 'so', 'because', 'all', 'our', 'heavy', 'shit', 'is', 'downstairs', '.', '[PAD]', '[PAD]', '[PAD]', ...]
hypotheses: ['do', 'n', "'", 't', 'try', 'to', 'do', 'n', "'", 't', 'mix', '.', '
', '', '', ...]

I used tifu dataset suggested from "Abstractive Summarization of Reddit Posts with Multi-level Memory Networks" paper.

There was a problem.. Freeze the Bert weights and run again
tf. get_trainable_variables()
And exclude all the variables which starts with "bert" then pass non Bert variables to optimizer

@santhoshkolloju Sorry for my question...
I tried hard freezing but I do not know what to do based on this codes..
Are you suggest how to freeze?
I tried to fix run_pretraining.py using export_savedmodel and removed all tpu related code.
So I create saved_model.pb file. But, loading pb file is failed..

In the notebook I shared replace this line code like shown below and run again it should work.
allvars = tf get_trainable_variables()
nonBert =[v for v in allvars if 'bert' not in v]

train_op = tx.core.get_train_op(
mle_loss,
learning_rate=learning_rate,
variables=non Bert,
global_step=global_step,
hparams=opt)

Thank you for your advice. I finally trained. Freezing bert encoder makes much readable and grammatically correct sentences. But Still cannot summarize well :'(.. maybe we need more technic like Pointer Generator ,Bottom-Top Summarization... etc :)

In my case my data is some what easy one. It was not generating the sentences as it is but it is rephrasing which gives same meaning.
Try training for more iterations or passing the entity information to the model

Pointer generator is to be used when you have unknowns in the data with the subword tokenization hardly there are unknowns.

But let me know if you were able to improve on this

I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.

I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.

Check out this paper: https://arxiv.org/pdf/1902.09243.pdf

They still haven't released their code yet, but I'm currently working on reimplementing it in PyTorch and will make the code public once I'm done with it.

I am looking to use BERT model for abstractive text summarization, I checked out @santhoshkolloju code, will run and see, however, it would be really helpful if someone could guide me to articles/papers/resources/ code for abstractive summarization with BERT.

Check out this paper: https://arxiv.org/pdf/1902.09243.pdf

They still haven't released their code yet, but I'm currently working on reimplementing it in PyTorch and will make the code public once I'm done with it.

I would be, to put it mildly, extremely interested in this!

I tried the summarization on some wiki articles. Splited the text into sentences then averaged the CLS vectors from each sentence to get the "whole text CLS vec", then just picked a few sentences that were most similar to the whole text CLS (cosine similarity). Results were interesting, but not good enough for something serious (too simple and vague i guess).

For those interested, looks like we have an implementation! https://github.com/nayeon7lee/bert-summarization

For those interested, looks like we have an implementation! https://github.com/nayeon7lee/bert-summarization

it is not complete... :(

It's been almost half a year since BERT released. Does anybody know where to find any colab notebook which shows working summarization example?

Thank you for your advice. I finally trained. Freezing bert encoder makes much readable and grammatically correct sentences. But Still cannot summarize well :'(.. maybe we need more technic like Pointer Generator ,Bottom-Top Summarization... etc :)

Could you share your exprerience about bert encoder + transformer decoder + pointer generator? I wonder whether it will summarize well with pointer generator. Thanks

Is there anyone alive?

There is this paper

Fine-tune BERT for Extractive Summarization

https://arxiv.org/pdf/1903.10318.pdf

I would love a colab example as well.

@santhoshkolloju I think the result might have something to do with the batch size? I tried to print out the batches generated by the iterator(FeedableDataIterator from texar), and despite trying to set batch_size to 32, the size of the generated batch remained 4....

Edit:
Okay, I finally get it, why the batch size is always 4
train_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'train',"./")
eval_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'eval',"./")
test_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'test',"./")
Those lines in the colab example are the culprit...

@Santosh-Gupta They have their code released here: https://github.com/nlpyang/BertSum, though it's not a colab example

hmm, any idea how to use it to end up with a function like summary_result = BertSum.summarize("Text to be summarized") ?

There is this paper

Fine-tune BERT for Extractive Summarization

Not extractive, Abstractive example please...

There is this paper
Fine-tune BERT for Extractive Summarization

Not extractive, Abstractive example please...

So you're looking to generate a summary, not just extract the most importance sentences?

I'm looking for both, thanks for the above link. I will look into it right away.

So you're looking to generate a summary, not just extract the most importance sentences?

Yes

I tried a bert encoder + similar transformer decoder on generating summaries. And none of them work.
I believe there are a lot of tricks I didn't realize for fine-tuning the network.

It looks like this repo is completed

https://github.com/nlpyang/BertSum

Also, UNILM gives some great abstractive summarization scores, maybe the best

UNILM gives some great abstractive summarization scores

I found UNILM paper only. Do you know where can we download the model? Any code example?

The authors say that they are preparing a release of the code and pretrained model

I found one useful paper which gave better performance than BERT for text summarization.
paper: https://arxiv.org/pdf/1905.02450.pdf
code: https://github.com/microsoft/MASS
Codes are not fully ready yet although.

Please see our paper using BERT for both extractive and abstractive summarization

https://arxiv.org/abs/1908.08345

With code and models released at https://github.com/nlpyang/PreSumm

How we can make BERT use our own ebooks (. Azw,. Epub or. Mobi) or pdf files?
And also specific website example success. Com
Thank you

@nlpyang, great! How do the results look like? Could somebody post here some examples, please?

Was this page helpful?
0 / 5 - 0 ratings