Allennlp: How to continue training the model after it is out of patience?

Created on 17 Apr 2019 · 4Comments · Source: allenai/allennlp

System (please complete the following information):

OS: Linux
Python version: 3.6
AllenNLP version: I installed by pip install
PyTorch version: 0.4.1

Question

How to continue training the model after it is out of patience?

The training process is out of patient based on my initial configuration file and the best model is saved. I am not satisfy with the result so I want to continue training the model by changing the "patience" in the configuration file to a larger number and copy the new configuration file into the serialization_dir. I used the command "allennlp train path_to_conf_file.json -s path_to_serialization_dir --recover" to continue training the model. However, the training process stops only after one more epoch and the message showed that it is out of patience. I think the reason is that the "training_state["metric_tracker"]" is saved in the training_state_epoch_XX.th and is not updated by the new configuration file. How to solve this issue in an easy way please?

Source

MeiqiGuo

Most helpful comment

it's not ideal, but you could just manually modify the training state:

import torch
state_dict = torch.load('training_state_epoch_2.th')
state_dict['metric_tracker']['patience'] = 20
torch.save(state_dict, 'training_state_epoch_2.th')

joelgrus on 17 Apr 2019

👍3

All 4 comments

Yes, it looks like the metric tracker saves the patience as part of it's state, so this approach won't work: https://github.com/allenai/allennlp/blob/7d34ca3b8f723eca603b3a012e9c17da809dc6d2/allennlp/training/metric_tracker.py#L89

You could try using the fine-tune command instead.

matt-gardner on 17 Apr 2019

it's not ideal, but you could just manually modify the training state:

import torch
state_dict = torch.load('training_state_epoch_2.th')
state_dict['metric_tracker']['patience'] = 20
torch.save(state_dict, 'training_state_epoch_2.th')

joelgrus on 17 Apr 2019

👍3

@joelgrus, not if you want to just do allennlp train. But yeah, if you're writing your own entry point, you could definitely do something like that.

matt-gardner on 17 Apr 2019

@matt-gardner @joelgrus Thank you so much for your answers. They are very helpful.
I tried Joel's approach and it works!

MeiqiGuo on 17 Apr 2019

Was this page helpful?

0 / 5 - 0 ratings