Allennlp: How to continue training the model after it is out of patience?

Created on 17 Apr 2019  路  4Comments  路  Source: allenai/allennlp

System (please complete the following information):

  • OS: Linux
  • Python version: 3.6
  • AllenNLP version: I installed by pip install
  • PyTorch version: 0.4.1

Question

  • How to continue training the model after it is out of patience?

The training process is out of patient based on my initial configuration file and the best model is saved. I am not satisfy with the result so I want to continue training the model by changing the "patience" in the configuration file to a larger number and copy the new configuration file into the serialization_dir. I used the command "allennlp train path_to_conf_file.json -s path_to_serialization_dir --recover" to continue training the model. However, the training process stops only after one more epoch and the message showed that it is out of patience. I think the reason is that the "training_state["metric_tracker"]" is saved in the training_state_epoch_XX.th and is not updated by the new configuration file. How to solve this issue in an easy way please?

Most helpful comment

it's not ideal, but you could just manually modify the training state:

import torch
state_dict = torch.load('training_state_epoch_2.th')
state_dict['metric_tracker']['patience'] = 20
torch.save(state_dict, 'training_state_epoch_2.th')

All 4 comments

Yes, it looks like the metric tracker saves the patience as part of it's state, so this approach won't work: https://github.com/allenai/allennlp/blob/7d34ca3b8f723eca603b3a012e9c17da809dc6d2/allennlp/training/metric_tracker.py#L89

You could try using the fine-tune command instead.

it's not ideal, but you could just manually modify the training state:

import torch
state_dict = torch.load('training_state_epoch_2.th')
state_dict['metric_tracker']['patience'] = 20
torch.save(state_dict, 'training_state_epoch_2.th')

@joelgrus, not if you want to just do allennlp train. But yeah, if you're writing your own entry point, you could definitely do something like that.

@matt-gardner @joelgrus Thank you so much for your answers. They are very helpful.
I tried Joel's approach and it works!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ncammarata picture ncammarata  路  4Comments

epwalsh picture epwalsh  路  4Comments

sai-prasanna picture sai-prasanna  路  4Comments

nitishgupta picture nitishgupta  路  3Comments

shounakpaul95 picture shounakpaul95  路  4Comments