Bert: restore parameters from previous checkpoints

Created on 13 Nov 2018 · 4Comments · Source: google-research/bert

For example, I had fine-tuned the model for 10000 steps. I got intermediate models like model-ckpt-9000, model-ckpt-8000, model-ckpt-7000. How can I evaluate the performance using those models.

I tried --init_checkpoint=/path/to/model-ckpt-8000, but it does not work.

Source

quincyliang

Most helpful comment

@quincyliang guessing what you describe is:

model-ckpt 9000, 8000, 7000 at a dir. /path/to/model-ckpt-8000 ;
2.modify --init_checkpoint=/path/to/model-ckpt-8000 and fail to restore parameters of model-ckpt 8k;

In such case, you may need to check the file checkpoint in /path/to/model-ckpt-8000 , it may look like:

cat /path/to/model-ckpt-8000/checkpoint

model_checkpoint_path: "model.ckpt-9000"
all_model_checkpoint_paths: "model.ckpt-7000"
all_model_checkpoint_paths: "model.ckpt-8000"
all_model_checkpoint_paths: "model.ckpt-9000"

you may need to change model_checkpoint_path as 9000, 8000, 7000 to let TF restore correct ckpt.

CoSeCant-csc on 13 Nov 2018

👍3

All 4 comments

@quincyliang guessing what you describe is:

model-ckpt 9000, 8000, 7000 at a dir. /path/to/model-ckpt-8000 ;
2.modify --init_checkpoint=/path/to/model-ckpt-8000 and fail to restore parameters of model-ckpt 8k;

In such case, you may need to check the file checkpoint in /path/to/model-ckpt-8000 , it may look like:

cat /path/to/model-ckpt-8000/checkpoint

model_checkpoint_path: "model.ckpt-9000"
all_model_checkpoint_paths: "model.ckpt-7000"
all_model_checkpoint_paths: "model.ckpt-8000"
all_model_checkpoint_paths: "model.ckpt-9000"

you may need to change model_checkpoint_path as 9000, 8000, 7000 to let TF restore correct ckpt.

CoSeCant-csc on 13 Nov 2018

👍3

@CoSeCant-csc you are right. Thanks for your answer.

quincyliang on 13 Nov 2018

The provided answer works.
However it feels a little hacky.

Is there a cleaner way to do this?