For example, I had fine-tuned the model for 10000 steps. I got intermediate models like model-ckpt-9000, model-ckpt-8000, model-ckpt-7000. How can I evaluate the performance using those models.
I tried --init_checkpoint=/path/to/model-ckpt-8000, but it does not work.
@quincyliang guessing what you describe is:
/path/to/model-ckpt-8000 ;--init_checkpoint=/path/to/model-ckpt-8000 and fail to restore parameters of model-ckpt 8k;In such case, you may need to check the file checkpoint in /path/to/model-ckpt-8000 , it may look like:
cat /path/to/model-ckpt-8000/checkpoint
model_checkpoint_path: "model.ckpt-9000"
all_model_checkpoint_paths: "model.ckpt-7000"
all_model_checkpoint_paths: "model.ckpt-8000"
all_model_checkpoint_paths: "model.ckpt-9000"
you may need to change model_checkpoint_path as 9000, 8000, 7000 to let TF restore correct ckpt.
@CoSeCant-csc you are right. Thanks for your answer.
The provided answer works.
However it feels a little hacky.
Is there a cleaner way to do this?
I simply gave the path name to the evaluator function. But I dont know if this is the right way to run the evaluation from a desired checkpoint.
estimator.evaluate(input_fn=val_input_fn, steps=None, checkpoint_path='/content/drive/My Drive/BERT temp files/model.ckpt-800')
Most helpful comment
@quincyliang guessing what you describe is:
/path/to/model-ckpt-8000;2.modify
--init_checkpoint=/path/to/model-ckpt-8000and fail to restore parameters of model-ckpt 8k;In such case, you may need to check the file
checkpointin/path/to/model-ckpt-8000, it may look like:you may need to change
model_checkpoint_pathas 9000, 8000, 7000 to let TF restore correct ckpt.