Models: Save checkpoints more frequently (object detection)

Created on 24 Nov 2017  路  4Comments  路  Source: tensorflow/models

System information

What is the top-level directory of the model you are using: object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 7
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 1.3.0
CUDA/cuDNN version: Cuda compilation tools, release 8.0, V8.0.44
GPU model and memory: GTX 1080Ti, 11Gb
Exact command to reproduce:
python train.py --logtostder --train_dir=training0311/ --pipeline_config_path=training0311\faster_rcnn_inception_resnet_v2_atrous_coco.config
Describe the problem

I would like to save checkpoint more frequently lets said: every 10 iterations. In trainer.py and slim.learning.train I add the line : save_interval_secs=10.
One iteration is about 10 second but the checkpoint are not save every 10 seconds (it is saved every 200 iterations).

Can you help me ? Thanks a lot !

All 4 comments

Summary and checkpoint are two different things. You want to use slim.learning.train(..., save_interval_secs=10, ...) to save every 10 seconds.

save_interval_secs is also is not working

That is weird considering it is how I do it. I have never tried to go down to 10 seconds but anything between 60 and 1800 seconds have worked for me.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/learning.py
The argument is specified on row 606, no particular comment on why it might have a lower bound.

Best, Nils

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

Was this page helpful?
0 / 5 - 0 ratings