I was following the directions to train a ssd_mobilenet model that trains the detection network from scratch after initializing the backbone with the weights of a model trained on Imagenet. I downloaded the pre-trained mobilenet checkpoint for this experiment.
MobileNet_v1_0.25_128
mobilenet_v1_0.25_128_2017_06_14.tar.gz
https://github.com/tensorflow/models/tree/master/research/slim
However, I get warning messages complaining that the batch norm parameters could not be loaded (see below for the messages). Has anyone else seen this issue or any pointers on how I can try to debug it?
Below I show my train_config to set up the experiment from the stock one given in the detection library and the warning messages I see about not reading in
train_config: {
batch_size: 24
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: "object_detection/mobilenet_v1_025/mobilenet_v1_0.25_128.ckpt"
from_detection_checkpoint: false
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_2_1x1_128/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_2_1x1_128/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_2_1x1_128/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_2_1x1_128/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_2_1x1_128/weights] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_3_1x1_64/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_3_1x1_64/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_3_1x1_64/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_3_1x1_64/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_3_1x1_64/weights] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_4_1x1_64/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_4_1x1_64/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_4_1x1_64/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_4_1x1_64/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_4_1x1_64/weights] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_5_1x1_32/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_5_1x1_32/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_5_1x1_32/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_5_1x1_32/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_1_Conv2d_5_1x1_32/weights] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_256/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_256/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_256/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_256/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_256/weights] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_3_3x3_s2_128/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_3_3x3_s2_128/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_3_3x3_s2_128/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_3_3x3_s2_128/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_3_3x3_s2_128/weights] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_4_3x3_s2_128/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_4_3x3_s2_128/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_4_3x3_s2_128/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_4_3x3_s2_128/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_4_3x3_s2_128/weights] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_64/BatchNorm/beta] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_64/BatchNorm/gamma] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_64/BatchNorm/moving_mean] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_64/BatchNorm/moving_variance] not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_64/weights] not available in checkpoint
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-09-28 12:54:07.684949: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-28 12:54:07.684970: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-28 12:54:07.684974: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-28 12:54:07.684977: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-28 12:54:07.684980: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-09-28 12:54:07.783602: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-09-28 12:54:07.783851: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.759
pciBusID 0000:02:00.0
Total memory: 7.92GiB
Free memory: 7.81GiB
2017-09-28 12:54:07.783862: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0
2017-09-28 12:54:07.783866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0: Y
2017-09-28 12:54:07.783871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0)
2017-09-28 12:54:08.774752: I tensorflow/core/common_runtime/simple_placer.cc:697] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is a reference connection and already has a device field set to /device:CPU:0
INFO:tensorflow:Restoring parameters from object_detection/mobilenet_v1_025/mobilenet_v1_0.25_128.ckpt
INFO:tensorflow:Restoring parameters from object_detection/mobilenet_v1_025/mobilenet_v1_0.25_128.ckpt
INFO:tensorflow:Error reported to Coordinator:
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma, save/RestoreV2_1)]]
Caused by op u'save/Assign_1', defined at:
File "object_detection/train.py", line 200, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 196, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/chiuyu/code2/tensorflow/models/research/object_detection/trainer.py", line 219, in train
init_saver = tf.train.Saver(available_var_map)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1140, in __init__
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 155, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 274, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 43, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [16] rhs shape= [8]
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma, save/RestoreV2_1)]]
INFO:tensorflow:Error reported to Coordinator:
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma, save/RestoreV2_1)]]
Caused by op u'save/Assign_1', defined at:
File "object_detection/train.py", line 200, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 196, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/chiuyu/code2/tensorflow/models/research/object_detection/trainer.py", line 219, in train
init_saver = tf.train.Saver(available_var_map)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1140, in __init__
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 155, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 274, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 43, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [16] rhs shape= [8]
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma, save/RestoreV2_1)]]
Traceback (most recent call last):
File "object_detection/train.py", line 200, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 196, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/chiuyu/code2/tensorflow/models/research/object_detection/trainer.py", line 296, in train
saver=saver)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 738, in train
master, start_standard_services=False, config=session_config) as sess:
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 953, in managed_session
start_standard_services=start_standard_services)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 708, in prepare_or_wait_for_session
init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 281, in prepare_session
init_fn(sess)
File "/home/chiuyu/code2/tensorflow/models/research/object_detection/trainer.py", line 221, in initializer_fn
init_saver.restore(sess, train_config.fine_tune_checkpoint)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [16] rhs shape= [8]
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma, save/RestoreV2_1)]]
Caused by op u'save/Assign_1', defined at:
File "object_detection/train.py", line 200, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 196, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/chiuyu/code2/tensorflow/models/research/object_detection/trainer.py", line 219, in train
init_saver = tf.train.Saver(available_var_map)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1140, in __init__
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 155, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 274, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 43, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [16] rhs shape= [8]
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](FeatureExtractor/MobilenetV1/Conv2d_0/BatchNorm/gamma, save/RestoreV2_1)]]
ERROR:tensorflow:==================================
Object was never used (type
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
ERROR:tensorflow:==================================
Object was never used (type
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
System information
What is the top-level directory of the model you are using: object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
TensorFlow installed from (source or binary): Source
TensorFlow version (use command below):
tf.__version__
'1.3.0'
Bazel version (if compiling from source): 0.5.1
CUDA/cuDNN version: CUDA 8.0, CUDNN 6.0
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
GPU model and memory: GeForce GTX 1080, with Total memory: 7.92GiB , Free memory: 7.81GiB
Exact command to reproduce: I am using the vanilla train.py command from the object detection README:
python2 object_detection/train.py \
--logtostderr \
--train_dir='object_detection/data_mobile025/train' \
--pipeline_config_path='object_detection/data_mobile025/ssd_mobilenet_v1_025.config'
I also tried other pre-trained mobilenet checkpoints for the ssd experiment as above described.
MobileNet_v1_0.25_224
MobileNet_v1_0.25_192
MobileNet_v1_0.25_160
MobileNet_v1_0.25_128
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
And all of the experiments have the same errors as the previous report.
same problem :S
@andrewghoward Can you comment on this?
I think the problem is related to 'depth_multiplier' in your config file (ssd_mobilenet_v1_xxxx.config)
So quick workaround can be setting it appropriately (i.e., 0.25) because you are testing shallower models which have depth_multiplier 0.25.
I'm having the same issue!
:point_up: changing the value of the boolean from_detection_checkpoint
to false
causes the program to crash with the error:
Traceback (most recent call last):
File "models/research/object_detection/train.py", line 163, in <module>
tf.app.run()
File "/home/.../anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "models/research/object_detection/train.py", line 159, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/.../TF_ObjectDetection_API/models/research/object_detection/trainer.py", line 255, in train
init_saver = tf.train.Saver(available_var_map)
File "/home/.../anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1218, in __init__
self.build()
File "/home/.../anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1227, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/.../anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1251, in _build
raise ValueError("No variables to save")
ValueError: No variables to save
Ok I figured it out :muscle:
If you want to use a classifier which is trained on imagenet but train the detector yourself, you should download the models from tensorflow zoo and not the detector zoo. E.g. mobilenet models are here.
:fire: Remeber to change back the depth_multiplier to its original value if you had it changed.
I am getting a similar error InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match.
while trying to retrain the ssd_mobilenet_v1_coco_2018_01_28 model.
I have downloaded the model and altered the config file to reflect the number of objects that I have. but I get the error. I see the solution above of changing the depth_multiplier, and that took me a bit fatrher I guess but then I encountered this ner error - NotFoundError (see above for traceback): Key FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_1_Conv2d_2_1x1_64/BatchNorm/beta not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
.
@atabakd what are you taking about? what classifier/imagenet?
it's about object detection and coco dataset! so of course models should be downloaded from detection_model_zoo page https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.
Most helpful comment
I also tried other pre-trained mobilenet checkpoints for the ssd experiment as above described.
MobileNet_v1_0.25_224
MobileNet_v1_0.25_192
MobileNet_v1_0.25_160
MobileNet_v1_0.25_128
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
And all of the experiments have the same errors as the previous report.