@davidsandberg How would you suggest fine-tuning the logits layer of inception_resnet_v2 on a new set of images (similar to what is explained in the tf-slim example "Fine-tuning a model from an existing checkpoint" here)? Their code example allows you to set a flag called
--checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits/Logits that retrains the last layer of the network. I'm trying to figure out how to adapt your code to this use-case.
I have done a similar thing where a model was train as a classifier and fine-tuned using triplet loss. The trick is to make sure that the final layer is not restored from the stored model and also that gradients are gated for all other parameters during training. You can have a look at the corresponding code in facenet_train.py, and more specifically the section
# Create list with variables to restore
restore_vars = []
update_gradient_vars = []
if args.pretrained_model:
for var in tf.all_variables():
if not 'Embeddings/' in var.op.name:
restore_vars.append(var)
else:
update_gradient_vars.append(var)
else:
restore_vars = tf.all_variables()
update_gradient_vars = tf.all_variables()
, where the restore_varsand update_gradient_vars are decided.
Thanks. I've modified this block of code to skip over the logits layer in inception-resnet-v2:
# If fine-tuning model on new task,
# we must remove the final logits (classifier) layer
exclusions = []
if args.checkpoint_exclude_scopes:
exclusions = [scope.strip()
for scope in args.checkpoint_exclude_scopes.split(',')]
# Create list with variables to restore
restore_vars = []
update_gradient_vars = []
if args.pretrained_model:
for var in tf.all_variables():
excluded=False
for exclusion in exclusions:
# skip this variable
if var.op.name.startswith(exclusion):
excluded=True
break
if not excluded:
print(var.op.name)
if not 'Embeddings/' in var.op.name:
restore_vars.append(var)
else:
update_gradient_vars.append(var)
else:
restore_vars = tf.all_variables()
update_gradient_vars = tf.all_variables()
Which, when I print the var.op.name, seems to give me what I want. I've provided the last few lines of the network below:
InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/weights
InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/BatchNorm/beta
InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/BatchNorm/moving_mean
InceptionResnetV2/Block8/Branch_0/Conv2d_1x1/BatchNorm/moving_variance
InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/weights
InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/BatchNorm/beta
InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_mean
InceptionResnetV2/Block8/Branch_1/Conv2d_0a_1x1/BatchNorm/moving_variance
InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/weights
InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/BatchNorm/beta
InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_mean
InceptionResnetV2/Block8/Branch_1/Conv2d_0b_1x3/BatchNorm/moving_variance
InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/weights
InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/beta
InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_mean
InceptionResnetV2/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance
InceptionResnetV2/Block8/Conv2d_1x1/weights
InceptionResnetV2/Block8/Conv2d_1x1/biases
InceptionResnetV2/Conv2d_7b_1x1/weights
InceptionResnetV2/Conv2d_7b_1x1/BatchNorm/beta
InceptionResnetV2/Conv2d_7b_1x1/BatchNorm/moving_mean
InceptionResnetV2/Conv2d_7b_1x1/BatchNorm/moving_variance
Embeddings/weights
Embeddings/biases
However, the run keeps crashing due to NotFoundError (see above for traceback): Tensor name "Embeddings/biases" not found in checkpoint files /tmp/checkpoints/inception_resnet_v2.ckpt.
Below is the traceback:
Traceback (most recent call last):
File "facenet_train.py", line 343, in <module>
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 164, in main
saver.restore(sess, pretrained_model)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1389, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 718, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 916, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 966, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 986, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Tensor name "Embeddings/biases" not found in checkpoint files /tmp/checkpoints/inception_resnet_v2.ckpt
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
[[Node: save/RestoreV2_270/_2223 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_1998_save/RestoreV2_270", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Caused by op u'save/RestoreV2', defined at:
File "facenet_train.py", line 343, in <module>
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 144, in main
saver = tf.train.Saver(tf.all_variables(), max_to_keep=3)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1000, in __init__
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1030, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 624, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 361, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 200, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 439, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 750, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2238, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1130, in __init__
self._traceback = _extract_stack()
NotFoundError (see above for traceback): Tensor name "Embeddings/biases" not found in checkpoint files /tmp/checkpoints/inception_resnet_v2.ckpt
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
[[Node: save/RestoreV2_270/_2223 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_1998_save/RestoreV2_270", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Closing due to inactivity. Open if needed.
Most helpful comment
I have done a similar thing where a model was train as a classifier and fine-tuned using triplet loss. The trick is to make sure that the final layer is not restored from the stored model and also that gradients are gated for all other parameters during training. You can have a look at the corresponding code in
facenet_train.py, and more specifically the section, where the
restore_varsandupdate_gradient_varsare decided.