Models: No biases in inception_v4 checkpoint file

Created on 18 Feb 2017 · 4Comments · Source: tensorflow/models

Hi I'm trying to use the inception_v4 code and checkpoint file as a starting point for my model. I'm not using the slim scripts directly but am importing the inception network and using

slim.assign_from_checkpoint_fn(
model_path + "inception_v4.ckpt",
weights,
ignore_missing_vars=True)

Where weights are the network weights minus all the meta weights needed for adam optimizer. I'm getting a lot of:

WARNING:tensorflow:Variable InceptionV4/Mixed_6f/Branch_1/Conv2d_0a_1x1/biases missing in checkpoint

It's only for the biases. After inspecting the checkpoint file manually with tf.train.NewCheckpointReader I see that there aren't in fact any bias variables saved in the checkpoint file. Is this on purpose or am I missing something? Why would nets.inception_v4.inception_v4 create a network with biases that aren't saved in the checkpoint file. Any help appreciated!

community support bug

Source

lucaswiser

👍1

Most helpful comment

@lucaswiser

Since the convolutional layers are batch normalized the biases are not needed (not a bug). However, I believe your issue is simply not setting the correct arg_scope when loading the model.

I've tested the following and it loads fine (note you can easily replace the saver load with slim.assign_from_checkpoint if you want):

sess = tf.Session()
arg_scope = inception_v4_arg_scope()
input_tensor = tf.placeholder(tf.float32, (None, 299, 299, 3))
with slim.arg_scope(arg_scope):
    logits, end_points = inception_v4(input_tensor, is_training=False)
saver = tf.train.Saver()
saver.restore(sess, checkpoint_file)

kentsommer on 21 Feb 2017

👍2

All 4 comments

@sguada, do you have any ideas, and if not, could you take a look? Thanks in advance!

aselle on 19 Feb 2017

@lucaswiser

Since the convolutional layers are batch normalized the biases are not needed (not a bug). However, I believe your issue is simply not setting the correct arg_scope when loading the model.

I've tested the following and it loads fine (note you can easily replace the saver load with slim.assign_from_checkpoint if you want):

sess = tf.Session()
arg_scope = inception_v4_arg_scope()
input_tensor = tf.placeholder(tf.float32, (None, 299, 299, 3))
with slim.arg_scope(arg_scope):
    logits, end_points = inception_v4(input_tensor, is_training=False)
saver = tf.train.Saver()
saver.restore(sess, checkpoint_file)

kentsommer on 21 Feb 2017

👍2

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

aselle on 4 Mar 2017

I'm having a similar problem using attention_ocr model. I have downloaded pretrained checkpoint file from http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz and used instructions for inference run available at https://github.com/tensorflow/models/tree/master/attention_ocr#how-to-use-a-pre-trained-model

This is my code:

CHECKPOINT_PATH = "checkpoints/model.ckpt-399731"
dataset = common_flags.create_dataset(split_name="test")
model = common_flags.create_model(dataset.num_char_classes,
                                    dataset.max_sequence_length,
                                    dataset.num_of_views, dataset.null_code)
endpoints = model.create_base(images_placeholder, labels_one_hot=None)
restore_op = model.create_init_fn_to_restore(CHECKPOINT_PATH , None)
with tf.Session() as sess:
    #sess.run(tf.initialize_all_variables())
    restore_op(sess)
    predictions = sess.run(endpoints.predicted_chars, feed_dict={images_placeholder:image})

It fails with an error:

Checkpoint is missing variable [AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/Attention_0/bias]

I have compared list of checkpoint variables and model variables and found these ones missing in checkpoint:

AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/Attention_0/bias
AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/lstm_cell/kernel
AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/AttnOutputProjection/kernel
AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/AttnOutputProjection/bias AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/Attention_0/kernel
AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/bias
AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/lstm_cell/bias
AttentionOcr_v1/sequence_logit_fn/SQLR/LSTM/attention_decoder/kernel

I'm using tensorflow version 1.2.0