Models: Inception Resnet V2 model not classifying as expected

Created on 20 Sep 2016 · 9Comments · Source: tensorflow/models

Please let us know which model this issue is about (specify the top-level directory)

slims/nets/inception_resnet_v2.py

Hi,
I am trying to build the inception_resnet_v2 model and restore the weights with the checkpoint file released (http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz). I am using the following code to build, restore and classify using the model-

import tensorflow as tf
slim = tf.contrib.slim
from PIL import Image
from inception_resnet_v2 import *
import numpy as np

checkpoint_file = 'inception_resnet_v2_2016_08_30.ckpt'
sample_images = ['dog.jpg', 'panda.jpg']
#Load the model
sess = tf.Session()
arg_scope = inception_resnet_v2_arg_scope()
with slim.arg_scope(arg_scope):
  logits, end_points = inception_resnet_v2(input_tensor, is_training=False)
saver = tf.train.Saver()
saver.restore(sess, checkpoint_file)
for image in sample_images:
  im = Image.open(image).resize((299,299))
  im = np.array(im)
  im = im.reshape(-1,299,299,3)
  predict_values, logit_values = sess.run([end_points['Predictions'], logits], feed_dict={input_tensor: im})
  print (np.max(predict_values), np.max(logit_values))
  print (np.argmax(predict_values), np.argmax(logit_values))

It classifies the dog and panda incorrectly as belonging to the same(class number 918) with very very high confidence(~100%).

awaiting response

Source

deepakrox

Most helpful comment

@flyfj Working code attached

import tensorflow as tf
slim = tf.contrib.slim
from PIL import Image
from inception_resnet_v2 import *
import numpy as np

checkpoint_file = 'inception_resnet_v2_2016_08_30.ckpt'
sample_images = ['dog.jpg', 'panda.jpg']

input_tensor = tf.placeholder(tf.float32, shape=(None,299,299,3), name='input_image')
scaled_input_tensor = tf.scalar_mul((1.0/255), input_tensor)
scaled_input_tensor = tf.sub(scaled_input_tensor, 0.5)
scaled_input_tensor = tf.mul(scaled_input_tensor, 2.0)

#Load the model
sess = tf.Session()
arg_scope = inception_resnet_v2_arg_scope()
with slim.arg_scope(arg_scope):
  logits, end_points = inception_resnet_v2(scaled_input_tensor, is_training=False)
saver = tf.train.Saver()
saver.restore(sess, checkpoint_file)
for image in sample_images:
  im = Image.open(image).resize((299,299))
  im = np.array(im)
  im = im.reshape(-1,299,299,3)
  predict_values, logit_values = sess.run([end_points['Predictions'], logits], feed_dict={input_tensor: im})
  print (np.max(predict_values), np.max(logit_values))
  print (np.argmax(predict_values), np.argmax(logit_values))

deepakrox on 22 Sep 2016

👍12 ❤1

All 9 comments

Is it just for the two examples or all classes are messed up? Could it be just the model is not good at this specific class?

jmchen-g on 20 Sep 2016

Figured out the issue - The image needs to be pre-processed in a certain way for the model to work. Interested folks can refer to http://stackoverflow.com/questions/39582703/using-pre-trained-inception-resnet-v2-with-tensorflow/39597537#39597537 for more details. Since this resolves the problem, this issue can be closed.
Thanks

deepakrox on 22 Sep 2016

👍2

@deepakrox could you share the complete code? I have similar issue with VGG model, have applied preprocessing but the results are still not right. thanks!

flyfj on 22 Sep 2016

@flyfj Working code attached

import tensorflow as tf
slim = tf.contrib.slim
from PIL import Image
from inception_resnet_v2 import *
import numpy as np

checkpoint_file = 'inception_resnet_v2_2016_08_30.ckpt'
sample_images = ['dog.jpg', 'panda.jpg']

input_tensor = tf.placeholder(tf.float32, shape=(None,299,299,3), name='input_image')
scaled_input_tensor = tf.scalar_mul((1.0/255), input_tensor)
scaled_input_tensor = tf.sub(scaled_input_tensor, 0.5)
scaled_input_tensor = tf.mul(scaled_input_tensor, 2.0)

#Load the model
sess = tf.Session()
arg_scope = inception_resnet_v2_arg_scope()
with slim.arg_scope(arg_scope):
  logits, end_points = inception_resnet_v2(scaled_input_tensor, is_training=False)
saver = tf.train.Saver()
saver.restore(sess, checkpoint_file)
for image in sample_images:
  im = Image.open(image).resize((299,299))
  im = np.array(im)
  im = im.reshape(-1,299,299,3)
  predict_values, logit_values = sess.run([end_points['Predictions'], logits], feed_dict={input_tensor: im})
  print (np.max(predict_values), np.max(logit_values))
  print (np.argmax(predict_values), np.argmax(logit_values))

deepakrox on 22 Sep 2016

👍12 ❤1

@deepakrox Hi, I use your code to classify "cropped_panda.jpg" used in tutorials/image/imagenet/classify_image.py.
And got (0.83959323, 8.7706404) (389, 389).

Additionally, I change the input as

decode_jpeg_data = tf.placeholder(tf.string)
decode_jpeg = tf.image.decode_jpeg(decode_jpeg_data, channels=3)
if decode_jpeg.dtype != tf.float32:
  decode_jpeg = tf.image.convert_image_dtype(decode_jpeg, dtype=tf.float32)
image = tf.expand_dims(decode_jpeg, 0)
image = tf.image.resize_bilinear(image, [299,299], align_corners=False)
scaled_input_tensor = tf.sub(scaled_input_tensor, 0.5)
scaled_input_tensor = tf.mul(scaled_input_tensor, 2.0)

the outputs are a little difference.
(0.78367269, 8.4608421) (389, 389)

I think the official inception_resnet_v2 model use the slim/preprocess* and tf.image.decode_jpeg,

It may be better to use tf.image.decode_jpeg rather PIL.Image to be consistent with training process.

D-X-Y on 7 Feb 2017

👍3 🎉1

@flyfj did you figure out the issue with VGG and did it work for you. Is it possible to share the code so that I can make changes in my code for feature extraction. Thank you for the help

vtalreja on 2 Mar 2017

@deepakrox @D-X-Y
Can we directly use the following for preprocessing:

for image in sample_images:
   decoded_jpeg =  tf.image.decode_jpeg(image)
    im = preprcess_for_train(decoded_jpeg,299,299)
   predict_values, logit_values = sess.run([end_points['Predictions'], logits], feed_dict={input_tensor: im})
   .....  #code as given by @deepakrox

ayushidalmia on 25 Jul 2017

@D-X-Y
Hi, I tried the same code with panda image, and also got 389, but that is not the correct label, is it? I think 169 should be the correct result ...

I'm struggling to get correct prediction with slim pretrained model, tried different models and different preprocessing, but always get 389, which I think is the wrong result, anyone has comments on this? Thanks.

EDIT:
looks like there are two ground truth labeling for imagenet? I think "panda" = 389 is correct for slim models. But for some other models, I got 169 as the correct prediction, and "ILSVRC2012_validation_ground_truth.txt" also give "panda" a 169 label.

auzyze on 13 Sep 2017

I got similar problem. The model give the same result with high confidence for different images.
But in my case,I foget to add biases to the "up" in block35, block17 and block8:
up=tf.nn.bias_add(up,bias)
The problem disappeared after correct the issue.