Models: How print the prediction probabilities in `eval_image_classifier.py`

Created on 1 Apr 2017  Â·  17Comments  Â·  Source: tensorflow/models

Currently, I followed tutorial in https://github.com/tensorflow/models/tree/master/slim to fine-tune the inception_v3 model on pathological images. Image data was converted and stored as TFRecord.

When finished fine-tuning, I run eval_image_classifier.py on validation set with the new model. The eval_image_classifier.py just print accuracy.

However, what I want is to print all probabilities for every class. Can anyone help me address it?

Thanks in advance for any advise.

awaiting maintainer feature

Most helpful comment

@aselle, after reading slim code, I made a python script classify_image.py that can predict for multiple images either present as raw image or TFRecord. The source code is available from my notebook.

Source code also posted below:

#!/usr/bin/env python

from __future__ import print_function
import sys
sys.path.append('../../tensorflow/models/slim/') # add slim to PYTHONPATH
import tensorflow as tf

tf.app.flags.DEFINE_integer('num_classes', 5, 'The number of classes.')
tf.app.flags.DEFINE_string('infile',None, 'Image file, one image per line.')
tf.app.flags.DEFINE_boolean('tfrecord',False, 'Input file is formatted as TFRecord.')
tf.app.flags.DEFINE_string('outfile',None, 'Output file for prediction probabilities.')
tf.app.flags.DEFINE_string('model_name', 'resnet_v1_50', 'The name of the architecture to evaluate.')
tf.app.flags.DEFINE_string('preprocessing_name', None, 'The name of the preprocessing to use. If left as `None`, then the model_name flag is used.')
tf.app.flags.DEFINE_string('checkpoint_path', 'finetuned_checkpoints/resnet_v1_50/','The directory where the model was written to or an absolute path to a checkpoint file.')
tf.app.flags.DEFINE_integer('eval_image_size', None, 'Eval image size.')
FLAGS = tf.app.flags.FLAGS

import numpy as np
import os

from datasets import imagenet
from nets import inception
from nets import resnet_v1
from nets import inception_utils
from nets import resnet_utils
from preprocessing import inception_preprocessing
from nets import nets_factory
from preprocessing import preprocessing_factory

slim = tf.contrib.slim

model_name_to_variables = {'inception_v3':'InceptionV3','inception_v4':'InceptionV4','resnet_v1_50':'resnet_v1_50','resnet_v1_152':'resnet_v1_152'}

preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
eval_image_size = FLAGS.eval_image_size

if FLAGS.tfrecord:
  fls = tf.python_io.tf_record_iterator(path=FLAGS.infile)
else:
  fls = [s.strip() for s in open(FLAGS.infile)]

model_variables = model_name_to_variables.get(FLAGS.model_name)
if model_variables is None:
  tf.logging.error("Unknown model_name provided `%s`." % FLAGS.model_name)
  sys.exit(-1)

if FLAGS.tfrecord:
  tf.logging.warn('Image name is not available in TFRecord file.')

if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
  checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
else:
  checkpoint_path = FLAGS.checkpoint_path

image_string = tf.placeholder(tf.string) # Entry to the computational graph, e.g. image_string = tf.gfile.FastGFile(image_file).read()

#image = tf.image.decode_image(image_string, channels=3)
image = tf.image.decode_jpeg(image_string, channels=3, try_recover_truncated=True, acceptable_fraction=0.3) ## To process corrupted image files

image_preprocessing_fn = preprocessing_factory.get_preprocessing(preprocessing_name, is_training=False)

network_fn = nets_factory.get_network_fn(FLAGS.model_name, FLAGS.num_classes, is_training=False)

if FLAGS.eval_image_size is None:
  eval_image_size = network_fn.default_image_size

processed_image = image_preprocessing_fn(image, eval_image_size, eval_image_size)

processed_images  = tf.expand_dims(processed_image, 0) # Or tf.reshape(processed_image, (1, eval_image_size, eval_image_size, 3))

logits, _ = network_fn(processed_images)

probabilities = tf.nn.softmax(logits)

init_fn = slim.assign_from_checkpoint_fn(checkpoint_path, slim.get_model_variables(model_variables))

sess = tf.Session()
init_fn(sess)

fout = sys.stdout
if FLAGS.outfile is not None:
  fout = open(FLAGS.outfile, 'w')
h = ['image']
h.extend(['class%s' % i for i in range(FLAGS.num_classes)])
h.append('predicted_class')
print('\t'.join(h), file=fout)


for fl in fls:
  image_name = None

  try:
    if FLAGS.tfrecord is False:
      x = tf.gfile.FastGFile(fl).read() # You can also use x = open(fl).read()
      image_name = os.path.basename(fl)
    else:
      example = tf.train.Example()
      example.ParseFromString(fl)

      # Note: The key of example.features.feature depends on how you generate tfrecord.
      x = example.features.feature['image/encoded'].bytes_list.value[0] # retrieve image string

      image_name = 'TFRecord'

    probs = sess.run(probabilities, feed_dict={image_string:x})
    #np_image, network_input, probs = sess.run([image, processed_image, probabilities], feed_dict={image_string:x})

  except Exception as e:
    tf.logging.warn('Cannot process image file %s' % fl)
    continue

  probs = probs[0, 0:]
  a = [image_name]
  a.extend(probs)
  a.append(np.argmax(probs))
  print('\t'.join([str(e) for e in a]), file=fout)

sess.close()
fout.close()

All 17 comments

This seems like it would be a great addition. @nealwu, could you take a look at adding this?

@nealwu, could you please take a look?

@aselle, after reading slim code, I made a python script classify_image.py that can predict for multiple images either present as raw image or TFRecord. The source code is available from my notebook.

Source code also posted below:

#!/usr/bin/env python

from __future__ import print_function
import sys
sys.path.append('../../tensorflow/models/slim/') # add slim to PYTHONPATH
import tensorflow as tf

tf.app.flags.DEFINE_integer('num_classes', 5, 'The number of classes.')
tf.app.flags.DEFINE_string('infile',None, 'Image file, one image per line.')
tf.app.flags.DEFINE_boolean('tfrecord',False, 'Input file is formatted as TFRecord.')
tf.app.flags.DEFINE_string('outfile',None, 'Output file for prediction probabilities.')
tf.app.flags.DEFINE_string('model_name', 'resnet_v1_50', 'The name of the architecture to evaluate.')
tf.app.flags.DEFINE_string('preprocessing_name', None, 'The name of the preprocessing to use. If left as `None`, then the model_name flag is used.')
tf.app.flags.DEFINE_string('checkpoint_path', 'finetuned_checkpoints/resnet_v1_50/','The directory where the model was written to or an absolute path to a checkpoint file.')
tf.app.flags.DEFINE_integer('eval_image_size', None, 'Eval image size.')
FLAGS = tf.app.flags.FLAGS

import numpy as np
import os

from datasets import imagenet
from nets import inception
from nets import resnet_v1
from nets import inception_utils
from nets import resnet_utils
from preprocessing import inception_preprocessing
from nets import nets_factory
from preprocessing import preprocessing_factory

slim = tf.contrib.slim

model_name_to_variables = {'inception_v3':'InceptionV3','inception_v4':'InceptionV4','resnet_v1_50':'resnet_v1_50','resnet_v1_152':'resnet_v1_152'}

preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
eval_image_size = FLAGS.eval_image_size

if FLAGS.tfrecord:
  fls = tf.python_io.tf_record_iterator(path=FLAGS.infile)
else:
  fls = [s.strip() for s in open(FLAGS.infile)]

model_variables = model_name_to_variables.get(FLAGS.model_name)
if model_variables is None:
  tf.logging.error("Unknown model_name provided `%s`." % FLAGS.model_name)
  sys.exit(-1)

if FLAGS.tfrecord:
  tf.logging.warn('Image name is not available in TFRecord file.')

if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
  checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
else:
  checkpoint_path = FLAGS.checkpoint_path

image_string = tf.placeholder(tf.string) # Entry to the computational graph, e.g. image_string = tf.gfile.FastGFile(image_file).read()

#image = tf.image.decode_image(image_string, channels=3)
image = tf.image.decode_jpeg(image_string, channels=3, try_recover_truncated=True, acceptable_fraction=0.3) ## To process corrupted image files

image_preprocessing_fn = preprocessing_factory.get_preprocessing(preprocessing_name, is_training=False)

network_fn = nets_factory.get_network_fn(FLAGS.model_name, FLAGS.num_classes, is_training=False)

if FLAGS.eval_image_size is None:
  eval_image_size = network_fn.default_image_size

processed_image = image_preprocessing_fn(image, eval_image_size, eval_image_size)

processed_images  = tf.expand_dims(processed_image, 0) # Or tf.reshape(processed_image, (1, eval_image_size, eval_image_size, 3))

logits, _ = network_fn(processed_images)

probabilities = tf.nn.softmax(logits)

init_fn = slim.assign_from_checkpoint_fn(checkpoint_path, slim.get_model_variables(model_variables))

sess = tf.Session()
init_fn(sess)

fout = sys.stdout
if FLAGS.outfile is not None:
  fout = open(FLAGS.outfile, 'w')
h = ['image']
h.extend(['class%s' % i for i in range(FLAGS.num_classes)])
h.append('predicted_class')
print('\t'.join(h), file=fout)


for fl in fls:
  image_name = None

  try:
    if FLAGS.tfrecord is False:
      x = tf.gfile.FastGFile(fl).read() # You can also use x = open(fl).read()
      image_name = os.path.basename(fl)
    else:
      example = tf.train.Example()
      example.ParseFromString(fl)

      # Note: The key of example.features.feature depends on how you generate tfrecord.
      x = example.features.feature['image/encoded'].bytes_list.value[0] # retrieve image string

      image_name = 'TFRecord'

    probs = sess.run(probabilities, feed_dict={image_string:x})
    #np_image, network_input, probs = sess.run([image, processed_image, probabilities], feed_dict={image_string:x})

  except Exception as e:
    tf.logging.warn('Cannot process image file %s' % fl)
    continue

  probs = probs[0, 0:]
  a = [image_name]
  a.extend(probs)
  a.append(np.argmax(probs))
  print('\t'.join([str(e) for e in a]), file=fout)

sess.close()
fout.close()

@lixiangchun Any idea on how to print the ground truths and plot the confusion matrix using tf.confusion_matrix ?? (using the eval_image_classifier)

@anne1994 It works for me.

def _create_local(name, shape, collections=None, validate_shape=True,
                  dtype=tf.float32):
    """Creates a new local variable.
    Args:
      name: The name of the new or existing variable.
      shape: Shape of the new or existing variable.
      collections: A list of collection names to which the Variable will be added.
      validate_shape: Whether to validate the shape of the variable.
      dtype: Data type of the variables.
    Returns:
      The created variable.
    """
    # Make sure local variables are added to tf.GraphKeys.LOCAL_VARIABLES
    collections = list(collections or [])
    collections += [tf.GraphKeys.LOCAL_VARIABLES]
    return variables.Variable(
        initial_value=tf.zeros(shape, dtype=dtype),
        name=name,
        trainable=False,
        collections=collections,
        validate_shape=validate_shape)


# Function to aggregate confusion
def _get_streaming_metrics(prediction, label, num_classes):
    with tf.name_scope("eval"):
        batch_confusion = tf.confusion_matrix(label, prediction,
                                              num_classes=num_classes,
                                              name='batch_confusion')

        confusion = _create_local('confusion_matrix',
                                  shape=[num_classes, num_classes],
                                  dtype=tf.int32)
        # Create the update op for doing a "+=" accumulation on the batch
        confusion_update = confusion.assign(confusion + batch_confusion)
        # Cast counts to float so tf.summary.image renormalizes to [0,255]
        confusion_image = tf.reshape(tf.cast(confusion, tf.float32),
                                     [1, num_classes, num_classes, 1])

    return confusion, confusion_update
        # Define the metrics:
        names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
            'Accuracy': slim.metrics.streaming_accuracy(predictions, labels),
            'Recall_5': slim.metrics.streaming_recall_at_k(
                logits, labels, 5),
            'Mean_absolute': tf.metrics.mean_absolute_error(labels,
                                                            predictions),
            'Confusion_matrix': _get_streaming_metrics(labels, predictions,
                                                       dataset.num_classes - FLAGS.labels_offset),
        })



md5-591cfec6f9056324a6e2fe78f0ec7fe8



        [confusion_matrix] = slim.evaluation.evaluate_once(
            master=FLAGS.master,
            checkpoint_path=checkpoint_path,
            logdir=FLAGS.eval_dir,
            num_evals=num_batches,
            eval_op=list(names_to_updates.values()),
            variables_to_restore=variables_to_restore,
            session_config=session_config,
            final_op=[names_to_updates['Confusion_matrix']]
        )
        print(confusion_matrix)

I have the orginal eval_imag_classifier.py with some innocuous adding and using your code as-is throws the error:

InvalidArgumentError (see above for traceback): tags and values not the same shape: [] != [21,21] (tag 'eval/Confusion_matrix')
 [[Node: eval/Confusion_matrix = ScalarSummary[T=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](eval/Confusion_matrix/tags, eval/confusion_matrix/read)]]

The code is:

def _create_local(name, shape, collections=None, validate_shape=True,
              dtype=tf.float32):
"""Creates a new local variable.
Args:
  name: The name of the new or existing variable.
  shape: Shape of the new or existing variable.
  collections: A list of collection names to which the Variable will be added.
  validate_shape: Whether to validate the shape of the variable.
  dtype: Data type of the variables.
Returns:
  The created variable.
"""
# Make sure local variables are added to tf.GraphKeys.LOCAL_VARIABLES
collections = list(collections or [])
collections += [tf.GraphKeys.LOCAL_VARIABLES]
return variables.Variable(
    initial_value=tf.zeros(shape, dtype=dtype),
    name=name,
    trainable=False,
    collections=collections,
    validate_shape=validate_shape)

#Function to aggregate confusion

def _get_streaming_metrics(prediction, label, num_classes):

with tf.name_scope("eval"):
    batch_confusion = tf.confusion_matrix(label, prediction,
                                          num_classes=num_classes,
                                          name='batch_confusion')

    confusion = _create_local('confusion_matrix',
                              shape=[num_classes, num_classes],
                              dtype=tf.int32)
    # Create the update op for doing a "+=" accumulation on the batch
    confusion_update = confusion.assign(confusion + batch_confusion)
    # Cast counts to float so tf.summary.image renormalizes to [0,255]
    confusion_image = tf.reshape(tf.cast(confusion, tf.float32),
                                 [1, num_classes, num_classes, 1])

return confusion, confusion_update

def main(_):
  if not FLAGS.dataset_dir:
    raise ValueError('You must supply the dataset directory with --dataset_dir')
tf.logging.set_verbosity(tf.logging.INFO)
  with tf.Graph().as_default():

tf_global_step = slim.get_or_create_global_step()

######################
# Select the dataset #
######################
dataset = dataset_factory.get_dataset(
    FLAGS.dataset_name, FLAGS.dataset_split_name, FLAGS.dataset_dir)

####################
# Select the model #
####################
network_fn = nets_factory.get_network_fn(
    FLAGS.model_name,
    num_classes=(dataset.num_classes - FLAGS.labels_offset),
    is_training=False)

##############################################################
# Create a dataset provider that loads data from the dataset #
##############################################################
provider = slim.dataset_data_provider.DatasetDataProvider(
    dataset,
    shuffle=False,
    common_queue_capacity=2 * FLAGS.batch_size,
    common_queue_min=FLAGS.batch_size)
[image, label, filename] = provider.get(['image', 'label', 'filename'])#added filename
label -= FLAGS.labels_offset


#####################################
# Select the preprocessing function #
#####################################
preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
image_preprocessing_fn = preprocessing_factory.get_preprocessing(
    preprocessing_name,
    is_training=False)

# Gather initial summaries.
summaries = set(tf.get_collection(tf.GraphKeys.SUMMARIES))

eval_image_size = FLAGS.eval_image_size or network_fn.default_image_size

image = image_preprocessing_fn(image, eval_image_size, eval_image_size)

images, labels, filenames = tf.train.batch(
    [image, label, filename],
    batch_size=FLAGS.batch_size,
    num_threads=FLAGS.num_preprocessing_threads,
    capacity=5 * FLAGS.batch_size)

####################
# Define the model #
####################
logits, _ = network_fn(images)

if FLAGS.moving_average_decay:
  variable_averages = tf.train.ExponentialMovingAverage(
      FLAGS.moving_average_decay, tf_global_step)
  variables_to_restore = variable_averages.variables_to_restore(
      slim.get_model_variables())
  variables_to_restore[tf_global_step.op.name] = tf_global_step
else:
  variables_to_restore = slim.get_variables_to_restore()

probabilities = tf.nn.softmax(logits) #added for probabilities
predictions = tf.argmax(logits, 1)

labels = tf.squeeze(labels)
# Define the metrics:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'Accuracy': slim.metrics.streaming_accuracy(predictions, labels),
    'Recall_5': slim.metrics.streaming_recall_at_k(
        logits, labels, 5),
'Mean_absolute': tf.metrics.mean_absolute_error(labels,
                                                        predictions),
    'Confusion_matrix': _get_streaming_metrics(labels, predictions,
                                                   dataset.num_classes - FLAGS.labels_offset),
})

#get the mislabeled filenames and print them into the evaluation loop
mislabeled = tf.not_equal(predictions, labels)
mislabeled_filenames = tf.boolean_mask(filenames, mislabeled)
#eval_op = tf.Print(list(names_to_updates.values()), [filenames, mislabeled_filenames, predictions], message="mislabeled_filenames:", summarize=100)
#eval_op = tf.Print(list(names_to_updates.values()), [filenames, predictions], message="filenames and predictions:", summarize=100)

# Print the summaries to screen.
for name, value in names_to_values.items():
  summary_name = 'eval/%s' % name
  op = tf.summary.scalar(summary_name, value, collections=[])
  op = tf.Print(op, [value], summary_name)
  tf.add_to_collection(tf.GraphKeys.SUMMARIES, op)

if FLAGS.max_num_batches:
  num_batches = FLAGS.max_num_batches
else:
  # This ensures that we make a single pass over all of the data.
  num_batches = math.ceil(dataset.num_samples / float(FLAGS.batch_size))

if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
  checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
else:
  checkpoint_path = FLAGS.checkpoint_path

tf.logging.info('Evaluating %s' % checkpoint_path)
config = tf.ConfigProto(device_count={'GPU':0}) #mask GPUs visible to the session so it falls back on CPU

[confusion_matrix] = slim.evaluation.evaluate_once(
    master=FLAGS.master,
    checkpoint_path=checkpoint_path,
    logdir=FLAGS.eval_dir,
    num_evals=num_batches,
eval_op = [names_to_updates['Confusion_matrix']],
    #eval_op=list(names_to_updates.values()), #original
#eval_op=eval_op,     
variables_to_restore=variables_to_restore,
#eval_interval_secs=FLAGS.eval_interval,
    #session_config=config)#actual fix for CPU training only
    #final_op=[names_to_updates['Confusion_matrix']]
)

print(confusion_matrix)

if __name__ == '__main__':
  tf.app.run()

Will you be so kind to check my code above? Is the standard code in the slim library

I solved it. It was something minimal to change.

(1) If 'variables' is not known, check if you imported variables library.
(2) In my code posted on github you just need to comment out the part of
the variables summary at the end:

Print the summaries to screen.

for name, value in names_to_values.items():
summary_name = 'eval/%s' % name
op = tf.summary.scalar(summary_name, value, collections=[])
op = tf.Print(op, [value], summary_name)
tf.add_to_collection(tf.GraphKeys.SUMMARIES, op)

(remember to comment out or remove the code above) and the last part of the
evaluation call will be:

[confusion_matrix] = slim.evaluation.evaluate_once(
master=FLAGS.master,
checkpoint_path=checkpoint_path,
logdir=FLAGS.eval_dir,
num_evals=num_batches,
#eval_op=eval_op,
eval_op=list(names_to_updates.values()),
variables_to_restore=variables_to_restore,
#session_config=session_config,
final_op=[names_to_updates['Confusion_matrix']]
)
print(confusion_matrix)

Hope it helps!
K.

How one adds to training loop metrics that require predictions (using train_image_classifier.py) ? I would like to monitor training with mean average error in addition to loss. Also slim.evaluation.evaluate_loop appears to do nothing with me, while the evaluate_once with eval_image_classifier.py works..?

see also https://stackoverflow.com/questions/46781847/how-periodicaly-evaluate-the-performance-of-models-in-tf-slim

@KaneFury Thank you for your solution. Now I have a new problem. For example, I have 100 evaluation data, spitted into 4 batches, first three with batch size 30, and last with size 10. When calculating first three batches, it is ok. However when it comes to last batch (batch size is less than before three batches), error occurs.
Added: Sorry for that, the error occurs because I forgot set the --labels_offset=1 when doing evaluation. It does nothing with the size of last batch.

Caused by op 'eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert', defined at:
  File "eval_image_classifier.py", line 252, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "eval_image_classifier.py", line 209, in main
    dataset.num_classes - FLAGS.labels_offset),
  File "eval_image_classifier.py", line 118, in _get_streaming_metrics
    name='batch_confusion')
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/confusion_matrix.py", line 181, in confusion_matrix
    message='`predictions` out of bound')],
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/check_ops.py", line 401, in assert_less
    return control_flow_ops.Assert(condition, data, summarize=summarize)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_should_use.py", line 175, in wrapped
    return _add_should_use_warning(fn(*args, **kwargs))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 131, in Assert
    condition, no_op, true_assert, name="AssertGuard")
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 296, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1828, in cond
    orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1694, in BuildCondBranch
    original_result = fn()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 129, in true_assert
    condition, data, summarize, name="Assert")
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_logging_ops.py", line 35, in _assert
    summarize=summarize, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): assertion failed: [`predictions` out of bound] [Condition x < y did not hold element-wise:x (eval/batch_confusion/control_dependency_1:0) = ] [199 199 199...] [y (eval/batch_confusion/Cast_2:0) = ] [200]
     [[Node: eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/Switch/_5857, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/data_0, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/data_1, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/Switch_1/_5859, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/data_3, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/Switch_2/_5861)]]

Look forward your help!

Hey,

I need to see more code to solve this, but as I can remember the batch were
loaded constantly with new images and the last batch size were loaded with
other images from the dataset (till the last batch was full).
If you want to use smaller size for the last batch you need to set >>
allow_smaller_final_batch=True,<< as the parameter for your batch.
Docs:
https://www.tensorflow.org/api_docs/python/tf/train/batch

Hope it can help,
cheers

2017-12-14 10:08 GMT+01:00 Amitayus notifications@github.com:

@KaneFury https://github.com/kanefury Thank you for your solution. Now
I have a new problem. For example, I have 100 evaluation data, spitted into
4 batches, first three with batch size 30, and last with shape 10. When
calculate first three batches, it is ok. However when it comes to last
batch (batch size is less than before three batches), error occurs.

Caused by op 'eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert', defined at:
File "eval_image_classifier.py", line 252, in
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "eval_image_classifier.py", line 209, in main
dataset.num_classes - FLAGS.labels_offset),
File "eval_image_classifier.py", line 118, in _get_streaming_metrics
name='batch_confusion')
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/confusion_matrix.py", line 181, in confusion_matrix
message='predictions out of bound')],
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/check_ops.py", line 401, in assert_less
return control_flow_ops.Assert(condition, data, summarize=summarize)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/tf_should_use.py", line 175, in wrapped
return _add_should_use_warning(fn(args, *kwargs))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 131, in Assert
condition, no_op, true_assert, name="AssertGuard")
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 296, in new_func
return func(args, *kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1828, in cond
orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1694, in BuildCondBranch
original_result = fn()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 129, in true_assert
condition, data, summarize, name="Assert")
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_logging_ops.py", line 35, in _assert
summarize=summarize, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:x (eval/batch_confusion/control_dependency_1:0) = ] [199 199 199...] [y (eval/batch_confusion/Cast_2:0) = ] [200]
[[Node: eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/Switch/_5857, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/data_0, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/data_1, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/Switch_1/_5859, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/data_3, eval/batch_confusion/assert_less_1/Assert/AssertGuard/Assert/Switch_2/_5861)]]

Look forward your help!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/models/issues/1286#issuecomment-351650479,
or mute the thread
https://github.com/notifications/unsubscribe-auth/APOnOAnyKuXa_s7C02foLLRSZQ3Oao9Iks5tAOV1gaJpZM4MwcKx
.

Re-assigning to @sguada who maintains this code.

In case any body else stumbles over this: the above code example from @vn09 works good, but has a tiny bug in it. The two parameters for the labels and predictions are swapped in the method call:

'Confusion_matrix': _get_streaming_metrics(labels, predictions, [..])

vs

def _get_streaming_metrics(prediction, label, num_classes):

@KaneFury about Confusion_matrix I'm facing the error

  • NameError: name 'variables' is not defined

You already told that.
(1) If 'variables' is not known, check if you imported variables library.

I didn't get it. Do i have to import some module or install something ?
Could you please help me this? or Maybe sent me the full code.
Thank you.

Yes you need to import that one.. variables is a library...

On Sun, 8 Apr 2018 at 12:33, 5410612484 notifications@github.com wrote:

@KaneFury https://github.com/KaneFury about Confusion_matrix I'm facing
the error

  • NameError: name 'variables' is not defined

You already told that.
(1) If 'variables' is not known, check if you imported variables library.

I didn't get it. Do i have to import some module or install something ?
Could you please help me this? or Maybe sent me the full code.
Thank you.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/models/issues/1286#issuecomment-379539535,
or mute the thread
https://github.com/notifications/unsubscribe-auth/APOnOENyDeM1g_pMLlirjdLh3DlySB4Mks5tmed-gaJpZM4MwcKx
.

@KaneFury @5410612484
About the confusion_matrix, the error
*NameError: name 'variables' is not defined
For newer versions of tf, I solved the issue replacing the line

return variables.Variable(
    initial_value=tf.zeros(shape, dtype=dtype),
    name=name,
    trainable=False,
    collections=collections,
    validate_shape=validate_shape)

for

return tf.Variable(
    initial_value=tf.zeros(shape, dtype=dtype),
    name=name,
    trainable=False,
    collections=collections,
    validate_shape=validate_shape)

Hi, I have a related question based on batch statistics in slim.evaluation.evaluate_once().
Suppose I want to calculate a result_batch(2D matrix, (1,3) for example) and append each batch's statistics to a result_final((500,3), since batch_size=100, and validation_example=50000)and save it to file for further analysis, How could I attain this purpose?

The problem is different from calculating the "confusion matrix" since the "confusion matrix" is
of a certain shape. Here the result_final is of increasing size.

I tried the following ways:
(1) add one row to result_final to save "current_step", and update this "current_step"
using another operation, however, when I group these using tf.group(result_update, current_step_update), the return value of final_result is None(Here is the pseodu code)

    def _get_streaming_metrics(channel):
        with tf.name_scope("eval"):
            result = _create_local('result_list', shape=[10, 4],dtype=tf.float32)
            # current batch_ind
            c_ind = tf.range(result[-1,-1], result[-1,-1] + FLAGS.batch_size) # current batch index

            # get batch result
            result_batch = some_function(channel)
            c_update = result[-1,-1].assign(result[-1,-1] + FLAGS.batch_size)
            with tf.control_dependencies([c_update]):
                 result_update = result.assign(result_tmp)
            group_update = tf.group(result_update, c_update)

        return result, group_update

(2) I notice there is a function to get the current evaluation step, however, I don't know how can I get this value during evaluation.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/evaluation.py#L37

Any suggestion would be appreciated, thanks

Hi, I finally solved the problem using the following code by simply returning
the all result and current index variable and update operation

def _get_streaming_metrics(channel, filenames):
    with tf.name_scope("eval"):
        result = _create_local('result_list', shape=[NUM_IMAGES, 3],dtype=tf.float32)
        c = _create_c('c_value', shape=(1,), dtype=tf.int32)

        result_update = some_function()
        with tf.control_dependencies([result_update]):
             c_update =  c.assign(c + FLAGS.batch_size) # batch_update

    return result, c, result_update, c_update
Was this page helpful?
0 / 5 - 0 ratings

Related issues

waltermaldonado picture waltermaldonado  Â·  58Comments

ddurgaprasad picture ddurgaprasad  Â·  48Comments

DanMossa picture DanMossa  Â·  48Comments

wrkgm picture wrkgm  Â·  78Comments

Tsuihao picture Tsuihao  Â·  90Comments