Models: Name conflict: tensorflow.contrib.slim vs tf_slim

Created on 30 May 2020  Â·  3Comments  Â·  Source: tensorflow/models

Prerequisites

  • [x] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
  • [x] I am reporting the issue to the correct repository. (Model Garden official or research directory)
  • [x] I checked to make sure that this issue has not already been filed.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/slim/README.md
https://github.com/tensorflow/models/blob/master/research/slim/slim_walkthrough.ipynb

2. Describe the bug

  • The sample code in README.md works with TensorFlow 1.15.3 but not with TensorFlow 2.2.0.
  • The notebook slim_walkthrough.ipynb does not work with either TensorFlow 1.15.3 or TensorFlow 2.2.0. In particular, two statements from tensorflow.contrib import slim and import tf_slim as slim appear in the same notebook, resulting in name conflict.

3. Steps to reproduce

  • The following sample commands in README.md throw exceptions with TensorFlow 2.2.0.

    $ python train_image_classifier.py \
        --train_dir=${TRAIN_DIR} \
        --dataset_dir=${DATASET_DIR} \
        --dataset_name=flowers \
        --dataset_split_name=train \
        --model_name=inception_v3 \
        --checkpoint_path=${CHECKPOINT_PATH} \
        --checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits \
        --trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits
    

    $ python eval_image_classifier.py \ --alsologtostderr \ --checkpoint_path=${CHECKPOINT_FILE} \ --dataset_dir=${DATASET_DIR} \ --dataset_name=imagenet \ --dataset_split_name=validation \ --model_name=inception_v3
    The reason is that both train_image_classifier.py and eval_image_classifier.py contain the sentence from tensorflow.contrib import quantize as contrib_quantize, which raises ModuleNotFoundError: No module named 'tensorflow.contrib'.

  • If I run slim_walkthrough.ipynb with TensorFlow 1.15.3, it throws within the 6th cell (starting from # The following snippet trains the regression model) a UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. and then in the 9th cell right after the sentence "Finally, we print the final value of each metric":

    TypeError    Traceback (most recent call last) <ipython-input-9-0b551dafa1af> in <module>
         16             num_evals=1, # Single pass over data
         17             eval_op=names_to_update_nodes.values(),
    ---> 18             final_op=names_to_value_nodes.values())
         19 
         20     names_to_values = dict(zip(names_to_value_nodes.keys(), metric_values))
    
    TypeError: 'module' object is not callable
    
  • On the other hand, since slim_walkthrough.ipynb contains the statement from tensorflow.contrib import slim, it is clear that the notebook is not compatible with TensorFlow 2 as it stands. However, if I run the notebook, it actually throws an error within the 6th cell starting from "# The following snippet trains the regression model", before this import statement:

    TypeError: An op outside of the function building code is being passed
    a "Graph" tensor. It is possible to have Graph tensors
    leak out of the function building context by including a
    tf.init_scope in your function building code.
    For example, the following function will fail:
      @tf.function
      def has_init_scope():
        my_constant = tf.constant(1.)
        with tf.init_scope():
          added = my_constant * 2
    The graph tensor has name: global_step:0
    

4. Expected behavior

  • Apparently, the commit a couple of days ago aims to make the TensorFlow-Slim Image Classification Model Library compatible with TensorFlow 2. For example, it deleted the ![TensorFlow 2 Not Supported] tag from READMEs and replaced as many tf.contrib.slim with tf-slim as possible. Since the sample code in README.md works with TensorFlow 1.15.3 anyway, this might not count as a bug, but it is at least confusing for a non-experienced TensorFlow user like me 😭

  • I hope the sample notebook slim_walkthrough.ipynb works with either TensorFlow 1 or 2. Also, for TensorFlow 1, the two conflicting statements from tensorflow.contrib import slim and import tf_slim as slim are better avoided.

5. Additional context


Full log for the 6th cell in the notebook, run with TensorFlow 1.15.3

WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow_core/python/ops/losses/losses_impl.py:121: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From <ipython-input-6-6bec913f2e40>:16: get_total_loss (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_total_loss instead.
WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:236: get_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_losses instead.
WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:238: get_regularization_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_regularization_losses instead.
WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/learning.py:734: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path /tmp/regression_model/model.ckpt
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global step 499: loss = 0.4413 (0.001 sec/step)
INFO:tensorflow:global step 999: loss = 0.2760 (0.001 sec/step)
INFO:tensorflow:global step 1499: loss = 0.2333 (0.001 sec/step)
INFO:tensorflow:global step 1999: loss = 0.2453 (0.001 sec/step)
INFO:tensorflow:global step 2499: loss = 0.1999 (0.001 sec/step)
INFO:tensorflow:global_step/sec: 547.203
INFO:tensorflow:global step 2999: loss = 0.1675 (0.001 sec/step)
INFO:tensorflow:global step 3499: loss = 0.1778 (0.001 sec/step)
INFO:tensorflow:global step 3999: loss = 0.2127 (0.001 sec/step)
INFO:tensorflow:global step 4499: loss = 0.1784 (0.001 sec/step)
INFO:tensorflow:global step 4999: loss = 0.1660 (0.001 sec/step)
INFO:tensorflow:Stopping Training.
INFO:tensorflow:Finished training! Saving model to disk.
Finished training. Last batch loss: 0.16604608
Checkpoint saved in /tmp/regression_model/
/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow_core/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened.
  warnings.warn("Attempting to use a closed FileWriter. "


Full log for the 9th cell in the notebook, run with TensorFlow 1.15.3

WARNING:tensorflow:From <ipython-input-9-0b551dafa1af>:7: streaming_mean_squared_error (from tf_slim.metrics.metric_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.metrics.mean_squared_error. Note that the order of the labels and predictions arguments has been switched.
WARNING:tensorflow:From <ipython-input-9-0b551dafa1af>:8: streaming_mean_absolute_error (from tf_slim.metrics.metric_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.metrics.mean_absolute_error. Note that the order of the labels and predictions arguments has been switched.
INFO:tensorflow:Restoring parameters from /tmp/regression_model/model.ckpt
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting standard services.
INFO:tensorflow:Saving checkpoint to path /tmp/regression_model/model.ckpt
INFO:tensorflow:Starting queue runners.
INFO:tensorflow:Error reported to Coordinator: <class 'TypeError'>, 'module' object is not callable
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-0b551dafa1af> in <module>
     16             num_evals=1, # Single pass over data
     17             eval_op=names_to_update_nodes.values(),
---> 18             final_op=names_to_value_nodes.values())
     19 
     20     names_to_values = dict(zip(names_to_value_nodes.keys(), metric_values))

TypeError: 'module' object is not callable


Full log for the 6th cell in the notebook, run with TensorFlow 2.2.0

WARNING:tensorflow:From <ipython-input-6-6bec913f2e40>:16: get_total_loss (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_total_loss instead.
WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:236: get_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_losses instead.
WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:238: get_regularization_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_regularization_losses instead.
WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/learning.py:734: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path /tmp/regression_model/model.ckpt
INFO:tensorflow:Error reported to Coordinator: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
  @tf.function
  def has_init_scope():
    my_constant = tf.constant(1.)
    with tf.init_scope():
      added = my_constant * 2
The graph tensor has name: global_step:0
Traceback (most recent call last):
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 470, in read_variable_op
    tld.op_callbacks, resource, "dtype", dtype)
tensorflow.python.eager.core._FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception
    yield
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py", line 485, in run
    self.start_loop()
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/supervisor.py", line 1077, in start_loop
    self._last_step = training_util.global_step(self._sess, self._step_counter)
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/training_util.py", line 67, in global_step
    return int(global_step_tensor.numpy())
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 603, in numpy
    return self.read_value().numpy()
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 666, in read_value
    value = self._read_variable_op()
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 645, in _read_variable_op
    result = read_and_set_handle()
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 636, in read_and_set_handle
    self._dtype)
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 475, in read_variable_op
    resource, dtype=dtype, name=name, ctx=_ctx)
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 502, in read_variable_op_eager_fallback
    attrs=_attrs, ctx=ctx, name=name)
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 75, in quick_execute
    raise e
  File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
  @tf.function
  def has_init_scope():
    my_constant = tf.constant(1.)
    with tf.init_scope():
      added = my_constant * 2
The graph tensor has name: global_step:0
INFO:tensorflow:Starting Queues.
INFO:tensorflow:Finished training! Saving model to disk.
/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/summary/writer/writer.py:388: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened.
  warnings.warn("Attempting to use a closed FileWriter. "
---------------------------------------------------------------------------
_FallbackException                        Traceback (most recent call last)
~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py in read_variable_op(resource, dtype, name)
    469         _ctx._context_handle, tld.device_name, "ReadVariableOp", name,
--> 470         tld.op_callbacks, resource, "dtype", dtype)
    471       return _result

_FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors.

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-6-6bec913f2e40> in <module>
     26         number_of_steps=5000,
     27         save_summaries_secs=5,
---> 28         log_every_n_steps=500)
     29 
     30 print("Finished training. Last batch loss:", final_loss)

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/learning.py in train(train_op, logdir, train_step_fn, train_step_kwargs, log_every_n_steps, graph, master, is_chief, global_step, number_of_steps, init_op, init_feed_dict, local_init_op, init_fn, ready_op, summary_op, save_summaries_secs, summary_writer, startup_delay_steps, saver, save_interval_secs, sync_optimizer, session_config, session_wrapper, trace_every_n_steps, ignore_live_threads)
    780               threads,
    781               close_summary_writer=True,
--> 782               ignore_live_threads=ignore_live_threads)
    783 
    784     except errors.AbortedError:

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/supervisor.py in stop(self, threads, close_summary_writer, ignore_live_threads)
    837           threads,
    838           stop_grace_period_secs=self._stop_grace_secs,
--> 839           ignore_live_threads=ignore_live_threads)
    840     finally:
    841       # Close the writer last, in case one of the running threads was using it.

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py in join(self, threads, stop_grace_period_secs, ignore_live_threads)
    387       self._registered_threads = set()
    388       if self._exc_info_to_raise:
--> 389         six.reraise(*self._exc_info_to_raise)
    390       elif stragglers:
    391         if ignore_live_threads:

~/tensorflow-slim/.venv/lib/python3.7/site-packages/six.py in reraise(tp, value, tb)
    701             if value.__traceback__ is not tb:
    702                 raise value.with_traceback(tb)
--> 703             raise value
    704         finally:
    705             value = None

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py in stop_on_exception(self)
    295     """
    296     try:
--> 297       yield
    298     except:  # pylint: disable=bare-except
    299       self.request_stop(ex=sys.exc_info())

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py in run(self)
    483   def run(self):
    484     with self._coord.stop_on_exception():
--> 485       self.start_loop()
    486       if self._timer_interval_secs is None:
    487         # Call back-to-back.

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/supervisor.py in start_loop(self)
   1075   def start_loop(self):
   1076     self._last_time = time.time()
-> 1077     self._last_step = training_util.global_step(self._sess, self._step_counter)
   1078 
   1079   def run_loop(self):

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/training_util.py in global_step(sess, global_step_tensor)
     65   """
     66   if context.executing_eagerly():
---> 67     return int(global_step_tensor.numpy())
     68   return int(sess.run(global_step_tensor))
     69 

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in numpy(self)
    601   def numpy(self):
    602     if context.executing_eagerly():
--> 603       return self.read_value().numpy()
    604     raise NotImplementedError(
    605         "numpy() is only available when eager execution is enabled.")

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in read_value(self)
    664     """
    665     with ops.name_scope("Read"):
--> 666       value = self._read_variable_op()
    667     # Return an identity so it can get placed on whatever device the context
    668     # specifies instead of the device where the variable is.

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in _read_variable_op(self)
    643           result = read_and_set_handle()
    644     else:
--> 645       result = read_and_set_handle()
    646 
    647     if not context.executing_eagerly():

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in read_and_set_handle()
    634     def read_and_set_handle():
    635       result = gen_resource_variable_ops.read_variable_op(self._handle,
--> 636                                                           self._dtype)
    637       _maybe_set_handle_data(self._dtype, self._handle, result)
    638       return result

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py in read_variable_op(resource, dtype, name)
    473       try:
    474         return read_variable_op_eager_fallback(
--> 475             resource, dtype=dtype, name=name, ctx=_ctx)
    476       except _core._SymbolicException:
    477         pass  # Add nodes to the TensorFlow graph.

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py in read_variable_op_eager_fallback(resource, dtype, name, ctx)
    500   _attrs = ("dtype", dtype)
    501   _result = _execute.execute(b"ReadVariableOp", 1, inputs=_inputs_flat,
--> 502                              attrs=_attrs, ctx=ctx, name=name)
    503   if _execute.must_record_gradient():
    504     _execute.record_gradient(

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     73           "Inputs to eager execution function cannot be Keras symbolic "
     74           "tensors, but found {}".format(keras_symbolic_tensors))
---> 75     raise e
     76   # pylint: enable=protected-access
     77   return tensors

~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
  @tf.function
  def has_init_scope():
    my_constant = tf.constant(1.)
    with tf.init_scope():
      added = my_constant * 2
The graph tensor has name: global_step:0

6. System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian GNU/Linux 9 (stretch)
  • Mobile device name if the issue happens on a mobile device: N/A
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): TensorFlow 2.2.0 (optionally 1.15.3)
  • Python version: Python 3.7.6
  • Bazel version (if compiling from source): N/A
  • GCC/Compiler version (if compiling from source): N/A
  • CUDA/cuDNN version: CUDA V10.1.243 (optionally CUDA V10.0.130 for TensorFlow 1.15.3)/cuDNN 7.6.5
  • GPU model and memory: Tesla T4 with 15079MiB memory
research bug

All 3 comments

Thanks for the report. The notebook does need to be fixed. Slim will probably never work in tensorflow 2.0 eager mode, only in graph mode. Thus command line examples won't work as is. We should restore that header in the README.MD with some caveats.

What is the best way to fix this? Should we mention that [TensorFlow 2 might not be supported] in the README or change all instances of from tensorflow.contrib import slim to import tf_slim as slim. Or maybe both?

Slim will probably never work in tensorflow 2.0 eager mode, only in graph mode

Is this being deprecated or is there any other reason it doesn't work?

We will update the readme.md and notebook shortly. Skim is mostly in
maintenance mode but full on tf2 support basically requires a very thorough
rewrite and not all concepts of slim map nicely in tf2.

On Wed, Jun 3, 2020, 5:13 PM Kilaru Yasaswi Sri Chandra Gandhi <
[email protected]> wrote:

What is the best way to fix this? Should we mention that [TensorFlow 2
might not be supported] in the README or change all instances of from
tensorflow.contrib import slim to import tf_slim as slim. Or maybe both?

Slim will probably never work in tensorflow 2.0 eager mode, only in graph
mode

Is this being deprecated or is there any other reason it doesn't work?

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/models/issues/8594#issuecomment-638525440,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABVIDWDFJ2EWIR5WIANE4Q3RU3RLFANCNFSM4NOQITFQ
.

Was this page helpful?
0 / 5 - 0 ratings