Models: Training stucks at "Instructions for updating: Use fn_output_signature instead"

Created on 14 Aug 2020  路  14Comments  路  Source: tensorflow/models

Prerequisites

Please answer the following questions for yourself before submitting an issue.

[Yes ] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
[ Yes] I am reporting the issue to the correct repository. (Model Garden official or research directory)
[ Yes] I checked to make sure that this issue has not been filed already.

1. The entire URL of the file you are using

python3 model_main_tf2.py --pipeline_config_path=/models/tf2/20200804_faster_rcnn_resnet50/pipeline.config --model_dir=checkpoints/20200804_faster_rcnn_resnet50 --alsologtostderr

2. Describe the bug

It seems like the training stucks at the point "Instructions for updating: Use fn_output_signature instead". It only creates one checkpoint and afterwards it seems like the training is frozen.

LOG-OUTPUT:

2020-08-14 08:42:33.110709: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-14 08:42:34.463497: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-08-14 08:42:34.488920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:05:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.582GHz coreCount: 28 deviceMemorySize: 10.91GiB deviceMemoryBandwidth: 451.17GiB/s
2020-08-14 08:42:34.489845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties:
pciBusID: 0000:09:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.582GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2020-08-14 08:42:34.489875: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-08-14 08:42:34.491363: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-08-14 08:42:34.492656: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-08-14 08:42:34.492904: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-08-14 08:42:34.494458: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-08-14 08:42:34.495275: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-08-14 08:42:34.495369: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-08-14 08:42:34.495385: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-08-14 08:42:34.495673: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-14 08:42:34.519294: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 3597555000 Hz
2020-08-14 08:42:34.519885: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4a7c2d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-14 08:42:34.519919: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-08-14 08:42:34.521380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-14 08:42:34.521405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
WARNING:tensorflow:There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce.
W0814 08:42:34.522547 140702927255360 cross_device_ops.py:1202] There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce.
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
I0814 08:42:34.522814 140702927255360 mirrored_strategy.py:341] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
INFO:tensorflow:Maybe overwriting train_steps: None
I0814 08:42:34.528091 140702927255360 config_util.py:552] Maybe overwriting train_steps: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0814 08:42:34.528242 140702927255360 config_util.py:552] Maybe overwriting use_bfloat16: False
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0814 08:42:34.570969 140702927255360 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_deterministic.
W0814 08:42:34.574131 140702927255360 deprecation.py:323] From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_deterministic.
WARNING:tensorflow:From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.map() W0814 08:42:34.660002 140702927255360 deprecation.py:323] From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Usetf.data.Dataset.map()
WARNING:tensorflow:From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
W0814 08:42:39.970870 140702927255360 deprecation.py:323] From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
WARNING:tensorflow:From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/inputs.py:259: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
W0814 08:42:43.646067 140702927255360 deprecation.py:323] From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/inputs.py:259: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/model_lib_v2.py:347: set_learning_phase (from tensorflow.python.keras.backend) is deprecated and will be removed after 2020-10-11.
Instructions for updating:
Simply pass a True/False value to the training argument of the __call__ method of your layer or model.
W0814 08:42:47.524089 140696258332416 deprecation.py:323] From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/model_lib_v2.py:347: set_learning_phase (from tensorflow.python.keras.backend) is deprecated and will be removed after 2020-10-11.
Instructions for updating:
Simply pass a True/False value to the training argument of the __call__ method of your layer or model.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0814 08:42:51.386824 140696258332416 convolutional_keras_box_predictor.py:154] depth of additional conv before box predictor: 0
WARNING:tensorflow:From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: calling crop_and_resize_v1 (from tensorflow.python.ops.image_ops_impl) with box_ind is deprecated and will be removed in a future version.
Instructions for updating:
box_ind is deprecated, use box_indices instead
W0814 08:42:57.259847 140696258332416 deprecation.py:506] From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: calling crop_and_resize_v1 (from tensorflow.python.ops.image_ops_impl) with box_ind is deprecated and will be removed in a future version.
Instructions for updating:
box_ind is deprecated, use box_indices instead
WARNING:tensorflow:From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/utils/model_util.py:57: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
W0814 08:42:57.738880 140696258332416 deprecation.py:323] From /home/bastianbernhardt/safeai/image-detection/imagedetect/models/tf2/research/object_detection/utils/model_util.py:57: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
WARNING:tensorflow:From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

W0814 08:43:02.355147 140696258332416 deprecation.py:323] From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

WARNING:tensorflow:Unresolved object in checkpoint: (root).model._groundtruth_lists
W0814 08:43:26.824852 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._groundtruth_lists
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv
W0814 08:43:26.825030 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor
W0814 08:43:26.825097 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._maxpool_layer
W0814 08:43:26.825157 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._maxpool_layer
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor
W0814 08:43:26.825212 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._batched_prediction_tensor_names
W0814 08:43:26.825268 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._batched_prediction_tensor_names
WARNING:tensorflow:Unresolved object in checkpoint: (root).model.endpoints
W0814 08:43:26.825321 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model.endpoints
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0
W0814 08:43:26.825373 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-1
W0814 08:43:26.825427 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-1
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-2
W0814 08:43:26.825478 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-2
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads
W0814 08:43:26.825530 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._sorted_head_names
W0814 08:43:26.825581 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._sorted_head_names
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._shared_nets
W0814 08:43:26.825633 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._shared_nets
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head
W0814 08:43:26.825684 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head
W0814 08:43:26.825735 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._third_stage_heads
W0814 08:43:26.825786 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._third_stage_heads
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0._inbound_nodes
W0814 08:43:26.825849 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0._inbound_nodes
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0.kernel
W0814 08:43:26.825902 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0.bias
W0814 08:43:26.825955 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer_with_weights-0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-1._inbound_nodes
W0814 08:43:26.826007 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-1._inbound_nodes
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-2._inbound_nodes
W0814 08:43:26.826058 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor_first_conv.layer-2._inbound_nodes
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings
W0814 08:43:26.826109 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background
W0814 08:43:26.826160 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._shared_nets.0
W0814 08:43:26.826212 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._shared_nets.0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers
W0814 08:43:26.826264 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers
W0814 08:43:26.826315 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0
W0814 08:43:26.826394 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0
W0814 08:43:26.826448 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.0
W0814 08:43:26.826501 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.1
W0814 08:43:26.826553 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.1
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.2
W0814 08:43:26.826604 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.2
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.0
W0814 08:43:26.826656 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.1
W0814 08:43:26.826708 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.1
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.2
W0814 08:43:26.826760 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.2
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers
W0814 08:43:26.826811 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers
W0814 08:43:26.826862 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.1.kernel
W0814 08:43:26.826914 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.1.bias
W0814 08:43:26.826966 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._box_prediction_head._box_encoder_layers.1.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.1.kernel
W0814 08:43:26.827017 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.1.bias
W0814 08:43:26.827068 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._mask_rcnn_box_predictor._class_prediction_head._class_predictor_layers.1.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers.0
W0814 08:43:26.827129 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers.0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers.0
W0814 08:43:26.827180 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers.0
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers.0.kernel
W0814 08:43:26.827233 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers.0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers.0.bias
W0814 08:43:26.827289 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.box_encodings.0._box_encoder_layers.0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers.0.kernel
W0814 08:43:26.827340 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers.0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers.0.bias
W0814 08:43:26.827392 140702927255360 util.py:150] Unresolved object in checkpoint: (root).model._first_stage_box_predictor._prediction_heads.class_predictions_with_background.0._class_predictor_layers.0.bias
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
W0814 08:43:26.827444 140702927255360 util.py:158] A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
WARNING:tensorflow:From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py:574: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Use fn_output_signature instead
W0814 08:43:40.411885 140696283510528 deprecation.py:506] From /home/bastianbernhardt/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py:574: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Use fn_output_signature instead

3. Steps to reproduce

I followed all the steps from this guide: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/index.html
Installation works fine and now I want to train my own dataset.
What i already did so far:

  • created tf-record files
  • created labelmap
  • configured the pipeline.config file (from pre-trained-model)
  • every single thing you need to train your own model

I already tried 2 pre-trained models:
ssd_resnet50_v1_fpn_640x640_coco17_tpu-8
faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8

4. Expected behavior

A running training process

5. Additional context

6. System information

Linux
TensorFlow installed from pip 2.3.0
Python version: 3.6.9
CUDA 10.1
GPU model: GeForce GTX 1080 Ti
research support

Most helpful comment

@Berbas95 Can you explain how you fixed the problem?

All 14 comments

@Berbas95

Will it be possible to share colab link or code snippet to reproduce the issue in our environment.It helps us in localizing the issue faster.Thanks!

@ravikyram Which code snippet do you need? From the Terminal? From a specific script?

I have the same problem, did you solve it?

@agaosto Hi, it wasn't a problem at all. I didn't notice that I trained my network with the CPU (Because libcudnn wasn't installed and therefore no GPU was used). It took 25 mins for the first "loss log". I was just impatient. Now i train with the GPU and got my first loss log within 2 mins.

@agaosto Hi, it wasn't a problem at all. I didn't notice that I trained my network with the CPU (Because libcudnn wasn't installed and therefore no GPU was used). It took 25 mins for the first "loss log". I was just impatient. Now i train with the GPU and got my first loss log within 2 mins.

i got the same issue ,can you please explain mw why this issue happens ??

Instructions for updating:
Use fn_output_signature instead

@agaosto Hi, it wasn't a problem at all. I didn't notice that I trained my network with the CPU (Because libcudnn wasn't installed and therefore no GPU was used). It took 25 mins for the first "loss log". I was just impatient. Now i train with the GPU and got my first loss log within 2 mins.

Yes. this worked well for me.
Please change your runtime instance to GPU - will take 2sec per steps.

@Berbas95 Can you explain how you fixed the problem?

first check if TensorFlow found gpu or not. I changed my runtime to gpu but some how tensorflow didn't find gpu. After restarting runtime few times tensorflow detected gpu and then it worked flawlessly.

import tensorflow as tf
if tf.test.gpu_device_name():
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
    print("Please install GPU version of TF")

https://colab.research.google.com/drive/1kMqMD_BaG_KYOfhHy7IH-FXyL2lCQJxx?authuser=7#scrollTo=_ICeXUkeyyfb

anybody can help me why? same issues instuctions for updating : usen FN_output_signature instead please :)

@agaosto Hi, it wasn't a problem at all. I didn't notice that I trained my network with the CPU (Because libcudnn wasn't installed and therefore no GPU was used). It took 25 mins for the first "loss log". I was just impatient. Now i train with the GPU and got my first loss log within 2 mins.

Yes. this worked well for me.
Please change your runtime instance to GPU - will take 2sec per steps.

can u tell me how? i change to GPU but still same problem use fn_output signature instead

@agaosto Hi, it wasn't a problem at all. I didn't notice that I trained my network with the CPU (Because libcudnn wasn't installed and therefore no GPU was used). It took 25 mins for the first "loss log". I was just impatient. Now i train with the GPU and got my first loss log within 2 mins.

can u tell me how to solve the instructions im using gcollab

@Berbas95

Will it be possible to share colab link or code snippet to reproduce the issue in our environment.It helps us in localizing the issue faster.Thanks!

https://colab.research.google.com/drive/1kMqMD_BaG_KYOfhHy7IH-FXyL2lCQJxx?authuser=7#scrollTo=_ICeXUkeyyfb

but still same problem

I'm facing the same problem where I used colab and changed the batch size from 8 to 4 already.

Note: Training object detection with custom data using SSD RESNET50 640x640

I'm facing the same problem where I used colab and changed the batch size from 8 to 4 already.

Note: Training object detection with custom data using SSD RESNET50 640x640

The problem is solved now by changing Batch size to 4 and the Image resolution from 640 to 512 (Don't really sure if 640 is better in term of accuracy)

Was this page helpful?
0 / 5 - 0 ratings