I am trying to train deeplab on my custom dataset. All of my training images are of size 320 X 240. So I used following command to train. The training works without any problem.
python train.py \
--logtostderr \
--training_number_of_steps=30000 \
--train_split="train" \
--model_variant="mobilenet_v2" \
--atrous_rates=12 \
--atrous_rates=24 \
--atrous_rates=36 \
--output_stride=8 \
--decoder_output_stride=4 \
--train_crop_size=321 \
--train_crop_size=241 \
--train_batch_size=16 \
--dataset="${DATASET}" \
--initialize_last_layer=false \
...
Now, when I run the vis.py code with --vis_crop_size=321 --vis_crop_size=241 I get the following error.
Note: when I run --vis_crop_size=321 --vis_crop_size=321 it works. But I generally get a relatively bad accuracy than my expectation, as I only have 4 classes to predict. I think it may be the cause of this input resolution change or padding.
INFO:tensorflow:Error reported to Coordinator:
[[Node: batch/padding_fifo_queue_enqueue = QueueEnqueueV2[Tcomponents=[DT_INT64, DT_FLOAT, DT_STRING, DT_INT32, DT_UINT8, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/padding_fifo_queue, Reshape_3/_945, add_2/_947, ParseSingleExample/ParseSingleExample:1, add_3/_949, batch/packed, Reshape_6/_951)]]
INFO:tensorflow:Visualizing batch 1 / 3242
Traceback (most recent call last):
File "vis.py", line 320, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "vis.py", line 306, in main
image_id_offset += FLAGS.vis_batch_size
File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 1005, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 833, in stop
ignore_live_threads=ignore_live_threads)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/queue_runner_impl.py", line 252, in _run
enqueue_callable()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1244, in _single_operation_run
self._call_tf_sessionrun(None, {}, [], target_list, None)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape mismatch in tuple component 1. Expected [321,241,3], got [321,320,3]
[[Node: batch/padding_fifo_queue_enqueue = QueueEnqueueV2[Tcomponents=[DT_INT64, DT_FLOAT, DT_STRING, DT_INT32, DT_UINT8, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/padding_fifo_queue, Reshape_3/_945, add_2/_947, ParseSingleExample/ParseSingleExample:1, add_3/_949, batch/packed, Reshape_6/_951)]]
vis_crop_size is actually defined as [height, width], so maybe try
--vis_crop_size=241
--vis_crop_size=321
Please also update the order in training as well.
Thanks !! @YknZhu, that was silly of me. Resolved the problem.
@sumsuddin how did you solve your problem? maybe its silly of me too.
@MertAliTombul In my case, as the author said the --vis_crop_size accepts [height, width] order. I mistakenly inputted them in [width, height] order. Correcting it solved my problem
also my actual image size was 240 X 320, for that I inputted it as --vis_crop_size=241
--vis_crop_size=321 (added 1 with both of them)
Most helpful comment
vis_crop_size is actually defined as [height, width], so maybe try
Please also update the order in training as well.