Models: the vis.py don`t work

Created on 19 Feb 2020 · 5Comments · Source: tensorflow/models

Please go to Stack Overflow for help and support:

http://stackoverflow.com/questions/tagged/tensorflow

Also, please understand that many of the models included in this repository are experimental and research-style code. If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.

System information

What is the top-level directory of the model you are using: ~/cxj/models/research
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Linux Ubuntu 16.04
TensorFlow installed from (source or binary):pip install tensorflow-gpu
TensorFlow version (use command below):1.14.0
Bazel version (if compiling from source):
CUDA/cuDNN version:CUDA 10.0 cuDNN 7.6.5
GPU model and memory:Getforce 1060 16G
Exact command to reproduce:
`

the train command

python deeplab/train.py \
--logtostderr \
--training_number_of_steps=200 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=321,321 \
--train_batch_size=2 \
--dataset="mydata" \
--tf_initial_checkpoint='/home/aiyunji/cxj/models/research/deeplab/backbone/deeplabv3_cityscapes_train/model.ckpt' \
--train_logdir='/home/aiyunji/cxj/models/research/deeplab/exp/train_on_train_set/train' \
--dataset_dir='/home/aiyunji/cxj/models/research/deeplab/datasets/lv/tfrecord'

the eval command

python deeplab/eval.py \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--eval_crop_size=321,321 \
--min_resize_value=321 \
--max_resize_value=321 \
--dataset="mydata" \
--checkpoint_dir="/home/aiyunji/cxj/models/research/deeplab/exp/train_on_train_set/train" \
--eval_logdir="/home/aiyunji/cxj/models/research/deeplab/exp/train_on_train_set/eval" \
--dataset_dir="/home/aiyunji/cxj/models/research/deeplab/datasets/lv/tfrecord" \
--max_number_of_evaluations=1

the vis command

command1

python deeplab/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=531,531 \
--dataset="mydata" \
--colormap_type="pascal" \
--checkpoint_dir='/home/aiyunji/cxj/models/research/deeplab/exp/train_on_train_set/train' \
--vis_logdir='/home/aiyunji/cxj/models/research/deeplab/exp/train_on_train_set/vis' \
--dataset_dir='/home/aiyunji/cxj/models/research/deeplab/datasets/lv/tfrecord'

command2

python deeplab/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=321,321 \
*--min_resize_value=321 \
--max_resize_value=321 *
--dataset="mydata" \
--checkpoint_dir='/home/aiyunji/cxj/models/research/deeplab/exp/train_on_train_set/train' \
--vis_logdir="/home/aiyunji/cxj/models/research/deeplab/exp/train_on_train_set/vis" \
--dataset_dir="/home/aiyunji/cxj/models/research/deeplab/datasets/lv/tfrecord" \
--max_number_of_iterations=1

Describe the problem

I passed the model_test.py. And train.py also work.
But when i wanted to eval and vis.It went wrong.

when I use the eval command without the --min_resize_value=321 --max_resize_value=321
--max_number_of_evaluations=1, the wrong occurred.The log was bellow.

2020-02-19 15:11:00.387115: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2020-02-19 15:11:01.176250: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at spacetobatch_op.cc:219 : Invalid argument: padded_shape[1]=29 is not divisible by block_shape[1]=2 Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: padded_shape[1]=29 is not divisible by block_shape[1]=2 [[{{node xception_65/exit_flow/block2/unit_1/xception_module/separable_conv1_depthwise/depthwise/SpaceToBatchND}}]] [[mean_iou/AssignAdd/_4527]] (1) Invalid argument: padded_shape[1]=29 is not divisible by block_shape[1]=2 [[{{node xception_65/exit_flow/block2/unit_1/xception_module/separable_conv1_depthwise/depthwise/SpaceToBatchND}}]] 0 successful operations. 0 derived errors ignored.

But after add the command --min_resize_value=321 --max_resize_value=321
--max_number_of_evaluations=1 suggerted by the issue in here,the eval command work like below.

2020-02-19 15:15:39.522904: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 eval/miou_1.0_class_4[0] eval/miou_1.0_class_3[0.00103092787] eval/miou_1.0_class_1[nan] eval/miou_1.0_class_0[0.486860573] eval/miou_1.0_class_5[nan] eval/miou_1.0_class_2[0.135669887] eval/miou_1.0_overall[0.124712273]

And then I try this way in vis command.But it occurd another problem as below.

`
Traceback (most recent call last):
File "deeplab/vis.py", line 327, in
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/aiyunji/.local/lib/python3.5/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/aiyunji/.local/lib/python3.5/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "deeplab/vis.py", line 274, in main
align_corners=True), 3)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/image_ops_impl.py", line 1182, in resize_images
skip_resize_if_same=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/image_ops_impl.py", line 1048, in _resize_images_common
new_height_const = size_const_as_shape.dims[0].value
TypeError: 'NoneType' object is not subscriptable

If I didn`t add the command command --min_resize_value=321 --max_resize_value=321 in vis command.The log liked below.

INFO:tensorflow:Visualizing batch 2 I0219 15:23:10.167136 140088710698752 vis.py:303] Visualizing batch 2 INFO:tensorflow:Visualizing batch 3 I0219 15:23:10.292280 140088710698752 vis.py:303] Visualizing batch 3 2020-02-19 15:23:10.658753: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at spacetobatch_op.cc:219 : Invalid argument: padded_shape[0]=74 is not divisible by block_shape[0]=18 2020-02-19 15:23:10.658821: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at spacetobatch_op.cc:219 : Invalid argument: padded_shape[0]=62 is not divisible by block_shape[0]=12 2020-02-19 15:23:10.658838: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at spacetobatch_op.cc:219 : Invalid argument: padded_shape[0]=50 is not divisible by block_shape[0]=6 Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: padded_shape[0]=74 is not divisible by block_shape[0]=18 [[{{node aspp3_depthwise/depthwise/SpaceToBatchND}}]] [[ArgMax/_4403]] (1) Invalid argument: padded_shape[0]=74 is not divisible by block_shape[0]=18 [[{{node aspp3_depthwise/depthwise/SpaceToBatchND}}]] 0 successful operations. 0 derived errors ignored.
I thought it was the same when I used eval command without the command min_resize_value=321 --max_resize_value=321.

I really want to know how to solve this problem.
Look forward to your soonest reply.

I made several changes in the code.

In train.py:

`
flags.DEFINE_boolean('initialize_last_layer', False,
'Initialize the last layer.')

flags.DEFINE_boolean('last_layers_contain_logits_only', True,
'Only consider logits as last layers or not.')

`
In train_util.py

Variables that will not be restored.

exclude_list = ['global_step','logits']
if not initialize_last_layer:
exclude_list.extend(last_layers)
`

and add my data in segmentation_dataset.py
`
_MYDATA_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 40, # num of samples in images/training
'val': 4, # num of samples in images/validation
},
num_classes=6,
ignore_label=255,
)

_DATASETS_INFORMATION = {
'cityscapes': _CITYSCAPES_INFORMATION,
'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
'ade20k': _ADE20K_INFORMATION,
'mydata': _MYDATA_INFORMATION
}
`
Thank you very much！

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

Source

xixixijie

Most helpful comment

I have exactly the same issue. Were you able to solve it?

advaza on 31 Mar 2020

👍4

All 5 comments

I discover that there some picture is to small as 384×316，other were bigger than this. Did it influence?

xixixijie on 19 Feb 2020

I had tried that turn the crop_size 321 into 313,but it still didn`t work

xixixijie on 19 Feb 2020

I have exactly the same issue. Were you able to solve it?

advaza on 31 Mar 2020

👍4

I also have the same issue. Did changing the crop size worked @xixixijie ?

riti1302 on 18 Aug 2020

There is an error in some version of Tensorflow resize_images that doesn't accept a tensor (variable) image size but only constants.
Potential solutions either: