I used a pre-trained model (deeplab3_pascal_train_aug) to perform semantic segmentation in my own dataset (one label + background, num_classes = 2) by retraining only the last layer. Training seems to go through without error. However, visualization or evaluation gives: Invalid argument: padded_shape[X] = Y is not divisible by block_shape[X] = Z. No matter what I do, I keep getting this error.
My database is composed by images of size [640, 480]. I followed the instructions to build the ground truth images with labels [background = 0, label_1 = 1] and create the tfrecords etc. I saw in previous issue reports that the selection of crop-size during evaluation has to cover the full image. Thus I increased the value eval_crop_size (641 = k * 16 +1 > imagesize) to ensure that it is larger than any dimension of the image.
Running the original local_test.sh works fine.
Error for the visualization call:
...
Caused by op 'xception_65/exit_flow/block2/unit_1/xception_module/separable_conv
1_depthwise/depthwise/SpaceToBatchND', defined at:
File "C:/tensorflow/models/research/deeplab/vis.py", line 312, in <module>
tf.app.run()
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
platform\app.py", line 125, in run
_sys.exit(main(argv))
File "C:/tensorflow/models/research/deeplab/vis.py", line 230, in main
image_pyramid=FLAGS.image_pyramid)
File "C:\tensorflow\models\research\deeplab\model.py", line 183, in predict_la
bels
fine_tune_batch_norm=False)
File "C:\tensorflow\models\research\deeplab\model.py", line 313, in multi_scal
e_logits
nas_training_hyper_parameters=nas_training_hyper_parameters)
File "C:\tensorflow\models\research\deeplab\model.py", line 553, in _get_logit
s
nas_training_hyper_parameters=nas_training_hyper_parameters)
File "C:\tensorflow\models\research\deeplab\model.py", line 395, in extract_fe
atures
use_bounded_activation=model_options.use_bounded_activation)
File "C:\tensorflow\models\research\deeplab\core\feature_extractor.py", line 3
41, in extract_features
scope=name_scope[model_variant])
File "C:\tensorflow\models\research\deeplab\core\feature_extractor.py", line 4
08, in network_fn
*args, **kwargs)
File "C:\tensorflow\models\research\deeplab\core\xception.py", line 655, in xc
eption_65
scope=scope)
File "C:\tensorflow\models\research\deeplab\core\xception.py", line 464, in xc
eption
net = stack_blocks_dense(net, blocks, output_stride)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\contrib
\framework\python\ops\arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "C:\tensorflow\models\research\deeplab\core\xception.py", line 379, in st
ack_blocks_dense
net = block.unit_fn(net, rate=rate, **dict(unit, stride=1))
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\contrib
\framework\python\ops\arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "C:\tensorflow\models\research\deeplab\core\xception.py", line 293, in xc
eption_module
scope='separable_conv' + str(i+1))
File "C:\tensorflow\models\research\deeplab\core\xception.py", line 284, in _s
eparable_conv
scope=scope)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\contrib
\framework\python\ops\arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "C:\tensorflow\models\research\deeplab\core\xception.py", line 185, in se
parable_conv2d_same
outputs = _split_separable_conv2d(padding='SAME')
File "C:\tensorflow\models\research\deeplab\core\xception.py", line 175, in _s
plit_separable_conv2d
**kwargs)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\contrib
\framework\python\ops\arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\contrib
\layers\python\layers\layers.py", line 2822, in separable_convolution2d
data_format=data_format)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
ops\nn_impl.py", line 522, in depthwise_conv2d
op=op)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
ops\nn_ops.py", line 435, in with_space_to_batch
return new_op(input, None)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
ops\nn_ops.py", line 591, in __call__
return self.call(inp, filter)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
ops\nn_ops.py", line 574, in _with_space_to_batch_call
input=inp, block_shape=dilation_rate, paddings=paddings)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
ops\gen_array_ops.py", line 8648, in space_to_batch_nd
paddings=paddings, name=name)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
framework\ops.py", line 3300, in create_op
op_def=op_def)
File "C:\ProgramData\Anaconda3\envs\delta\lib\site-packages\tensorflow\python\
framework\ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): padded_shape[0]=127 is not divis
ible by block_shape[0]=2
[[node xception_65/exit_flow/block2/unit_1/xception_module/separable_co
nv1_depthwise/depthwise/SpaceToBatchND (defined at C:\tensorflow\models\research
\deeplab\core\xception.py:175) ]]
My code:
changes in data_generator.py
_MyDataset = DatasetDescriptor(
splits_to_sizes={
'train': 125, # num of samples in images/training
'val': 125, # num of samples in images/validation
'trainval': 250, # num of samples in images/validation
},
num_classes=2,
ignore_label=255,
)
_DATASETS_INFORMATION = {
'cityscapes': _CITYSCAPES_INFORMATION,
'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
'ade20k': _ADE20K_INFORMATION,
'MyDataset': _MyDataset
}
My modified local_test.sh script:
cd ..
# Set up the working environment.
CURRENT_DIR=$(pwd)
WORK_DIR="${CURRENT_DIR}/deeplab"
DATASET_DIR="datasets"
# Set up the working directories.
PASCAL_FOLDER="MyDataset"
EXP_FOLDER="exp/train_on_trainval_set"
PASCAL_DATASET="C:\tensorflow\models\research\deeplab\datasets\MyDataset\tfrecord"
INIT_FOLDER="C:/tensorflow\models\research\deeplab\datasets\pascal_voc_seg\init_models"
EVAL_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/eval"
TRAIN_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/train"
VIS_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/vis"
EXPORT_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/${EXP_FOLDER}/export"
mkdir -p "${WORK_DIR}/${DATASET_DIR}/${PASCAL_FOLDER}/exp"
mkdir -p "${TRAIN_LOGDIR}"
NUM_ITERATIONS=10
python "${WORK_DIR}"/train.py \
--logtostderr \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=321 \
--train_crop_size=321 \
--train_batch_size=1 \
--training_number_of_steps="${NUM_ITERATIONS}" \
--fine_tune_batch_norm=false \
--initialize_last_layer=false \
--last_layers_contain_logits_only=true \
--dataset="${PASCAL_FOLDER}" \
--tf_initial_checkpoint="${INIT_FOLDER}/deeplabv3_pascal_train_aug/model.ckpt" \
--train_logdir="${TRAIN_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}"
# Run evaluation.
python "${WORK_DIR}"/eval.py \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--eval_crop_size=641 \
--eval_crop_size=641 \
--checkpoint_dir="${TRAIN_LOGDIR}" \
--eval_logdir="${EVAL_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}" \
--dataset="${PASCAL_FOLDER}" \
--max_number_of_evaluations=1
# Visualize the results.
python "${WORK_DIR}"/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=641\
--vis_crop_size=641\
--checkpoint_dir="${TRAIN_LOGDIR}" \
--vis_logdir="${VIS_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}" \
--dataset="${PASCAL_FOLDER}" \
--max_number_of_iterations=1
I am getting same error but using the ade20k dataset.
InvalidArgumentError (see above for traceback): padded_shape[1]=58 is not divisible by block_shape[1]=6
@codysjackson I have encountered exactly the same error like yours when using ade20k dataset. Have you fixed it? Thanks!
I have solved this issue thanks to #3939.
I just solved the problem following #6559
The thing that solved my problem was a change in: utils -> train_utils.py
I had to explicitly modify the exclude list:
From:
exclude_list = ['global_step']
to:
exclude_list = ['global_step','logits']
I think this should be controlled by the arguments in train.py:
--fine_tune_batch_norm=false \
--initialize_last_layer=false \
--last_layers_contain_logits_only=true \
However it seems that for some reason they were not parsed correctly, and the exclude list has to be updated manually. Also I changed the crop size in evaluation and visualization to: 481, 641 (since my images are 480x640.
Hope this serves for you also!
I couldn't solve the same problem with @Argantonio65 approach only. After implementing every step as he did, I still had to support my code with #3695 . Only then fix was complete.
Closing this issue since its resolved. Thanks all!
Most helpful comment
I just solved the problem following #6559
The thing that solved my problem was a change in: utils -> train_utils.py
I had to explicitly modify the exclude list:
From:
to:
I think this should be controlled by the arguments in train.py:
However it seems that for some reason they were not parsed correctly, and the exclude list has to be updated manually. Also I changed the crop size in evaluation and visualization to: 481, 641 (since my images are 480x640.
Hope this serves for you also!