Hello Tensorflow team,
This is my system information for the issus I have explained below:
The exact command that fails:
python eval.py \
--eval_crop_size='513,513' \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--dataset="cells" \
--checkpoint_dir=/mnt/lustre/LOGDIR \
--eval_logdir=/mnt/lustre/LOGDIREVAL \
--dataset_dir=/mnt/lustre/tfrecord
I'm failing to run the eval.py script as above. The error I get is:
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [labelsout of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency:0) = ] [3 3 3...] [y (mean_iou/Cast_1:0) = ] [3]
[[node mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert (defined at /jorgeenv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Prior to that I have:
1. Created the files in the tfrecord folder using *build_voc2012_data.py for the --dataset_dir argument of both train.py and eval.py*
My original images are 500X333 png files. The corresponding masks are 500X333 indexed png
files. There are three indexes 0,1,2, where 0 is the background. For testing purposes I have two images, one for training and one for validation. I have uploaded an example. Therefore in the datasets/data_generator.py script I have added:
_CELLS_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 1,
'trainval': 2,
'val': 1,
},
num_classes=3,
ignore_label=0
)
_DATASETS_INFORMATION = {
'cityscapes': _CITYSCAPES_INFORMATION,
'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
'ade20k': _ADE20K_INFORMATION,
'cells': _CELLS_INFORMATION,
}
2. successfully run the train.py script like this:
python train.py \
--initialize_last_layer=False \
--last_layers_contain_logits_only=False \
--logtostderr \
--dataset="cells" \
--training_number_of_steps=1 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size="513,513" \
--train_batch_size=1 \
--tf_initial_checkpoint=/mnt/lustre/xception/model.ckpt \
--train_logdir=/mnt/lustre/LOGDIR \
--dataset_dir=/mnt/lustre/tfrecord
3. run the eval.py script like this, which produces the error:
python eval.py \
--eval_crop_size='513,513' \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--dataset="cells" \
--checkpoint_dir=/mnt/lustre/LOGDIR \
--eval_logdir=/mnt/lustre/LOGDIREVAL \
--dataset_dir=/mnt/lustre/tfrecord
The error again is:
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [labelsout of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency:0) = ] [3 3 3...] [y (mean_iou/Cast_1:0) = ] [3]
[[node mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert (defined at /jorgeenv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Many Thanks
Jorge

Me too facing the same issue. awaiting for help.
thanks in advance
Did you solve the error? I have the same issue
Hi,
I am afraid I didn't solve it. Actually I ended up using Mask R CNN.
https://github.com/matterport/Mask_RCNN
Jorge
I got the same error.
python "${WORK_DIR}"/train.py \
--logtostderr \
--train_split="train" \
--model_variant="mobilenet_v2" \
--output_stride=16 \
--train_crop_size="513,513" \
--train_batch_size=4 \
--training_number_of_steps="${NUM_ITERATIONS}" \
--fine_tune_batch_norm=False \
--train_logdir="${TRAIN_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}" \
--dataset="rare_plane"
python "${WORK_DIR}"/eval.py \
--logtostderr \
--eval_split="trainval" \
--model_variant="mobilenet_v2" \
--output_stride=16 \
--eval_crop_size="513,513" \
--eval_batch_size=4 \
--checkpoint_dir="${TRAIN_LOGDIR}" \
--eval_logdir="${EVAL_LOGDIR}" \
--dataset_dir="${PASCAL_DATASET}" \
--dataset="rare_plane" \
--max_number_of_evaluations=1
Please give a response if someone solves this problem. Thanks.