cd ~/tf_1_10_src/tensorflow/tensorflow/models/research; ~/train_object_detection_v1.shDescribe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.
Script train_object_detection_v1.sh source:
!/bin/bash
echo "Object detection script v.1"
echo "Check execution path"
if [[ "$PWD" = "$TFMODELPATH/research" ]]
then
echo "Current working directory is correct."
PROJECT_DIR=/media/nikita/LinuxBD4Tb/SSD_PROJECT
PIPELINE_CONFIG_PATH=$PROJECT_DIR/ssd_inception_v2_coco_nik.config
MODEL_DIR=$PROJECT_DIR/models/model
NUM_TRAIN_STEPS=4000000
SAMPLE_1_OF_N_EVAL_EXAMPLES=1
echo "PIPELINE_CONFIG_PATH=$PIPELINE_CONFIG_PATH"
echo "MODEL_DIR=$MODEL_DIR"
echo "NUM_TRAIN_STEPS=$NUM_TRAIN_STEPS"
echo "SAMPLE_1_OF_N_EVAL_EXAMPLES=$SAMPLE_1_OF_N_EVAL_EXAMPLES"
echo "-----------------"
echo "Start object_detection/model_main.py"
python object_detection/model_main.py \
--pipeline_config_path=${PIPELINE_CONFIG_PATH} \
--model_dir=${MODEL_DIR} \
--num_train_steps=${NUM_TRAIN_STEPS} \
--sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
--alsologtostderr
else
echo "Current working directory must be 'tensorflow/models/research/'."
echo "Correct path is '$TFMODELPATH/research'"
echo "Exit."
fi
Dir struct:
./SSD_PROJECT/
โโโ data
โย ย โโโ mscoco_label_map.pbtxt
โย ย โโโ mscoco_train.record
โย ย โโโ mscoco_val.record
โโโ models
โย ย โโโ model
โย ย โโโ eval
โย ย โโโ pipeline.config
โย ย โโโ train
โโโ ssd_inception_v2_coco_nik.config
File ssd_inception_v2_coco_nik.config:
model {
ssd {
num_classes: 90
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
reduce_boxes_in_lowest_layer: true
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 3
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
}
}
}
feature_extractor {
type: 'ssd_inception_v2'
min_depth: 16
depth_multiplier: 1.0
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
override_base_feature_extractor_hyperparams: true
}
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 0
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
batch_size: 32
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: "/home/nikita/tf_1_10_src/pretrained_InceptionV2_ImageNet_CLS2012/inception_v2.ckpt"
from_detection_checkpoint: false
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
#num_steps: 300000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "/media/nikita/LinuxBD4Tb/SSD_PROJECT/data/mscoco_train.record"
}
label_map_path: "/media/nikita/LinuxBD4Tb/SSD_PROJECT/data/mscoco_label_map.pbtxt"
}
eval_config: {
num_examples: 5000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}
eval_input_reader: {
tf_record_input_reader {
input_path: "/media/nikita/LinuxBD4Tb/SSD_PROJECT/data/mscoco_val.record"
}
label_map_path: "/media/nikita/LinuxBD4Tb/SSD_PROJECT/data/mscoco_label_map.pbtxt"
shuffle: false
num_readers: 1
}
Error log:
nikita@ubuntulinux:~/tf_1_10_src/tensorflow/tensorflow/models/research$ ~/train_object_detection_v1.sh
Object detection script v.1
Check execution path
Current working directory is correct.
PIPELINE_CONFIG_PATH=/media/nikita/LinuxBD4Tb/SSD_PROJECT/ssd_inception_v2_coco_nik.config
MODEL_DIR=/media/nikita/LinuxBD4Tb/SSD_PROJECT/models/model
NUM_TRAIN_STEPS=4000000
SAMPLE_1_OF_N_EVAL_EXAMPLES=1
-----------------
Start object_detection/model_main.py
/home/nikita/tf_1_10_src/tensorflow/tensorflow/models/research/object_detection/utils/visualization_utils.py:27: UserWarning:
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.
The backend was *originally* set to u'TkAgg' by the following code:
File "object_detection/model_main.py", line 26, in <module>
from object_detection import model_lib
File "/home/nikita/tf_1_10_src/tensorflow/tensorflow/models/research/object_detection/model_lib.py", line 27, in <module>
from object_detection import eval_util
File "/home/nikita/tf_1_10_src/tensorflow/tensorflow/models/research/object_detection/eval_util.py", line 27, in <module>
from object_detection.metrics import coco_evaluation
File "/home/nikita/tf_1_10_src/tensorflow/tensorflow/models/research/object_detection/metrics/coco_evaluation.py", line 20, in <module>
from object_detection.metrics import coco_tools
File "/home/nikita/tf_1_10_src/tensorflow/tensorflow/models/research/object_detection/metrics/coco_tools.py", line 47, in <module>
from pycocotools import coco
File "/home/nikita/tf_1_10_src/tensorflow/tensorflow/models/research/pycocotools/coco.py", line 49, in <module>
import matplotlib.pyplot as plt
File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 69, in <module>
from matplotlib.backends import pylab_setup
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/__init__.py", line 14, in <module>
line for line in traceback.format_stack()
import matplotlib; matplotlib.use('Agg') # pylint: disable=multiple-statements
WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0924 15:58:51.491380 140112981780224 tf_logging.py:125] Forced number of epochs for all eval validations to be 1.
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0924 15:58:51.491605 140112981780224 tf_logging.py:125] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
WARNING:tensorflow:Estimator's model_fn (<function model_fn at 0x7f6e49985c80>) includes params argument, but params are not passed to Estimator.
W0924 15:58:51.491858 140112981780224 tf_logging.py:125] Estimator's model_fn (<function model_fn at 0x7f6e49985c80>) includes params argument, but params are not passed to Estimator.
1) exporter_name=Servo_0; eval_spec_name=0(type <type 'int'>)
Traceback (most recent call last):
File "object_detection/model_main.py", line 109, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "object_detection/model_main.py", line 102, in main
eval_on_train_data=False)
File "/home/nikita/tf_1_10_src/tensorflow/tensorflow/models/research/object_detection/model_lib.py", line 659, in create_train_and_eval_specs
exporters=exporter))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 237, in __new__
raise TypeError('`name` must be string, given: {}'.format(name))
TypeError: `name` must be string, given: 0
Debug of file object_detection/model_lib.py shows:
File modifications:
eval_specs = [] #line 646
cntr = 0 # add counter
for eval_spec_name, eval_input_fn in zip(eval_spec_names, eval_input_fns):
cntr += 1 # add counter inc
exporter_name = '{}_{}'.format(final_exporter_name, eval_spec_name)
print("{}) exporter_name={}; eval_spec_name={}(type {})".format(cntr, exporter_name, eval_spec_name, type(eval_spec_name))) # add debuging variables
exporter = tf.estimator.FinalExporter(
name=exporter_name, serving_input_receiver_fn=predict_input_fn)
eval_specs.append(
tf.estimator.EvalSpec(
name=eval_spec_name),
input_fn=eval_input_fn,
steps=None,
exporters=exporter))
Result:
...
1) exporter_name=Servo_0; eval_spec_name=0(type <type 'int'>)
...
Possible solution (IMHO):
Change in object_detection/model_lib.py:
eval_specs.append(
tf.estimator.EvalSpec(
name=eval_spec_name),
input_fn=eval_input_fn,
steps=None,
exporters=exporter))
to:
eval_specs.append(
tf.estimator.EvalSpec(
name=str(eval_spec_name),
input_fn=eval_input_fn,
steps=None,
exporters=exporter))
P.S. Can't understand why in file object_detection/model_lib.py at line 644 list of integers generated:
if eval_spec_names is None: # line 643
eval_spec_names = range(len(eval_input_fns)) # creates integers. Line 644
@pkulzc - I face the same issue with tf1_11 and python 3.6 . Could you please help us out
Give this line of change a shot (wrapping eval_spec_name with str):
https://github.com/tensorflow/models/pull/5372/files#diff-76df8ca264a059e8a3003851fe4d7849R653
Upstream sent https://github.com/tensorflow/models/pull/5354 to fix this
Give this line of change a shot (wrapping
eval_spec_namewithstr):
https://github.com/tensorflow/models/pull/5372/files#diff-76df8ca264a059e8a3003851fe4d7849R653
This seems to work for me, thanks!
Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.
Most helpful comment
Give this line of change a shot (wrapping
eval_spec_namewithstr):https://github.com/tensorflow/models/pull/5372/files#diff-76df8ca264a059e8a3003851fe4d7849R653