Models: No variables to save error using tensorflow object detection API

Created on 20 Jan 2018  ·  48Comments  ·  Source: tensorflow/models

System information

  • What is the top-level directory of the model you are using:/tensorflow/models/research
  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):

    DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=16.04
    DISTRIB_CODENAME=xenial
    DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"

  • TensorFlow installed from (source or binary):
    binary

  • TensorFlow version (use command below):
    ('v1.3.0-rc1-7384-gc9437c1', '1.6.0-dev20180119')
  • Bazel version (if compiling from source): Not Applicable
  • CUDA/cuDNN version: Not Applicable
  • GPU model and memory: Not Applicable
  • Exact command to reproduce:

    python object_detection/train.py --logtostderr
    --pipeline_config_path=train_mydata/models/model/ssd_mobilenet_v1_coco.config
    --train_dir=train_mydata/models/train

Describe the problem

Successfully created necessary files for training my custom data as described at https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_locally.md

Executing the above command complained about absence of tkinter, which is installed with apt-get install.Updated the tensorflow to latest nightly build.

Getting 'No Variables to save' after running above training

Source code / logs

---Config file

  SSD with Mobilenet v1 configuration for MSCOCO Dataset.


model {
  ssd {
    num_classes: 1
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
          anchorwise_output: true
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 15
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "/tensorflow/models/research/train_mydata/data/mobilenet_v1_1.0_224.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 300
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "/tensorflow/models/research/train_mydata/data/train.record"
  }
  label_map_path: "/tensorflow/models/research/train_mydata/data/object_label.pbtxt"
}

eval_config: {
  num_examples: 160
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/tensorflow/models/research/train_mydata/data/test.record"
  }
  label_map_path: "/tensorflow/models/research/train_mydata/data/object_label.pbtxt"
  shuffle: false
  num_readers: 1
  num_epochs: 1
}

Errors

Traceback (most recent call last):
File "object_detection/train.py", line 163, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "object_detection/train.py", line 159, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "build/bdist.linux-x86_64/egg/object_detection/trainer.py", line 255, in train
init_saver = tf.train.Saver(available_var_map)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1288, in __init__
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1297, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1322, in _build
raise ValueError("No variables to save")
ValueError: No variables to save

Most helpful comment

Solved for me using the solution from @NinjaWendy for training ssd_mobilenet_v1_fpn_coco with the stock config file that did not contain a from_detection_checkpoint line.

To my config file I added the line:

from_detection_checkpoint: true

Immediately after the fine_tune_checkpoint line. Worked fine.

All 48 comments

Any ideas @jch1 @tombstone ?

Getting the same issue. I was able to bypass the error by using other models (faster_rcnn_inception_v2_coco_2017_11_08).

I'd love to bypass this error by using other models but other models give another error, for which there is already an issue:

Expected int32, got range(0, 3) of type 'range' instead.

So, I've so far been unsuccessful at using ANY model with TF object detection.

Solution for issue #3443 might help you, @jazoom.

resnet_v1_101,按照 #3443 解决了“Expected int32, got range(0, 3) of type 'range' instead”的错误,但是出现了“ValueError: No variables to save”的错误。resnet_v2_101也有这些错误。

@yuanyuanxiang 所有你的问题解决了吗?求助

@shuli163love 这可能是预训练模型的问题,换一个模型就行了。虽然模型名称一样,但是日期不同也应该换着试一下。

I'm getting same errors on different models including one I trained yesterday.
yesterday btw I didn't get any errors, I terminated the training script (I'm using the stock script train.py from models/research/object_detection which doesn't stop itself) and then I presumably did something wrong and started to get those variable errors
I wrote no custom code, and was trying to train inception v2, mobilenet v1&2 models

before raise the variable exception, it gives up some warnings like this

WARNING:root:Variable [MobilenetV1/Conv2d_9_depthwise/depthwise_weights/RMSProp] is not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_9_depthwise/depthwise_weights/RMSProp_1] is not available in checkpoint
WARNING:root:Variable [MobilenetV1/Conv2d_9_pointwise/BatchNorm/beta] is not available in checkpoint

which looks strange because I used different models but the mentioning of MobilenetV1 pops up every time

speaking of Stackoverflow, there's some custom code being discussed, it's hard to figure out how their solutions may be applied to the stock code.

I don't know why but setting this variable to false solved the issue for me:
from_detection_checkpoint: false

@yuanyuanxiang is right, I met the same problem,
finally use ssd_mobilenet_v1_coco_2017_11_17 succed

2025 should help @ddurgaprasad

from_detection_checkpoint: true should be changed to from_detection_checkpoint: false

Did anyone find any other solutions to this problem? I am trying to run train.py using the ssd_mobilenet_v1_fpn model and I am running into the same error:

ValueError: No variables to save.

There is no line from_detection_checkpoint in the ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config config file unfortunately. Tried adding it into the config file and run into the same error.

those "from checkpoint=false" tricks never worked for me....
though, it's always useful to ensure whether your TF version matches the version that was used for exporting the model
models from the zoo was from v. 1.5 (last time i've checked) when the last TF-GPU is 1.8 or so

just run export model before training, it should help

I get the same error with ssd_mobilenet_v1_fpn_coco and ssd_resnet_50_fpn_coco as well. Also there is not any " from_detection_checkpoint " in those config files. im using ubuntu 16.4 ang tensorflow 1.9 gpu version. what can i do to bypass the error?

The opposite happened for me. Setting from_detection_checkpoint: true helped me resolve the error. I had it to false previously. This is for ssd_resnet101_fpn_ct_coco.config

@GeorgiaA did you find a solution to the probelm? It's kind of what I'm facing too.

@karansomaiah sadly I didn't. I couldn't find any solution to it. I was only testing out various models so I didn't spend much time fretting over the problem. If you find anything please let me know.

I was getting same error from past few hours. I started again with clean tensorflow repository and custom scripts as mentioned in the following blog and it worked.
https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73

Earlier it was not detecting model.ckpt as there is no file in the folder with this name but somehow it's working now.
Value of variable in config file is
fine_tune_checkpoint="path_to_model_dir/model.ckpt"

Hey @goravkaul thanks for your reply. I did get it working by changing the value of "from_detection_checkpoint" to "true". Thank you for your suggestions.

@GeorgiaA I did get it working by changing the above values in the config file. Let me know if anyone is still facing issues. Also, I was facing this specifically with the fpn models in ssd only. So I'm training them with the train.py files in the legacy folder rather than the model_main.py

If you're using multiple GPUs, make sure to change "sync_replicas" to "true" as well.

@ravikantgupta9 Same here. I was using ssd_resnet50_v1_fpn and setting from_detection_checkpoint to true resolved it. I added it separately as it wasn't there in the config file.

I got it working for
from_detection_checkpoint: true

For faster_rcnn_inception_resnet_v2_atrous_oid.config for pre training that model

@karansomaiah @nareshmungpara Hello! I'm DS
.
I use the model "ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config"
And met the same error before.
.
Could you tell me where is the command "from_detection_checkpoint" ?
I edit the config file "ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config"
But I can't find the command
.
Thank you and wish you have a nice day
.
Cheers
.
DS

I received the same error using Mobilenet_V2. Within the config file there was no from_detection_checkpoint line. I have added the following (addition is in the middle) and it worked:

fine_tune_checkpoint: "/home/Machine_Learning_Verstening/scripts/training/model.ckpt"
from_detection_checkpoint: true
#fine_tune_checkpoint_type: "detection"

Solved for me using the solution from @NinjaWendy for training ssd_mobilenet_v1_fpn_coco with the stock config file that did not contain a from_detection_checkpoint line.

To my config file I added the line:

from_detection_checkpoint: true

Immediately after the fine_tune_checkpoint line. Worked fine.

@NinjaWendy has addressed this thread well, concerning 'detection-checkpoint'.

Training ssd_mobilenet_v1_fpn with gpu, same problem.
Add from_detection_checkpoint: true after checkpoint path in pipeline.config, and if you have an error
valueerror no variables to save train.py
make this setting replicas_to_aggregate: 1

I don't know why but setting this variable to false solved the issue for me:
from_detection_checkpoint: false

I'm on the contrary, I try to turn this to false and raise the "No variables to save" error.

Adding following line below fine_tune_checkpoint solved me.
from_detection_checkpoint: true

train_config: { fine_tune_checkpoint: "path/model.ckpt" from_detection_checkpoint: true

have the same err when runs for model faster_rcnn_resnet152_coco and get ValueError: No variables to save, tried change from_detection_checkpoint: true, but doesn't work. the solution is better switch to another model or the same model but different released date

@karansomaiah @nareshmungpara Hello! I'm DS
.
I use the model "ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config"
And met the same error before.
.
Could you tell me where is the command "from_detection_checkpoint" ?
I edit the config file "ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config"
But I can't find the command
.
Thank you and wish you have a nice day
.
Cheers
.
DS

just add it manual ! you will find it can be solved!

@karansomaiah did you get it working for ssd_mobilenet_v1_fpn model? Can you share which lines in the config file you changed? I manually added the from_detection_checkpoint : true but it still doesn't work.

Did anyone find any other solutions to this problem? I am trying to run train.py using the ssd_mobilenet_v1_fpn model and I am running into the same error:

ValueError: No variables to save.

There is no line from_detection_checkpoint in the ssd_mobilenet_v1_fpn_shared_box_predictor_640x640_coco14_sync.config config file unfortunately. Tried adding it into the config file and run into the same error.

Add the below line in the lin 157 of the pipeline.config in {"ssd_mobilenet_v1_fpn model"}
---> from_detection_checkpoint : true

@karansomaiah did you get it working for ssd_mobilenet_v1_fpn model? Can you share which lines in the config file you changed? I manually added the from_detection_checkpoint : true but it still doesn't work.

Add the below line in the lin 157 of the pipeline.config in {"ssd_mobilenet_v1_fpn model"}
---> from_detection_checkpoint : true

ValueError: No variables to save
i am facing the no variable to save issue. anyone have solution for this??

Same problem. I tried adding "from_detection_checkpoint: false"- it didn't work, I tried adding "from_detection_checkpoint: true"- it started stepping but didn't work eventually.
Using tensorflow-gpu == 1.14 on ssd_mobilenet_v2_oid_v4_2018_12_12 model.

Any solution would be great!

i had the same error and after two days i finally found a solution. look at the guy's comment in youtube window. hope it will work for you!

image

Hey @goravkaul thanks for your reply. I did get it working by changing the value of "from_detection_checkpoint" to "true". Thank you for your suggestions.

@GeorgiaA I did get it working by changing the above values in the config file. Let me know if anyone is still facing issues. Also, I was facing this specifically with the fpn models in ssd only. So I'm training them with the train.py files in the legacy folder rather than the model_main.py

If you're using multiple GPUs, make sure to change "sync_replicas" to "true" as well.

where is sync_replicas

@nika9774 , THANK YOU!!!!!

i had the same error and after two days i finally found a solution. look at the guy's comment in youtube window. hope it will work for you!

image

hey! that was my solution.. i was 2 days stuck in that problem. My question is why? why did we have to comment this out to make it work?

@nika9774 thank you!!!!!.. your answer was my solution. My question is why did we have to comment this line out in order to make it work?... the training process is supposed to used this pretrained model? isn't?
Im asking so, because as far as know, the accuracy of our result will be better if we use a pretrained file.

Hey guys, truly @nika9774's solution will fix this issue for you but it's a short-lived solution that will come back to haunt you as you proceed. The best thing is to know what's wrong and look for the actual solution.

Check that the name attribute in your label_map.txt matches your XML files among other things. If you have more than class, see that you specify it in your label.txt and also in your <model>.config file.

I also initially removed it and i was able to pass this stage. But in the long run, I had other issues to solve and had to uncomment it for me to finish all the process.

Do NOT comment out from_detection_checkpoint prop

@SirPhemmiey i am training right now with the fine_tune_checkpoint commented out. I will check the results, then i'll write the results here..
My question is what is the use of fine_tune_checkpoint command.?
I've been checking around and the say that the traininig can be done directly from the scratch or you can update the weights of the neurons of a pretrained neural network. So, basically what i think fine_tune_checkpoint does is link the training to a pretrained file to modify the its weights values based on your custom data.
im not sure if im right....

Yes, you're correct.

@SirPhemmiey i found the problem. The file i was using the train my own model was different than the configuration file of the pretained model, the one i put in fine_tune_checkpoint command. So, i checked the file and i used it with the proper modifications (paths, batch size, num_classes....) and now it's working properly. !! thank you for your suggestion.... @nika9774 i hope this info helps u.

That's a good one man!
I'm glad you dug it out. Good luck in your ML project :)

Does anyone managed to solve this problem besides trying to the solutions above?
I tried to use from_detection_checkpoint = False/True and fine_tune_checkpoint_type = "detection"but, in severals weight-config combinations, I'm facing this same problem. Don't have anymore clues of solvings this besides going deep on code (which certainly I will lost so, so many time on). Anyone who managed to solve this, please, help. It's happening on so many combinations.

Parameters I tried:

def start_training(f):
    return subprocess.run(["python", f"{OD_PATH}/train.py",
                                "--train_dir", train_path,
                                "--pipeline_config_path", pipeline_file_path],
                                stdout=f, stderr=f)

with open(train_logs_file, "wb") as f:
    proc_result = start_training(f)
    if proc_result.returncode != 0:
        pipeline_config.train_config.from_detection_checkpoint = False
        proc_result = start_training(f)
    if proc_result.returncode != 0:
        pipeline_config.train_config.from_detection_checkpoint = True
        proc_result = start_training(f)
    if proc_result.returncode != 0:
        pipeline_config.train_config.fine_tune_checkpoint_type = "detection"
        proc_result = start_training(f)

with open(f"{current_logs_dir}/general-train-results.txt", "a") as f:
    if proc_result.returncode != 0:
        f.write(f"problem during train of model {weight_file_name} using configs {pipeline_file_name}\n")
    else:
        f.write(f"succesfully trained model {weight_file_name} using configs {pipeline_file_name}\n")

Combinations I checked, until now:

succesfully trained model faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_coco.config
succesfully trained model faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_coco.config
problem during train of model faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_cosine_lr_coco.config
problem during train of model faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_cosine_lr_coco.config
problem during train of model faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_oid_v4.config
problem during train of model faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_oid_v4.config
problem during train of model faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_oid.config
problem during train of model faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_oid.config
succesfully trained model faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_pets.config
succesfully trained model faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_pets.config
succesfully trained model faster_rcnn_inception_v2_coco_2018_01_28 using configs faster_rcnn_inception_resnet_v2_atrous_pets.config
succesfully trained model faster_rcnn_inception_v2_coco_2018_01_28 using configs faster_rcnn_inception_v2_coco.config
succesfully trained model faster_rcnn_inception_v2_coco_2018_01_28 using configs faster_rcnn_inception_v2_pets.config
succesfully trained model faster_rcnn_nas_coco_2018_01_28 using configs faster_rcnn_nas_coco.config
succesfully trained model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_nas_coco.config
problem during train of model faster_rcnn_resnet101_coco_2018_01_28 using configs faster_rcnn_resnet101_atrous_coco.config
problem during train of model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet101_atrous_coco.config
problem during train of model faster_rcnn_resnet101_coco_2018_01_28 using configs faster_rcnn_resnet101_ava_v2.1.config
problem during train of model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet101_ava_v2.1.config
succesfully trained model faster_rcnn_resnet101_coco_2018_01_28 using configs faster_rcnn_resnet101_coco.config
problem during train of model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet101_coco.config
problem during train of model faster_rcnn_resnet101_coco_2018_01_28 using configs faster_rcnn_resnet101_fgvc.config
problem during train of model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet101_fgvc.config
succesfully trained model faster_rcnn_resnet101_coco_2018_01_28 using configs faster_rcnn_resnet101_kitti.config
problem during train of model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet101_kitti.config
succesfully trained model faster_rcnn_resnet101_coco_2018_01_28 using configs faster_rcnn_resnet101_pets.config
succesfully trained model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet101_pets.config
succesfully trained model faster_rcnn_resnet101_coco_2018_01_28 using configs faster_rcnn_resnet101_voc07.config
problem during train of model faster_rcnn_nas_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet101_voc07.config
succesfully trained model faster_rcnn_resnet50_coco_2018_01_28 using configs faster_rcnn_resnet50_coco.config
succesfully trained model faster_rcnn_resnet50_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet50_coco.config
problem during train of model faster_rcnn_resnet50_coco_2018_01_28 using configs faster_rcnn_resnet50_fgvc.config
problem during train of model faster_rcnn_resnet50_lowproposals_coco_2018_01_28 using configs faster_rcnn_resnet50_fgvc.config
succesfully trained model faster_rcnn_resnet50_coco_2018_01_28 using configs faster_rcnn_resnet50_pets.config
succesfully trained model faster_rcnn_resnet101_coco_2018_01_28 using configs rfcn_resnet101_coco.config
succesfully trained model faster_rcnn_resnet101_lowproposals_coco_2018_01_28 using configs rfcn_resnet101_coco.config
succesfully trained model faster_rcnn_resnet101_coco_2018_01_28 using configs rfcn_resnet101_pets.config
succesfully trained model faster_rcnn_resnet101_lowproposals_coco_2018_01_28 using configs rfcn_resnet101_pets.config
succesfully trained model ssd_inception_v2_coco_2018_01_28 using configs ssd_inception_v2_coco.config
succesfully trained model ssd_inception_v2_coco_2018_01_28 using configs ssd_inception_v2_pets.config
problem during train of model ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync_2018_07_03 using configs ssd_mobilenet_v1_0.75_depth_300x300_coco14_sync.config
problem during train of model ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync_2018_07_18 using configs ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync.config
problem during train of model ssd_mobilenet_v1_0.75_depth_quantized_300x300_coco14_sync_2018_07_18 using configs ssd_mobilenet_v1_0.75_depth_quantized_300x300_pets_sync.config
problem during train of model ssd_mobilenet_v1_coco_2018_01_28 using configs ssd_mobilenet_v1_300x300_coco14_sync.config
succesfully trained model ssd_mobilenet_v1_coco_2018_01_28 using configs ssd_mobilenet_v1_coco.config
succesfully trained model ssd_mobilenet_v1_coco_2018_01_28 using configs ssd_mobilenet_v1_pets.config
problem during train of model ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18 using configs ssd_mobilenet_v1_quantized_300x300_coco14_sync.config
succesfully trained model ssd_mobilenet_v2_coco_2018_03_29 using configs ssd_mobilenet_v2_coco.config
problem during train of model ssd_mobilenet_v2_coco_2018_03_29 using configs ssd_mobilenet_v2_fpnlite_quantized_shared_box_predictor_256x256_depthmultiplier_75_coco14_sync.config
problem during train of model ssd_mobilenet_v2_coco_2018_03_29 using configs ssd_mobilenet_v2_fullyconv_coco.config
problem during train of model ssd_mobilenet_v2_coco_2018_03_29 using configs ssd_mobilenet_v2_oid_v4.config
problem during train of model ssd_mobilenet_v2_coco_2018_03_29 using configs ssd_mobilenet_v2_pets_keras.config
problem during train of model ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18 using configs ssd_mobilenet_v2_quantized_300x300_coco.config
problem during train of model ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03 using configs ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config
problem during train of model ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03 using configs ssdlite_mobilenet_edgetpu_320x320_coco_quant.config
problem during train of model ssd_mobilenet_v2_coco_2018_03_29 using configs ssdlite_mobilenet_edgetpu_320x320_coco.config
succesfully trained model ssdlite_mobilenet_v2_coco_2018_05_09 using configs ssdlite_mobilenet_v2_coco.config

I didn't check if all these problems are related to the same error, but every log I checked (about 7..8..) it was.

If anyone is interested in the script I ran these combinarions, here it is. The combinations file.

Usage:

python train.py --input-images ./data/train/images/ --input-annotations ./data/train/annotations-4/ --num-steps 1 --config-weights-relation-file config-weights-relation.csv

I was getting this error and like other people I couldn't find an option to change the from_detection_checkpoint = False/True but found instead that i had a checkpoint file in my training folder. When I deleted the checkpoint file the error went away and a new one is created when you start training. Hope this helps someone else!

image

Was this page helpful?
0 / 5 - 0 ratings

Related issues

10183308 picture 10183308  ·  50Comments

ludazhao picture ludazhao  ·  111Comments

NilsLattek picture NilsLattek  ·  61Comments

aifollower picture aifollower  ·  49Comments

wrkgm picture wrkgm  ·  78Comments