Hello all,
I am newbie in tensorflow. I use tensorflow-1.2 version and want to train my own data. And use the pre-train model ssd_mobilenet_v1_coco. At the training time i got the error:
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /usr/local/lib/python2.7/dist-packages/tensorflow/models/model/model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
I read some related stack overflow post and written that:
add lines:
saver = tf.train.Saver(Write_version = saver_pb2.SaveDef.V1)
saver.save(sess,"./model.ckpt",global_step=step)
But, I am not getting where to add this lines in saver.py and Do I need to remove any line from this file.
Any help solving this problem would be appreciated.
Thanks in advance.
Can you fill out the new issue template? In particular, are you running your own code or code from the repo?
Hi skye,
I follow Mr. Dat Tran object detection blog, download the image dataset and .xml files from his repo. I followed the exact instructions given in blog, only I try to train model locally. But there I get the error as I mentioned.
Thank You.
I got the same error, also waiting for some advice
Actually I am reading that too.
Just specify model.ckpt in the .config file
model.ckpt is just renaming of model.ckpt.data-00000-of-00001 file.
But the problem is model.ckpt.data-00000-of-00001 is in old format and tensorflow-1.2 object-detection required new format.
Unfortunately, new formaat of model.ckpt.data-00000-of-00001 is not available till now.
I am not very familiar with the object detection model or blog. Can you post the exact instructions for reproducing this problem? (This is also requested in the issue template.)
Hi,
I am running into the same problem.
I am following this link https://github.com/tensorflow/models/blob/master/object_detection/g3doc/running_locally.md
to retrain the object detection model. I downloaded "rfcn_resnet101_coco" from here, which gave me four files:
-frozen_inference_graph.pb
-graph.pbtxt
-model.ckpt.data-00000-of-00001
-model.ckpt.meta
-model.ckpt.index
As suggested in the documentation , fine_tune_checkpoint should be path to "/usr/home/username/checkpoint/model.ckpt-#####". BUT If I give "fine_tune_checkpoint=file_path/model.ckpt.data-00000-of-00001", it throws this :
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file ~/Documents/trained_models/faster_rcnn_resnet101_coco_11_06_2017/model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
I am using TensorFlow version 1.3.0 on Ubuntu 16.04 with cudnn5.1.
Any help appreciated.
Thanks
Just giving the path : fine_tune_checkpoint=file_path/model.ckpt, worked for me.
hi skulhae,
Which tensorflow version and OS are you using?
I already set the fine_tune_checkpoint = ${path_to_model.ckpt} not worked for me :(
I am using TensorFlow version 1.3.0 on Ubuntu 16.04 with cudnn5.1.
Where did you get the model ckpt file from?
@drpngx I have the same issue. I downloaded it from model zoo, ssd_mobilenet_v1_coco
@suharshs reassigning to you since it looks like you might have touched the model last?
I haven't touched the ssd models so am unfamiliar with this issue :( @iamtodor what is the exact command you are running so we can repro? Thanks
@suharshs Thank you for attantion. Here is my command.
python models/research/object_detection/train.py --logtostderr --pipeline_config_path=ssd_mobilenet/ssd.config --train_dir=data
My ssd.config.
Dir data contains following files:
$ ls data
class.pbtxt pipeline.config test.record train.record
Is there something else I could provide to you?
My files are like this ls $TRAIN_DIR
model.ckpt-4513245.data-00000-of-00001
model.ckpt-4513245.index
model.ckpt-4513245.meta
What works to me is to just set CKPT_FILE=${TRAIN_DIR}/model.ckpt-4513245
Then
python eval_image_classifier.py --alsologtostderr --checkpoint_path=${CKPT_FILE} --dataset_dir=${DATA_DIR} --dataset_name=mnist --dataset_split_name=test --model_name=lenet
works
Tensorflow-gpu 1.4, Ubuntu 16.04
@rajvi3105 Please consider using latest checkpoint files in Model Detection Zoo and steps in the blog post we published recently.
Closing this issue, feel free to reopen if encountering any issues with checkpoint files.
ssd_mobilenet_v1_coco_2018_01_28 don't know the checkpoints?
model.ckpt is just renaming of model.ckpt.data-00000-of-00001 file.
But the problem is model.ckpt.data-00000-of-00001 is in old format and tensorflow-1.2 object-detection required new format.
Unfortunately, new formaat of model.ckpt.data-00000-of-00001 is not available till now.
What is the situation now? Is this issue resolved? - Thanks in advance
as skulhare mentioned you just need to address the pre-trained model file as "model.ckpt" not "model.ckpt.data-00000-of-00001" (although the actual file name is "model.ckpt.data-00000-of-00001"). So for instance if you're following the object detection tutorial for ssd_inception_v2_coco.config file:
fine_tune_checkpoint: "../training_demo/pre-trained-model/ssd_inception_v2_coco_2018_01_28/model.ckpt"
Hi, I am using checkpoint files in this path
models/conv3d_sep2/
|---> conv3d_sep2-00000005.data-00000-of-00001
|---> conv3d_sep2-00000005.index
|---> conv3d_sep2-00000005.meta)
But getting this error again.
Why even if there are checkpoint files I am getting this error? Please help me with any suggestions.
I tried Training in Google colab
python train.py --train_dir=training/ --pipeline_config_path=ssd_mobilenet_v2_quantized_300x300_coco.config
but I am getting following error
Failed to get matching files on /content/models-master/research/object_detection/ssd_inception_v2_coco_2018_01_28/model.ckpt: Not found: /content/models-master/research/object_detection/ssd_inception_v2_coco_2018_01_28; No such file or directory
Most helpful comment
Just giving the path : fine_tune_checkpoint=file_path/model.ckpt, worked for me.