Facenet: pre-trained model unable to load because of tensorflow saver format :(

Created on 20 Apr 2017 · 14Comments · Source: davidsandberg/facenet

When I try to load the model (with tensorflow 1.0.0 or tensorflow 1.0.1, I'm getting the following message

DataLossError (see above for traceback): Unable to open table file ../DATAS/20170216-091149/model-20170216-091149.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2_278 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_278/tensor_names, save/RestoreV2_278/shape_and_slices)]]

My configuration:

Ubuntu Linux
Python 3.5
Tensorflow 1.0.0 and 1.0.1 (tried in conda separate envs)

Is there another check to make (protobuf version?

Source

tbarnier

👍9 🎉1

Most helpful comment

Hi,
I'm not sure which file you are trying to load. I recently tried loading a pretrained model with train_tripletloss.py and worked out fine. Then I used
--pretrained_model ~/models/20170214-092102/model-20170214-092102.ckpt-80000
But it should be noted that there is no file named model-20170214-092102.ckpt-80000 in the 20170214-092102 directory, so Tensorflow adds .data-00000-of-00001 before restoring. These are the files in the directory:

-rwxrwxrwx 1 root root 96689276 feb 14 18:49 model-20170214-092102.ckpt-80000.data-00000-of-00001
-rwxrwxrwx 1 root root    22478 feb 14 18:49 model-20170214-092102.ckpt-80000.index
-rwxrwxrwx 1 root root 19991968 feb 14 09:29 model-20170214-092102.meta

davidsandberg on 7 May 2017

👍12 🎉2 🚀1 ❤1

All 14 comments

How can I train a model setting the --pretrained_model parameter?

Same issue here.
I am using floydhub to run my training with the flowing command.

floyd run --gpu --env tensorflow-1.0:py2 --data uGo93wTLTC6yyKb7c4W7Ri 'python facenet/src/facenet_train_classifier.py --data_dir /input/senadores --logs_base_dir /output/facenet_logs --models_base_dir /output/facenet_models --pretrained_model /input/20170216-091149/model.ckpt'

Before running I have uploaded my dataset and the pretrained model. For this I have used the floyd init and floyd upload commands.

Also, I have tried training with the original model name (model-20170216-091149.ckpt-250000.data-00000-of-00001), as well as, after renaming it to model.cpkt, just in case.

My training finish sucessful if I don´t set the --pretrained_model parameter. I mean, the call bellow works just fine:

floyd run --gpu --env tensorflow-1.0:py2 --data uGo93wTLTC6yyKb7c4W7Ri 'python facenet/src/facenet_train_classifier.py --data_dir /input/senadores --logs_base_dir /output/facenet_logs --models_base_dir /output/facenet_models'

The only difference between the two training call is that in the second one I didn´t set the --pretrained_model parameter.

Here is the error:

INFO - W tensorflow/core/framework/op_kernel.cc:993] Data loss: Unable to open table file /input/20170216-091149/model.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

INFO - DataLossError (see above for traceback): Unable to open table file /input/20170216-091149/model.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

INFO - tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /input/20170216-091149/model.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-04-28 11:13:00,100 INFO - [[Node: save/RestoreV2 = RestoreV2dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"]]
2017-04-28 11:13:00,101 INFO - [[Node: save/RestoreV2_86/_575 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_968_save/RestoreV2_86", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

So, How can I train a model setting the --pretrained_model parameter?

fabriciosantana on 28 Apr 2017

confirmed with v1.1.0

ShownX on 29 Apr 2017

-rwxrwxrwx 1 root root 96689276 feb 14 18:49 model-20170214-092102.ckpt-80000.data-00000-of-00001
-rwxrwxrwx 1 root root    22478 feb 14 18:49 model-20170214-092102.ckpt-80000.index
-rwxrwxrwx 1 root root 19991968 feb 14 09:29 model-20170214-092102.meta

davidsandberg on 7 May 2017

👍12 🎉2 🚀1 ❤1

@davidsandberg
Hi David,
When I'm trying to load the pretrained model, I found there are 3 sets of similar data. I'm confused which one I SHOULD use to load? Can you give me some idea??

Many thanks.

xmuszq on 5 Nov 2017

If you look at the filenames, I believe that you are seeing checkpoints - the model state at different points during training. Choosing the latest is probably a good idea - so model-20171104-092733.ckpt-151000.

EdwardDixon on 24 Nov 2017

Hello, I'm using the 20180402-114759 model, and I got the same issue.
I'm sure that my path is right
--pretrained_model /home/constantine/DataDisk/models/facenet/20180402-114759/model-20180402-114759.ckpt-275.data-00000-of-00001
and I have that
DataLossError (see above for traceback): Unable to open table file /home/constantine/DataDisk/models/facenet/20180402-114759/model-20180402-114759.ckpt-275.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]] [[Node: save/RestoreV2/_907 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_306_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
Anyone know how to fix this?

StepOITD on 8 May 2018

👍5

@StepOITD Did you find the solution for this issue? I got the same error when I try to export my trained model following script in here.

hamhochoi on 28 Jun 2018

found a modified version in pr，worked

StepOITD on 2 Jul 2018

👎4 😕1

Hi @StepOITD
I am getting the same error please help me to figure it out.

sanjayanayak on 19 Jul 2018

found a modified version in pr，worked

@StepOITD , Could you please post the solution on how you tackled the issue. I am getting the same error when I have all the models available for multiple checkpoints in the same directory. !

anubhav0fnu on 8 Oct 2018

@anubhav0fnu I have tried with model version 2018, and restore model-20180408-102900.ckpt-90.data-00000-of-00001. I just add argument: --pretrain_model model-20180408-102900.ckpt-90, It works for me. I read some topic on stackoverflow, that said restore function of tf.Saver is add "data-00000-of-00001" in last path of model automatically.

ndinhtuan on 21 Mar 2019

👍4 ❤1

as @theiron97 said don't need to append the "data-00000-of-00001" in the file name. just provide the filename without that part.

KanchanIIT on 25 Jun 2019

👍2

@anubhav0fnu I have tried with model version 2018, and restore model-20180408-102900.ckpt-90.data-00000-of-00001. I just add argument: --pretrain_model model-20180408-102900.ckpt-90, It works for me. I read some topic on stackoverflow, that said restore function of tf.Saver is add "data-00000-of-00001" in last path of model automatically.

I love you dude <3
It worked for me

Choapinus on 27 Jul 2019

@anubhav0fnu I have tried with model version 2018, and restore model-20180408-102900.ckpt-90.data-00000-of-00001. I just add argument: --pretrain_model model-20180408-102900.ckpt-90, It works for me. I read some topic on stackoverflow, that said restore function of tf.Saver is add "data-00000-of-00001" in last path of model automatically.

Helped me !!! Thank yal