Facenet: pre-trained model unable to load because of tensorflow saver format :(

Created on 20 Apr 2017  路  14Comments  路  Source: davidsandberg/facenet

When I try to load the model (with tensorflow 1.0.0 or tensorflow 1.0.1, I'm getting the following message

DataLossError (see above for traceback): Unable to open table file ../DATAS/20170216-091149/model-20170216-091149.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2_278 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_278/tensor_names, save/RestoreV2_278/shape_and_slices)]]

My configuration:

Ubuntu Linux
Python 3.5
Tensorflow 1.0.0 and 1.0.1 (tried in conda separate envs)

Is there another check to make (protobuf version?

Most helpful comment

Hi,
I'm not sure which file you are trying to load. I recently tried loading a pretrained model with train_tripletloss.py and worked out fine. Then I used
--pretrained_model ~/models/20170214-092102/model-20170214-092102.ckpt-80000
But it should be noted that there is no file named model-20170214-092102.ckpt-80000 in the 20170214-092102 directory, so Tensorflow adds .data-00000-of-00001 before restoring. These are the files in the directory:

-rwxrwxrwx 1 root root 96689276 feb 14 18:49 model-20170214-092102.ckpt-80000.data-00000-of-00001
-rwxrwxrwx 1 root root    22478 feb 14 18:49 model-20170214-092102.ckpt-80000.index
-rwxrwxrwx 1 root root 19991968 feb 14 09:29 model-20170214-092102.meta

All 14 comments

How can I train a model setting the --pretrained_model parameter?

Same issue here.
I am using floydhub to run my training with the flowing command.

floyd run --gpu --env tensorflow-1.0:py2 --data uGo93wTLTC6yyKb7c4W7Ri 'python facenet/src/facenet_train_classifier.py --data_dir /input/senadores --logs_base_dir /output/facenet_logs --models_base_dir /output/facenet_models --pretrained_model /input/20170216-091149/model.ckpt'

Before running I have uploaded my dataset and the pretrained model. For this I have used the floyd init and floyd upload commands.

Also, I have tried training with the original model name (model-20170216-091149.ckpt-250000.data-00000-of-00001), as well as, after renaming it to model.cpkt, just in case.

My training finish sucessful if I don麓t set the --pretrained_model parameter. I mean, the call bellow works just fine:

floyd run --gpu --env tensorflow-1.0:py2 --data uGo93wTLTC6yyKb7c4W7Ri 'python facenet/src/facenet_train_classifier.py --data_dir /input/senadores --logs_base_dir /output/facenet_logs --models_base_dir /output/facenet_models'

The only difference between the two training call is that in the second one I didn麓t set the --pretrained_model parameter.

Here is the error:

INFO - W tensorflow/core/framework/op_kernel.cc:993] Data loss: Unable to open table file /input/20170216-091149/model.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

INFO - DataLossError (see above for traceback): Unable to open table file /input/20170216-091149/model.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

INFO - tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /input/20170216-091149/model.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2017-04-28 11:13:00,100 INFO - [[Node: save/RestoreV2 = RestoreV2dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"]]
2017-04-28 11:13:00,101 INFO - [[Node: save/RestoreV2_86/_575 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_968_save/RestoreV2_86", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

So, How can I train a model setting the --pretrained_model parameter?

confirmed with v1.1.0

Hi,
I'm not sure which file you are trying to load. I recently tried loading a pretrained model with train_tripletloss.py and worked out fine. Then I used
--pretrained_model ~/models/20170214-092102/model-20170214-092102.ckpt-80000
But it should be noted that there is no file named model-20170214-092102.ckpt-80000 in the 20170214-092102 directory, so Tensorflow adds .data-00000-of-00001 before restoring. These are the files in the directory:

-rwxrwxrwx 1 root root 96689276 feb 14 18:49 model-20170214-092102.ckpt-80000.data-00000-of-00001
-rwxrwxrwx 1 root root    22478 feb 14 18:49 model-20170214-092102.ckpt-80000.index
-rwxrwxrwx 1 root root 19991968 feb 14 09:29 model-20170214-092102.meta

@davidsandberg
Hi David,
When I'm trying to load the pretrained model, I found there are 3 sets of similar data. I'm confused which one I SHOULD use to load? Can you give me some idea??

image

Many thanks.

If you look at the filenames, I believe that you are seeing checkpoints - the model state at different points during training. Choosing the latest is probably a good idea - so model-20171104-092733.ckpt-151000.

Hello, I'm using the 20180402-114759 model, and I got the same issue.
I'm sure that my path is right
--pretrained_model /home/constantine/DataDisk/models/facenet/20180402-114759/model-20180402-114759.ckpt-275.data-00000-of-00001
and I have that
DataLossError (see above for traceback): Unable to open table file /home/constantine/DataDisk/models/facenet/20180402-114759/model-20180402-114759.ckpt-275.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]] [[Node: save/RestoreV2/_907 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_306_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
Anyone know how to fix this?

@StepOITD Did you find the solution for this issue? I got the same error when I try to export my trained model following script in here.

found a modified version in pr锛寃orked

Hi @StepOITD
I am getting the same error please help me to figure it out.

found a modified version in pr锛寃orked

@StepOITD , Could you please post the solution on how you tackled the issue. I am getting the same error when I have all the models available for multiple checkpoints in the same directory. !

@anubhav0fnu I have tried with model version 2018, and restore model-20180408-102900.ckpt-90.data-00000-of-00001. I just add argument: --pretrain_model model-20180408-102900.ckpt-90, It works for me. I read some topic on stackoverflow, that said restore function of tf.Saver is add "data-00000-of-00001" in last path of model automatically.

as @theiron97 said don't need to append the "data-00000-of-00001" in the file name. just provide the filename without that part.

@anubhav0fnu I have tried with model version 2018, and restore model-20180408-102900.ckpt-90.data-00000-of-00001. I just add argument: --pretrain_model model-20180408-102900.ckpt-90, It works for me. I read some topic on stackoverflow, that said restore function of tf.Saver is add "data-00000-of-00001" in last path of model automatically.

I love you dude <3
It worked for me

@anubhav0fnu I have tried with model version 2018, and restore model-20180408-102900.ckpt-90.data-00000-of-00001. I just add argument: --pretrain_model model-20180408-102900.ckpt-90, It works for me. I read some topic on stackoverflow, that said restore function of tf.Saver is add "data-00000-of-00001" in last path of model automatically.

Helped me !!! Thank yal

Was this page helpful?
0 / 5 - 0 ratings

Related issues

allahbaksh picture allahbaksh  路  3Comments

tonybaigang picture tonybaigang  路  3Comments

xvdehao picture xvdehao  路  4Comments

patienceFromZhou picture patienceFromZhou  路  3Comments

haochange picture haochange  路  3Comments