Keras: Loading a TensorFlow checkpoint, and turn it into a Keras model

Created on 5 Feb 2017 · 23Comments · Source: keras-team/keras

Consider the process of loading a TF checkpoint in TF:

self.sess = tf.InteractiveSession()
self.sess.run(tf.initialize_all_variables())
saver = tf.train.Saver(tf.all_variables())
saver.restore(self.sess, tf_ckpt_path)

and then I am abel to do self.sess.run(stuff) etc.

The problem is, if I do the above in a Keras jupyter notebook, I am not really building a Keras model, but still using a TF model.

How can I change this model into a Keras model (without building the model from scratch again), such that I can follow the directions in https://github.com/transcranial/keras-js to run the model in Keras.js (which is ultimately what I'd hope to do)? Thanks.

Source

cheshirecats

👍34

Most helpful comment

Is it possible in 2020?

phoebejiang on 23 Jan 2020

👍19 👀8 😄5

All 23 comments

+1
Hi @fchollet, I have the same question, it would be nice to convert trained TensorFlow models into Keras models for subsequently exporting model weights and architecture.
Please look into this.
Thanks.

AashishTiwari on 16 Feb 2017

rohrl on 22 Feb 2017

+1 :)

Pezaun on 24 Feb 2017

tayor on 14 Mar 2017

And vice versa. From keras to tensorflow is also important.

Updated:
(see https://github.com/fchollet/keras/pull/6074)

adamcavendish on 19 Mar 2017

👍13

Wonder if someone can take a look at this as many people are calling for this feature... Thanks.

cheshirecats on 3 Apr 2017

👍8

ParthaEth on 19 Apr 2017

Recently, I also have this problem.

shawnyuen on 19 Apr 2017

I want a little more :-) after I have it in Keras I want to save inf HDF5 format so I can load it in deeplearning4j. (and execute in a "real world application" with java ... )

rjpg on 21 Apr 2017

😄5 👍5

How can I change this model into a Keras model (without building the model from scratch again)

Fundamentally, you cannot "turn an arbitrary TensorFlow checkpoint into a Keras model".

What you can do, however, is build an equivalent Keras model then load into this Keras model the weights contained in a TensorFlow checkpoint that corresponds to the saved model. In fact this is how the pre-trained InceptionV3 in Keras was obtained.

For instance, you can take a TensorFlow checkpoint that implements VGG16, then build the same VGG16 model in Keras and load the weights from the TensorFlow checkpoint.

It's not always easy: it involves iterating over the variables in the checkpoint and transferring them to the Keras model using layer.load_weights(weights).

fchollet on 21 Apr 2017

👍18 ❤1

From keras to tensorflow is also important.

Turning a Keras model into a TensorFlow checkpoint is easy: a Keras model built with the TF backend is already a TF graph, and you can just save the current TF graph to a TF checkpoint the way you normally would. Maybe see https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html

fchollet on 21 Apr 2017

👍11

@fchollet Could you write something or examples on Keras document about this topic? We all think this is very useful for research from TensorFlow. Thanks!

jacknlliu on 11 Sep 2017

👍19

Say I built a model using keras, took its tensors (model.inputs and model.outputs) and trained the parameters in TF. Now I want to get that keras model with the trained weights. What's the best way to do that?

jonilaserson on 20 Sep 2017

What I'm unsuccessfully trying to do is (pseudo operational code)

with tf.Graph().as_default():
    input_tensors = build_the_input_tensors()
    model, _ = build_my_favorite_keras_model(input_tensors, **params)
    with tf.Session() as S:
        tf.train.Saver().restore(S, checkpoint)
        print(S.run(model.trainable_weights[0])) # p1
        model.save(modelfile)
        print(S.run(model.trainable_weights[0])) # p2
# the output is
# > p1 = 0.02982 as it should be
# > p2 = 0.05 as it shouldn't

Interestingly, model.save seems to (re)initialize values before it saves them. How can this be avoided?

jusjusjus on 31 Oct 2017

I figured it out. model.save calls get_session in tensorflow_backend.py:2143. Therein, if not _MANUAL_VAR_INIT: .. _initialize_variables(). We need to toggle that switch:

import keras.backend as K
K.manual_variable_initialization(True)

jusjusjus on 31 Oct 2017

I have the same problem recently.

zhangrong1722 on 25 Dec 2017

I am using tf.keras to create the layers for the model and then using TensorFlow to perform the training and save the checkpoints.

If I try to do a model.save('test.h5') after the session finishes I get an error that the graph is not modifiable once it starts training. It seems like there should be a way to do this. Would it be possible for the tf.Saver to save BOTH the keras model (hdf5 format) and the TF model (.pb format)?

mas-dse-greina on 2 Jan 2018

The correct API is layer.set_weights(weights)

rawmean on 13 Mar 2018

Since it was no clear to me, I created a small example of how to load a tensorflow checkpoint, build an equivalent model and load the tensorflow weights into this model weights. See https://stackoverflow.com/a/53638524/2135504

cgebbe on 5 Dec 2018

❤3

How can i convert from tensorflow model to tf.keras model?

omarjaml on 21 Apr 2019

Is this possible in 2019?

satrya-sabeni on 2 Oct 2019

👍17 👀10 😄1

Is it possible in 2020?

phoebejiang on 23 Jan 2020

👍19 👀8 😄5

For instance, you can take a TensorFlow checkpoint that implements VGG16, then build the same VGG16 model in Keras and load the weights from the TensorFlow checkpoint.

It's not always easy: it involves iterating over the variables in the checkpoint and transferring them to the Keras model using layer.load_weights(weights). (emphasis mine)

It isn't always easy, but I did it for the VGGish model and I want to share a few hints on the steps. This is just for iterating over the variables in the checkpoint and transferring them to the equivalent model. I assume you've already built that equivalent model in Keras:

This will show you the names of all the layers in a checkpoint and their weight/bias shapes (from stackoverflow)

from tensorflow.python import pywrap_tensorflow
import os

checkpoint_path = os.path.join(model_dir, "model.ckpt")
reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()

for key in var_to_shape_map:
    print("tensor_name: ", key)
    print(reader.get_tensor(key).shape) # Remove this is you want to print only variable names

example output:

tensor_name: vggish/conv1/biases
(64,)
tensor_name: vggish/conv1/weights
(3, 3, 1, 64)
tensor_name: vggish/conv2/biases
(128,)
tensor_name: vggish/conv2/weights
(3, 3, 64, 128)
tensor_name: vggish/conv3/conv3_1/biases
(256,)
tensor_name: vggish/conv3/conv3_1/weights
(3, 3, 128, 256)
...

Look at the list of layers and shapes in your keras model:

model.summary()

example output:

Model: "MWE"

Layer (type) Output Shape Param #

inputs (InputLayer) (None, 96, 64, 1) 0

conv1 (Conv2D) (None, 96, 64, 64) 640

pool1 (MaxPooling2D) (None, 48, 32, 64) 0

conv2 (Conv2D) (None, 48, 32, 128) 73856

pool2 (MaxPooling2D) (None, 24, 16, 128) 0

conv3_1 (Conv2D) (None, 24, 16, 256) 295168
...
3. Identify equivalent layer names and set weights layer by layer. It helps if your Keras layers are named similarly to the TF checkpoint layers so that you can loop through. Here I just demonstrate setting weights on one layer, vggish/conv3/conv3_1 in the TF checkpoint file, or conv3_1 as I've named it in Keras:
* Note: it helps to check that the sizes are what you expect. vggish/conv3/conv3_1/weights has 3x3x128x256=294912 parameters, and vggish/conv3/conv3_1/biases has 256. Add them together and it equals 295168, the number of parameters in the equivalent Keras layer. Good stuff.

weights_key = 'vggish/conv3/conv3_1/weights'
bias_key = 'vggish/conv3/conv3_1/biases'
weights = reader.get_tensor(weights_key)
biases = reader.get_tensor(bias_key)
model.get_layer('conv3_1').set_weights([weights, biases])

You can see here that Keras's model.get_layer(<name>).set_weights takes a list of numpy arrays with weights first and biases second.
Warning: it would make sense to check the shapes of model.get_layer('conv3_1').get_weights() to make sure that you are passing weights and biases in the correct order. If you use the default initializer, however, TF has to calculate a lot of random values and gave me a memory error. On the other hand, setting weights is fast.