I have a couple of questions on serving models in production.
Thanks for any inputs.
In the case of a complex model, loading from json can end up taking a
while (100s of seconds), given there's a model.compile() happening in
there. Now, if you have one model, but different versions of it - say,
trained on datasets d1, d2 and .. dn - what would be the best way to
maintain these in memory? A deep copy doesn't work (recursion depth
exceeded, for Theano backed models) - so perhaps a copy.copy() is the best
option for now.
If the models are identical but only the weights differ, you could do a
deep copy (change the recursion depth limit to be able to do it) then set
different weights on the copied model.
Are there any thoughts on utilizing Tensorflow serving for something like
this with Keras models, and has any thought been given to
integration/examples of how one might do that, given they support this kind
of stuff?
With the TF backend, after building a model you can access the TF
computation graph (model.get_output / model.get_input) and then work in TF
from then on. This is useful for production deployment. Quite a few people
are doing this already.
On 23 March 2016 at 17:46, Viksit Gaur [email protected] wrote:
I have a couple of questions on serving models in production.
-
In the case of a complex model, loading from json can end up taking a
while (100s of seconds), given there's a model.compile() happening in
there. Now, if you have one model, but different versions of it - say,
trained on datasets d1, d2 and .. dn - what would be the best way to
maintain these in memory? A deep copy doesn't work (recursion depth
exceeded, for Theano backed models) - so perhaps a copy.copy() is the best
option for now.
-Are there any thoughts on utilizing Tensorflow serving for something
like this with Keras models, and has any thought been given to
integration/examples of how one might do that, given they support this kind
of stuff?Thanks for any inputs.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/2056
What helps me usually (especially if the incoming batch sizes are small) is to just compile the output function.
get_output = K.function([nn.inputs['input_1'].input, nn.inputs['input_2'].input], [nn.outputs['output'].get_output(train=False)])
And then just call get_output. This takes me 4 seconds as opposed to the usual 700.
@fchollet With TF-backend - you mention that people are using the TF computation graph to access input - output for production systems. Do you have any example code for this?
@lemuriandezapada can you explain what you're doing there? I don't see a get_output() for the output in my model?
I also look for such examples, @Froskekongen did you get it? @fchollet could u provide any example code ? Thanks a lot !
I'm also straggling with exporting my Keras model as an input for TensorFlow Serving.
Found few examples based on the Keras tutorial (f.e. here .
But it seems that the SessionBundle will be deprecated and instead, tensorflow serving is using savedModel from tf 1.0 version.
Did anyone had any success in exporting a Keras model to tf savedModel?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
Most helpful comment
What helps me usually (especially if the incoming batch sizes are small) is to just compile the output function.
get_output = K.function([nn.inputs['input_1'].input, nn.inputs['input_2'].input], [nn.outputs['output'].get_output(train=False)])
And then just call get_output. This takes me 4 seconds as opposed to the usual 700.