Bert: Serving fine-tuned Model - best solution

Created on 11 Jul 2019  路  7Comments  路  Source: google-research/bert

What is the best solution to serve a fine-tuned model and returning predictions?

Most helpful comment

Although it might not be the most efficient method I find wrapping the prediction in a Flask API to work quite well. It works something like this:

First export your model after training:
estimator.export_saved_model(model_dir, serving_input_receiver_fn)

Then load your model in the API using something in the lines of:

from tensorflow.contrib import predictor
predict_fn = predictor.from_saved_model(model_dir)
result = predict_fn(...)

Now you can use predict_fn to serve predictions. I have a rough implementation that I could share if you need it :)

All 7 comments

this is a duplicate of issue #679

Thanks Jay for the response, however, bert-as-service will only encode sentences. I've looked at their documentation and have it running and it does not do any prediction.

Although it might not be the most efficient method I find wrapping the prediction in a Flask API to work quite well. It works something like this:

First export your model after training:
estimator.export_saved_model(model_dir, serving_input_receiver_fn)

Then load your model in the API using something in the lines of:

from tensorflow.contrib import predictor
predict_fn = predictor.from_saved_model(model_dir)
result = predict_fn(...)

Now you can use predict_fn to serve predictions. I have a rough implementation that I could share if you need it :)

@sarnikowski I am actually using Flask to serve predictions however, it's very slow. Is it possible for you to share the code with me?
TIA!

For anyone else interested in this, i wrote a rough implementation and made it available here: https://github.com/sarnikowski/bert_in_a_flask

Maybe check this out if you are looking for serving BERT fine-tuned model.
BERT Serving and Inferencing from fine-tuned

Was this page helpful?
0 / 5 - 0 ratings