Serving: Auto load the new-model without stopping tf-serving

Created on 6 Mar 2019 · 4Comments · Source: tensorflow/serving

My tf-serving starting command:

tensorflow_model_server --port=8002 --model_config_file=/home/serving/serving-bin/serving_models/config_files/attributes.conf --enable_batching=true --enable_model_warmup=true --per_process_gpu_memory_fraction=1

My config file (attributes.conf):

model_config_list: {
  config: {
     name: "model_1",
     base_path: "/home/serving/serving-bin/serving_models/atttributes/model_1/serving",
     model_platform: "tensorflow"
  },
}

TF-serving is already running with the defined _attributes.conf_ file.
Now, I have updated the _attributes.conf_ file (i.e: added new model _config_ in it).

My config file-- updated(attributes.conf):

model_config_list: {
  config: {
     name: "model_1",
     base_path: "/home/serving/serving-bin/serving_models/atttributes/model_1/serving",
     model_platform: "tensorflow"
  },
 config: {
     name: "model_2",
     base_path: "/home/serving/serving-bin/serving_models/atttributes/model_2/serving",
     model_platform: "tensorflow"
  },
}

Usually what I do is restarted the tf-serving server by running the first command.

But I would like to ask, is there a way where it will automatically detects the updation of the _attributes.conf_ file and just the load the newly added models without affecting the previously loaded one? or loaded only those parts which have been instantly changed at continous level?

Source

gr8Adakron

Most helpful comment

use HandleReloadConfigRequest gRPC API:
https://github.com/tensorflow/serving/blob/b3fb70010eed896e6eefb2f431baa9624801711a/tensorflow_serving/apis/model_service.proto#L22

sample:
https://github.com/tensorflow/serving/issues/765#issuecomment-451905019

this will not offer you continuous polling, but you can do that from the outside and send this request each time your config changes. will that work for you?

netfs on 7 Mar 2019

👍3 🎉1

All 4 comments

@ewilderj @montanaflynn @lamberta @kchodorow would be glad, if I can get the solution ASAP!

gr8Adakron on 6 Mar 2019

use HandleReloadConfigRequest gRPC API:
https://github.com/tensorflow/serving/blob/b3fb70010eed896e6eefb2f431baa9624801711a/tensorflow_serving/apis/model_service.proto#L22

sample:
https://github.com/tensorflow/serving/issues/765#issuecomment-451905019

this will not offer you continuous polling, but you can do that from the outside and send this request each time your config changes. will that work for you?

netfs on 7 Mar 2019

👍3 🎉1

Thanks a lot, it's really helpful. So along with the change in the static-config file, I will run this piece of code as well to instantly load the newly added model and make them ready for inference, without trading any downtime for already running models.
And later, when the server restarts it will pull the new models too. Thanks again.

Anyway, I am open to further suggestion as well.

gr8Adakron on 8 Mar 2019

i also strongly suggest posting such questions to stackoverflow (SO) than creating an "issue".

thanks!

netfs on 8 Mar 2019

Was this page helpful?

0 / 5 - 0 ratings