Serving: Auto load the new-model without stopping tf-serving

Created on 6 Mar 2019  路  4Comments  路  Source: tensorflow/serving

My tf-serving starting command:

tensorflow_model_server --port=8002 --model_config_file=/home/serving/serving-bin/serving_models/config_files/attributes.conf --enable_batching=true --enable_model_warmup=true --per_process_gpu_memory_fraction=1

My config file (attributes.conf):

model_config_list: {
  config: {
     name: "model_1",
     base_path: "/home/serving/serving-bin/serving_models/atttributes/model_1/serving",
     model_platform: "tensorflow"
  },
}

TF-serving is already running with the defined _attributes.conf_ file.
Now, I have updated the _attributes.conf_ file (i.e: added new model _config_ in it).

My config file-- updated(attributes.conf):

model_config_list: {
  config: {
     name: "model_1",
     base_path: "/home/serving/serving-bin/serving_models/atttributes/model_1/serving",
     model_platform: "tensorflow"
  },
 config: {
     name: "model_2",
     base_path: "/home/serving/serving-bin/serving_models/atttributes/model_2/serving",
     model_platform: "tensorflow"
  },
}

Usually what I do is restarted the tf-serving server by running the first command.

But I would like to ask, is there a way where it will automatically detects the updation of the _attributes.conf_ file and just the load the newly added models without affecting the previously loaded one? or loaded only those parts which have been instantly changed at continous level?

Most helpful comment

use HandleReloadConfigRequest gRPC API:
https://github.com/tensorflow/serving/blob/b3fb70010eed896e6eefb2f431baa9624801711a/tensorflow_serving/apis/model_service.proto#L22

sample:
https://github.com/tensorflow/serving/issues/765#issuecomment-451905019

this will not offer you continuous polling, but you can do that from the outside and send this request each time your config changes. will that work for you?

All 4 comments

@ewilderj @montanaflynn @lamberta @kchodorow would be glad, if I can get the solution ASAP!

use HandleReloadConfigRequest gRPC API:
https://github.com/tensorflow/serving/blob/b3fb70010eed896e6eefb2f431baa9624801711a/tensorflow_serving/apis/model_service.proto#L22

sample:
https://github.com/tensorflow/serving/issues/765#issuecomment-451905019

this will not offer you continuous polling, but you can do that from the outside and send this request each time your config changes. will that work for you?

Thanks a lot, it's really helpful. So along with the change in the static-config file, I will run this piece of code as well to instantly load the newly added model and make them ready for inference, without trading any downtime for already running models.
And later, when the server restarts it will pull the new models too. Thanks again.

Anyway, I am open to further suggestion as well.

i also strongly suggest posting such questions to stackoverflow (SO) than creating an "issue".

thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

prateekgupta11 picture prateekgupta11  路  4Comments

TonyChouZJU picture TonyChouZJU  路  4Comments

jluite picture jluite  路  4Comments

akkiagrawal94 picture akkiagrawal94  路  3Comments

dylanrandle picture dylanrandle  路  3Comments