My tf-serving starting command:
tensorflow_model_server --port=8002 --model_config_file=/home/serving/serving-bin/serving_models/config_files/attributes.conf --enable_batching=true --enable_model_warmup=true --per_process_gpu_memory_fraction=1
My config file (attributes.conf):
model_config_list: {
config: {
name: "model_1",
base_path: "/home/serving/serving-bin/serving_models/atttributes/model_1/serving",
model_platform: "tensorflow"
},
}
TF-serving is already running with the defined _attributes.conf_ file.
Now, I have updated the _attributes.conf_ file (i.e: added new model _config_ in it).
My config file-- updated(attributes.conf):
model_config_list: {
config: {
name: "model_1",
base_path: "/home/serving/serving-bin/serving_models/atttributes/model_1/serving",
model_platform: "tensorflow"
},
config: {
name: "model_2",
base_path: "/home/serving/serving-bin/serving_models/atttributes/model_2/serving",
model_platform: "tensorflow"
},
}
Usually what I do is restarted the tf-serving server by running the first command.
But I would like to ask, is there a way where it will automatically detects the updation of the _attributes.conf_ file and just the load the newly added models without affecting the previously loaded one? or loaded only those parts which have been instantly changed at continous level?
@ewilderj @montanaflynn @lamberta @kchodorow would be glad, if I can get the solution ASAP!
use HandleReloadConfigRequest gRPC API:
https://github.com/tensorflow/serving/blob/b3fb70010eed896e6eefb2f431baa9624801711a/tensorflow_serving/apis/model_service.proto#L22
sample:
https://github.com/tensorflow/serving/issues/765#issuecomment-451905019
this will not offer you continuous polling, but you can do that from the outside and send this request each time your config changes. will that work for you?
Thanks a lot, it's really helpful. So along with the change in the static-config file, I will run this piece of code as well to instantly load the newly added model and make them ready for inference, without trading any downtime for already running models.
And later, when the server restarts it will pull the new models too. Thanks again.
Anyway, I am open to further suggestion as well.
i also strongly suggest posting such questions to stackoverflow (SO) than creating an "issue".
thanks!
Most helpful comment
use HandleReloadConfigRequest gRPC API:
https://github.com/tensorflow/serving/blob/b3fb70010eed896e6eefb2f431baa9624801711a/tensorflow_serving/apis/model_service.proto#L22
sample:
https://github.com/tensorflow/serving/issues/765#issuecomment-451905019
this will not offer you continuous polling, but you can do that from the outside and send this request each time your config changes. will that work for you?