What would be the best way to load and serve multiple models into the system?
Looking at the code that exists today, I couldn't see anything directly usable. If there's nothing that does this today, what would be a good set of things to look at to implement something like the following,
Some thoughts,
FileSystemStoragePathSource only looks under a given path for the aspired version. So ideally, there are some methods somewhere that can instead help look for modelid rather than within it.servableId type objects, and I ought to be able to do a Get() from such an object to get the right modelId, and then run classify() on it.Lastly - do you guys have use cases where there are multiple models spread out across multiple machines, and using some sort of a DHT implementation to access them?
Hi there,
The simplest approach is to deploy N instances of FileSystemStoragePathSource. The upstream modules that "catch" the aspired-versions calls are thread-safe (although I don't think we have good testing around that -- perhaps something you could contribute?).
If N gets large that starts to get ridiculous. Also, if you don't know the set of model names (a.k.a. servable names) in advance it wouldn't work because each FileSystemStoragePathSource requires to be configured with a model name. It would be great to have an alternative or generalized file-system source that would look for model_name/version_number, as you say. A contribution on that front would be welcome!
Regarding models on multiple machines -- we don't have anything that explicitly helps with that scenario at the moment. Perhaps if you can elaborate on your requirements and/or proposed architecture we can discuss further. It might be worth moving that to a different thread -- it's a distinct topic. Thanks.
Chris
Thanks for the answer, @chrisolston.
What I've ended up doing is create my own version of a FileSystemStoragePathSource that allows me to discover models within a filesystem, which are keyed by ID. I've also made a new version of the AspiredVersionsManager which allows me to load/unload these according to an eager policy. This gives me flexibility of either having a model with the same graph loaded with different datasets (versions), or different model graphs + datasets keyed by IDs.
Will start another thread around the multiple model serving, thanks!
Cool. Please consider contributing your code back to the project :). We are open to pull requests.
@chrisolston Absolutely :) - let me put these through the paces first.
One issue that would be good to get more info on is https://github.com/tensorflow/serving/issues/46.
Also, I'm still figuring out how to deploy this code in prod environments (https://github.com/tensorflow/serving/issues/44). Any pointers there would also help!
@viksit FYI this recent commit goes part way toward addressing this issue:
https://github.com/tensorflow/serving/commit/a77b9a78af8d746fbd247b222128a0b64735f6fc
@kirilg @chrisolston thank you for the fyi - reviewing it now!
Any documentation on how to use TensorFlow serving to deploy multiple models ( not multiple versions of same model ) in production ?
I found this comment in main.cc of the server that seems to suggest this is already possible with a config file: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/model_servers/main.cc#L20
// gRPC server implementation of
// tensorflow_serving/apis/prediction_service.proto.
//
// It bring up a standard server to serve a single TensorFlow model using
// command line flags, or multiple models via config file.
However, all the examples (MNIST & inception) seem to only use the command line flags. Is there any documentation on the config file?
As of this time, serving multiple models is fully supported:
-Chris
@chrisolston Is there any documentation / example on using ModelServerConfig?
I don't think so. It should be pretty straightforward. You just give a list of models (via ModelConfigList). For each model you specify:
@chrisolston could your please explain a little further.
In main.cc, it says 'ModelServer does not yet support custom model config.'. It confuses me. Thanks in advance.
Same problem as @kinhunt . Any help? Thanks so much!
Same problem as @wangbin83-gmail-com - Thank you!
Ok,
so not elegant, I made a simple hack / edit, but yes I had to edit the main.cc
that checks if the model argument has a comma, i.e. multiple models, if so it adds multiple models, appending the model name to the model path.
seems to work.
this was a this morning hack. so not tried and tested,
my c++ is rusty, and I didn't want to add too many extra imports to do directory checking, to see if the directories in the model path are numeric (assume versions) and therefore only one model, or characters therefore multi-model
hope this helps
I've also gone down a similar path first I added the following imports:
#include <google/protobuf/text_format.h>
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <fcntl.h>
Then created the following method:
ModelServerConfig BuildConfigFromFile(
const string& config_file_path) {
ModelServerConfig config;
LOG(INFO) << "Building from config file: "
<< config_file_path;
ModelServerConfig model_config;
int fd = open(config_file_path.c_str(), O_RDONLY);
google::protobuf::io::FileInputStream fstream(fd);
google::protobuf::TextFormat::Parse(&fstream, &model_config);
return model_config;
}
Then updated the main function by commenting out the model_base_path and model_name parameters and adding my own config_file parameter
int main(int argc, char** argv) {
tensorflow::int32 port = 8500;
bool enable_batching = false;
// tensorflow::string model_name = "default";
tensorflow::int32 file_system_poll_wait_seconds = 1;
// tensorflow::string model_base_path;
bool use_saved_model = false;
tensorflow::string config_file;
tensorflow::string model_version_policy =
FileSystemStoragePathSourceConfig_VersionPolicy_Name(
FileSystemStoragePathSourceConfig::LATEST_VERSION);
std::vector<tensorflow::Flag> flag_list = {
tensorflow::Flag("config_file", &config_file, "config file"),
tensorflow::Flag("port", &port, "port to listen on"),
tensorflow::Flag("enable_batching", &enable_batching, "enable batching"),
// tensorflow::Flag("model_name", &model_name, "name of model"),
// tensorflow::Flag(
// "model_version_policy", &model_version_policy,
// "The version policy which determines the number of model versions to "
// "be served at the same time. The default value is LATEST_VERSION, "
// "which will serve only the latest version. See "
// "file_system_storage_path_source.proto for the list of possible "
// "VersionPolicy."),
tensorflow::Flag("file_system_poll_wait_seconds",
&file_system_poll_wait_seconds,
"interval in seconds between each poll of the file "
"system for new model version"),
// tensorflow::Flag("model_base_path", &model_base_path,
// "path to export (required)"),
tensorflow::Flag("use_saved_model", &use_saved_model,
"If true, use SavedModel in the server; otherwise, use "
"SessionBundle. It is used by tensorflow serving team "
"to control the rollout of SavedModel and is not "
"expected to be set by users directly.")};
string usage = tensorflow::Flags::Usage(argv[0], flag_list);
const bool parse_result = tensorflow::Flags::Parse(&argc, argv, flag_list);
if (!parse_result || config_file.empty()) {
std::cout << usage;
return -1;
}
tensorflow::port::InitMain(argv[0], &argc, &argv);
if (argc != 1) {
std::cout << "unknown argument: " << argv[1] << "\n" << usage;
}
...
and also by commenting updating the creation of the server options:
// For ServerCore Options, we leave servable_state_monitor_creator unspecified
// so the default servable_state_monitor_creator will be used.
ServerCore::Options options;
//options.model_server_config = BuildSingleModelConfig(
// model_name, model_base_path, parsed_version_policy);
options.model_server_config = BuildConfigFromFile(config_file);
finally I can load in a text config file in the format, e.g.:
model_config_list: {
config: {
name: "bob",
base_path: "/tmp/model",
model_platform: "tensorflow"
},
config: {
name: "bob2",
base_path: "/tmp/model2",
model_platform: "tensorflow"
}
}
Finally running it as such:
./bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --config_file=./tfserv.conf
I get output:
I tensorflow_serving/model_servers/main.cc:166] Building from config file: ./tfserv.conf
I tensorflow_serving/model_servers/server_core.cc:261] Adding/updating models.
I tensorflow_serving/model_servers/server_core.cc:298] (Re-)adding model: bob
I tensorflow_serving/model_servers/server_core.cc:298] (Re-)adding model: bob2
Maybe there's an easier way... hope this helps!
@perdasilva your c++ seems less rusty than mine.
nice.
might be worth a commit, maybe an override on preference check, config_file then single model.
Thanks
@sendit2me I've already forked the repo - I'm working on the patch now then I'll submit a PR
The last time I touched C++ was at university =S thank god for google and stackoverflow hehehe
My apologies. I was mistaken when I claimed earlier that the binary (model_servers/main.cc) supports multiple models via ModelServerConfig. I am reviewing PR 294, which adds that feature, now. Thanks for your patience.
Any news on this?
I'm looking for an option to deploy multiple models without having to change the C++ code.
PR 294 was merged in January.
You can supply a ModelServerConfig protocol buffer to the tf-serving binary. It contains a "repeated ModelConfig" field [1], which lets you specify multiple models.
Thanks for the quick reply! Unfortunately I don't have much experience with protocol buffers.
So supplying a ModelServerConfig protocol buffer to the tf-serving binary means to supply the --config_file argument to the tensorflow_model_server binary call?
My model config file looks like this:
model_config_list: {
config: {
name: "model1",
base_path: "/serving/models/model1",
model_platform: "tensorflow"
},
config: {
name: "model2",
base_path: "/serving/models/model2",
model_platform: "tensorflow"
}
}
Then I start the server with:
/serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --config_file=/serving/model_config/model_config.conf
However, it does not work like this. Server is not started :/
Help would be appreciated :)
EDIT:
the argument for tensorflow_model_server must be --model_config_file=foo.conf instead of --config_file=foo.conf
The exact thing works for me. :-) @mm-manu
I know this is a very old issue, but I'm having trouble finding information on what the maximum number of models one can include in a .conf file is; I assume this limit is likely related to the footprint of the models vs the machine on which serving is running, but it's just a guess.
Indeed, there is no limit other than the server's resources.
How to deploy multi-models in tf serving. for example, I have plate detect model A and plate recognise model B. I want to serve these two models in tf serving. is it possible to cascade these two model in the server, so my client just send the server one picture, server did both detect and recognise and return me the final plate number. I do not want the rough detect bounding box of plate candidates after detetction model. best~
Same problem as @CLIsVeryOK . Any inspiration please? Thanks so much!
@CLIsVeryOK @WilliamL1
I'd recommend writing a simple CLI program or webservice which implements a TF Serving client that interacts with the TF Serving instance over gRPC and submits images to both models using your desired logic. There are quite a few python examples readily available, as well as a few for Golang.
I solve my problem by changing the c code in tensorflow serving, and re-compile it in bazel. and it works.
serving\tensorflow_serving\servables\tensorflow\predict_impl.cc
@WilliamL1 @zachgrayio
Hello
Is there a way to avoid abolute paths for base_path ? if there is not, is there a way to make it configurable? Thanks in advance!
I have attached the folder structure in my host machine.

Config file :
model_config_list: {
config: {
name: "model1",
base_path: "/models/allmodels/model1",
model_platform: "tensorflow"
},
config: {
name: "model2",
base_path: "/models/allmodels/model2",
model_platform: "tensorflow"
},
config: {
name: "model3",
base_path: "/models/allmodels/model3",
model_platform: "tensorflow"
}
}
Docker command:
docker run -p 8500:8500 --mount type=bind,source=/Users/rahulkumar/Desktop/allmodels/,target=/models/allmodels -t tensorflow/serving --model_config_file=/models/allmodels/models.config
Hope this help. :-)
@goodrahstar I followed the same command. But I am getting error saying file not found.
File is there in the file path.
Docker command :
docker run -p 8501:8501 --mount type=bind,source=/opt/script/TFServingModelFactory/,target=/models/TFServingModelFactory -t $USER/tensorflow-serving --model_config_file=/opt/script/TFServingModelFactory/models.config
model.config file :
model_config_list {
config {
name: "model1"
base_path: "/models/TFServingModelFactory/model1/"
model_platform: "tensorflow"
}
config {
name: "model1"
base_path: "/models/TFServingModelFactory/model1"
model_platform: "tensorflow"
}
}
error : F tensorflow_serving/model_servers/server.cc:97] Non-OK-status: ParseProtoTextFile(file, &proto) status: Not found: /opt/script/TFServingModelFactory/models.config; No such file or directory
/usr/bin/tf_serving_entrypoint.sh: line 3: 6 Aborted (core dumped) tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"
It is working fine when I deploy using tensorflow_model_server
I am new to it.
I was able to deploy 3 models at a time using docker and config file.
I have attached the folder structure in my host machine.
Config file :
model_config_list: { config: { name: "model1", base_path: "/models/allmodels/model1", model_platform: "tensorflow" }, config: { name: "model2", base_path: "/models/allmodels/model2", model_platform: "tensorflow" }, config: { name: "model3", base_path: "/models/allmodels/model3", model_platform: "tensorflow" } }Docker command:
docker run -p 8500:8500 --mount type=bind,source=/Users/rahulkumar/Desktop/allmodels/,target=/models/allmodels -t tensorflow/serving --model_config_file=/models/allmodels/models.configHope this help. :-)
Hi, this works great. I used GPU to load these models but not sure how to connect to the server from client side. do you have any suggestions?
how can we tag a model to a specific batching config since that is not the parameter models.config expects.
How does server side batching work for different models expecting different inputs?
@goodrahstar @chrisolston
TIA
Most helpful comment
I've also gone down a similar path first I added the following imports:
Then created the following method:
Then updated the main function by commenting out the model_base_path and model_name parameters and adding my own config_file parameter
and also by commenting updating the creation of the server options:
finally I can load in a text config file in the format, e.g.:
Finally running it as such:
I get output:
Maybe there's an easier way... hope this helps!