Serving: Deploying multiple models

Created on 16 Apr 2016 · 35Comments · Source: tensorflow/serving

What would be the best way to load and serve multiple models into the system?

Looking at the code that exists today, I couldn't see anything directly usable. If there's nothing that does this today, what would be a good set of things to look at to implement something like the following,

Some thoughts,

Lets assume each model resided in a directory called /mnt/modelid/0000001
The current FileSystemStoragePathSource only looks under a given path for the aspired version. So ideally, there are some methods somewhere that can instead help look for modelid rather than within it.
The output of this should be a hashmap of servableId type objects, and I ought to be able to do a Get() from such an object to get the right modelId, and then run classify() on it.

Lastly - do you guys have use cases where there are multiple models spread out across multiple machines, and using some sort of a DHT implementation to access them?

contributions welcome

Source

viksit

👍4

Most helpful comment

I've also gone down a similar path first I added the following imports:

#include <google/protobuf/text_format.h>
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <fcntl.h>

Then created the following method:

ModelServerConfig BuildConfigFromFile(
    const string& config_file_path) {

  ModelServerConfig config;
  LOG(INFO) << "Building from config file: "
            << config_file_path;

  ModelServerConfig model_config;
  int fd = open(config_file_path.c_str(), O_RDONLY);
  google::protobuf::io::FileInputStream fstream(fd);
  google::protobuf::TextFormat::Parse(&fstream, &model_config);
  return model_config;
}

Then updated the main function by commenting out the model_base_path and model_name parameters and adding my own config_file parameter

int main(int argc, char** argv) {
  tensorflow::int32 port = 8500;
  bool enable_batching = false;
  // tensorflow::string model_name = "default";
  tensorflow::int32 file_system_poll_wait_seconds = 1;
  // tensorflow::string model_base_path;
  bool use_saved_model = false;
  tensorflow::string config_file;
  tensorflow::string model_version_policy =
      FileSystemStoragePathSourceConfig_VersionPolicy_Name(
          FileSystemStoragePathSourceConfig::LATEST_VERSION);
  std::vector<tensorflow::Flag> flag_list = {
      tensorflow::Flag("config_file", &config_file, "config file"),
      tensorflow::Flag("port", &port, "port to listen on"),
      tensorflow::Flag("enable_batching", &enable_batching, "enable batching"),
      // tensorflow::Flag("model_name", &model_name, "name of model"),
      // tensorflow::Flag(
      //    "model_version_policy", &model_version_policy,
      //    "The version policy which determines the number of model versions to "
      //    "be served at the same time. The default value is LATEST_VERSION, "
      //    "which will serve only the latest version. See "
      //    "file_system_storage_path_source.proto for the list of possible "
      //    "VersionPolicy."),
      tensorflow::Flag("file_system_poll_wait_seconds",
                       &file_system_poll_wait_seconds,
                       "interval in seconds between each poll of the file "
                       "system for new model version"),
      // tensorflow::Flag("model_base_path", &model_base_path,
      //                 "path to export (required)"),
      tensorflow::Flag("use_saved_model", &use_saved_model,
                       "If true, use SavedModel in the server; otherwise, use "
                       "SessionBundle. It is used by tensorflow serving team "
                       "to control the rollout of SavedModel and is not "
                       "expected to be set by users directly.")};
  string usage = tensorflow::Flags::Usage(argv[0], flag_list);
  const bool parse_result = tensorflow::Flags::Parse(&argc, argv, flag_list);
  if (!parse_result || config_file.empty()) {
    std::cout << usage;
    return -1;
  }
  tensorflow::port::InitMain(argv[0], &argc, &argv);
  if (argc != 1) {
    std::cout << "unknown argument: " << argv[1] << "\n" << usage;
  }
...

and also by commenting updating the creation of the server options:

// For ServerCore Options, we leave servable_state_monitor_creator unspecified
  // so the default servable_state_monitor_creator will be used.
  ServerCore::Options options;
  //options.model_server_config = BuildSingleModelConfig(
  //    model_name, model_base_path, parsed_version_policy);
  options.model_server_config = BuildConfigFromFile(config_file);

finally I can load in a text config file in the format, e.g.:

model_config_list: {

  config: {
    name: "bob",
    base_path: "/tmp/model",
    model_platform: "tensorflow"
  },
  config: {
     name: "bob2",
     base_path: "/tmp/model2",
     model_platform: "tensorflow"
  }
}

Finally running it as such:

./bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --config_file=./tfserv.conf

I get output:

I tensorflow_serving/model_servers/main.cc:166] Building from config file: ./tfserv.conf
I tensorflow_serving/model_servers/server_core.cc:261] Adding/updating models.
I tensorflow_serving/model_servers/server_core.cc:298]  (Re-)adding model: bob
I tensorflow_serving/model_servers/server_core.cc:298]  (Re-)adding model: bob2

Maybe there's an easier way... hope this helps!

perdasilva on 11 Jan 2017

👍20 ❤2

All 35 comments

Hi there,

The simplest approach is to deploy N instances of FileSystemStoragePathSource. The upstream modules that "catch" the aspired-versions calls are thread-safe (although I don't think we have good testing around that -- perhaps something you could contribute?).

If N gets large that starts to get ridiculous. Also, if you don't know the set of model names (a.k.a. servable names) in advance it wouldn't work because each FileSystemStoragePathSource requires to be configured with a model name. It would be great to have an alternative or generalized file-system source that would look for model_name/version_number, as you say. A contribution on that front would be welcome!

Regarding models on multiple machines -- we don't have anything that explicitly helps with that scenario at the moment. Perhaps if you can elaborate on your requirements and/or proposed architecture we can discuss further. It might be worth moving that to a different thread -- it's a distinct topic. Thanks.

Chris

chrisolston on 18 Apr 2016

Thanks for the answer, @chrisolston.

What I've ended up doing is create my own version of a FileSystemStoragePathSource that allows me to discover models within a filesystem, which are keyed by ID. I've also made a new version of the AspiredVersionsManager which allows me to load/unload these according to an eager policy. This gives me flexibility of either having a model with the same graph loaded with different datasets (versions), or different model graphs + datasets keyed by IDs.

Will start another thread around the multiple model serving, thanks!

viksit on 19 Apr 2016

Cool. Please consider contributing your code back to the project :). We are open to pull requests.

chrisolston on 19 Apr 2016

@chrisolston Absolutely :) - let me put these through the paces first.

One issue that would be good to get more info on is https://github.com/tensorflow/serving/issues/46.

Also, I'm still figuring out how to deploy this code in prod environments (https://github.com/tensorflow/serving/issues/44). Any pointers there would also help!

viksit on 19 Apr 2016

@viksit FYI this recent commit goes part way toward addressing this issue:
https://github.com/tensorflow/serving/commit/a77b9a78af8d746fbd247b222128a0b64735f6fc

chrisolston on 23 Aug 2016

@kirilg @chrisolston thank you for the fyi - reviewing it now!

viksit on 25 Aug 2016

Any documentation on how to use TensorFlow serving to deploy multiple models ( not multiple versions of same model ) in production ?

akrai48 on 4 Oct 2016

I found this comment in main.cc of the server that seems to suggest this is already possible with a config file: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/model_servers/main.cc#L20

// gRPC server implementation of
// tensorflow_serving/apis/prediction_service.proto.
//
// It bring up a standard server to serve a single TensorFlow model using
// command line flags, or multiple models via config file.

However, all the examples (MNIST & inception) seem to only use the command line flags. Is there any documentation on the config file?

vikeshkhanna on 24 Oct 2016

As of this time, serving multiple models is fully supported:

If you are using TF-Serving _libraries_ (not the ModelServer _binary_) you can set up a FileSystemStoragePathSourceConfig to serve multiple models.
If you are using the _binary_ you can use ModelServerConfig to serve multiple models.

-Chris

chrisolston on 26 Oct 2016

@chrisolston Is there any documentation / example on using ModelServerConfig?

vikeshkhanna on 26 Oct 2016

I don't think so. It should be pretty straightforward. You just give a list of models (via ModelConfigList). For each model you specify:

Model name
Model base path
Model platform (use "tensorflow" unless you are doing something exotic)

chrisolston on 26 Oct 2016

😕8 👍2

@chrisolston could your please explain a little further.

by 'binary', do you mean this 'bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server' ?
how to give a list of models? code it in main.cc and compile or run the binary with a config file?

In main.cc, it says 'ModelServer does not yet support custom model config.'. It confuses me. Thanks in advance.

kinhunt on 22 Dec 2016

Same problem as @kinhunt . Any help? Thanks so much!

wangbin83-gmail-com on 10 Jan 2017

Same problem as @wangbin83-gmail-com - Thank you!

perdasilva on 10 Jan 2017

Ok,
so not elegant, I made a simple hack / edit, but yes I had to edit the main.cc

that checks if the model argument has a comma, i.e. multiple models, if so it adds multiple models, appending the model name to the model path.

seems to work.
this was a this morning hack. so not tried and tested,

my c++ is rusty, and I didn't want to add too many extra imports to do directory checking, to see if the directories in the model path are numeric (assume versions) and therefore only one model, or characters therefore multi-model

hope this helps

sendit2me on 10 Jan 2017

I've also gone down a similar path first I added the following imports:

#include <google/protobuf/text_format.h>
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <fcntl.h>

Then created the following method:

ModelServerConfig BuildConfigFromFile(
    const string& config_file_path) {

  ModelServerConfig config;
  LOG(INFO) << "Building from config file: "
            << config_file_path;

  ModelServerConfig model_config;
  int fd = open(config_file_path.c_str(), O_RDONLY);
  google::protobuf::io::FileInputStream fstream(fd);
  google::protobuf::TextFormat::Parse(&fstream, &model_config);
  return model_config;
}

Then updated the main function by commenting out the model_base_path and model_name parameters and adding my own config_file parameter

int main(int argc, char** argv) {
  tensorflow::int32 port = 8500;
  bool enable_batching = false;
  // tensorflow::string model_name = "default";
  tensorflow::int32 file_system_poll_wait_seconds = 1;
  // tensorflow::string model_base_path;
  bool use_saved_model = false;
  tensorflow::string config_file;
  tensorflow::string model_version_policy =
      FileSystemStoragePathSourceConfig_VersionPolicy_Name(
          FileSystemStoragePathSourceConfig::LATEST_VERSION);
  std::vector<tensorflow::Flag> flag_list = {
      tensorflow::Flag("config_file", &config_file, "config file"),
      tensorflow::Flag("port", &port, "port to listen on"),
      tensorflow::Flag("enable_batching", &enable_batching, "enable batching"),
      // tensorflow::Flag("model_name", &model_name, "name of model"),
      // tensorflow::Flag(
      //    "model_version_policy", &model_version_policy,
      //    "The version policy which determines the number of model versions to "
      //    "be served at the same time. The default value is LATEST_VERSION, "
      //    "which will serve only the latest version. See "
      //    "file_system_storage_path_source.proto for the list of possible "
      //    "VersionPolicy."),
      tensorflow::Flag("file_system_poll_wait_seconds",
                       &file_system_poll_wait_seconds,
                       "interval in seconds between each poll of the file "
                       "system for new model version"),
      // tensorflow::Flag("model_base_path", &model_base_path,
      //                 "path to export (required)"),
      tensorflow::Flag("use_saved_model", &use_saved_model,
                       "If true, use SavedModel in the server; otherwise, use "
                       "SessionBundle. It is used by tensorflow serving team "
                       "to control the rollout of SavedModel and is not "
                       "expected to be set by users directly.")};
  string usage = tensorflow::Flags::Usage(argv[0], flag_list);
  const bool parse_result = tensorflow::Flags::Parse(&argc, argv, flag_list);
  if (!parse_result || config_file.empty()) {
    std::cout << usage;
    return -1;
  }
  tensorflow::port::InitMain(argv[0], &argc, &argv);
  if (argc != 1) {
    std::cout << "unknown argument: " << argv[1] << "\n" << usage;
  }
...

and also by commenting updating the creation of the server options:

// For ServerCore Options, we leave servable_state_monitor_creator unspecified
  // so the default servable_state_monitor_creator will be used.
  ServerCore::Options options;
  //options.model_server_config = BuildSingleModelConfig(
  //    model_name, model_base_path, parsed_version_policy);
  options.model_server_config = BuildConfigFromFile(config_file);

finally I can load in a text config file in the format, e.g.:

model_config_list: {

  config: {
    name: "bob",
    base_path: "/tmp/model",
    model_platform: "tensorflow"
  },
  config: {
     name: "bob2",
     base_path: "/tmp/model2",
     model_platform: "tensorflow"
  }
}

Finally running it as such:

./bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --config_file=./tfserv.conf

I get output:

I tensorflow_serving/model_servers/main.cc:166] Building from config file: ./tfserv.conf
I tensorflow_serving/model_servers/server_core.cc:261] Adding/updating models.
I tensorflow_serving/model_servers/server_core.cc:298]  (Re-)adding model: bob
I tensorflow_serving/model_servers/server_core.cc:298]  (Re-)adding model: bob2

Maybe there's an easier way... hope this helps!

perdasilva on 11 Jan 2017

👍20 ❤2

@perdasilva your c++ seems less rusty than mine.
nice.
might be worth a commit, maybe an override on preference check, config_file then single model.

Thanks

sendit2me on 11 Jan 2017

@sendit2me I've already forked the repo - I'm working on the patch now then I'll submit a PR
The last time I touched C++ was at university =S thank god for google and stackoverflow hehehe

perdasilva on 11 Jan 2017

My apologies. I was mistaken when I claimed earlier that the binary (model_servers/main.cc) supports multiple models via ModelServerConfig. I am reviewing PR 294, which adds that feature, now. Thanks for your patience.

chrisolston on 6 Feb 2017

Any news on this?
I'm looking for an option to deploy multiple models without having to change the C++ code.

mm-manu on 29 Jun 2017

PR 294 was merged in January.

You can supply a ModelServerConfig protocol buffer to the tf-serving binary. It contains a "repeated ModelConfig" field [1], which lets you specify multiple models.

[1] https://github.com/tensorflow/serving/blob/master/tensorflow_serving/config/model_server_config.proto#L56

chrisolston on 29 Jun 2017

Thanks for the quick reply! Unfortunately I don't have much experience with protocol buffers.
So supplying a ModelServerConfig protocol buffer to the tf-serving binary means to supply the --config_file argument to the tensorflow_model_server binary call?
My model config file looks like this:

model_config_list: {
config: {
name: "model1",
base_path: "/serving/models/model1",
model_platform: "tensorflow"
},
config: {
name: "model2",
base_path: "/serving/models/model2",
model_platform: "tensorflow"
}
}

Then I start the server with:

/serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --config_file=/serving/model_config/model_config.conf

However, it does not work like this. Server is not started :/
Help would be appreciated :)

EDIT:
the argument for tensorflow_model_server must be --model_config_file=foo.conf instead of --config_file=foo.conf

mm-manu on 29 Jun 2017

👍11

The exact thing works for me. :-) @mm-manu

sumsuddin on 6 Sep 2017

👍1

I know this is a very old issue, but I'm having trouble finding information on what the maximum number of models one can include in a .conf file is; I assume this limit is likely related to the footprint of the models vs the machine on which serving is running, but it's just a guess.

zachgrayio on 7 Jun 2018

Indeed, there is no limit other than the server's resources.

chrisolston on 7 Jun 2018

👍2

How to deploy multi-models in tf serving. for example, I have plate detect model A and plate recognise model B. I want to serve these two models in tf serving. is it possible to cascade these two model in the server, so my client just send the server one picture, server did both detect and recognise and return me the final plate number. I do not want the rough detect bounding box of plate candidates after detetction model. best~

CLIsVeryOK on 5 Jul 2018

👍1

Same problem as @CLIsVeryOK . Any inspiration please? Thanks so much!

WilliamL1 on 5 Aug 2018

@CLIsVeryOK @WilliamL1

I'd recommend writing a simple CLI program or webservice which implements a TF Serving client that interacts with the TF Serving instance over gRPC and submits images to both models using your desired logic. There are quite a few python examples readily available, as well as a few for Golang.

zachgrayio on 5 Aug 2018

I solve my problem by changing the c code in tensorflow serving, and re-compile it in bazel. and it works.
serving\tensorflow_serving\servables\tensorflow\predict_impl.cc
@WilliamL1 @zachgrayio

CLIsVeryOK on 6 Aug 2018

Hello
Is there a way to avoid abolute paths for base_path ? if there is not, is there a way to make it configurable? Thanks in advance!

rola93 on 23 Aug 2018

I was able to deploy 3 models at a time using docker and config file.

I have attached the folder structure in my host machine.
screenshot 2018-12-21 at 12 39 07 am

Config file :

model_config_list: {
  config: {
    name: "model1",
    base_path: "/models/allmodels/model1",
    model_platform: "tensorflow"
  },
  config: {
    name: "model2",
    base_path: "/models/allmodels/model2",
    model_platform: "tensorflow"
  },
  config: {
    name: "model3",
    base_path: "/models/allmodels/model3",
    model_platform: "tensorflow"
  }
}

Docker command:
docker run -p 8500:8500 --mount type=bind,source=/Users/rahulkumar/Desktop/allmodels/,target=/models/allmodels -t tensorflow/serving --model_config_file=/models/allmodels/models.config

Hope this help. :-)

goodrahstar on 20 Dec 2018

👍7 ❤1 🎉1

@goodrahstar I followed the same command. But I am getting error saying file not found.
File is there in the file path.
Docker command :
docker run -p 8501:8501 --mount type=bind,source=/opt/script/TFServingModelFactory/,target=/models/TFServingModelFactory -t $USER/tensorflow-serving --model_config_file=/opt/script/TFServingModelFactory/models.config

model.config file :
model_config_list {
config {
name: "model1"
base_path: "/models/TFServingModelFactory/model1/"
model_platform: "tensorflow"
}
config {
name: "model1"
base_path: "/models/TFServingModelFactory/model1"
model_platform: "tensorflow"
}
}

error : F tensorflow_serving/model_servers/server.cc:97] Non-OK-status: ParseProtoTextFile(file, &proto) status: Not found: /opt/script/TFServingModelFactory/models.config; No such file or directory /usr/bin/tf_serving_entrypoint.sh: line 3: 6 Aborted (core dumped) tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"

It is working fine when I deploy using tensorflow_model_server

I am new to it.

aashish-0393 on 4 Feb 2019

I was able to deploy 3 models at a time using docker and config file.

I have attached the folder structure in my host machine.

Config file :
model_config_list: {
  config: {
    name: "model1",
    base_path: "/models/allmodels/model1",
    model_platform: "tensorflow"
  },
  config: {
    name: "model2",
    base_path: "/models/allmodels/model2",
    model_platform: "tensorflow"
  },
  config: {
    name: "model3",
    base_path: "/models/allmodels/model3",
    model_platform: "tensorflow"
  }
}
Docker command:
docker run -p 8500:8500 --mount type=bind,source=/Users/rahulkumar/Desktop/allmodels/,target=/models/allmodels -t tensorflow/serving --model_config_file=/models/allmodels/models.config

Hope this help. :-)

Hi, this works great. I used GPU to load these models but not sure how to connect to the server from client side. do you have any suggestions?

EthannyDing on 10 Sep 2019

I think that following this and this tutorials explain how to create the client, and in particular, how to define which model you are calling to

rola93 on 11 Sep 2019

how can we tag a model to a specific batching config since that is not the parameter models.config expects.
How does server side batching work for different models expecting different inputs?
@goodrahstar @chrisolston
TIA

chikubee on 27 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Extension file not found. Unable to load package for '@org_tensorflow//third_party/mkl:build_defs.bzl'

abcfy2 · 4Comments

[Question] How do I use tensorflow_serving deployed on kubernetes?

atwj · 4Comments

Apt-get Install does not use GPU

dylanrandle · 3Comments

Export and run inference on modified InceptionV3 trained model

sskgit · 4Comments

Encountered error while reading extension file 'protobuf.bzl': no such package '@protobuf//': Could not find handler for bind rule //external:protobuf error on ubuntu 16.04

sandipmgiri · 3Comments