Serving: How to use the custom_model_config to load models at runtime

Created on 28 Mar 2017  路  15Comments  路  Source: tensorflow/serving

The tensorflow serving advanced tutorial says the models to be loaded can be declared either through model_config_list or through dynamic_model_config as follows:

ServerCore::Create() takes a ServerCore::Options parameter. Here are a few commonly used options:
ModelServerConfig that specifies models to be loaded. Models are declared either through model_config_list, which declares a static list of models, or through dynamic_model_config, which declares a dynamic list of models that may get updated at runtime.

Now, i can load multiple models through model_config_list with a text config file like this:

model_config_list: {
config: {
name: "mnist",
base_path: "/tmp/monitored/inception_model",
model_platform: "tensorflow"
},
config: {
name: "inception",
base_path: "/tmp/monitored/inception_model",
model_platform: "tensorflow"
}
}

But, how to use the dynamic_model_config to load some models at runtime? Any example or detail explanation about that? I also read the following comments in config/model_server_config.proto. It says the custom model config is fetched dynamically at run time through network RPC, custom service, etc. Okay, i am still confused...

// ModelServer config.
message ModelServerConfig {
// ModelServer takes either a static file-based model config list or an Any
// proto representing custom model config that is fetched dynamically at
// runtime (through network RPC, custom service, etc.).

oneof config {
ModelConfigList model_config_list = 1;
google.protobuf.Any custom_model_config = 2;
}
}

contributions welcome docs

Most helpful comment

I would be very interested in this code too. I think this would also be a great pull request to the main repo since that is probably what everyone needs. Running the server all the time to serve different models and different versions thereof while being able to swap in new models and new versions without down time, that sounds like what a server should do.

All 15 comments

Still have no idea about how to use the dynamic_model_config, maybe it's still unsupported? I modified main.cc & server_core.cc in the tensorflow_serving/model_servers and implemented automatic loading function i want.

Have the same issue, I dont see dynamic_model_config anywhere in the code, so probably it is not supported. Btw the only extension supported is to write plugins to manage the new versions:
https://tensorflow.github.io/serving/custom_source
There is a reload config method but I guess it just pulls the new versions, not new models.

Seems we had one tutorial that still had the old name of dynamic_model_config which was renamed to custom_model_config. I'll update the documentation.

A custom_model_config can be used for anything that you can't do with a standard model config that points to models on disk. A common case is if you have your own custom sources for example. It was originally called dynamic because you can configure a source that dynamically changes the list of models to load based on some external signals, but was renamed to something more general (custom_model_config) since it may or may not be dynamic.

In order to use it, you'll need to implement https://github.com/tensorflow/serving/blob/master/tensorflow_serving/model_servers/main.cc#L146 to work with your custom config that you defined (comes in the form of an Any proto).

Thank you for you comment, @kirilg ! Just as you said, besides implementing the LoadCustomModelConfig function, i also need send some external signals to tell tensorflow model server to call LoadCustomModelConfig at runtime. Could you explain more about how to send the signals? Any example code or tutorial document?

I see the LoadCustomModelConfig function has an argument with type EventBus, does it have something to do with the signaling?

@kirilg I think the dynamic case from a file should be available by default as it helps with a running server, instead of redeploying with static files. Now for general sources then it should be up to the user to do it for his needs. I would be happy to do it if I knew the details (I am new to tensorflow). A common useful case would be from s3 or something like that.

I would also be very curious to see a clean way on how to load new models dynamically during runtime. Since we can already load dynamically new versions, we should have all components. The missing part for me is how we can trigger the server to load a new model.
I think we there are three main options:

  • Instead of watching only new version on the file system, we watch also the parent directory if a new folder for a new model appears.
  • If we use the 'custom_model_config' we periodically check it for changes and see if a new model is in there (or one is missing).
  • Have a gRPC call which sends the required information for the new model.

Does anyone have some experience what is the best way or if some way is already supported? Since I'm new to TF serving it is also not easy for me to see what is the best way to integrate this and where to start.

@markusnagel The first option is easy based on the existing codes and i have implemented it. I just add a thread for the tensorflow model server to check the model parent directory periodically. Once new models detected, it will rebuild the ModelServerConfig and call the ReloadConfig method of ServerCore. Then the model server will do the rest by itself. Furthermore, calling the ReloadConfig won't interfere other already loaded models. I'm not sure if it's a good way, but it's easy enough and meets my requirement for now.

Ok great, thanks for the answer @Hello-Frankey. Would you mind sharing some code, my C++ is already a but rusty by now and I don't fully understand yet all parts of TF serving and what implications each has...

I would be very interested in this code too. I think this would also be a great pull request to the main repo since that is probably what everyone needs. Running the server all the time to serve different models and different versions thereof while being able to swap in new models and new versions without down time, that sounds like what a server should do.

@Hello-Frankey I run into the same situation, and I wonder if you still have the code change of this issue? Would you like sharing it with me?

first compile serving java api with the help of https://github.com/alexxyjiang/tensorflow-serving-api

then use java api to reload configure at runtime

public static void addNewModel() {

    ModelServerConfigOuterClass.ModelConfig modelConfig1 =
            ModelServerConfigOuterClass.ModelConfig.newBuilder()
                    .setBasePath("/models/new_model")
                    .setName("new_model")
                    .setModelType(ModelServerConfigOuterClass.ModelType.TENSORFLOW)
                    .build();

    ModelServerConfigOuterClass.ModelConfig modelConfig2 =
            ModelServerConfigOuterClass.ModelConfig.newBuilder()
                    .setBasePath("/models/beidian_cart_ctr_wdl_model")
                    .setName("beidian_cart_ctr_wdl_model")
                    .setModelType(ModelServerConfigOuterClass.ModelType.TENSORFLOW)
                    .build();

    ModelServerConfigOuterClass.ModelConfigList modelConfigList =
            ModelServerConfigOuterClass.ModelConfigList.newBuilder()
                    .addConfig(modelConfig1)
                    .addConfig(modelConfig2)
                    .build();

    ModelServerConfigOuterClass.ModelServerConfig modelServerConfig = ModelServerConfigOuterClass.
            ModelServerConfig.newBuilder()
            .setModelConfigList(modelConfigList)
            .build();

    ModelManagement.ReloadConfigRequest reloadConfigRequest =
            ModelManagement.ReloadConfigRequest.newBuilder()
                    .setConfig(modelServerConfig)
                    .build();

    ManagedChannel channel = ManagedChannelBuilder.forAddress("10.2.176.43", 8500).usePlaintext().build();
    ModelServiceGrpc.ModelServiceBlockingStub modelServiceBlockingStub = ModelServiceGrpc.newBlockingStub(channel);
    ModelManagement.ReloadConfigResponse reloadConfigResponse =
            modelServiceBlockingStub.handleReloadConfigRequest(reloadConfigRequest);

    System.out.println(reloadConfigResponse.getStatus().getErrorMessage());

    channel.shutdownNow();

}

@lebron374 thanks, I would like to have a try.

@lebron374, what is "ModelServiceGrpc", I can't find it , would you mind telling me? thank you ~

Thank you for the interest in this feature everyone - a simple implementation will be soon added. please follow #1301 for updates.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

demiladef picture demiladef  路  4Comments

jluite picture jluite  路  4Comments

dylanrandle picture dylanrandle  路  3Comments

akkiagrawal94 picture akkiagrawal94  路  3Comments

prateekgupta11 picture prateekgupta11  路  4Comments