Serving: Aspired servables/versions aren't refreshed following config updates

Created on 18 Dec 2019 · 4Comments · Source: tensorflow/serving

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04, RHEL 8
TensorFlow Serving installed from (source or binary): Both
TensorFlow Serving version: 1.15.0

Describe the problem

Currently when the configured model list is updated via a call to handleReloadConfigRequest, the request thread blocks until any newly added models become available.

Their availability however depends on the filesystem polling thread rescanning the filesystem at some periodic interval, meaning that there's an arbitrary delay before the requested changes actually take effect and the RPC returns.

This problem may not be very noticeable with the default polling interval of 1 second, but seems undesirable for longer intervals and in particular makes API-based dynamic reconfiguration incompatible with the --file_system_poll_wait_seconds=0 setting (in this case all handleReloadConfigRequest calls time-out and do not take effect).

Exact Steps to Reproduce

Start tensorflow_model_server with the --file_system_poll_wait_seconds=0 option and empty initial config (no models)
Call handleReloadConfigRequest API with a ModelListConfig containing a (valid) new model. It will hang indefinitely or until the grpc deadline

I have opened PR #1518 with proposed fix.

awaiting response bug

Source