Hi,
I've compile tensorflow serving with CUDA on a P2 instance on Amazon. Everything compiled fine (after a few touchups covered in the issues). However, when I try to run the mnist_client after training and loading the mnist model I get:
AbortionError(code=StatusCode.NOT_FOUND, details="FeedInputs: unable to find feed output images")
Any ideas?
Having same issue here, help needed.
I followed the tutorial and ran the following lines:
bazel build //tensorflow_serving/example:mnist_export
bazel-bin/tensorflow_serving/example/mnist_export /tmp/mnist_model
bazel build //tensorflow_serving/model_servers:tensorflow_model_server
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=mnist --model_base_path=/tmp/mnist_model/
another session
bazel build //tensorflow_serving/example:mnist_client
bazel-bin/tensorflow_serving/example/mnist_client --num_tests=1000 --server=localhost:9000
Having same issue here, help needed.
My server looks like starting properly:
I tensorflow_serving/model_servers/main.cc:117] Building single TensorFlow model file config: model_name: mnist model_base_path: /tmp/monitored model_version_policy: 0
I tensorflow_serving/model_servers/server_core.cc:319] Adding/updating models.
I tensorflow_serving/model_servers/server_core.cc:364] (Re-)adding model: mnist
I tensorflow_serving/core/basic_manager.cc:693] Successfully reserved resources to load servable {name: mnist version: 1}
I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: mnist version: 1}
I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: mnist version: 1}
I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:161] Attempting to load a SessionBundle from: /tmp/monitored/00000001
I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:162] Using RunOptions:
E external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:925] OpKernel ('op: "NegTrain" device_type: "CPU"') for unknown op: NegTrain
E external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:925] OpKernel ('op: "Skipgram" device_type: "CPU"') for unknown op: Skipgram
I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:135] Running restore op for SessionBundle: save/restore_all, save/Const:0
I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:150] Running init op for SessionBundle
I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:244] Loading SessionBundle: success. Took 112097 microseconds.
I tensorflow_serving/servables/tensorflow/saved_model_bundle_factory.cc:66] Wrapping session to perform batch processing
I tensorflow_serving/servables/tensorflow/bundle_factory_util.cc:153] Wrapping session to perform batch processing
I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: mnist version: 1}
I tensorflow_serving/model_servers/main.cc:177] Running ModelServer at 0.0.0.0:9000 ...
But I got the same problem
Ok - seems like bullshit but it seems --use_saved_model is defaulted to true
try using --use_saved_model=false when starting the server seems to work for me
It works! thanks !
Thanks @perdasilva . However this time I got an error at client
$ bazel-bin/tensorflow_serving/example/mnist_client --num_tests=1000 --server=localhost:9000
D0118 03:46:40.536165687 18608 ev_posix.c:101] Using polling engine: poll
Extracting /tmp/train-images-idx3-ubyte.gz
Extracting /tmp/train-labels-idx1-ubyte.gz
Extracting /tmp/t10k-images-idx3-ubyte.gz
Extracting /tmp/t10k-labels-idx1-ubyte.gz
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/grpc/_channel.py", line 658, in _call_spin
completed_call = event.tag(event)
File "/usr/local/lib/python2.7/dist-packages/grpc/_channel.py", line 174, in handle_event
callback()
File "/usr/local/lib/python2.7/dist-packages/grpc/_channel.py", line 294, in <lambda>
self._state.callbacks.append(lambda: fn(self))
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 135, in <lambda>
self._future.add_done_callback(lambda ignored_callback: fn(self))
File "/home/liangcc/serving/bazel-bin/tensorflow_serving/example/mnist_client.runfiles/tf_serving/tensorflow_serving/example/mnist_client.py", line 107, in _callback
exception = result_future.exception()
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 120, in exception
return _abortion_error(rpc_error_call)
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 74, in _abortion_error
code = rpc_error_call.code()
AttributeError: 'NoneType' object has no attribute 'code'
@ChihChengLiang I don't know if you problem is related to this. Did you install all of the dependencies as per the documentation?
I built my environment by following the tensorflow/serving dockerfile.
On a Ubuntu 14.04
sudo su
apt-get update && apt-get install -y \
build-essential \
curl \
git \
libfreetype6-dev \
libpng12-dev \
libzmq3-dev \
pkg-config \
python-dev \
python-numpy \
python-pip \
software-properties-common \
swig \
zip \
zlib1g-dev \
libcurl3-dev
curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && python get-pip.py
pip install enum34 futures mock six
pip install --pre 'protobuf>=3.0.0a3'
pip install -i https://testpypi.python.org/simple --pre grpcio
add-apt-repository -y ppa:openjdk-r/ppa
apt-get update
apt-get install -y openjdk-8-jdk openjdk-8-jre-headless
export BAZEL_VERSION=0.4.2
mkdir /bazel && \
cd /bazel && \
curl -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
curl -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE.txt && \
chmod +x bazel-*.sh && \
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
cd / && \
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
git clone --recurse-submodules https://github.com/tensorflow/serving
cd serving
cd tensorflow
./configure
cd ..
bazel build tensorflow_serving/...
@ChihChengLiang yeah, I'm really sorry but idk what the problem is =S
It turns out that I encountered a known issue in grpc, which is fixed in the HEAD.
followed the dockerfile to build latest grpc and it works
git clone -b master https://github.com/grpc/grpc
cd grpc
git submodule update --init
pip install -r requirements.txt
GRPC_PYTHON_BUILD_WITH_CYTHON=1 pip install .
reference: https://github.com/tensorflow/serving/issues/122
Thanks @perdasilva
@ChihChengLiang Nice one! Thanks for posting the fix =D
@ChihChengLiang, the reference issue has a date of Jul 2016. The fix still not on the latest gRPC? Thanks!
--use_saved_model=false when starting the server
This worked for me to resolve AbortionError(code=StatusCode.NOT_FOUND, details="FeedInputs: unable to find feed output images")
I get the same error AbortionError(code=StatusCode.NOT_FOUND, details="FeedInputs: unable to find feed output inputs")
when running a simple model with just float features.
Here is the code that trains and exports the model:
def create_model(column_names, model_dir):
# Specify that all features have real-value data
feature_columns = [
feature_column_lib.real_valued_column('feature', dimension=len(column_names))
]
# Build Classifier
classifier = tf.contrib.learn.LinearClassifier(feature_columns=feature_columns,
model_dir=model_dir)
return classifier
def input_fn(df):
if "__label" in df:
label = tf.constant(df["__label"].values)
df_without_label = df.drop("__label", axis=1)
#feature_cols = {k: tf.constant(df_without_label[k].values) for k in df_without_label.keys()}
feature_cols = {'feature': constant_op.constant(df_without_label.values, dtype=dtypes.float32)}
else:
feature_cols = {'feature': constant_op.constant(df.values, dtype=dtypes.float32)}
label = None
return feature_cols, label
def train_and_evaluate_model(training_run_data, model_dir):
tf.logging.set_verbosity(tf.logging.INFO)
print "Input training set size: {}".format(len(training_run_data["df_train"]))
training_set, validation_set = get_training_validation_set(training_run_data, True, 0.4)
print "Selected training set size: {}, validation set size: {}".format(len(training_set), len(validation_set))
training_y = training_set["__label"].values
training_set_sorted = training_set.drop("__label", axis=1).sort_index(axis=1)
training_x = training_set_sorted.values
print "Nans in target: {}".format(np.any(np.isnan(training_y)))
print "Nans in x: {}".format(np.any(np.isnan(training_x)))
classifier = create_model(training_set_sorted.keys(), model_dir)
# Fit model.
classifier.fit(input_fn=lambda: input_fn(training_set), steps=2000)
# Evaluate accuracy.
eval_score = classifier.evaluate(input_fn=lambda: input_fn(validation_set), steps=10)
for key, val in eval_score.items():
print "{}: {}".format(key, val)
return classifier
def export_saved_model(model, training_df, export_dir_base):
def fix_feature_spec(feature_spec):
for key, feature in feature_spec.items():
if isinstance(feature, tf.VarLenFeature):
print("fixing feature %s " % key)
feature_spec[key] = tf.FixedLenFeature(shape=[1], dtype=feature.dtype, default_value=None)
training_df_sorted = training_df.drop("__label", axis=1).sort_index(axis=1)
feature_columns = [
feature_column_lib.real_valued_column('feature', dimension=len(training_df_sorted.keys()))
]
feature_spc = feature_column_lib.create_feature_spec_for_parsing(feature_columns)
fix_feature_spec(feature_spc)
print(feature_spc)
serving_input_fn = input_fn_utils.build_parsing_serving_input_fn(feature_spc)
print (serving_input_fn())
print(export_dir_base)
save_result = model.export_savedmodel(export_dir_base,
serving_input_fn)
return save_result
def create_process_model():
conf_name = "ProcessValidBoolean"
trd = prep.extract_process_training_run_data(
extract.get_data_selection_config(1000, 1000, '2016/09/15', '2017/02/15', '2017/03/15'),
extract.get_extract_config(conf_name))
mdl = train_and_evaluate_model(trd, "models/process_valid/saved_model_test_4")
export_saved_model(mdl, trd["df_train"], "saved_models/process_valid/v_3")
And here is a snippet from the client (based on the mnist client as boiler plate):
host, port = hostport.split(':')
channel = implementations.insecure_channel(host, int(port))
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
result_counter = _ResultCounter(num_tests, concurrency)
for _ in range(num_tests):
request = predict_pb2.PredictRequest()
request.model_spec.name = 'process_validation1'
request.model_spec.signature_name = 'serving_default'
data, label = [0,0,0,0,0,0,0,0], 0
# construct the Example proto boject
example = tf.train.Example(
# Example contains a Features proto object
features=tf.train.Features(
# Features contains a map of string to Feature proto objects
feature={
'feature': tf.train.Feature(float_list=tf.train.FloatList(value=data))
}))
# use the proto object to serialize the example to a string
serialized = example.SerializeToString()
request.inputs['inputs'].CopyFrom(
tf.contrib.util.make_tensor_proto(serialized, shape=[1]))
result_counter.throttle()
result_future = stub.Predict.future(request, 5.0) # 5 seconds
result_future.add_done_callback(
_create_rpc_callback(label, result_counter))
I cannot figure out why this happens!
Just use: pip install grpcio nowdays. It has the latest fix for the NonType problem.
On the 14.04 image I get /usr/lib/python2.7/dist-packages (from protobuf>=3.2.0->grpcio)
and the tutorial works as expected.
How to solve this problem? I need help
E tensorflow/examples/label_image/main.cc:305] Running model failed: Not found: FeedInputs: unable to find feed output input
Thank you
@fventer Have you found the solution? I have the same problem. I think the client serialized example that is sent to the server doesn't match the servers' input example signature for some reason.
I could not find a solution yet.
Defining an input layer worked for me (--input_layer=Mul)
Sorry not yet, I was focusing on other work for a while. I will post an
answer after coming back to this issue and solving it.
Regards
Fritz Venter
fritz.[email protected]
cell: +15122937896
On Wed, Apr 19, 2017 at 7:49 AM, MtDersvan notifications@github.com wrote:
@fventer https://github.com/fventer Have you found the solution? I have
the same problem. I think the client serialized example that is sent to the
server doesn't match the server input example signature for some reason.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/serving/issues/295#issuecomment-295258238,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAwRcsOdf6L-edp5HsCiTgUXQRzvQQEuks5rxgLzgaJpZM4Lkcsq
.
Hello Guys! I'm facing the same problem: E tensorflow/examples/label_image/main.cc:305] Running model failed: Not found: FeedInputs: unable to find feed output input. I'm running the retraining demo with flowers.
System: Ubuntu 16.04
Python = 3.6
I've installed tensorflow using conda tutorial.
Thanks!
Hey guys I found the solution of the problem above:
bazel build tensorflow/examples/label_image:label_image && \
bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--output_layer=final_result \
--image=$HOME/flower_photos/daisy/21652746_cc379e0eea_m.jpg \
--input_layer=Mul //this parameter update the input and output layer names to "Mul" and "final_result" respectively
Thanks @davidsmandrade that worked for me! Just curious but where did you deduce that the input_layer should be "Mul"?
When retrain inceptionV3 model the Input layer name is updated, see here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py#L870
Tutorial still does not tell that.
Closing due to staleness. If this is still an issue, please file a new updated issue with current steps to reproduce the bug. If this is a question, please ask it on:
https://stackoverflow.com/questions/tagged/tensorflow-serving
Thanks!
Most helpful comment
Hey guys I found the solution of the problem above:
bazel build tensorflow/examples/label_image:label_image && \
bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--output_layer=final_result \
--image=$HOME/flower_photos/daisy/21652746_cc379e0eea_m.jpg \
--input_layer=Mul //this parameter update the input and output layer names to "Mul" and "final_result" respectively