Trying to serve my Chinese to English model and am having trouble querying. I am receiving an error:
(test) root@ubuntu-c-8-16gib-sfo2-01:~/T2T_Model# t2t-query-server --server=0.0.0.0:9000 --servable_name=transformer --problem=translate_enzh_wmt32k_rev --data_dir=/root/T2T_Model/t2t_data --inputs_once='Hello my name is John.'
Traceback (most recent call last):
File "/usr/local/bin/t2t-query-server", line 17, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/usr/local/bin/t2t-query-server", line 12, in main
query.main(argv)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/serving/query.py", line 89, in main
outputs = serving_utils.predict([inputs], problem, request_fn)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/serving/serving_utils.py", line 157, in predict
predictions = request_fn(examples)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/serving/serving_utils.py", line 113, in _make_grpc_request
response = stub.Predict(request, timeout_secs)
File "/usr/local/lib/python2.7/dist-packages/grpc/_channel.py", line 533, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python2.7/dist-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Requested more than 0 entries, but params is empty. Params shape: [1,4,8,0,64]
[[{{node transformer/while/GatherNd_32}} = GatherNd[Tindices=DT_INT32, Tparams=DT_FLOAT, _output_shapes=[[?,8,?,?,64]], _device="/job:localhost/replica:0/task:0/device:CPU:0"](transformer/while/Reshape_65, transformer/while/stack)]]"
debug_error_string = "{"created":"@1542086942.107507941","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Requested more than 0 entries, but params is empty. Params shape: [1,4,8,0,64]\n\t [[{{node transformer/while/GatherNd_32}} = GatherNd[Tindices=DT_INT32, Tparams=DT_FLOAT, _output_shapes=[[?,8,?,?,64]], _device="/job:localhost/replica:0/task:0/device:CPU:0"](transformer/while/Reshape_65, transformer/while/stack)]]","grpc_status":3}"
>
The model server seems to be working fine and responding with the same error:
(test) root@ubuntu-c-8-16gib-sfo2-01:~/T2T_Model# tensorflow_model_server --port=9000 --model_name=transformer --model_base_path=/root/T2T_Model/t2t_train/translate_enzh_wmt32k/transformer-transformer_base/export
2018-11-13 05:28:29.116290: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config: model_name: transformer model_base_path: /root/T2T_Model/t2t_train/translate_enzh_wmt32k/transformer-transformer_base/export
2018-11-13 05:28:29.116412: I tensorflow_serving/model_servers/server_core.cc:461] Adding/updating models.
2018-11-13 05:28:29.116424: I tensorflow_serving/model_servers/server_core.cc:558] (Re-)adding model: transformer
2018-11-13 05:28:29.216782: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: transformer version: 1542073770}
2018-11-13 05:28:29.216806: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: transformer version: 1542073770}
2018-11-13 05:28:29.216815: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: transformer version: 1542073770}
2018-11-13 05:28:29.216830: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /root/T2T_Model/t2t_train/translate_enzh_wmt32k/transformer-transformer_base/export/1542073770
2018-11-13 05:28:29.216838: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /root/T2T_Model/t2t_train/translate_enzh_wmt32k/transformer-transformer_base/export/1542073770
2018-11-13 05:28:29.537966: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2018-11-13 05:28:29.597214: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2018-11-13 05:28:29.722289: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:162] Restoring SavedModel bundle.
2018-11-13 05:28:30.139345: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:138] Running MainOp with key saved_model_main_op on SavedModel bundle.
2018-11-13 05:28:30.227063: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:259] SavedModel load for tags { serve }; Status: success. Took 1010210 microseconds.
2018-11-13 05:28:30.227116: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:83] No warmup data file found at /root/T2T_Model/t2t_train/translate_enzh_wmt32k/transformer-transformer_base/export/1542073770/assets.extra/tf_serving_warmup_requests
2018-11-13 05:28:30.227223: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: transformer version: 1542073770}
2018-11-13 05:28:30.229398: I tensorflow_serving/model_servers/server.cc:286] Running gRPC ModelServer at 0.0.0.0:9000 ...
2018-11-13 05:59:38.052592: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at gather_nd_op.cc:50 : Invalid argument: Requested more than 0 entries, but params is empty. Params shape: [1,4,8,0,64]
My environment:
tensor2tensor (1.10.0)
tensorboard (1.12.0)
tensorflow (1.12.0)
tensorflow-serving-api (1.12.0)
Would appreciate any tips or comments.
I have met the same problem too.
My environments:
tensor2tensor 1.9.0
tensorflow_gpu 1.11.0
tensorflow-serving and tensorflow-serving-api 1.11.0
And I have tried using more recent versions of t2t & tf, but it doesn't work!
@leemingtian curious to know which source and target languages your model is serving
@echan00 translate_enzh_wmt32k_rev
I'm wondering if Chinese -> English is causing the problem.
While I was debugging another issue earlier, I noticed in translate_enzh.py line 243 the source and target vocab files were the other way around.
source_vocab_filename = os.path.join(data_dir, self.source_vocab_name)
target_vocab_filename = os.path.join(data_dir, self.target_vocab_name)
translate_enzh_wmt32k_rev should be Chinese to English. But I saw that source_vocab_filename was EN and target_vocab_filename was ZH
@echan00 Yes, I can get correct response when I using translate_enzh_wmt32k. Maybe you are right.
source_vocab_filename = os.path.join(data_dir, self.source_vocab_name) target_vocab_filename = os.path.join(data_dir, self.target_vocab_name)
Looks like the problem isn't the one mentioned above.
@leemingtian could you possibly add a print(request) command as I have shown below in serving_utils.py (line 113)? I'm interested to see how the request compared between translate_enzh_wmt32k and translate_enzh_wmt32k_rev
def _make_grpc_request(examples):
"""Builds and sends request to TensorFlow model server."""
request = predict_pb2.PredictRequest()
request.model_spec.name = servable_name
request.inputs["input"].CopyFrom(
tf.contrib.util.make_tensor_proto(
[ex.SerializeToString() for ex in examples], shape=[len(examples)]))
print(request) #print request to see what it looks like
response = stub.Predict(request, timeout_secs)
outputs = tf.make_ndarray(response.outputs["outputs"])
scores = tf.make_ndarray(response.outputs["scores"])
assert len(outputs) == len(scores)
return [{
"outputs": outputs[i],
"scores": scores[i]
} for i in range(len(outputs))]
For translate_enzh_wmt32k_rev the request looks like this:
>> model_spec {
name: "transformer"
}
inputs {
key: "input"
value {
dtype: DT_STRING
tensor_shape {
dim {
size: 1
}
}
string_val: "\n\021\n\017\n\006inputs\022\005\032\003\n\001\001"
}
}
Would appreciate if anybody else can also chime in on this issue!
@echan00 I can get correct response in Chinese -> English now.
I have fixed the serving_utils.py as below.
def predict(inputs_list, problem, request_fn):
"""Encodes inputs, makes request to deployed TF model, and decodes outputs."""
assert isinstance(inputs_list, list)
#fname = "inputs" if problem.has_inputs else "targets"
fname = "targets"
input_encoder = problem.feature_info[fname].encoder
input_ids_list = [
_encode(inputs, input_encoder, add_eos=problem.has_inputs)
for inputs in inputs_list
]
examples = [_make_example(input_ids, problem, fname)
for input_ids in input_ids_list]
predictions = request_fn(examples)
#output_decoder = problem.feature_info["targets"].encoder
output_decoder = problem.feature_info["inputs"].encoder
outputs = [
(_decode(prediction["outputs"], output_decoder),
prediction["scores"])
for prediction in predictions
]
return outputs
And then use t2t-query-server with argument "--problem=translate_enzh_wmt32k". (Attention here: argument problem with translate_enzh_wmt32k_rev will not work.)
Hope this helps.
Wow thanks @leemingtian, I'll commit a fix to this repo at some point.
Most helpful comment
@echan00 I can get correct response in Chinese -> English now.
I have fixed the serving_utils.py as below.
And then use t2t-query-server with argument "--problem=translate_enzh_wmt32k". (Attention here: argument problem with translate_enzh_wmt32k_rev will not work.)
Hope this helps.