_From @kimamula on April 3, 2018 17:17_
0.1.0
Chrome 65.0.3325.181 (64bit)
When I execute a model (i.e., FrozenModel#execute()) which is converted by tensorflowjs_converter and loaded with loadFrozenModel(), it fails with an error Error in matMul: inputs must be rank 2, got ranks 1 and 2.
I compared the .pb files before and after the conversion and found that a reshape operation is removed during the conversion.
.pb file)Squeeze -> Reshape -> PlaceholderWithDefault -> MatMul...
!MobilenetV1/Logits/SpatialSqueezeSqueeze(MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd*
T0*
squeeze_dims
Z
%MobilenetV1/Predictions/Reshape/shapeConst*
valueB"�����*
dtype0
�
MobilenetV1/Predictions/ReshapeReshape!MobilenetV1/Logits/SpatialSqueeze%MobilenetV1/Predictions/Reshape/shape*
Tshape0*
T0
�
"input_1/BottleneckInputPlaceholderPlaceholderWithDefaultMobilenetV1/Predictions/Reshape*
dtype0*
shape:����������
�^
...
#final_training_ops/Wx_plus_b/MatMulMatMul"input_1/BottleneckInputPlaceholder-final_training_ops/weights/final_weights/read*
transpose_a( *
transpose_b( *
...
Squeeze -> PlaceholderWithDefault -> MatMul (Reshape dissappeared)...
!MobilenetV1/Logits/SpatialSqueezeSqueeze(MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd*
squeeze_dims
*
T0
�
"input_1/BottleneckInputPlaceholderPlaceholderWithDefault!MobilenetV1/Logits/SpatialSqueeze*
dtype0*
shape:����������
�
#final_training_ops/Wx_plus_b/MatMulMatMul"input_1/BottleneckInputPlaceholder(final_training_ops/weights/final_weights*
transpose_a( *
transpose_b( *
...
I confirmed that when I modify matrices_executor.ts as follows, the loaded model works as expected.
export let executeOp: OpExecutor =
(node: Node, tensorMap: NamedTensorsMap): tfc.Tensor[] => {
switch (node.op) {
case 'matMul':
+ const a = getParamValue('a', node, tensorMap) as tfc.Tensor2D;
+ const b = getParamValue('b', node, tensorMap) as tfc.Tensor2D;
+ if (a.rank === 1 && b.rank === 2) {
+ return [tfc.vectorTimesMatrix(a, b)];
+ }
return [tfc.matMul(
+ a,
+ b,
- getParamValue('a', node, tensorMap) as tfc.Tensor2D,
- getParamValue('b', node, tensorMap) as tfc.Tensor2D,
getParamValue('transposeA', node, tensorMap) as boolean,
getParamValue('transposeB', node, tensorMap) as boolean)];
case 'transpose':
return [tfc.transpose(
getParamValue('x', node, tensorMap) as tfc.Tensor,
getParamValue('perm', node, tensorMap) as number[])];
default:
throw TypeError(`Node type ${node.op} is not implemented`);
}
};
I prepared my original model by retraining MobileNet on my own categories as described in TensorFlow For Poets codelab.
Then I converted the resulting model to the SavedModel format with the following script.
import tensorflow as tf
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
export_dir = 'path/to/saved_model'
graph_pb = 'path/to/original_pb'
builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
with tf.gfile.GFile(graph_pb, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Session(graph=tf.Graph()) as sess:
tf.import_graph_def(graph_def, name='')
g = tf.get_default_graph()
inp = g.get_tensor_by_name('input:0')
out = g.get_tensor_by_name('final_result:0')
predict_signature = tf.saved_model.signature_def_utils.predict_signature_def({'input': inp}, {'output': out})
builder.add_meta_graph_and_variables(sess, [tag_constants.SERVING], signature_def_map={
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: predict_signature
})
builder.save()
Finally, I converted the SavedModel with tensorflowjs_converter as follows.
tensorflowjs_converter \
--input_format=tf_saved_model \
--saved_model_tags=serve \
--output_node_names="final_result" \
path/to/saved_model \
path/to/output
_Copied from original issue: tensorflow/tfjs-core#919_
I forgot to mention that TensorFlow version is 1.7.0.
@kimamula is it possible for you to share the saved model with us, so we can investigate why the reshape node is removed? Thanks!
@pyu10055 Sure! I uploaded my saved model here.
Thanks @kimamula, can you also share the variables/ directory? The variables/ directory should have the checkpoint files.
The python script I pasted above emitted an empty variables/ directory.
@pyu10055 I followed your advice at https://groups.google.com/a/tensorflow.org/forum/#!topic/tfjs/ztXLl4dcy04 and found that if I remove some of the optimizers from tf_saved_model_conversion.py as follows, the Reshape operation remains.
rewriter_config.optimizers[:] = [
'pruning', 'constfold', 'arithmetic'
]
However, the resulting model still induces the Error in matMul.
When I further remove arithmetic optimizer as follows, the error disappears.
rewriter_config.optimizers[:] = [
'pruning', 'constfold'
]
Removing only arithmetic optimizers also works.
rewriter_config.optimizers[:] = [
'pruning', 'constfold', 'dependency', 'pruning',
'constfold', 'dependency'
]
Thank you for @kimamula for the investigation, I will check with the grappler team on fixing this.
Ping, would we be able to turn off the grappler optimization with a flag until we solve this?
Hi @pyu10055, is there any update from the grappler team? I'm handling this issue by removing arithmetic currently, but wondering if we could have a cleaner solution in the near future.
@tushuhei no update yet, will keep you posted when I hear anything.
@pyu10055 Thank you, that's helpful.
@nsthorat I'm debugging this now.
So far, it looks like the removal of reshape is correct with the (internal) code at head, and the graph runs without error. We fixed several shape inference bugs in TF in early April, which caused Grappler to materialize incorrect shapes, and which may be the cause of what you see.
Thanks @rmlarsen, the model @kimamula shared uses deprecated param name for Squeeze op, which is ignored by the FrozenModel, just added support for that, and model is executed without error. @tushuhei you can checkout the latest converter or wait for 0.2.0 release.
Thank you @pyu10055! Unfortunately, I'm still seeing the same issue with my model, which is trained by retrain.py in TensorFlow 1.7. I'm generating the model by running a docker image described in Dockerfile. I checked the output file, and looks like Reshape op is still removed.
@tushuhei reshape sometimes can be remove due to optimization, give it a try with TensorFlow 1.8, see if the issue still exists. Also can you paste your errors here, or even better share your model with me.
@pyu10055 I have updated the converter to 0.2.0 and TensorFlow to 1.8 and encountered another error when I ran the converted model, saying Error: Can't squeeze axis 2 since its dim '1001' is not 1.
@pyu10055 Thank you for your info! I've managed to fix the issue by upgrading tfjs-core and tfjs-converter in dependency.
@kimamula do you still having the some issue as before? I tried your model, it should works as expected.
@pyu10055 Sorry, the error was due to my fault.
I confirmed that every thing is working fine now.
Thanks!
What was the solution? for This Error: Can't squeeze axis 2 since its dim '1001' is not 1
Update: The error means the input must be normalized and reshaped. Found an example here. https://hpssjellis.github.io/beginner-tensorflowjs-examples-in-javascript/tf-examples/mobinet/index.js
Yes. The solution was reshaping the input with [1, IMAGE_SIZE, IMAGE_SIZE, 3].
Most helpful comment
Thanks @rmlarsen, the model @kimamula shared uses deprecated param name for Squeeze op, which is ignored by the FrozenModel, just added support for that, and model is executed without error. @tushuhei you can checkout the latest converter or wait for 0.2.0 release.