Glow: [Reshape] Allow the shape to be computed from constants

Created on 28 Nov 2019 · 14Comments · Source: pytorch/glow

Assume we have the following case:

Assume also that even though the graph has this form, all the tensor sizes are statically known (that is the size for the output of conv and transpose are known). Right now this model will not work because the Reshape node complains that the "shape" operator is not explicitly a constant even though conceptually the shape is a constant.

Can we do something to support this case?
Maybe something like moving the restriction of "shape" operand being a constant right after all the graph optimizations took place.

Source

mciprian13

All 14 comments

Have you tried -const-fold-ops? It tries to constant fold nodes during loading.

ayermolo on 4 Dec 2019

👍1

Without the flag I get one error:
Error message: Non-constant shape tensors are unsupported by Glow.

With the -const-fold-ops flag I get multiple errors:

Error message: No constant found
Error message: No constant found
Error message: No constant found
Error message: Non-constant shape tensors are unsupported by Glow.

mciprian13 on 5 Dec 2019

Maybe I miss understood the issue. Are you saying output of Conv is constant, as in can be deduced at compile time of the graph?

ayermolo on 5 Dec 2019

The tensor size (shape) is constant not the tensor values. The reshape operator has 2 operands: a tensor and a shape. The problem is about the shape operand.

mciprian13 on 5 Dec 2019

Sorry derp moment. Are you sure problem is not elsewhere in the graph?
Yeah when shape is loaded it will create a constant, and -constant-fold-ops should fold gather, concat, and by the time reshape is loaded that input should be constant.

I just double checked on onnx ssd which has similar pattern and it worked. Although granted it's on internal branch...

ayermolo on 5 Dec 2019

@jfix71, @ayermolo I understand the flag -const-fold-ops might do some hocus-pocus with nodes which involve constants. My questions:

Why isn`t this flag enabled by default?
Why isn`t this flag solving the problem? ayermolo mentions that this does the trick on an internal branch ...

mciprian13 on 14 Jan 2020

@mciprian13

Why isn`t this flag enabled by default?

Not sure 🙂I don't think there's any real downside to enabling it by default.

Why isn`t this flag solving the problem? ayermolo mentions that this does the trick on an internal branch ...

Good question. Are you sure it's an issue with Shape? It should always be loaded as a Constant so that's weird. Otherwise I'm wondering how the indices are being represented for the Gather, and the data for the Unsqueeze. Can you try adding in some dumpDAG() svgs to upload to this issue to help diagnose given what the graph looks like? I can also try to help debug this if you can provide the model and command line you're using.

jfix71 on 14 Jan 2020

The problem above is with the "shape" input operand of the Reshape node (and NOT the Shape node).

mciprian13 on 15 Jan 2020

@mciprian13 Right, but the debug messages you're showing say that there are multiple places where a Constant is not found. The way the const fold ops flag works is it tries to constant fold each op as it's loaded. Compilation might fail at the shape input for Reshape but the problem could be coming because a prior node that was loaded that flows into the shape input for Reshape was not correctly constant folded. Can you share something which repros it, even just that part of the proto that you showed in the issue?

jfix71 on 15 Jan 2020

👍1

The model with the trouble (the picture above is from this model) is this:
mb2-ssd-lite.zip
You can reproduce the problem using the model-compiler like this:
model-compiler -model=mb2-ssd-lite.onnx -backend=CPU -emit-bundle=build -const-fold-ops
This model was taken from here:
https://github.com/qfgaohao/pytorch-ssd

I think the option -const-fold-ops is not enabled by default due to the model-runner application which assumes the model is made up of only constants (don't know what is the utility for this ... testing maybe).
I really think we should enable the -const-fold-ops option by default.

mciprian13 on 16 Jan 2020

OK. Figured it out.
This bugged me a bunch. Since similar graph worked internally, and this one did also.
The issue is in constantFoldInLoader in Protobufloader.h

Code is using mod->getConstantByName.
We are using loader->getConstantByNameOrNull.

Reason for that is that former iterates through constant vector and checks names against node names. Except those names are modified to be unique. So no match happens. The latter uses hash in Protobufloader which has original name. So that mystery solved. :)

ayermolo on 17 Jan 2020

❤1

Nice find @ayermolo! Seems like then a simple 1 line fix is needed -- @mciprian13 Can you try this to make sure it fixes the problem for you? I believe it's simply changing:

https://github.com/pytorch/glow/blob/69124a01e750a624cf5f94999df18a2ed3bf8753/include/glow/Importer/ProtobufLoader.h#L230

to Constant *tmpConst = loader->getConstantByNameOrNull(op.input(i));

jfix71 on 17 Jan 2020

Thanks for the solution. It works!
Created a PR with the fix + enabled the const-fold-ops options by default.

mciprian13 on 17 Jan 2020

Should be fixed as of https://github.com/pytorch/glow/commit/855775629b8be67116012770bc044e90b23646e7.

jfix71 on 25 Jan 2020

Was this page helpful?

0 / 5 - 0 ratings