No
Hi there, I am using an ONNX model whose size is larger than 2GB, it can be loaded correctly because ONNX has set load_external_data=True in load_model function, and I can also print its graph by calling onnx.helper.printable_graph(model.graph).
But when I call infer_shapes() function on the loaded model, it will fail with error msg "ONNX_REL_1_7.ModelProto exceeds maximum protobuf size of 2GB: 2239723653"
Detailed error msg:
_model = infer_shapes(model)
File "/usr/local/lib/python3.6/dist-packages/onnx/shape_inference.py", line 34, in infer_shapes
model_str = model.SerializeToString()
ValueError: Message ONNX_REL_1_7.ModelProto exceeds maximum protobuf size of 2GB: 2239723653_
The reason of calling infer_shapes() is because of onnxconverter_common package. I want to use fp16 instead of fp32 to speed-up this model.
import onnx
model = onnx.load('./onnx/graph.onnx')
from onnx.shape_inference import infer_shapes
model = infer_shapes(model)
A clear and concise description of what you expected to happen.
I expect the infer_shapes() function can work properly for model size>2GB, or any other solutions which could provide the same output as infer_shapes() for model size>2GB. Thanks a lot!
Any additional information
Hi @yetingqiaqia,
It's a known issue that python protobuf object has 2GB limit. You can try to build onnx from master branch with cherry-pick this PR: https://github.com/onnx/onnx/pull/3012. This PR enables shape_inference to take model_path as input to avoid the 2GB limit issue. Then you can use something like this:
import onnx
# output inferred model to "./onnx/inferred_model.onnx"
onnx.shape_inference.infer_shapes_path('./onnx/graph.onnx', './onnx/inferred_model.onnx'))
Hi @jcwchen Thanks for your solution. It works now. I can convert ONNX fp32 to fp16 with the new function. Appreciate it!
That PR has been merged into the main branch so it will be included in ONNX Release 1.8. Close this issue now. Thanks.
Most helpful comment
Hi @yetingqiaqia,
It's a known issue that python protobuf object has 2GB limit. You can try to build onnx from master branch with cherry-pick this PR: https://github.com/onnx/onnx/pull/3012. This PR enables shape_inference to take model_path as input to avoid the 2GB limit issue. Then you can use something like this: