Vision: ONNX export of FasterRCNN: inference fails when no detections are present

Created on 2 Jul 2020 · 6Comments · Source: pytorch/vision

🐛 Bug

I am running into a similar issue as the one reported in https://github.com/pytorch/vision/issues/2251 where my exported ONNX model fails to run inference when no detections are present, but for FasterRCNN instead of MaskRCNN.

Running inference on a random tensor that will not create detections results in a similar runtime exception:

RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running ReduceMax node. Name:'ReduceMax_1814' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/reduction/reduction_ops.cc:351 onnxruntime::common::Status onnxruntime::cuda::PrepareForReduce(onnxruntime::OpKernelContext*, bool, const std::vector<long int>&, const onnxruntime::Tensor**, onnxruntime::Tensor**, int64_t&, int64_t&, std::vector<long int>&, std::vector<long int>&, std::vector<long int>&, int64_t&, int64_t&) keepdims || dim != 0 was false. Can't reduce on dim with value of 0 if 'keepdims' is false. Invalid output shape would be produced. input_shape:{0,4}

I updated my environment to use recent torch, torchvision, and onnxruntime versions as instructed in the closed issue, but I still hit the same runtime exception.

To Reproduce

Steps to reproduce the behavior (mostly copied from closed issue):

Export a pretrained FasterRCNN model to ONNX:

torch.onnx.export(
    model, 
    inputs, 
    onnx_model_filepath,
    opset_version=11,
    do_constant_folding=True,
    verbose=True,
    input_names=[
        "data"
    ],
    output_names=[
        "boxes", 
        "labels", 
        "scores"
    ],
    dynamic_axes={
        "data": [1, 2],
        "boxes": [0],
        "labels": [0],
        "scores": [0]
    }
)

Run inference on an image that will result in detections and see output without failure:

ort_session = onnxruntime.InferenceSession(onnx_model_name)
input_array = input_tensor.cpu().numpy()
ort_inputs = {"data": input_array}
ort_outputs = ort_session.run(None, ort_inputs)

Run inference on an image that will not result in detections and hit the runtime exception provided above:

random_tensor = torch.randn(input_tensor.shape)
random_array = random_tensor.cpu().numpy()
ort_inputs = {"data": random_array}
ort_outputs = ort_session.run(None, ort_inputs)

Expected behavior

I am expecting output from the ONNX exported FasterRCNN that is similar to the output from the PyTorch version:

[{'boxes': tensor([], size=(0, 4), grad_fn=<StackBackward>),
  'labels': tensor([], dtype=torch.int64),
  'scores': tensor([], grad_fn=<IndexBackward>)}]

Environment

PyTorch version: 1.7.0a0+4102fbd
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: TITAN Xp
GPU 1: TITAN Xp
GPU 2: TITAN Xp

Nvidia driver version: 440.64.00
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.7.0a0+4102fbd
[pip3] torchvision==0.8.0a0+bea6127
[conda] magma-cuda101             2.5.2                         1    pytorch
[conda] mkl                       2020.1                      217  
[conda] mkl-include               2020.1                      219    conda-forge
[conda] numpy                     1.18.5           py38h8854b6b_0    conda-forge
[conda] torch                     1.7.0a0+4102fbd          pypi_0    pypi
[conda] torchvision               0.8.0a0+bea6127          pypi_0    pypi

!pip freeze | grep onnx

onnx==1.7.0
onnxruntime-gpu==1.3.0

awaiting response onnx

Source

drwaltman

Most helpful comment

@neginraoof I have tested it in torch 1.5.1 and torchvision 0.6.1 (aka 0.6.0a0+35d732a). The inference results of pytorch is equal with ort's.

@drwaltman In an early June nightly version, there is a bug in the test_onnx.py unittest. Have updated it to nightly version in June 30 (At that time, the version of torch is 1.7.0.dev20200626 and torchvision is 0.8.0.dev20200629, ignore the time zones difference), it's resolved.

zhiqwang on 4 Jul 2020

👍2

All 6 comments

cc @neginraoof can you have a look? He seems to have a very recent version of torchvision, which should have had this fixed already.

fmassa on 2 Jul 2020

Use torch 1.7.0.dev20200626, torchvision 0.8.0.dev20200629 and onnxruntime 1.3.0, the ort outputs is exactly same with pytorch's.

The keys of ort_inputs in the second case is data ?

ort_inputs = {"data": random_array}

Maybe this typo cause the error?

zhiqwang on 3 Jul 2020

@drwaltman have a look at @zhiqwang explanation, it looks like it could be the cause of the issue you are facing.

fmassa on 3 Jul 2020

Hi @zhiqwang and @fmassa, thanks for the responses!

Running inference on the random tensor is now working after updating torch to 1.7.0.dev20200626 and torchvision to 0.8.0.dev20200629! @zhiqwang how did you determine to use these two versions in particular?

Regarding the repro steps typo, I was using "data" instead of "input" for the input keys for local testing, but pasted incorrect code on accident (was trying to match the examples from the previous issue), my bad!

Thanks again!

drwaltman on 3 Jul 2020

👍2

@drwaltman Can you please let us know if either of latest stable (1.5.1) or nightly builds are having this issue?
Thanks.