Vision: [JIT] Not supported for maskrcnn_resnet50_fpn

Created on 6 Jun 2019  Â·  59Comments  Â·  Source: pytorch/vision

I am trying to accelerate the maskrcnn_resnet50_fpn pretrained model using JIT tracing provided by pytorch. It appears that some operations present in this model are not supported by pytorch JIT.

Are these models supposed to have JIT support officially? If not, would you be able to provide advice for a workaround?

To replicate, running:

import torch
import torchvision
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model.eval()
traced_net = torch.jit.trace(model, torch.rand(1, 3,800, 800))

produces

RuntimeError: log2_vml_cpu not implemented for 'Long

Thank you.

enhancement models object detection

Most helpful comment

@cted18 Yes i'll be working on adding OrderedDict to support fcn_resnet101. I think together with op support added in https://github.com/pytorch/vision/pull/1267 it shouldn't be too hard to support in script.

All 59 comments

this actually looks like a bug in scale = 2 ** torch.tensor(approx_scale).log2().round().item() in torchvision/ops/poolers.py.

If approx_scale here is an exact integer, the tensor will be a LongTensor, which is unexpected.

That should be changed to torch.tensor(approx_scale, dtype=torch.float32)

@rbrigden as mentioned in the release notes, the detection models do not yet support JIT, in particular because we use custom ops which are not registered with the TorchScript ops.

We plan to add full JIT support for the detection models in follow-up releases.

And @soumith good catch about the location of the error.
But this looks like a problem with tracing, because in https://github.com/pytorch/vision/blob/aa32c9376c46eb284f2b091f3eb98aec4fd64b03/torchvision/ops/poolers.py#L100
we force approx_scale to be a float, so the JIT should take that into account.
But a workaround solution could be to explicitly force a dtype in torch.tensor, as you mentioned

@fmassa dear fmassa, what time does the detection models support support JIT? thank you

@lzp0916 A first PyTorch PR that would enable us to start making the model TorchScript friendly has just been sent to PyTorch https://github.com/pytorch/pytorch/pull/22582

But I'd say it will still take a few months to get the detection models to support TorchScript.

cc @fbbradheintz

@soumith , @fmassa I change the the code to torch.tensor(approx_scale, dtype=torch.float32) in torchvision/ops/poolers.py as soumith said.
It worked for that error. But there came another error. I think it's about the TorchScript is not supporting maskrcnn's output format
here are the logging:
RuntimeError: Only tensors or tuples of tensors can be output from traced functions (getNestedOutputTrace at /opt/conda/conda-bld/pytorch_1556653099582/work/torch/csrc/jit/tracer.cpp:200) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f7bb5b1adc5 in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libc10.so) frame #1: torch::jit::tracer::getNestedOutputTrace(std::shared_ptr<torch::jit::tracer::TracingState> const&, c10::IValue const&) + 0x23e (0x7f7bb39d5cee in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch.so.1) frame #2: torch::jit::tracer::exit(std::vector<c10::IValue, std::allocator<c10::IValue> > const&) + 0x2f (0x7f7bb39d5dbf in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch.so.1) frame #3: <unknown function> + 0x447ab3 (0x7f7be4e3eab3 in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #4: <unknown function> + 0x45a8b4 (0x7f7be4e518b4 in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #5: <unknown function> + 0x12ce4a (0x7f7be4b23e4a in /home/lxs/anaconda3/envs/torchscript/lib/python3.6/site-packages/torch/lib/libtorch_python.so) <omitting python frames> frame #20: __libc_start_main + 0xe7 (0x7f7bf41f0b97 in /lib/x86_64-linux-gnu/libc.so.6)
And it seems too hard for me to work around it, torchvision.models.detection is such a great work, it make my code a lot easier. hope this problem can be fixed soon : )

@XushengLee adding support for TorchScript for all models in torchvision is in the plans, but it will still take a few months before we are there.

@XushengLee you can fix the second error if you change how the outputs of the inference are put into a dictionary but rather just pass the tensors directly

@remzr7 thank you for your help and I tried that, and it solved the problem of output.
But there is another error, and the logging is not as clear as before.
However, I think it regards the input format. I find out the maskrcnn in torchvision.models.detection takes in a list of channel-first image tensors at least during the evaluation, not a typical 4-D tensor.

# this snippet is from engine.py of the torchvion.models.detection 
for images, targets in metric_logger.log_every(data_loader, print_freq, header):
    images = list(image.to(device) for image in images)
    targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
    loss_dict = model(images, targets)
    losses = sum(loss for loss in loss_dict.values())

Oh yes, I think you can also disable the GeneralizedRCNN Transforms that
the underlying GeneralizedRCNN Class applies, but instead perform the
transformations (i.e resize/to_tensor) before you do model.forward()

On Tue, Jul 30, 2019 at 11:39 PM XushengLee notifications@github.com
wrote:

@remzr7 https://github.com/remzr7 thank you for your help and I tried
that, and it solved the problem of output.
But there is another error, and the logging is not as clear as before.
However, I think it regards the input format. I find out the maskrcnn in
torchvision.models.detection takes in a list of channel-first image
tensors at least during the evaluation, not a typical 4-D tensor.

this snippet is from engine.py of the torchvion.models.detection

for images, targets in metric_logger.log_every(data_loader, print_freq, header):
images = list(image.to(device) for image in images)
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
loss_dict = model(images, targets)
losses = sum(loss for loss in loss_dict.values())

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/pytorch/vision/issues/1002?email_source=notifications&email_token=ABJKSRAP22V4664N5N7KUWTQCEXRNA5CNFSM4HVHGJW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3GIAKY#issuecomment-516718635,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABJKSREGTHSXJJQHDLAJIVTQCEXRNANCNFSM4HVHGJWQ
.

@remzr7 It doesn't seem that simple. I took the transforms in GeneralizedRCNN outside and changed the output of GeneralizedRCNN to tuples instead of a dict.

Now, it seems that I would have to change the outputs of all modules recursively, i.e.,

  1. IntermediateLayerGetter(..) returns an OrderedDict
  2. FeaturePyramidNetwork(..) returns an OrderedDict
  3. BackboneWithFPN(..) returns an OrderedDict
    and so on..

I changed the outputs of all of them to tuples of tensors except for IntermediateLayerGetter(..)
I have not been able to get around IntermediateLayerGetter(..) by changing the OrderedDict structure being used because torchscript at this point cannot deal with OrderedDict outputs.

@soumith @fmassa since OrderedDict outputs are being used everywhere in detection, maybe it would be easier to add torchscript support for returning OrderedDicts? Is there a quick workaround to solve this problem?

@cted18 yes, OrderedDict support in torchscript is something that should be added.

And we are starting to work on adding support for maskrcnn_resnet50_fpn to work on torchscript / traceable, a first PR in this series has been sent in https://github.com/pytorch/vision/pull/1267

cc @eellison for OrderedDict support in torchscript

@cted18 Yes i'll be working on adding OrderedDict to support fcn_resnet101. I think together with op support added in https://github.com/pytorch/vision/pull/1267 it shouldn't be too hard to support in script.

@fmassa dear fmassa, I am using torch.jit.trace to encounter an error as follows:
"RuntimeError: Tried to trace <__torch__.torchvision.ops.misc.FrozenBatchNorm2d object at 0000029EB0B365E0> but it is not part of the active trace. Modules that are called during a trace must be registered as submodules of the thing being traced."
How can I solve this problem?
windows
pytorch:1.3.0.dev20190920
torchvision:0.5.0.dev20190924
model:fasterrcnn_resnet50_fpn

@lzp0916 this error will be solved when https://github.com/pytorch/vision/pull/1329 is merged

The issue is critical for putting the model into production system. Thanks for working on this.

2 ** torch.tensor(approx_scale).log2().round()

can someone explain why here if approx_scale < 1 it doesnt got rounded to integer? It's some hack or normal behavior?

@creotiv it's an approximation, that avoids us having to manually specify what's the downscaling for layer n.

@fmassa no i understand that. i mean why function round() not rounding 0.123 for example to zero(only after log function)?
Cause i dont see anything like that in docs https://pytorch.org/docs/stable/torch.html?highlight=round#torch.round, and it looking like bug

And also torch.log2(2**torch.tensor(0.123,dtype=torch.float64)).round() return 0.

@creotiv FYI this is unrelated to the issue (which is that maskrcnn_resnet50_fpn is not yet scriptable), but I don't understand your point.

Can you open a new issue describing with an example what you think is the problem?

RuntimeError: Only tensors or tuples of tensors can be output from traced functions

@XushengLee how did you get rid of the error "RuntimeError: Only tensors or tuples of tensors can be output from traced functions"? I am currently having the same issue when trying to trace Maskrcnn model from trochvision with the following script

`
import torch
import torchvision

model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model.eval()
test_data = torch.rand(1, 3, 480, 640)
traced_model = torch.jit.trace(model, test_data)
`

@gemmit support for tracing / scripting maskrcnn is coming soon, check https://github.com/pytorch/vision/pull/1407 and https://github.com/pytorch/vision/pull/1461

@fmassa okay, thanks for the info. Will check the links

@gemmit ~tracing should already be supported for maskrcnn~. Using torch.jit.script will be supported in the coming weeks

@lara-hdr I've just tried tracing maskrcnn, and I got an error

import torch, torchvision
m = torchvision.models.detection.maskrcnn_resnet50_fpn()
m.eval()

traced_model = torch.jit.trace(m, [[torch.rand(3, 300, 300)]]

I get the following error

RuntimeError: Only tensors or tuples of tensors can be output from traced functions (getOutput at /Users/distiller/project/conda/conda-bld/pytorch_1572429967983/work/torch/csrc/jit/tracer.cpp:211)
frame #0: c10::Error::Error(c10::SourceLocation, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 135 (0x112b608b7 in libc10.dylib)
frame #1: torch::jit::tracer::TracingState::getOutput(c10::IValue const&) + 1593 (0x11b1d8549 in libtorch.dylib)
frame #2: torch::jit::tracer::trace(std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >, std::__1::function<std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> > (std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >)> const&, std::__1::function<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > (torch::autograd::Variable const&)>, bool, torch::jit::script::Module*) + 1792 (0x11b1d90b0 in libtorch.dylib)
frame #3: torch::jit::tracer::createGraphByTracing(pybind11::function const&, std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >, pybind11::function const&, bool, torch::jit::script::Module*) + 361 (0x1121829b9 in libtorch_python.dylib)
frame #4: void pybind11::cpp_function::initialize<torch::jit::script::initJitScriptBindings(_object*)::$_16, void, torch::jit::script::Module&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, pybind11::function, pybind11::tuple, pybind11::function, bool, pybind11::name, pybind11::is_method, pybind11::sibling>(torch::jit::script::initJitScriptBindings(_object*)::$_16&&, void (*)(torch::jit::script::Module&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, pybind11::function, pybind11::tuple, pybind11::function, bool), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) + 319 (0x1121bd20f in libtorch_python.dylib)
frame #5: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 3324 (0x111c9f3fc in libtorch_python.dylib)
<omitting python frames>
frame #61: start + 1 (0x7fff6fa6d3d5 in libdyld.dylib)
frame #62: 0x0 + 2 (0x2 in ???)

I just now realized that ONNX export does not call into torch.jit.trace, but torch.jit.get_trace_graph. Hum, this is unfortunate :-/

@XushengLee adding support for TorchScript for all models in torchvision is in the plans, but it will still take a few months before we are there.

Any progress about support the detection models by JIT?Thanks

@stereomatchingkiss Yes, it's almost ready, just need to fix some unrelated ONNX issues and it will be merged this week

@stereomatchingkiss Yes, it's almost ready, just need to fix some unrelated ONNX issues and it will be merged this week

Thanks, glad to hear that, could we convert the model to onnx format after this merged?

@stereomatchingkiss ONNX and JIT support for Mask R-CNN in torchvision has been merged into master, and is available if you compile from source.

I still cannot trace the Maskrcnn model from the latest branch.

I get this error out of the box:

scale = 2 ** float(torch.tensor(approx_scale).log2().round()) RuntimeError: log2_vml_cpu not implemented for 'Long'
Then I make changes suggested by @soumith

this actually looks like a bug in scale = 2 ** torch.tensor(approx_scale).log2().round().item() in torchvision/ops/poolers.py.

If approx_scale here is an exact integer, the tensor will be a LongTensor, which is unexpected.

That should be changed to torch.tensor(approx_scale, dtype=torch.float32)

Now I have this:

File "/../python3.6/site-packages/torchvision-0.5.0a0+5b1716a-py3.6-linux-x86_64.egg/torchvision/ops/poolers.py", line 164, in setup_scales self.map_levels = initLevelMapper(int(lvl_min), int(lvl_max)) OverflowError: cannot convert float infinity to integer

@cted18 can you print torchvision.__version__? I suspect you are in an old version

Sure.

torchvision.__version__ '0.5.0a0+5b1716a'

I just built it from the master.

@cted18 can you share a script that reproduces the error you have?

I am trying to accelerate the maskrcnn_resnet50_fpn pretrained model using JIT tracing provided by pytorch. It appears that some operations present in this model are not supported by pytorch JIT.

Are these models supposed to have JIT support officially? If not, would you be able to provide advice for a workaround?

To replicate, running:

import torch
import torchvision
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model.eval()
traced_net = torch.jit.trace(model, torch.rand(1, 3,800, 800))

produces

RuntimeError: log2_vml_cpu not implemented for 'Long

Thank you.

Yes. It is the exact same script as from @rbrigden

Ubuntu 16.04
python 3.6.7
torch.__version__ '1.3.0a0+de394b6'
torchvision.__version__ '0.5.0a0+cec7ea7'

@cted18 this should be fixed when https://github.com/pytorch/vision/pull/1639 get's merged

Still cannot convert fasterrcnn_resnet50_fpn

Version(print(torchvision.__version__)) :

0.5.0.dev20191206

Codes:

import torch
import torchvision

print(torchvision.__version__)

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(True)
model.eval()
example = torch.rand(1, 3, 300, 400)
traced_script_module = torch.jit.trace(model, example)

Error messages:

RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
'incorrect results).', category=RuntimeWarning)
Traceback (most recent call last):
File "pytorch_conversion.py", line 14, in
traced_script_module = torch.jit.trace(model, example)
File "C:\Users\yyyy\Anaconda3\envs\pytorch_preview\lib\site-packages\torch\jit__init__.py", line 877, in trace
check_tolerance, _force_outplace, _module_class)
File "C:\Users\yyyy\Anaconda3\envs\pytorch_preview\lib\site-packages\torch\jit__init__.py", line 1029, in trace_module
module._c._create_method_from_trace(method_name, func, example_inputs, var_lookup_fn, _force_outplace)
RuntimeError: Only tensors or tuples of tensors can be output from traced functions (getOutput at ..\torch\csrc\jit\tracer.cpp:212)
(no backtrace available)

OS : windows 10 64bits
installed by anaconda :

conda create --name pytorch_n python=3.7
conda activate pytorch_n
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch-nightly -c defaults -c conda-forge

Models I need:

keypointrcnn_resnet50_fpn, fasterrcnn_resnet50_fpn

@stereomatchingkiss use torch.jit.script instead of torch.jit.trace, and it should work.

model = torch.jit.script(model)

@stereomatchingkiss use torch.jit.script instead of torch.jit.trace, and it should work.

model = torch.jit.script(model)

Thanks, this work, but fail to load the model of fasterrcnn_resnet50_fpn by the c++ api.
OS : ubuntu18.0.4.3 LTS 64bits
libtorch : nightly(2019/12/07)

main.cpp

#include <torch/script.h>

#include <iostream>
#include <memory>

int main(int argc, const char* argv[])
{
    if(argc != 2){
        std::cerr << "usage: example-app <path-to-exported-script-module>\n";
        return -1;
    }


    torch::jit::script::Module module;
    try {
        // Deserialize the ScriptModule from a file using torch::jit::load().
        module = torch::jit::load(argv[1]);
    }
    catch (const c10::Error& e) {
        std::cerr << "error loading the model\n";
        return -1;
    }

    std::cout << "ok\n";
}

CMakeLists.txt

cmake_minimum_required(VERSION 3.5)

project(pytorch_test LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

find_package(Torch REQUIRED)

add_executable(pytorch_test main.cpp)
target_link_libraries(pytorch_test "${TORCH_LIBRARIES}")
set_property(TARGET pytorch_test PROPERTY CXX_STANDARD 14)

Error message:

terminate called after throwing an instance of 'torch::jit::script::ErrorReport'
  what():  
Unknown builtin op: torchvision::_new_empty_tensor_op.
Could not find any similar ops to torchvision::_new_empty_tensor_op. This op may not exist or may not be currently supported in TorchScript.
:
  File "C:\Users\yyyy\Anaconda3\envs\pytorch_preview\lib\site-packages\torchvision\ops\new_empty_tensor.py", line 16
        output (Tensor)
    """
    return torch.ops.torchvision._new_empty_tensor_op(x, shape)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
Serialized   File "code/__torch__/torchvision/ops/new_empty_tensor.py", line 4
def _new_empty_tensor(x: Tensor,
    shape: List[int]) -> Tensor:
  _0 = ops.torchvision._new_empty_tensor_op(x, shape)
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
  return _0
'_new_empty_tensor' is being compiled since it was called from 'interpolate'
Serialized   File "code/__torch__/torchvision/ops/misc.py", line 25
    align_corners: Optional[bool]=None) -> Tensor:
  _1 = __torch__.torchvision.ops.misc._output_size
  _2 = __torch__.torchvision.ops.new_empty_tensor._new_empty_tensor
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
  _3 = uninitialized(Tensor)
  if torch.gt(torch.numel(input), 0):
'interpolate' is being compiled since it was called from 'GeneralizedRCNNTransform.resize'
Serialized   File "code/__torch__/torchvision/models/detection/transform.py", line 79
    target: Optional[Dict[str, Tensor]]) -> Tuple[Tensor, Optional[Dict[str, Tensor]]]:
    _18 = __torch__.torchvision.models.detection.transform.resize_boxes
    _19 = __torch__.torchvision.ops.misc.interpolate
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _20 = __torch__.torchvision.models.detection.transform.resize_keypoints
    _21 = uninitialized(Tuple[Tensor, Optional[Dict[str, Tensor]]])
'GeneralizedRCNNTransform.resize' is being compiled since it was called from 'GeneralizedRCNNTransform.forward'
  File "C:\Users\yyyy\Anaconda3\envs\pytorch_preview\lib\site-packages\torchvision\models\detection\transform.py", line 47
                                 "of shape [C, H, W], got {}".format(image.shape))
            image = self.normalize(image)
            image, target_index = self.resize(image, target_index)
                                  ~~~~~~~~~~~ <--- HERE
            images[i] = image
            if targets is not None and target_index is not None:
Serialized   File "code/__torch__/torchvision/models/detection/transform.py", line 29
        pass
      image0 = (self).normalize(image, )
      _2 = (self).resize(image0, target_index, )
                                 ~~~~~~~~~~~~ <--- HERE
      image1, target_index0, = _2
      _3 = torch._set_item(images0, i, image1)

Aborted (core dumped)

Edit : I download the cpp package(cpu only) about one hour ago.

@stereomatchingkiss use torch.jit.script instead of torch.jit.trace, and it should work.

model = torch.jit.script(model)

I found a solution from issue #1407, but I have another question, how could I know which op I need to register? Or I should not care about this part because in the future these op would not need to register by the end users? Thanks

static auto registry =
        torch::RegisterOperators()
                .op("torchvision::nms", &nms)
                .op("torchvision::roi_align(Tensor input, Tensor rois, float spatial_scale, int pooled_height, int pooled_width, int sampling_ratio) -> Tensor",
                    &roi_align)
                .op("torchvision::roi_pool", &roi_pool)
                .op("torchvision::_new_empty_tensor_op", &new_empty_tensor)
                .op("torchvision::ps_roi_align", &ps_roi_align)
                .op("torchvision::ps_roi_pool", &ps_roi_pool);

@stereomatchingkiss

how could I know which op I need to register?

that's a good question. I don't yet have a good answer for that, I'll discuss with @eellison to see if we can find a good solution to it

@stereomatchingkiss

how could I know which op I need to register?

that's a good question. I don't yet have a good answer for that, I'll discuss with @eellison to see if we can find a good solution to it

When I copy the codes, I find another question, where could I find following headers

#include "torchvision/PSROIAlign.h"
#include "torchvision/PSROIPool.h"
#include "torchvision/ROIAlign.h"
#include "torchvision/ROIPool.h"
#include "torchvision/empty_tensor_op.h"
#include "torchvision/nms.h"

Are they generated when I compiled from source?

@stereomatchingkiss

how could I know which op I need to register?

that's a good question. I don't yet have a good answer for that, I'll discuss with @eellison to see if we can find a good solution to it

Check issue #1407 again, looks like I need to change the make file and compile it by myself in order to generate the files.
Any good news of using the models by c++ api?

@stereomatchingkiss

Any good news of using the models by c++ api?

We will be improving the experience of using the torchvision models with the C++ API over time. We have just enabled support for Mask R-CNN models to be torchscripted, and will be refining the C++ export over time

@fmassa
I can script Maskrcnn parts and load them in cpp using this

model = models.detection.maskrcnn_resnet50_fpn(pretrained=True).eval()
backbone_script = torch.jit.script(model.backbone)

but when I add a wrapper around the attributes (backbone eg.) and load it on cpp, it cannot find torchvision operators.
Why might this happen?

class BackboneWrapper(torch.nn.Module):
    def __init__(self, model):
        super(BackboneWrapper, self).__init__()
        self.transform = model.transform
        self.backbone = model.backbone

    def forward(self, images, targets=None):
        # type: (List[Tensor], Optional[List[Dict[str, Tensor]]]) -> Dict[str, Dict[str, Tensor]]
        images, _ = self.transform(images, targets)
        features = self.backbone(images.tensors)
        return {'features': features}

Error:

Unknown builtin op: torchvision::_new_empty_tensor_op.
Could not find any similar ops to torchvision::_new_empty_tensor_op. This op may not exist or may not be currently supported in TorchScript.
: torchvision-0.4.2-py3.6-linux-x86_64.egg/torchvision/ops/new_empty_tensor.py", line 16
        output (Tensor)
    """
    return torch.ops.torchvision._new_empty_tensor_op(x, shape)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
Serialized   File "code/__torch__/torchvision/ops/new_empty_tensor.py", line 4

@cted18 I believe the solution you are looking for can be found in https://github.com/pytorch/vision/issues/1002#issuecomment-562915463 and https://github.com/pytorch/vision/pull/1407#issuecomment-563048240

If you are still facing issues, can you open a new issue with a full reproducible example of the problem?

@fmassa
I used the comments from #1407, but the problem still exists.
Opened a new issue #1730
Thanks

@cted18 I believe the solution you are looking for can be found in #1002 (comment) and #1407 (comment)

If you are still facing issues, can you open a new issue with a full reproducible example of the problem?

@fmassa How could I get a torchscript version of torchvision.models.detection. maskrcnn_resnet50_fpn?

torch.jit.script and torch.jit.tarce are not working with this model

With torch.jit.script

model = torch.load(modelname+"-best.pth")
model=model.cuda()
model.eval()
print(img)
with torch.no_grad():
    print(model(img))
    traced_cell = torch.jit.script(model, (img))
torch.jit.save(traced_cell, modelname+"-torchscript.pth")

loaded_trace = torch.jit.load(modelname+"-torchscript.pth")
loaded_trace.eval()
with torch.no_grad():
    print(loaded_trace(img))

TensorMask(torch.argmax(loaded_trace(img),1)).show()

Output:

TensorImage([[[[0.8961, 0.9132, 0.8789,  ..., 0.2453, 0.1939, 0.2282],
          [0.8276, 0.9132, 0.8618,  ..., 0.2282, 0.1939, 0.2282],
          [0.8961, 0.9132, 0.8789,  ..., 0.2282, 0.2282, 0.2453],
          ...,
          [0.8961, 0.8618, 0.9132,  ..., 0.4508, 0.4166, 0.3994],
          [0.9303, 0.9132, 0.9474,  ..., 0.4166, 0.4166, 0.4508],
          [0.9646, 0.8789, 0.9303,  ..., 0.3994, 0.3994, 0.3994]],

         [[1.0455, 1.0630, 1.0280,  ..., 0.3803, 0.3277, 0.3627],
          [0.9755, 1.0630, 1.0105,  ..., 0.3627, 0.3277, 0.3627],
          [1.0455, 1.0630, 1.0280,  ..., 0.3627, 0.3627, 0.3803],
          ...,
          [1.0455, 1.0105, 1.0630,  ..., 0.5903, 0.5553, 0.5378],
          [1.0805, 1.0630, 1.0980,  ..., 0.5553, 0.5553, 0.5903],
          [1.1155, 1.0280, 1.0805,  ..., 0.5378, 0.5378, 0.5378]],

         [[1.2631, 1.2805, 1.2457,  ..., 0.6008, 0.5485, 0.5834],
          [1.1934, 1.2805, 1.2282,  ..., 0.5834, 0.5485, 0.5834],
          [1.2631, 1.2805, 1.2457,  ..., 0.5834, 0.5834, 0.6008],
          ...,
          [1.2631, 1.2282, 1.2805,  ..., 0.8099, 0.7751, 0.7576],
          [1.2980, 1.2805, 1.3154,  ..., 0.7751, 0.7751, 0.8099],
          [1.3328, 1.2457, 1.2980,  ..., 0.7576, 0.7576, 0.7576]]]],
       device='cuda:0')
[{'boxes': tensor([[412.5222, 492.3208, 619.7662, 620.9233]], device='cuda:0'), 'labels': tensor([1], device='cuda:0'), 'scores': tensor([0.1527], device='cuda:0'), 'masks': tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]]], device='cuda:0')}]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-23-7216a0dac5a0> in <module>
     12 loaded_trace.eval()
     13 with torch.no_grad():
---> 14     print(loaded_trace(img))
     15 
     16 TensorMask(torch.argmax(loaded_trace(img),1)).show()

~/anaconda3/envs/pro1/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    556             result = self._slow_forward(*input, **kwargs)
    557         else:
--> 558             result = self.forward(*input, **kwargs)
    559         for hook in self._forward_hooks.values():
    560             hook_result = hook(self, input, result)

RuntimeError: forward() Expected a value of type 'List[Tensor]' for argument 'images' but instead found type 'TensorImage'.
Position: 1
Value: TensorImage([[[[0.8961, 0.9132, 0.8789,  ..., 0.2453, 0.1939, 0.2282],
          [0.8276, 0.9132, 0.8618,  ..., 0.2282, 0.1939, 0.2282],
          [0.8961, 0.9132, 0.8789,  ..., 0.2282, 0.2282, 0.2453],
          ...,
          [0.8961, 0.8618, 0.9132,  ..., 0.4508, 0.4166, 0.3994],
          [0.9303, 0.9132, 0.9474,  ..., 0.4166, 0.4166, 0.4508],
          [0.9646, 0.8789, 0.9303,  ..., 0.3994, 0.3994, 0.3994]],

         [[1.0455, 1.0630, 1.0280,  ..., 0.3803, 0.3277, 0.3627],
          [0.9755, 1.0630, 1.0105,  ..., 0.3627, 0.3277, 0.3627],
          [1.0455, 1.0630, 1.0280,  ..., 0.3627, 0.3627, 0.3803],
          ...,
          [1.0455, 1.0105, 1.0630,  ..., 0.5903, 0.5553, 0.5378],
          [1.0805, 1.0630, 1.0980,  ..., 0.5553, 0.5553, 0.5903],
          [1.1155, 1.0280, 1.0805,  ..., 0.5378, 0.5378, 0.5378]],

         [[1.2631, 1.2805, 1.2457,  ..., 0.6008, 0.5485, 0.5834],
          [1.1934, 1.2805, 1.2282,  ..., 0.5834, 0.5485, 0.5834],
          [1.2631, 1.2805, 1.2457,  ..., 0.5834, 0.5834, 0.6008],
          ...,
          [1.2631, 1.2282, 1.2805,  ..., 0.8099, 0.7751, 0.7576],
          [1.2980, 1.2805, 1.3154,  ..., 0.7751, 0.7751, 0.8099],
          [1.3328, 1.2457, 1.2980,  ..., 0.7576, 0.7576, 0.7576]]]],
       device='cuda:0')
Declaration: forward(__torch__.torchvision.models.detection.mask_rcnn.___torch_mangle_1723.MaskRCNN self, Tensor[] images, Dict(str, Tensor)[]? targets=None) -> ((Dict(str, Tensor), Dict(str, Tensor)[]))
Cast error details: Unable to cast Python instance to C++ type (compile in debug mode for details)

With torch.jit.trace

modelname="maskrcnn"
model = torch.load(modelname+"-best.pth")
model=model.cuda()
model.eval()
print(img)
with torch.no_grad():
    print(model(img))
    traced_cell = torch.jit.trace(model, (img))
torch.jit.save(traced_cell, modelname+"-torchscript.pth")

loaded_trace = torch.jit.load(modelname+"-torchscript.pth")
loaded_trace.eval()
with torch.no_grad():
    print(loaded_trace(img))

TensorMask(torch.argmax(loaded_trace(img),1)).show()

Output

TensorImage([[[[0.8961, 0.9132, 0.8789,  ..., 0.2453, 0.1939, 0.2282],
          [0.8276, 0.9132, 0.8618,  ..., 0.2282, 0.1939, 0.2282],
          [0.8961, 0.9132, 0.8789,  ..., 0.2282, 0.2282, 0.2453],
          ...,
          [0.8961, 0.8618, 0.9132,  ..., 0.4508, 0.4166, 0.3994],
          [0.9303, 0.9132, 0.9474,  ..., 0.4166, 0.4166, 0.4508],
          [0.9646, 0.8789, 0.9303,  ..., 0.3994, 0.3994, 0.3994]],

         [[1.0455, 1.0630, 1.0280,  ..., 0.3803, 0.3277, 0.3627],
          [0.9755, 1.0630, 1.0105,  ..., 0.3627, 0.3277, 0.3627],
          [1.0455, 1.0630, 1.0280,  ..., 0.3627, 0.3627, 0.3803],
          ...,
          [1.0455, 1.0105, 1.0630,  ..., 0.5903, 0.5553, 0.5378],
          [1.0805, 1.0630, 1.0980,  ..., 0.5553, 0.5553, 0.5903],
          [1.1155, 1.0280, 1.0805,  ..., 0.5378, 0.5378, 0.5378]],

         [[1.2631, 1.2805, 1.2457,  ..., 0.6008, 0.5485, 0.5834],
          [1.1934, 1.2805, 1.2282,  ..., 0.5834, 0.5485, 0.5834],
          [1.2631, 1.2805, 1.2457,  ..., 0.5834, 0.5834, 0.6008],
          ...,
          [1.2631, 1.2282, 1.2805,  ..., 0.8099, 0.7751, 0.7576],
          [1.2980, 1.2805, 1.3154,  ..., 0.7751, 0.7751, 0.8099],
          [1.3328, 1.2457, 1.2980,  ..., 0.7576, 0.7576, 0.7576]]]],
       device='cuda:0')
[{'boxes': tensor([[412.5222, 492.3208, 619.7662, 620.9233]], device='cuda:0'), 'labels': tensor([1], device='cuda:0'), 'scores': tensor([0.1527], device='cuda:0'), 'masks': tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]]], device='cuda:0')}]
/opt/conda/conda-bld/pytorch_1587452831668/work/torch/csrc/utils/python_arg_parser.cpp:760: UserWarning: This overload of nonzero is deprecated:
    nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
    nonzero(Tensor input, *, bool as_tuple)
/home/david/anaconda3/envs/proy/lib/python3.7/site-packages/torch/tensor.py:467: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  'incorrect results).', category=RuntimeWarning)
/home/david/anaconda3/envs/proy/lib/python3.7/site-packages/fastai2/torch_core.py:272: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  res = getattr(super(TensorBase, self), fn)(*args, **kwargs)
/opt/conda/conda-bld/pytorch_1587452831668/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
/home/david/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py:164: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  torch.tensor(image_size[1] / g[1], dtype=torch.int64, device=device)] for g in grid_sizes]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-15-44b7a9360e87> in <module>
      6 with torch.no_grad():
      7     print(model(img))
----> 8     traced_cell = torch.jit.trace(model, (img))
      9 torch.jit.save(traced_cell, modelname+"-torchscript.pth")
     10 

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/jit/__init__.py in trace(func, example_inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    881         return trace_module(func, {'forward': example_inputs}, None,
    882                             check_trace, wrap_check_inputs(check_inputs),
--> 883                             check_tolerance, strict, _force_outplace, _module_class)
    884 
    885     if (hasattr(func, '__self__') and isinstance(func.__self__, torch.nn.Module) and

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/jit/__init__.py in trace_module(mod, inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
   1035             func = mod if method_name == "forward" else getattr(mod, method_name)
   1036             example_inputs = make_tuple(example_inputs)
-> 1037             module._c._create_method_from_trace(method_name, func, example_inputs, var_lookup_fn, strict, _force_outplace)
   1038             check_trace_method = module._c._get_method(method_name)
   1039 

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    554                 input = result
    555         if torch._C._get_tracing_state():
--> 556             result = self._slow_forward(*input, **kwargs)
    557         else:
    558             result = self.forward(*input, **kwargs)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
    540                 recording_scopes = False
    541         try:
--> 542             result = self.forward(*input, **kwargs)
    543         finally:
    544             if recording_scopes:

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
     68         if isinstance(features, torch.Tensor):
     69             features = OrderedDict([('0', features)])
---> 70         proposals, proposal_losses = self.rpn(images, features, targets)
     71         detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
     72         detections = self.transform.postprocess(detections, images.image_sizes, original_image_sizes)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    554                 input = result
    555         if torch._C._get_tracing_state():
--> 556             result = self._slow_forward(*input, **kwargs)
    557         else:
    558             result = self.forward(*input, **kwargs)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
    540                 recording_scopes = False
    541         try:
--> 542             result = self.forward(*input, **kwargs)
    543         finally:
    544             if recording_scopes:

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py in forward(self, images, features, targets)
    486         proposals = self.box_coder.decode(pred_bbox_deltas.detach(), anchors)
    487         proposals = proposals.view(num_images, -1, 4)
--> 488         boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level)
    489 
    490         losses = {}

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py in filter_proposals(self, proposals, objectness, image_shapes, num_anchors_per_level)
    392 
    393         # select top_n boxes independently per level before applying nms
--> 394         top_n_idx = self._get_top_n_idx(objectness, num_anchors_per_level)
    395 
    396         image_range = torch.arange(num_images, device=device)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py in _get_top_n_idx(self, objectness, num_anchors_per_level)
    372                 pre_nms_top_n = min(self.pre_nms_top_n(), num_anchors)
    373             _, top_n_idx = ob.topk(pre_nms_top_n, dim=1)
--> 374             r.append(top_n_idx + offset)
    375             offset += num_anchors
    376         return torch.cat(r, dim=1)

RuntimeError: expected device cuda:0 but got device cpu

@WaterKnight1998 's issue also tracked here with potential solution.

@WaterKnight1998 to complement @ptrblck comment, it seems that your input is a TensorImage (which is not something that we provide in torchvision I believe)
If you pass instead a list of 3d tensors, it should work.

@WaterKnight1998 to complement @ptrblck comment, it seems that your input is a TensorImage (which is not something that we provide in torchvision I believe)
If you pass instead a list of 3d tensors, it should work.

TensorImage is just a normal Tensor obtained from fastai that just add show function.

The problem that we are finding is that after tracing the output gets changed!

You can find the concrete output here

@WaterKnight1998 I would recommend converting the TensorImage into a Tensor before feeding the image, and making it be a list of tensors of 3 dimensions.

@WaterKnight1998 I would recommend converting the TensorImage into a Tensor before feeding the image, and making it be a list of tensors of 3 dimensions.

I tried using a list of 3d tensors and I am getting the strange empty dict.

({}, [{'scores': tensor([0.0570], grad_fn=<IndexBackward>), 'labels': tensor([1]), 'boxes': tensor([[165.8691, 434.1203, 527.4108, 714.6182]], grad_fn=<StackBackward>), 'masks': tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],

@WaterKnight1998 your output seems ok to me, MaskRCNN detected only a single object, with low confidence.

I would make sure that I'm feeding the inputs in the right forward (the images should be in the range 0-1)

your output seems ok to me

@fmassa mask-rcnn withotuh scripting just output the second element of the tuple. Is normal that after tracing it, it returns a tuple with first element of tuple being an empty dict?

@WaterKnight1998 yes, it is.
We raise a warning in https://github.com/pytorch/vision/blob/11a39aaab5b55a3c116c2e8d8001bad94a96f99d/torchvision/models/detection/generalized_rcnn.py#L108
explaining the differences. It's a limitation of torchscript that we can't have different return types depending on the self.training, so we always return both the losses and the detections, although only one of them will be activated.

It's a limitation of torchscript that we can't have different return types depending on the self.training, so we always return both the losses and the detections, although only one of them will be activated.

@fmassa Thank you very much for your explanation. It gave me the intuition that I needed!

Hello @fmassa,

Is there any updates on this issue?

torch==1.7.1
torchaudio==0.7.2
torchvision==0.8.2

Traceback (most recent call last):
  File "D:/Projects/tester/main.py", line 62, in <module>
    torch_out = script_module(x)
  File "D:\Projects\tester\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
RuntimeError: forward() Expected a value of type 'List[Tensor]' for argument 'images' but instead found type 'Tensor'.
Position: 1
Value: tensor([[[[-1.3924, -0.3426,  0.1565,  ..., -1.0010, -0.1127,  0.2637],
          [ 0.1392, -1.3978,  0.4600,  ..., -1.7351, -1.3514, -0.4097],
          [ 1.1242, -0.2859,  0.0956,  ..., -0.9409,  0.6421, -0.0713],
          ...,
          [ 0.4488,  0.1756,  1.9472,  ...,  1.3395,  0.0882,  0.2821],
          [ 1.2623,  0.0925, -2.4398,  ..., -0.9513, -2.2078,  1.7615],
          [-0.0645, -0.4522,  1.2193,  ..., -0.3644,  0.0360, -0.1954]],

         [[ 1.1202, -1.4459, -1.7245,  ..., -1.2972, -0.0717,  0.4818],
          [ 0.8732, -0.1661, -0.1113,  ...,  1.9476, -0.4579,  1.1956],
          [-2.1614,  0.3758, -0.7581,  ..., -1.0231, -0.8411, -0.1101],
          ...,
          [ 0.5501,  0.3279, -0.8761,  ..., -0.8433, -0.2146, -1.6229],
          [ 0.6187, -1.9583, -3.2449,  ...,  1.4666, -0.0826,  1.5495],
          [-1.4143,  0.3092, -0.3439,  ...,  0.8020, -0.5509,  0.0355]],

         [[ 0.7972,  0.5274, -1.5208,  ..., -0.6306,  0.5713, -1.0178],
          [ 0.4690,  0.6849,  0.0668,  ..., -0.5453, -1.1445,  0.2774],
          [-0.0832,  1.3775, -0.8812,  ..., -2.3852,  0.5324,  1.5018],
          ...,
          [ 0.6334,  0.4894,  0.3861,  ...,  0.9698,  1.0560, -0.8113],
          [-0.8962,  1.7035, -0.8178,  ..., -0.1556,  1.7010, -0.4338],
          [ 0.0149, -0.4869, -1.8882,  ..., -1.3715,  0.9658, -0.3530]]]])
Declaration: forward(__torch__.torchvision.models.detection.faster_rcnn.FasterRCNN self, Tensor[] images, Dict(str, Tensor)[]? targets=None) -> ((Dict(str, Tensor), Dict(str, Tensor)[]))
Cast error details: Unable to cast Python instance to C++ type (compile in debug mode for details)

@bulatnv torchscript should be supported for MaskRCNN models, but they only support the List[Tensor] interface, and not the Tensor.

So instead of doing

model(torch.rand(1, 3, 300, 300))

do instead

model([torch.rand(3, 300, 300)])
Was this page helpful?
0 / 5 - 0 ratings

Related issues

alpha-gradient picture alpha-gradient  Â·  3Comments

martinarjovsky picture martinarjovsky  Â·  4Comments

Abolfazl-Mehranian picture Abolfazl-Mehranian  Â·  3Comments

iacolippo picture iacolippo  Â·  4Comments

bodokaiser picture bodokaiser  Â·  3Comments