Maskrcnn-benchmark: pytorch 1.3 got a used bug

Created on 11 Sep 2019 · 7Comments · Source: facebookresearch/maskrcnn-benchmark

I just upgrade to pytorch1.3 (build from source) previously can training code not working anymore.

/usr/local/lib/python3.5/dist-packages/torch/optim/lr_scheduler.py:82: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
  " please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
  " please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
  " please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
  " please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
  " please use a dtype torch.bool instead.");




maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 57, in do_train
    for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.5/dist-packages/torch/_utils.py", line 369, in reraise
    raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataset.py", line 207, in __getitem__
    return self.datasets[dataset_idx][sample_idx]
  File "//maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/data/datasets/coco.py", line 94, in __getitem__
    target = target.clip_to_image(remove_empty=True)
  File "s/fagangjin/work/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 223, in clip_to_image
    return self[keep]
  File k/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 208, in __getitem__
    bbox.add_field(k, v[item])
  File "/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 553, in __getitem__
    selected_instances = self.instances.__getitem__(item)
  File "/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 462, in __getitem__
    selected_polygons.append(self.polygons[i])
IndexError: list index out of range

This bug can be seen previously, but I am sure this bug is not related that one since I just cloned a fresh new maskrcnn-benmark.

Seems only happens on pytorch 1.3?

To be more detail, it happens in these line codes:

link

Source

jinfagang

Most helpful comment

@jinfagang
I pasted the wrong path in my previous reply.
To resolve the warning, I only change the below file.
maskrcnn_benchmark/modeling/rpn/inference.py
191:            inds_mask = torch.zeros_like(objectness, dtype=torch.uint8)
From release 1.2,
It shows that "Masking via torch.uint8 Tensors is now deprecated in favor of masking via torch.bool Tensors."

Therefore, the warning is showed only when we use torch.uint8 as index or mask to select tensor.
And other places using dtype=torch.uint8 needn't be changed.

You are right, thanks! The warning disappears when in this line "uint8" is replaced by "bool".

zhenglilei on 12 Sep 2019

👍6

All 7 comments

It seems related to default value type change:

INFO 09-11 14:20:25 segmentation_mask.py:452 - polygonlist item: tensor([True])

def __getitem__(self, item):
        logging.info('polygonlist item: {}'.format(item))
        if isinstance(item, int):
            selected_polygons = [self.polygons[item]]
        elif isinstance(item, slice):
            selected_polygons = self.polygons[item]
        else:
            # advanced indexing on a single dimension
            selected_polygons = []
            if isinstance(item, torch.Tensor) and item.dtype == torch.uint8:
                item = item.nonzero()
                item = item.squeeze(1) if item.numel() > 0 else item
                item = item.tolist()
            for i in item:
                selected_polygons.append(self.polygons[i])
        return PolygonList(selected_polygons, size=self.size)

Really hope anyone can figure out what's happening now.,...

jinfagang on 11 Sep 2019

I think it is because the comparison operations return dtype has changed in PyTorch 1.2.
https://github.com/pytorch/pytorch/releases

If __getitem__ is called from clip_to_image, the dtype of keep is changed from torch.uint8 to torch.bool.

so you could change the dtype checking in __getitem__ from item.dtype == torch.uint8: to item.dtype == torch.bool:

def clip_to_image(self, remove_empty=True):
    TO_REMOVE = 1
    self.bbox[:, 0].clamp_(min=0, max=self.size[0] - TO_REMOVE)
    self.bbox[:, 1].clamp_(min=0, max=self.size[1] - TO_REMOVE)
    self.bbox[:, 2].clamp_(min=0, max=self.size[0] - TO_REMOVE)
    self.bbox[:, 3].clamp_(min=0, max=self.size[1] - TO_REMOVE)
    if remove_empty:
        box = self.bbox
        keep = (box[:, 3] > box[:, 1]) & (box[:, 2] > box[:, 0])
        return self[keep]
    return self

We could also resolve the warnings by modifying the dtype in maskrcnn_benchmark/modeling/balanced_positive_negative_sampler.py.

henrywang1 on 11 Sep 2019

👍2

@henrywang1 Thanks, I fixed the index out of range error. Still got that warning after I did changes inside modeling/balanced_positive_negative_sampler.py:

pos_idx_per_image_mask = torch.zeros_like(
                matched_idxs_per_image, dtype=torch.bool
            )
            neg_idx_per_image_mask = torch.zeros_like(
                matched_idxs_per_image, dtype=torch.bool
            )

Do u have any idea to suppress them? I think there must be a lot of places which calling in older way...

jinfagang on 12 Sep 2019

maskrcnn_benchmark/structures/segmentation_mask.py
443:            masks = torch.empty([0, size[1], size[0]], dtype=torch.uint8)

demo_mine/predictor.py
323:            masks_padded = torch.zeros(max_masks, 1, height, width, dtype=torch.uint8)
328:            (masks_per_dim * height, masks_per_dim * width), dtype=torch.uint8

demo_mine/demo_coco_maskrcnn_fbnet.py
177:            masks_padded = torch.zeros(max_masks, 1, height, width, dtype=torch.uint8)
182:            (masks_per_dim * height, masks_per_dim * width), dtype=torch.uint8

demo_mine/demo_coco_maskrcnn_fbnet_xib.py
176:            masks_padded = torch.zeros(max_masks, 1, height, width, dtype=torch.uint8)
181:            (masks_per_dim * height, masks_per_dim * width), dtype=torch.uint8

demo_mine/trace_model.py
91:    # mask = torch.zeros((height, width), dtype=torch.uint8)
126:        color = ((palette * labels[i]) % 255).to(torch.uint8)

maskrcnn_benchmark/modeling/rpn/inference.py
191:            inds_mask = torch.zeros_like(objectness, dtype=torch.uint8)

maskrcnn_benchmark/modeling/rpn/anchor_generator.py
109:            inds_inside = torch.ones(anchors.shape[0], dtype=torch.uint8, device=device)

vendor/maskrcnn-benchmark/demo/predictor.py
363:            masks_padded = torch.zeros(max_masks, 1, height, width, dtype=torch.uint8)
368:            (masks_per_dim * height, masks_per_dim * width), dtype=torch.uint8

vendor/maskrcnn-benchmark/demo_mine/trace_model.py
91:    # mask = torch.zeros((height, width), dtype=torch.uint8)
126:        color = ((palette * labels[i]) % 255).to(torch.uint8)

vendor/maskrcnn-benchmark/demo_mine/export_to_onnx.py
76:    image = torch.nn.functional.upsample(image.permute(2, 0, 1).unsqueeze(0).to(torch.float), size=(960, 1280)).to(torch.uint8).squeeze(0).permute(1, 2, 0).to(device)

maskrcnn_benchmark/modeling/roi_heads/box_head/inference.py
62:            keep = torch.ones(scores.shape, device=scores.device, dtype=torch.uint8)

maskrcnn_benchmark/modeling/roi_heads/mask_head/inference.py
157:        mask = (mask * 255).to(torch.uint8)
159:    im_mask = torch.zeros((im_h, im_w), dtype=torch.uint8)

maskrcnn_benchmark/modeling/roi_heads/keypoint_head/loss.py
157:        valid = cat(valid, dim=0).to(dtype=torch.uint8)

All these need to change?

jinfagang on 12 Sep 2019

@jinfagang
I pasted the wrong path in my previous reply.
To resolve the warning, I only change the below file.

maskrcnn_benchmark/modeling/rpn/inference.py
191:            inds_mask = torch.zeros_like(objectness, dtype=torch.uint8)

From release 1.2,
It shows that "Masking via torch.uint8 Tensors is now deprecated in favor of masking via torch.bool Tensors."

Therefore, the warning is showed only when we use torch.uint8 as index or mask to select tensor.
And other places using dtype=torch.uint8 needn't be changed.

henrywang1 on 12 Sep 2019

👍1

To fix the error, this will help.
https://github.com/facebookresearch/maskrcnn-benchmark/pull/1053/commits/81c190146fd78919675953be26e4a4d9a5283d40

The warning can be fixed as @henrywang1 suggests.

It was said in https://github.com/facebookresearch/maskrcnn-benchmark/pull/1053 that they will "move the whole project to use pytorch 1.2 and rewrite the INSTALL.md accordingly soon."

zhenglilei on 12 Sep 2019

👍1

@jinfagang
I pasted the wrong path in my previous reply.
To resolve the warning, I only change the below file.
maskrcnn_benchmark/modeling/rpn/inference.py
191:            inds_mask = torch.zeros_like(objectness, dtype=torch.uint8)
From release 1.2,
It shows that "Masking via torch.uint8 Tensors is now deprecated in favor of masking via torch.bool Tensors."

Therefore, the warning is showed only when we use torch.uint8 as index or mask to select tensor.
And other places using dtype=torch.uint8 needn't be changed.

You are right, thanks! The warning disappears when in this line "uint8" is replaced by "bool".

zhenglilei on 12 Sep 2019

👍6

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Raise ValueError: Type mismatch (<type 'str'> vs. <type 'tuple'>) with values (coco_2017_train vs. ('coco_2017_train',)) for config key: DATASETS.TRAIN

SkeletonOne · 3Comments

Run coco panoptic dataset

YuShen1116 · 4Comments

?? What's the problem

kaaier · 3Comments

Support for Fast RCNN

adityaarun1 · 3Comments

Cityscapes to COCO inefficiency

botcs · 3Comments