Mmdetection: IndexError: list index out of range

Created on 8 May 2020  路  13Comments  路  Source: open-mmlab/mmdetection

i try to use this code to evaluate my result
!python content/mmdetection/tools/test.py content/mmdetection/configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py \ work_dirs/faster_rcnn_r50_fpn_1x_voc0712/latest.pth \ --eval mAP

and get this error message:

File "content/mmdetection/tools/test.py", line 149, in
main()
File "content/mmdetection/tools/test.py", line 145, in main
dataset.evaluate(outputs, args.eval, **kwargs)
File "/content/mmdetection/mmdet/datasets/voc.py", line 48, in evaluate
logger=logger)
File "/content/mmdetection/mmdet/core/evaluation/mean_ap.py", line 387, in eval_map
mean_ap, eval_results, dataset, area_ranges, logger=logger)
File "/content/mmdetection/mmdet/core/evaluation/mean_ap.py", line 450, in print_map_summary
label_names[j], num_gts[i, j], results[j]['num_dets'],
IndexError: list index out of range

my config is:

model=dict(
type='FasterRCNN',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(
type='BN',
requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=True,
loss_weight=1.0),
loss_bbox=dict(
type='L1Loss',
loss_weight=1.0)),
roi_head=dict(
type='StandardRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(
type='RoIAlign',
out_size=7,
sample_num=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=13,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(
type='L1Loss',
loss_weight=1.0))))
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False))
test_cfg=dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.5,
nms=dict(
type='nms',
iou_thr=0.5),
max_per_img=20))
dataset_type='VOCDataset'
data_root='content/mmdetection/data/VOCdevkit/'
img_norm_cfg=dict(
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True)
train_pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations',
with_bbox=True),
dict(type='Resize',
img_scale=(1000, 600),
keep_ratio=True),
dict(type='RandomFlip',
flip_ratio=0.5),
dict(type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad',
size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'])]
test_pipeline=[
dict(type='LoadImageFromFile'),
dict(type='MultiScaleFlipAug',
img_scale=(1000, 600),
flip=False,
transforms=[
dict(type='Resize',
keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad',
size_divisor=32),
dict(type='ImageToTensor',
keys=['img']),
dict(type='Collect',
keys=['img'])])]
data=dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type='RepeatDataset',
times=3,
dataset=dict(
type='VOCDataset',
ann_file=['content/mmdetection/data/VOCdevkit/VOC2007/ImageSets/Main/train.txt'],
img_prefix=['content/mmdetection/data/VOCdevkit/VOC2007/'],
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations',
with_bbox=True),
dict(type='Resize',
img_scale=(1000, 600),
keep_ratio=True),
dict(type='RandomFlip',
flip_ratio=0.5),
dict(type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad',
size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'])])),
val=dict(
type='VOCDataset',
ann_file='content/mmdetection/data/VOCdevkit/VOC2007/ImageSets/Main/test.txt',
img_prefix='content/mmdetection/data/VOCdevkit/VOC2007/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='MultiScaleFlipAug',
img_scale=(1000, 600),
flip=False,
transforms=[
dict(type='Resize',
keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad',
size_divisor=32),
dict(type='ImageToTensor',
keys=['img']),
dict(type='Collect',
keys=['img'])])]),
test=dict(
type='VOCDataset',
ann_file='content/mmdetection/data/VOCdevkit/VOC2007/ImageSets/Main/test.txt',
img_prefix='content/mmdetection/data/VOCdevkit/VOC2007/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='MultiScaleFlipAug',
img_scale=(1000, 600),
flip=False,
transforms=[
dict(type='Resize',
keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad',
size_divisor=32),
dict(type='ImageToTensor',
keys=['img']),
dict(type='Collect',
keys=['img'])])]))
evaluation=dict(
interval=1,
metric='mAP')
checkpoint_config=dict(
interval=1)
log_config=dict(
interval=50,
hooks=[
dict(type='TextLoggerHook')])
dist_params=dict(
backend='nccl')
log_level='INFO'
load_from=None
resume_from=None
workflow=[('train', 1)]
optimizer=dict(
type='SGD',
lr=0.01,
momentum=0.9,
weight_decay=0.0001)
optimizer_config=dict(
grad_clip=None)
lr_config=dict(
policy='step',
step=[3])
total_epochs=4
work_dir='./work_dirs/faster_rcnn_r50_fpn_1x_voc0712'
gpu_ids=range(0, 1)

The dataset i use have 12 classes,i have changed num_classes to 13.

Thank you!

Most helpful comment

Hi @vick-wuwei ,
You need to change num_classes=1 since v2.0 does not count BG anymore.

All 13 comments

Please follow the Error report issue template.

I met same error here, after I upgrade to mmdet2.0 and train with 1 positive class custom dataset. I saw the default coco config num_classes=80. So I guess I should set num_classes = 1. Then I got all the loss = 0 during training:
image

Then I change num_classes = 2. I got this error at the end of evaluation process of val dataset after 1st epoch of training on train dataset:
image

my env:
pytorch 1.4 + mmdet2.0

my config file:
_base_ = './foveabox/fovea_r50_fpn_4x4_1x_coco.py'
model = dict(
bbox_head=dict(
num_classes=1,
sigma=0.5,
with_deform=True,
norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)))

dataset settings

dataset_type = 'TCTDataset'
data_root = '/media/cw/data/tct_train/iter22'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile', to_float32=True),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='PhotoMetricDistortion',
brightness_delta=10,
contrast_range=(0.9, 1.1),
saturation_range=(0.95, 1.05),
hue_delta=6),
dict(type='Resize', img_scale=(1126, 1126), ratio_range=(0.8, 1)), # here to mimic random scale, 1126 upper bound, 0.81126=900 lower bound
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', *
img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
# dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=3,
workers_per_gpu=3,
train=dict(
type=dataset_type,
ann_file=data_root + '/iter22_train_v1_mmdet.csv',
img_prefix=data_root,
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file=data_root + '/iter22_val_v1_mmdet.csv',
img_prefix=data_root,
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + '/iter22_val_v1_mmdet.csv',
img_prefix=data_root,
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='mAP')

learning policy

lr_config = dict(
policy='step',
warmup=None,
warmup_iters=500,
warmup_ratio=0.001,
step=[15])
total_epochs = 30
checkpoint_config = dict(interval=1)
optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(
_delete_=True, grad_clip=dict(max_norm=35, norm_type=2))

yapf:disable

log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
load_from = './pretrained/fovea_align_gn_ms_r50_fpn_4gpu_2x_20190905-13374f33.pth'
resume_from = None

I upgrade to:
pytorch 1.5
cuda 10.2
mmdet 2.0
I still got the same error as above.

And I did one more exp. and found another error. Don't know if it's realted.
when I run training with default config fovea_align_r50_fpn_gn-head_4x4_2x_coco.py with samples_per_gpu >1, I get
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
image
I'm sure I still have enough gpu mem.(My gpu is 2080Ti)

I am also facing same issue

how to change number of classes of custom dataset?
may be beacuse of that its throwing this error. if u solve please let me know

I have same issue
I use mmdet 2.0, and change the config file which was written in v1.1.0.
v1.1.0 In "SharedFCBBoxHead" num_classes = 2 and output is 2 classes
but in v2.0, the same backbone and "Shared2FCBBoxHead", num_classes = 2, I get the warning:

size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([2, 1024]) from checkpoint, the shape in current model is torch.Size([3, 1024]).

Why the current model is size([3,1024]) ??

Hi @vick-wuwei ,
You need to change num_classes=1 since v2.0 does not count BG anymore.

Hi @ZwwWayne ,

In my case, I tried num_classes=1 and keep other configs like it in mmdet v1. I got all loss equal 0 as I mentioned above.
image

When I use mmdet v1, it went well.I don't know if it's config issue or bug?

Hi @edwardyangxin ,
I am really confused about the status of your bug report after these posts. Could you raise a new issue to describe your bug using the Error Template? If you tried many cases, please point out what setting you changed and what results you got in these case.

Hi @RiyazAina-DeepML ,
If you meet an issue, please raise the issue following the Error Template. We are not able to help you out with I am also facing same issue. About how to change the number of classes, please read the tutorial.

This PR is closed since the original issue reporter does not use the Error template and the following posts are messy and do not help to locate the bug. Feel free to open a new issue with a clear description using Error Template.

For anyone who have encountered the same problem can check if you have write the correct class names in mmdet/core/evaluation/class_names.py as the code will read label_names from here.
It may be the out of index cause.

i have meet the same problem. this is my solution method. https://www.codenong.com/cs107067843/

Was this page helpful?
0 / 5 - 0 ratings