Mmdetection: Training SSD with wider face got probem.....

Created on 12 Feb 2019  路  5Comments  路  Source: open-mmlab/mmdetection

Config and info while retinanet is Ok.....

# model settings
input_size = 300
model = dict(
    type='SingleStageDetector',
    # pretrained='open-mmlab://vgg16_caffe',
    pretrained='models/vgg16_caffe-292e1171.pth',
    backbone=dict(
        type='SSDVGG',
        input_size=input_size,
        depth=16,
        with_last_pool=False,
        ceil_mode=True,
        out_indices=(3, 4),
        out_feature_indices=(22, 34),
        l2_norm_scale=20),
    neck=None,
    bbox_head=dict(
        type='SSDHead',
        input_size=input_size,
        in_channels=(512, 1024, 512, 256, 256, 256),
        num_classes=2,
        anchor_strides=(8, 16, 32, 64, 100, 300),
        basesize_ratio_range=(0.15, 0.9),
        anchor_ratios=([2], [2, 3], [2, 3], [2, 3], [2], [2]),
        target_means=(.0, .0, .0, .0),
        target_stds=(0.1, 0.1, 0.2, 0.2)))
cudnn_benchmark = True
train_cfg = dict(
    assigner=dict(
        type='MaxIoUAssigner',
        pos_iou_thr=0.5,
        neg_iou_thr=0.5,
        min_pos_iou=0.,
        ignore_iof_thr=-1,
        gt_max_assign_all=False),
    smoothl1_beta=1.,
    allowed_border=-1,
    pos_weight=-1,
    neg_pos_ratio=3,
    debug=False)
test_cfg = dict(
    nms=dict(type='nms', iou_thr=0.45),
    min_bbox_size=0,
    score_thr=0.02,
    max_per_img=200)
# model training and testing settings
# dataset settings
dataset_type = 'CocoDataset'
data_root = 'data/face_detection/'
img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
data = dict(
    imgs_per_gpu=8,
    workers_per_gpu=3,
    train=dict(
        type='RepeatDataset',
        times=5,
        dataset=dict(
            type=dataset_type,
            ann_file=data_root + 'annotations/instances_wider_train_val.json',
            img_prefix=data_root + 'wider_face/',
            img_scale=(300, 300),
            img_norm_cfg=img_norm_cfg,
            size_divisor=None,
            flip_ratio=0.5,
            with_mask=False,
            with_crowd=False,
            with_label=True,
            test_mode=False,
            extra_aug=dict(
                photo_metric_distortion=dict(
                    brightness_delta=32,
                    contrast_range=(0.5, 1.5),
                    saturation_range=(0.5, 1.5),
                    hue_delta=18),
                expand=dict(
                    mean=img_norm_cfg['mean'],
                    to_rgb=img_norm_cfg['to_rgb'],
                    ratio_range=(1, 4)),
                random_crop=dict(
                    min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_crop_size=0.3)),
            resize_keep_ratio=False)),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_wider_train_val.json',
        img_prefix=data_root + 'wider_face/',
        img_scale=(300, 300),
        img_norm_cfg=img_norm_cfg,
        size_divisor=None,
        flip_ratio=0,
        with_mask=False,
        with_label=False,
        test_mode=True,
        resize_keep_ratio=False),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_wider_train_val.json',
        img_prefix=data_root + 'wider_face/',
        img_scale=(300, 300),
        img_norm_cfg=img_norm_cfg,
        size_divisor=None,
        flip_ratio=0,
        with_mask=False,
        with_label=False,
        test_mode=True,
        resize_keep_ratio=False))
# optimizer
optimizer = dict(type='SGD', lr=2e-3, momentum=0.9, weight_decay=5e-4)
optimizer_config = dict()
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=1.0 / 3,
    step=[16, 22])
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
# runtime settings
total_epochs = 24
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/ssd300_wider'
load_from = None
resume_from = None
workflow = [('train', 1)]

2019-02-12 10:16:13,219 - INFO - Distributed training: True
2019-02-12 10:16:13,699 - INFO - load model from: models/vgg16_caffe-292e1171.pth
2019-02-12 10:16:13,817 - WARNING - missing keys in source state_dict: extra.6.weight, extra.0.bias, extra.7.bias, extra.3.bias, extra.2.weight, extra.1.bias, extra.5.weight, l2_norm.weight, extra.0.weight, extra.3.weight, extra.4.bias, extra.1.weight, extra.5.bias, extra.4.weight, extra.6.bias, extra.7.weight, extra.2.bias

loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
Done (t=1.27s)
creating index...
index created!
Done (t=1.33s)
creating index...
Done (t=1.07s)
creating index...
index created!
Done (t=1.13s)
creating index...
Done (t=1.12s)
creating index...
Done (t=1.09s)
creating index...
Done (t=1.29s)
creating index...
index created!
index created!
index created!
index created!
index created!
Done (t=1.38s)
creating index...
index created!
2019-02-12 10:16:22,574 - INFO - Start running, host: gaozhihua@FA-TRAIN-06, work_dir: /home/gaozhihua/program/mmdetection/work_dirs/ssd300_wider
2019-02-12 10:16:22,575 - INFO - workflow: [('train', 1)], max: 24 epochs
2019-02-12 10:17:41,332 - INFO - Epoch [1][50/1259]     lr: 0.00080, eta: 13:11:54, time: 1.575, data_time: 0.388, loss_cls: 21.3922, loss_reg: 4.9717, loss: 26.3639
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [0,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [2,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [3,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [4,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [5,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [7,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [8,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [10,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [15,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [19,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [20,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [21,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [22,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [5,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [2,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [34,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [35,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [36,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [37,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [38,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [0,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [2,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [4,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [5,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [7,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [8,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [9,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [10,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [11,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [12,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [13,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [14,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [15,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [16,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [17,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [18,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [19,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [20,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [21,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [22,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [23,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [24,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [25,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [26,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [29,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [0,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [0,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [2,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [3,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [4,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [5,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [6,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [7,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [8,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [9,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [10,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [11,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [12,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [13,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [14,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [0,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [2,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [3,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [4,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [5,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [6,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [7,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [8,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [9,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [10,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [11,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [12,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [13,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [14,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [0,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [2,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [3,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [4,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [5,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [6,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [7,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [8,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [9,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [10,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [11,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [12,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [13,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [14,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [15,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [16,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCTensorScatterGather.cu:124: void THCudaTensor_scatterKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 1]: block: [0,0,0], thread: [17,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
Traceback (most recent call last):
  File "./tools/train.py", line 90, in <module>
    main()
  File "./tools/train.py", line 86, in main
    logger=logger)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/apis/train.py", line 57, in train_detector
    _dist_train(model, dataset, cfg, validate=validate)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/apis/train.py", line 96, in _dist_train
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 355, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 268, in train
    self.call_hook('after_train_iter')
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 228, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/core/utils/dist_utils.py", line 53, in after_train_iter
    runner.outputs['loss'].backward()
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh:317
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: device-side assert triggered (insert_events at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCCachingAllocator.cpp:470)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fe7f9389cc5 in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x1338d10 (0x7fe7fce58d10 in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libcaffe2_gpu.so)
frame #2: at::TensorImpl::release_resources() + 0x50 (0x7fe7f99e4f90 in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #3: <unknown function> + 0x2ad15b (0x7fe7f663715b in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #4: <unknown function> + 0x313530 (0x7fe7f669d530 in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #5: torch::autograd::deleteFunction(torch::autograd::Function*) + 0x2f0 (0x7fe7f66399a0 in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #6: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x45 (0x7fe82961edf5 in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #7: torch::autograd::Variable::Impl::release_resources() + 0x4a (0x7fe7f68ad92a in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #8: <unknown function> + 0x121b2b (0x7fe829636b2b in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #9: <unknown function> + 0x31b8df (0x7fe8298308df in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #10: <unknown function> + 0x31b921 (0x7fe829830921 in /home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #11: <unknown function> + 0x1993cf (0x55c9b4f2f3cf in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #12: <unknown function> + 0xf18e8 (0x55c9b4e878e8 in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #13: <unknown function> + 0xf18a8 (0x55c9b4e878a8 in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #14: <unknown function> + 0x1993b1 (0x55c9b4f2f3b1 in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #15: <unknown function> + 0xf12b7 (0x55c9b4e872b7 in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #16: <unknown function> + 0xf1147 (0x55c9b4e87147 in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #17: <unknown function> + 0xf115d (0x55c9b4e8715d in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #18: <unknown function> + 0xf115d (0x55c9b4e8715d in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #19: <unknown function> + 0xf115d (0x55c9b4e8715d in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #20: PyDict_SetItem + 0x3da (0x55c9b4ecce7a in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #21: PyDict_SetItemString + 0x4f (0x55c9b4ed578f in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #22: PyImport_Cleanup + 0x99 (0x55c9b4f39709 in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #23: Py_FinalizeEx + 0x61 (0x55c9b4fa55f1 in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #24: Py_Main + 0x35e (0x55c9b4fb01fe in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #25: main + 0xee (0x55c9b4e7902e in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)
frame #26: __libc_start_main + 0xf0 (0x7fe84280a830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #27: <unknown function> + 0x1c3e0e (0x55c9b4f59e0e in /home/gaozhihua/anaconda2/envs/open-mmlab/bin/python)

Traceback (most recent call last):
  File "./tools/train.py", line 90, in <module>
    main()
  File "./tools/train.py", line 86, in main
    logger=logger)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/apis/train.py", line 57, in train_detector
    _dist_train(model, dataset, cfg, validate=validate)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/apis/train.py", line 96, in _dist_train
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 355, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 268, in train
    self.call_hook('after_train_iter')
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 228, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/core/utils/dist_utils.py", line 53, in after_train_iter
    runner.outputs['loss'].backward()
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/generated/../THCReduceAll.cuh:317
Traceback (most recent call last):
  File "./tools/train.py", line 90, in <module>
    main()
  File "./tools/train.py", line 86, in main
    logger=logger)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/apis/train.py", line 57, in train_detector
    _dist_train(model, dataset, cfg, validate=validate)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/apis/train.py", line 96, in _dist_train
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 355, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 268, in train
    self.call_hook('after_train_iter')
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/runner.py", line 228, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/mmdet/core/utils/dist_utils.py", line 53, in after_train_iter
    runner.outputs['loss'].backward()
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/gaozhihua/anaconda2/envs/open-mmlab/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag

Most helpful comment

This is a bug of topk, which is used in hard mining in 'ssd_head.py'.
When lr is too big, the loss may contain 'nan', while for a tensor in GPU, topk with "nan" is kind of undefined behavior, please refer https://github.com/pytorch/pytorch/issues/1810

All 5 comments

similar problem, try decrease the warmup_ratio or lr which worked for me. BUT i still don't find the reason

This is a bug of topk, which is used in hard mining in 'ssd_head.py'.
When lr is too big, the loss may contain 'nan', while for a tensor in GPU, topk with "nan" is kind of undefined behavior, please refer https://github.com/pytorch/pytorch/issues/1810

Yes, u are right.

@yhcao6 Thank you!

It is true that learning rate is too big.
So i use lr = 10e-5

Was this page helpful?
0 / 5 - 0 ratings

Related issues

FrankXinqi picture FrankXinqi  路  3Comments

songyuc picture songyuc  路  3Comments

fmassa picture fmassa  路  3Comments

hust-kevin picture hust-kevin  路  3Comments

tianxinhang picture tianxinhang  路  3Comments