Detectron2: The checkpoint contains parameters not used by the model

Created on 8 Feb 2020  路  11Comments  路  Source: facebookresearch/detectron2

Trying to train by using TridentNet on custom dataset . The config is the following which i used,

`from projects.TridentNet.tridentnet import add_tridentnet_config

cfg = get_cfg()
add_tridentnet_config(cfg)
cfg.merge_from_file(project_root+"/projects/TridentNet/configs/tridentnet_fast_R_50_C4_3x.yaml")
cfg.DATASETS.TRAIN = ("train", )
cfg.OUTPUT_DIR = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.DATASETS.TEST = ("val", )
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.001
cfg.SOLVER.MAX_ITER = 200000
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.TEST.EVAL_PERIOD = 200
cfg.SOLVER.CHECKPOINT_PERIOD = 600
cfg.SOLVER.MOMENTUM = 0.87

from detectron2.modeling import build_model
from detectron2.checkpoint import DetectionCheckpointer

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()`

Got the following error

tridenNet_error

Most helpful comment

Your issue is answered in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues already.
If you need help, please also include environment information following the issue template.

I have build the detectron with cuda 10.1 by using following command

for CUDA 10.1:

pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/index.html

All 11 comments

You're loading an ImageNet pre-trained model (because that's what's written in the config file) and ImageNet pre-trained model contains classification layers that are not used by detection model. So it's expected.

You're loading an ImageNet pre-trained model (because that's what's written in the config file) and ImageNet pre-trained model contains classification layers that are not used by detection model. So it's expected.

Which is the imagenet pretrained model for detection ?

that's what's written in the config file

You're loading an ImageNet pre-trained model (because that's what's written in the config file) and ImageNet pre-trained model contains classification layers that are not used by detection model. So it's expected.

Did u meant that 'cfg.MODEL.WEIGHTS = "detectron2://ImageNetPretrained/MSRA/R-50.pkl" contains classification layer ?How to solve this error ? will u show which part of the config file mentioned the classification layer?

Yes.
It's expected, which means it's not an error.

Yes.
It's expected, which means it's not an error.

But training failed in 0th iteration itself.. u can see it in the question

Please provide full logs. I can't see what is the error in the screenshot

Please provide full logs. I can't see what is the error in the screenshot

>   proposal_generator.anchor_generator.cell_anchors.0
  proposal_generator.rpn_head.anchor_deltas.{bias, weight}
  proposal_generator.rpn_head.conv.{bias, weight}
  proposal_generator.rpn_head.objectness_logits.{bias, weight}
  roi_heads.box_predictor.bbox_pred.{bias, weight}
  roi_heads.box_predictor.cls_score.{bias, weight}
[02/10 15:36:04 d2.checkpoint.c2_model_loading]: The checkpoint contains parameters not used by the model:
  fc1000_b
  fc1000_w
  conv1_b
[02/10 15:36:04 d2.engine.train_loop]: Starting training from iteration 0
[02/10 15:36:04 d2.engine.hooks]: Total training time: 0:00:00 (0:00:00 on hooks)
Registering val image 
100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻坾 9958/9958 [01:20<00:00, 124.41it/s]
9958 Images registered successfully.
[02/10 15:37:25 d2.data.build]: Distribution of instances among all 1 categories:
|  category  | #instances   |
|:----------:|:-------------|
|   person   | 57016        |
|            |              |
WARNING [02/10 15:37:25 d2.engine.defaults]: No evaluator found. Use `DefaultTrainer.test(evaluators=)`, or implement its `build_evaluator` method.
Traceback (most recent call last):
  File "tridentnet_custom_train.py", line 96, in <module>
    trainer.train()
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/engine/defaults.py", line 373, in train
    super().train(self.start_iter, self.max_iter)
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/engine/train_loop.py", line 132, in train
    self.run_step()
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/engine/train_loop.py", line 212, in run_step
    loss_dict = self.model(data)
  File "/opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/modeling/meta_arch/rcnn.py", line 129, in forward
    _, detector_losses = self.roi_heads(images, features, proposals, gt_instances)
  File "/opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/projects/TridentNet/tridentnet/trident_rcnn.py", line 66, in forward
    pred_instances, losses = super().forward(images, features, proposals, all_targets)
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/modeling/roi_heads/roi_heads.py", line 392, in forward
    box_features = self._shared_roi_transform(
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/modeling/roi_heads/roi_heads.py", line 378, in _shared_roi_transform
    x = self.pooler(features, boxes)
  File "/opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/modeling/poolers.py", line 215, in forward
    return self.level_poolers[0](x[0], pooler_fmt_boxes)
  File "/opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/layers/roi_align.py", line 94, in forward
    return roi_align(
  File "/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/layers/roi_align.py", line 19, in forward
    output = _C.roi_align_forward(
RuntimeError: CUDA error: invalid device function (ROIAlign_forward_cuda at /mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/layers/csrc/ROIAlign/ROIAlign_cuda.cu:361)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f4ac3239627 in /opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: detectron2::ROIAlign_forward_cuda(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xa24 (0x7f4aa8c8c770 in /mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/_C.cpython-38-x86_64-linux-gnu.so)
frame #2: detectron2::ROIAlign_forward(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xb6 (0x7f4aa8c09fc6 in /mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/_C.cpython-38-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x543a9 (0x7f4aa8c1a3a9 in /mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/_C.cpython-38-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x5039e (0x7f4aa8c1639e in /mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2/_C.cpython-38-x86_64-linux-gnu.so)
<omitting python frames> frame #10: THPFunction_apply(_object*, _object*) + 0xb2f (0x7f4af51c0d1f in /opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
[1]    11845 segmentation fault (core dumped)  python tridentnet_custom_train.py
(d2_train) 

Your issue is answered in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues already.
If you need help, please also include environment information following the issue template.

Your issue is probably answered in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md already.
If you need help, please also include environment information following the issue template.

I have already trained retinanet model using detectron. I got the above error when i tried with other models
The output of 'python -m detectron2.utils.collect_env'

$ python -m detectron2.utils.collect_env


sys.platform linux
Python 3.8.1 (default, Jan 8 2020, 22:29:32) [GCC 7.3.0]
numpy 1.18.1
detectron2 0.1 @/mnt/Data_common/PPE_Violation_Detection_Samjith/MPC_model/detectron2/detectron2
detectron2 compiler GCC 7.4
detectron2 CUDA compiler 10.0
detectron2 arch flags sm_61
DETECTRON2_ENV_MODULE
PyTorch 1.4.0 @/opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torch
PyTorch debug build False
CUDA available True
GPU 0,1 GeForce GTX 1080
CUDA_HOME /usr/local/cuda
NVCC Cuda compilation tools, release 10.0, V10.0.130
Pillow 6.2.2
torchvision 0.5.0 @/opt/anaconda3/envs/d2_train/lib/python3.8/site-packages/torchvision
torchvision arch flags sm_35, sm_50, sm_60, sm_70, sm_75
cv2 4.2.0


PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

I have found that ,

detectron2 CUDA compiler 10.0
CUDA_HOME /usr/local/cuda
PyTorch built with:
- CUDA Runtime 10.1

Detectron2 CUDA compiler is 10.0 but pytorch build cuda is 10.1. Should i rebuild the detectron2 or should i install cuda 10.0 and rebuild pytorch with cuda 10.0?

Your issue is answered in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues already.
If you need help, please also include environment information following the issue template.

I have build the detectron with cuda 10.1 by using following command

for CUDA 10.1:

pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/index.html

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RomRoc picture RomRoc  路  4Comments

choasup picture choasup  路  3Comments

AntonBaumannDE picture AntonBaumannDE  路  3Comments

Ormagardskvaedi picture Ormagardskvaedi  路  4Comments

joeythegod picture joeythegod  路  4Comments