Detectron2: Why does it take so long time to start?

Created on 12 Oct 2019 · 9Comments · Source: facebookresearch/detectron2

❓ Questions and Help

Hello~
When I start to train RetinaNet with default setting, it is very slow in preparation phase ！
Info in the console is as following:

[10/12 14:51:43 detectron2]: Full config saved to output/detectron2/DEBUG/config.yaml
[10/12 14:51:43 d2.utils.env]: Using a generated random seed 43796016
[10/12 15:03:06 d2.engine.defaults]: Model:

from 14:51:43 to 15:03:06, it does not start to train.
Therefore, could you tell me why does it take so long time?
Thank you very much!

installation / environment

Source

pengzhiliang

Most helpful comment

This issue is now fixed with newly updated binaries.
Uninstalling and reinstalling PyTorch from Anaconda will fix it.

soumith on 12 Oct 2019

🎉3

All 9 comments

Please include details following the issue template

ppwwyyxx on 12 Oct 2019

@ppwwyyxx OK.

I did not modify config file,and just ran command as following:

DIR=output/detectron2/coco/Retinanet
CUDA_VISIBLE_DEVICES=4,5,6,7 python tools/train_net.py --num-gpus 4 --dist-url auto \
                            --config-file configs/COCO-Detection/retinanet_R_50_FPN_1x.yaml \
                            SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 \
                            MODEL.WEIGHTS models/R-50.pkl \
                            OUTPUT_DIR $DIR

Then, I didn't get error but found it took so long time to start.
Major information in the pycharm console is as following:

[10/12 14:51:43 detectron2]: Full config saved to output/detectron2/DEBUG/config.yaml
[10/12 14:51:43 d2.utils.env]: Using a generated random seed 43796016
[10/12 15:03:06 d2.engine.defaults]: Model:
RetinaNet(
  (backbone): FPN(
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    ........

Strangely, from 14:51:43 to 15:03:06, it did not start to train.

And my environment info:

---------------------  --------------------------------------------------
Python                 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
Detectron2 Compiler    GCC 5.4
DETECTRON2_ENV_MODULE  <not set>
PyTorch                1.3.0
PyTorch Debug Build    False
CUDA available         True
GPU 0,1,2,3            GeForce RTX 2080 Ti
Pillow                 6.2.0
cv2                    4.1.1
---------------------  --------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_50,code=compute_50
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

In summary, no error happened but it took so long time in preparation phase!

Thank you!

pengzhiliang on 12 Oct 2019

Your version of pytorch is not built with the pre-computed code for your GPU architecture. In that case everything will run very slowly at first.

To resolve this you need to find a different build of pytorch or build by yourself.

ppwwyyxx on 12 Oct 2019

OK, Thanks a lot!

pengzhiliang on 12 Oct 2019

@soumith we've seen two reports about this issue. It seems like the pytorch 1.3 + cuda 10.1 package on pypi is built with GPU code up to 7.5 architectures, while the package on conda only has GPU code up to 5.0.

To users: use pip install rather than conda install should help

ppwwyyxx on 12 Oct 2019

Sorry, I met the same problem here, it take so long time to start ... (pytorch 1.3 + cuda 10.1)