Darknet: Training with GPU=1 fails with CUDA error

Created on 14 Jan 2020  路  12Comments  路  Source: AlexeyAB/darknet

I'm trying to train on a custom dataset. But when I compile with GPU=1 it fails with following error. Training works fine with GPU=0. Wondering how to fix it.
PS: I'm training this on a Tesla K80 on a VM

./darknet detector train AnnotationLarge/obj.data AnnotationLarge/yolov3.cfg darknet53.conv.74
 CUDA-version: 10020 (10020), cuDNN: 7.6.5, GPU count: 1
 OpenCV version: 3.2.0d
CUDA status Error: file: ./src/darknet.c : () : line: 467 : build time: Jan 14 2020 - 21:25:37
CUDA Error: cannot set while device is active in this process
CUDA Error: cannot set while device is active in this process: File exists
darknet: ./src/utils.c:325: error: Assertion `0' failed.
Aborted (core dumped)
Solved

All 12 comments

I get the same error.
!chmod 777 darknet !./darknet detecter data cfg/obj.data cfg/yolov2-tiny-voc.cfg tiny-yolo-voc.conv.13 CUDA status Error: file: ./src/dark_cuda.c : () : line: 495 : build time: Jan 15 2020 - 00:17:22 CUDA Error: no CUDA-capable device is detected CUDA Error: no CUDA-capable device is detected: Success darknet: ./src/utils.c:325: error: Assertion0' failed.`
The development environment is googlecolab.
There seems to be a problem with mikefile.

@eraser3112 Show output of command nvidia-smi

@maykulkarni

CUDA status Error: file: ./src/darknet.c : () : line: 467

Try to use the latest code and show the error message, since there is no code in the line 467: https://github.com/AlexeyAB/darknet/blob/14172d42b68cf9c81ca1150020475f3e79c82fab/src/detector.c#L467

@AlexeyAB

Surprised, it was set to not use the GPU.Thank you!

hi @AlexeyAB
I got the same error as @maykulkarni
can work on your CPU version and work on pjreddie's GPU version.
but can't work on your GPU version.

CUDA-version: 10000 (10020)
Warning: CUDA-version is lower than Driver-version!
, GPU count: 3
OpenCV version: 3.2.0d
CUDA status Error: file: ./src/darknet.c : () : line: 467 : build time: Jan 15 2020 - 10:52:44
CUDA Error: cannot set while device is active in this process
CUDA Error: cannot set while device is active in this process: File exists
darknet: ./src/utils.c:325: error: Assertion `0' failed.
Aborted (core dumped)

My git was clone yesterday, so I think it's latest version.
your post is detector.c but error code was in darknet.c . https://github.com/AlexeyAB/darknet/blob/14172d42b68cf9c81ca1150020475f3e79c82fab/src/detector.c#L467

there is darknet.c and it have code in the line 467.
https://github.com/AlexeyAB/darknet/blob/14172d42b68cf9c81ca1150020475f3e79c82fab/src/darknet.c#L467

thanks ~

I got the same error . Help !

@ntucschen git the latest code and make, the error disappeared. I have success run the demo .

@Zhangxiaof001 I still can't work QQ

I tried to debug the C code but couldn't do it. Finally, I solved the problem by scraping the VM and creating a new one 馃槃

My configuration:
ubuntu 18.04
cuda10.0
cudnn 7.6.5
nvidia driver 430
opencv 3.4.0
I can sucess make it.

@ntucschen I added some fix. Download new Darknet version.

@AlexeyAB thank u very much~~~
new version is working 馃憤 馃憤 馃憤

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bit-scientist picture bit-scientist  路  3Comments

siddharth2395 picture siddharth2395  路  3Comments

PROGRAMMINGENGINEER-NIKI picture PROGRAMMINGENGINEER-NIKI  路  3Comments

rezaabdullah picture rezaabdullah  路  3Comments

kebundsc picture kebundsc  路  3Comments