Darknet: i got one error when i train my user dataset

Created on 7 Jan 2020  路  10Comments  路  Source: AlexeyAB/darknet

when i run darknet, it was done. i could get result image by using pre-trained model.
but when i train my own data, there was one error even i already checked my cfg and data set

CUDA Error Prev: an illegal memory access was encountered
CUDA Error Prev: an illegal memory access was encountered: File exists
darknet: /home/nvidia/darknet/src/utils.c:297: error: Assertion `0' failed.

my gpu is gtx1080
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:01:00.0 On | N/A |
| 0% 55C P0 48W / 210W | 251MiB / 8118MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 985 G /usr/lib/xorg/Xorg 146MiB |
| 0 1763 G compiz 102MiB |
+-----------------------------------------------------------------------------+

can you let me know what is wrong on my settings?

Bug fixed

All 10 comments

hi bro, i have same error, i use 1080ti and problem with new src, but old src run well

Hi,

The same here when trying to train on yolov3_spp.cfg. I use 2x 2080ti

Capture

Attach your cfg-file in zip-archive.
What params do you use in the Makefile?
Do you use the latest version of Darknet?

@AlexeyAB

here is my cfg-file attached. I used the latest version of darknet https://github.com/AlexeyAB/darknet/commit/e62506629eba5d68386b68f05c12bc745d0cfba3
compiled using CMake without any changes (opencv 410, CUDA10.0, cudnn 7.4.2)

Just to mention as @KhoiNguyen1112 said, this error appears with the latest version ; the https://github.com/AlexeyAB/darknet/commit/dcfeea30f195e0ca1210d580cac8b91b6beaf3f7 was fine for me
yolov3-spp-asff_mosaic_angle_mse.zip

@alxBO @psw7420fusion @KhoiNguyen1112
Do you get this issue if you train with random=0 in the last [yolo] layer?

@AlexeyAB The error is gone when i set random=0 on the last yolo layer. hmm... what does it mean ?

@alxBO
It means that the error was in the resize_shortcut_layer() function which I just fixed: https://github.com/AlexeyAB/darknet/commit/9bd88d7fd7cef872998aa931f8280a1ba1578822#diff-452556c0247ea5a4e6095f1c065edcbeR118-R132

@AlexeyAB Thank you ! It works like a charm now

Attach your cfg-file in zip-archive.
What params do you use in the Makefile?
Do you use the latest version of Darknet?
yes i use latest version in darknet

@psw7420fusion Bug is fixed.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jasleen137 picture jasleen137  路  3Comments

Jacky3213 picture Jacky3213  路  3Comments

Greta-A picture Greta-A  路  3Comments

HanSeYeong picture HanSeYeong  路  3Comments

PROGRAMMINGENGINEER-NIKI picture PROGRAMMINGENGINEER-NIKI  路  3Comments