Dlib: On nVidia RTX 2080 TI, CUDA 10, Ubutnu 18.4, dlib returns error: CUDNN_STATUS_EXECUTION_FAILED

Created on 19 Feb 2019 · 12Comments · Source: davisking/dlib

Expected Behavior

dlib compiled with GPU flag should works on RTX 2080 without errors CUDNN_STATUS_EXECUTION_FAILED

Current Behavior

On nVidia RTX 2080 TI, CUDA 10, Ubutnu 18.4, dlib returns error:
File "face_clustering.py", line 77, in
face_descriptor = facerec.compute_face_descriptor(img, shape)
RuntimeError: Error while calling cudnnConvolutionForward( context(), &alpha, descriptor(data), data.device(), (const cudnnFilterDescriptor_t)filter_handle, filters.device(), (const cudnnConvolutionDescriptor_t)conv_handle, (cudnnConvolutionFwdAlgo_t)forward_algo, forward_workspace, forward_workspace_size_in_bytes, &beta, descriptor(output), output.device()) in file /home/maxprog/dlib/dlib/cuda/cudnn_dlibapi.cpp:1007. code: 8, reason: CUDNN_STATUS_EXECUTION_FAILED
screenshot 2019-02-19 at 19 46 23

Steps to Reproduce

Environment: nVidia RTX 2080 TI, CUDA 10, Ubutnu 18.4,
Steps to reproduce: After compilation and build dlib from sources with GPU support, please run from python_examples:
python face_clustering.py shape_predictor_5_face_landmarks.dat dlib_face_recognition_resnet_model_v1.dat ../examples/faces output_folder

Version: 19.16.99
Where did you get dlib: official sourcefrom github repo, https://github.com/davisking/dlib/
Platform: Ubuntu 18.4 64Bit
Compiler:

Source

maxprog

Most helpful comment

@maxprog You don't want to share your tiny bit of knowledge for free - but you literally posted here asking for free help from a volunteer community about a massive free open source library that Davis and many volunteers spent thousands of hours creating (for free) so that you could use it for free?

ageitgey on 20 Aug 2019

😄14 ❤7

All 12 comments

Problem solved :)

maxprog on 19 Feb 2019

How do you fix it ?

LouisScorpio on 26 Feb 2019

Please write to me on email: [email protected] - then I will answer how solve it

maxprog on 27 Feb 2019

👎20 😕2

Hi
I will help You.
What gpu have You - rtx 2080 ti?

Wysłane z iPhone'a

Wiadomość napisana przez LouisScorpio notifications@github.com w dniu 26.02.2019, o godz. 09:57:

How do you fix it ?

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.

maxprog on 27 Feb 2019

Yes,rtx 2080 Ti, CUDA9.0, Ubutnu 18.4

LouisScorpio on 11 Mar 2019

@LouisScorpio Have you fix you problem? I meet a same problem same like you, can you tell you how to fix it, thanks, rtx 2080Ti , CUDA10.0, Ubuntu16.04

fangs22 on 11 Apr 2019

i am faced with the issue under cuda9.0,ubuntu16.04,rtx2080ti, any solutions?

Davidrjx on 13 Apr 2019

Hey, so I am not sure why @maxprog doesn't want to share his solution, but I had the exact same issue.. After trying out various combinations of CUDA & Nvidia Drivers, I purged all that I could still find. Then I installed cuda_10.1.168_418.67_linux.run (including the packaged drivers!), together with cudnn-10.1-linux-x64-v7.6.0.64.tgz. That worked. Before I had tried the drivers 430.14 with the same CUDA 10.1, but it threw the error.

One can investigate the runtime version (located in /usr/lib) using nvidia-smi, and the toolkit version (located in /usr/local/lib). Using ldconfig -p | grep cuda is also a good tool to see if there is any stray CUDA laying around.

Hope it helps!

elggem on 4 Jun 2019

Knowlegde has price - thats is reason - I not share my knowladge for free

On 4 Jun 2019, at 08:27, Ralf Mayet notifications@github.com wrote:

Hey, so I am not sure why @maxprog https://github.com/maxprog doesn't want to share his solution, but I had the exact same issue.. After trying out various combinations of CUDA & Nvidia Drivers, I purged all that I could still find. Then I installed cuda_10.1.168_418.67_linux.run (including the packaged drivers!), together with cudnn-10.1-linux-x64-v7.6.0.64.tgz. That worked. Before I had tried the drivers 430.14 with the same CUDA 10.1, but it threw the error.

One can investigate the runtime version (located in /usr/lib) using nvidia-smi, and the toolkit version (located in /usr/local/lib). Using ldconfig -p is also a good tool to see if there is any stray CUDA laying around.

Hope it helps!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/davisking/dlib/issues/1662?email_source=notifications&email_token=ADFCZ2FCVKOUSIV3AN6TNNLPYYDLZA5CNFSM4GYOSCIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW3RSPY#issuecomment-498538815, or mute the thread https://github.com/notifications/unsubscribe-auth/ADFCZ2C5NNILDWBFPCOWGETPYYDLZANCNFSM4GYOSCIA.

maxprog on 4 Jun 2019

👎27 😕2

ageitgey on 20 Aug 2019

😄14 ❤7

I've found an ugly workaround for this problem: it seems like it fails only the first time the model called - so you can 'warmup' it with random image like np.empty(shape=(150, 150), dtype=np.uint8) while ignoring RuntimeError, and on subsequent calls it should work fine. That solved the same problem with _cnn_face_detection_model_v1_ for me, not sure if it works for _face_recognition_model_v1_ or any other model