Dlib: On nVidia RTX 2080 TI, CUDA 10, Ubutnu 18.4, dlib returns error: CUDNN_STATUS_EXECUTION_FAILED

Created on 19 Feb 2019  ·  12Comments  ·  Source: davisking/dlib

Expected Behavior

dlib compiled with GPU flag should works on RTX 2080 without errors CUDNN_STATUS_EXECUTION_FAILED

Current Behavior

On nVidia RTX 2080 TI, CUDA 10, Ubutnu 18.4, dlib returns error:
File "face_clustering.py", line 77, in
face_descriptor = facerec.compute_face_descriptor(img, shape)
RuntimeError: Error while calling cudnnConvolutionForward( context(), &alpha, descriptor(data), data.device(), (const cudnnFilterDescriptor_t)filter_handle, filters.device(), (const cudnnConvolutionDescriptor_t)conv_handle, (cudnnConvolutionFwdAlgo_t)forward_algo, forward_workspace, forward_workspace_size_in_bytes, &beta, descriptor(output), output.device()) in file /home/maxprog/dlib/dlib/cuda/cudnn_dlibapi.cpp:1007. code: 8, reason: CUDNN_STATUS_EXECUTION_FAILED
screenshot 2019-02-19 at 19 46 23

Steps to Reproduce

Environment: nVidia RTX 2080 TI, CUDA 10, Ubutnu 18.4,
Steps to reproduce: After compilation and build dlib from sources with GPU support, please run from python_examples:
python face_clustering.py shape_predictor_5_face_landmarks.dat dlib_face_recognition_resnet_model_v1.dat ../examples/faces output_folder

Most helpful comment

@maxprog You don't want to share your tiny bit of knowledge for free - but you literally posted here asking for free help from a volunteer community about a massive free open source library that Davis and many volunteers spent thousands of hours creating (for free) so that you could use it for free?

All 12 comments

Problem solved :)

How do you fix it ?

Please write to me on email: [email protected] - then I will answer how solve it

Hi
I will help You.
What gpu have You - rtx 2080 ti?

Wysłane z iPhone'a

Wiadomość napisana przez LouisScorpio notifications@github.com w dniu 26.02.2019, o godz. 09:57:

How do you fix it ?


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.

Yes,rtx 2080 Ti, CUDA9.0, Ubutnu 18.4

@LouisScorpio Have you fix you problem? I meet a same problem same like you, can you tell you how to fix it, thanks, rtx 2080Ti , CUDA10.0, Ubuntu16.04

i am faced with the issue under cuda9.0,ubuntu16.04,rtx2080ti, any solutions?

Hey, so I am not sure why @maxprog doesn't want to share his solution, but I had the exact same issue.. After trying out various combinations of CUDA & Nvidia Drivers, I purged all that I could still find. Then I installed cuda_10.1.168_418.67_linux.run (including the packaged drivers!), together with cudnn-10.1-linux-x64-v7.6.0.64.tgz. That worked. Before I had tried the drivers 430.14 with the same CUDA 10.1, but it threw the error.

One can investigate the runtime version (located in /usr/lib) using nvidia-smi, and the toolkit version (located in /usr/local/lib). Using ldconfig -p | grep cuda is also a good tool to see if there is any stray CUDA laying around.

Hope it helps!

Knowlegde has price - thats is reason - I not share my knowladge for free

On 4 Jun 2019, at 08:27, Ralf Mayet notifications@github.com wrote:

Hey, so I am not sure why @maxprog https://github.com/maxprog doesn't want to share his solution, but I had the exact same issue.. After trying out various combinations of CUDA & Nvidia Drivers, I purged all that I could still find. Then I installed cuda_10.1.168_418.67_linux.run (including the packaged drivers!), together with cudnn-10.1-linux-x64-v7.6.0.64.tgz. That worked. Before I had tried the drivers 430.14 with the same CUDA 10.1, but it threw the error.

One can investigate the runtime version (located in /usr/lib) using nvidia-smi, and the toolkit version (located in /usr/local/lib). Using ldconfig -p is also a good tool to see if there is any stray CUDA laying around.

Hope it helps!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/davisking/dlib/issues/1662?email_source=notifications&email_token=ADFCZ2FCVKOUSIV3AN6TNNLPYYDLZA5CNFSM4GYOSCIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW3RSPY#issuecomment-498538815, or mute the thread https://github.com/notifications/unsubscribe-auth/ADFCZ2C5NNILDWBFPCOWGETPYYDLZANCNFSM4GYOSCIA.

@maxprog You don't want to share your tiny bit of knowledge for free - but you literally posted here asking for free help from a volunteer community about a massive free open source library that Davis and many volunteers spent thousands of hours creating (for free) so that you could use it for free?

I've found an ugly workaround for this problem: it seems like it fails only the first time the model called - so you can 'warmup' it with random image like np.empty(shape=(150, 150), dtype=np.uint8) while ignoring RuntimeError, and on subsequent calls it should work fine. That solved the same problem with _cnn_face_detection_model_v1_ for me, not sure if it works for _face_recognition_model_v1_ or any other model

Was this page helpful?
0 / 5 - 0 ratings