Keras: Doesn't work with cudnn v7.1.1.5

Created on 6 Mar 2018 · 21Comments · Source: keras-team/keras

Hi, I have updated nvidia-cuda 9.0 container, which now uses CUDNN_VERSION 7.1.1.5.
Now when I try to run keras from keras Docker I have an error:

E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7101 (compatibility version 7100) but source was compiled with 7004 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.

Tensorflow version: 1.6.0
Keras version: 2.1.4

Which source was compiled with 7004 version? How can I recompile it?

Source

taneta

👍4 ❤1

Most helpful comment

My env:windows-10 64 bit +python 3.5+ visual studio 2017 community + cuda toolkit 9.0+cuDNN v7.1.1 (Feb 28, 2018), for CUDA 9.0+tensorflow 1.6-gpu
My failure is

2018-03-13 20:44:03.100572: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\stream_executor\cuda\cuda_dnn.cc:378] Loaded runtime CuDNN library: 7101 (compatibility version 7100) but source was compiled with 7003 (compatibility version 7000).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.

replace Download cuDNN v7.1.1 (Feb 28, 2018), for CUDA 9.0 by cuDNN v7.0.5 (Dec 5, 2017), for CUDA 9.0 is ok.

So new is windows-10 64 bit +python 3.5+ visual studio 2017 community + cuda toolkit 9.0+cuDNN v7.0.5 (Dec 5, 2017), for CUDA 9.0+tensorflow 1.6-gpu

yichudu on 13 Mar 2018

👍10

All 21 comments

It's a problem about versions. You can solve this by installing cuDNN v7.0.x.

zhaoyang10 on 6 Mar 2018

👍5

@zhaoyang10 Yes, I understand that this problem because of cudnn version.
What I don't know is how to refer to nvidia/cuda Docker image with cuDNN v7.0.x.

Keras Dockerfile is built on of top Nvidia container:
ARG cuda_version=9.0
ARG cudnn_version=7
FROM nvidia/cuda:${cuda_version}-cudnn${cudnn_version}-devel

But nvidia/cuda:$9.0-cudnn7-devel container now includes cuDNN v7.1.1.5 version and I don't now how to request the earlier one.

taneta on 6 Mar 2018

👍9

I actually have the same problem as you @taneta. Have you found a solution, yet?

Naxter on 6 Mar 2018

👍5

@taneta I ran into the same problem running nvidia-docker after upgrading to CUDA 9.1 and CUDNN 7.1. Here is how I was able to fix the issue:

As @zhaoyang10 mentioned, you need to downgrade CUDNN.

First, you can search for available versions using apt-cache madison libcudnn7. Pick an appropriate version (7.0.x) and then run the following command to downgrade, replacing the CUDNN version with your chosen one (I used an official NVIDIA docker file as a reference):

apt-get update && apt-get install -y --allow-downgrades --no-install-recommends \ 
    libcudnn7=7.0.5.15-1+cuda9.1 \
    libcudnn7-dev=7.0.5.15-1+cuda9.1 && \
    rm -rf /var/lib/apt/lists/*

(Then run a quick apt-get update to refresh the lists)

Hope that helps!

LucidBlue on 6 Mar 2018

🎉6 👍1

@LucidBlue Thanks, that solved the problem!

I added couple lines to the Dockerfile and had to switch to root user to make it work:

# fix cudnn version
USER root
RUN apt-get update && apt-get install -y --allow-downgrades --no-install-recommend$
libcudnn7=7.0.5.15-1+cuda9.1 \
libcudnn7-dev=7.0.5.15-1+cuda9.1 && \
rm -rf /var/lib/apt/lists/*
RUN apt-get update

taneta on 7 Mar 2018

@LucidBlue Actually, it was to early to celebrate.

I can load a model now, but have this error when I run model.predict():

E tensorflow/stream_executor/cuda/cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
E tensorflow/stream_executor/cuda/cuda_dnn.cc:393] possibly insufficient driver version: 384.111.0
E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM

Haven't you observed the same problem?

UPD: Sorry, that was easily solved by replacing cuda9.1 with cuda9.0 in the code above.

taneta on 8 Mar 2018

hi, anybody knows if rebuilding keras from latest source code can fix this issue? or do we have a timeline to support this incompability? Thanks!　

qianggenxiadeshuihu on 8 Mar 2018

@taneta
Thanks! that fixed my problem using the downgrade (cudnn 7.1 to 7.0) you gave, here's just a paste of it with the typos $=s and 9.1=9.0 fix

now my code in docker works with tf 1.6 cuda 9.0 and cudnn 7.0

============================================

USER root

RUN apt-get update && apt-get install -y --allow-downgrades --no-install-recommends \
libcudnn7=7.0.5.15-1+cuda9.0 libcudnn7-dev=7.0.5.15-1+cuda9.0 && rm -rf /var/lib/apt/lists/*

RUN apt-get update

gsiisg on 8 Mar 2018

👍4 ❤3 🎉3

I tried adding the changes to the Dockerfile, as mentioned above, but for some reason when I inspect the image after building it, it still is with the CUDNN version 7.1.1.5. If I try to run Keras with the image, the issue persists. Is there anything wrong with my Dockerfile? I'm attaching it (as TXT)
Dockerfile.txt
to the post.

alberto-oliveira on 8 Mar 2018

@alberto-oliveira
I am a newbie at docker so I took the dockerfile from probably a similar source then added the paste at the VERY end, so it first installs the wrong version 7.1, then downgrades it to 7.0

my dockerfile
Dockerfile.txt

gsiisg on 8 Mar 2018

Luke035 on 12 Mar 2018

My env:windows-10 64 bit +python 3.5+ visual studio 2017 community + cuda toolkit 9.0+cuDNN v7.1.1 (Feb 28, 2018), for CUDA 9.0+tensorflow 1.6-gpu
My failure is

2018-03-13 20:44:03.100572: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\stream_executor\cuda\cuda_dnn.cc:378] Loaded runtime CuDNN library: 7101 (compatibility version 7100) but source was compiled with 7003 (compatibility version 7000).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.

replace Download cuDNN v7.1.1 (Feb 28, 2018), for CUDA 9.0 by cuDNN v7.0.5 (Dec 5, 2017), for CUDA 9.0 is ok.

So new is windows-10 64 bit +python 3.5+ visual studio 2017 community + cuda toolkit 9.0+cuDNN v7.0.5 (Dec 5, 2017), for CUDA 9.0+tensorflow 1.6-gpu

yichudu on 13 Mar 2018

👍10

so i have the same error and i ftry to use this :apt-get install -y --allow-downgrades --no-install-recommends
libcudnn7=7.0.5.15-1+cuda9.0 libcudnn7-dev=7.0.5.15-1+cuda9.0 && rm -rf /var/lib/apt/lists/*
But it doesn't work . Error said: can't find lib 7.0.5.15-1.
So i realize that error happened just because of my libcudnn version . So i remove the early version and install libcudnn7.0.5.15-1 and it's work.
i remove the old version by :
rm /usr/local/cuda/include/cudnn.h
rm /usr/local/cuda/lib64/libcudnn* ( cuda maybe rename like cuda9.0 or something else but it's in /usr/local/...)
so then i install that again with different version : http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
So here is my solution. Hope it can help :D

nguyenkhacduyngoc on 19 Mar 2018

👍1

# Downgrade CuDNN for compatibility with Tensforflow 1.5
RUN apt-get update && apt-get install -y --allow-downgrades --no-install-recommends \
    libcudnn7=7.0.4.31-1+cuda9.0 \
    libcudnn7-dev=7.0.4.31-1+cuda9.0 && \
    rm -rf /var/lib/apt/lists/*

waspinator on 4 Apr 2018

I doubt that this will solve most cases, but I installed cuDNN 7.2(.1) via .deb files, reinstalled tensorflow-gpu, and it worked. After all, it wasn't a version issue the driver (I had 384.xx which was correct), but one with cuDNN.

zhanwenchen on 24 Aug 2018

Try this
python= import os os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152 os.environ["CUDA_VISIBLE_DEVICES"]="1"

tony2037 on 24 Sep 2018

I am facing the same issue - https://stackoverflow.com/questions/52590880/mask-rcnn-resource-exhausted-oom-on-my-own-dataset

E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7201 (compatibility version 7200) but source was compiled with 7004 (compatibility version 7000).

csyang6052 on 2 Oct 2018

@gsiisg solution worked charming for me for latest official tf gpu docker image.FROM nvidia/cuda:9.0-base-ubuntu16.04
Nvidia driver 390. Ubuntu 18

juanluisrosaramos on 16 Oct 2018

@LucidBlue Actually, it was to early to celebrate.

I can load a model now, but have this error when I run model.predict():

E tensorflow/stream_executor/cuda/cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
E tensorflow/stream_executor/cuda/cuda_dnn.cc:393] possibly insufficient driver version: 384.111.0
E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM

Haven't you observed the same problem?

UPD: Sorry, that was easily solved by replacing cuda9.1 with cuda9.0 in the code above.