Models: Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)

Created on 24 Feb 2017 · 22Comments · Source: tensorflow/models

I got the following error when I run python cifar10_train.py.

E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_dnn.cc:390] Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source was compiled with 5105 (compatibility version 5100).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
F c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\kernels\conv_ops.cc:605] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)

Operating System: Windows 10
CUDA: Cuda compilation tools, release 8.0, V8.0.44
cuDNN: 5.1
tensorflow: 1.0.0

The output from python -c "import tensorflow; print(tensorflow.__version__)"

I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:135] successfully opened CUDA library cublas64_80.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:135] successfully opened CUDA library cudnn64_5.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:135] successfully opened CUDA library cufft64_80.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:135] successfully opened CUDA library nvcuda.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:135] successfully opened CUDA library curand64_80.dll locally
1.0.0

I have upgrade cudnn from 5.0 to 5.1. But it didn't work.

awaiting response

Source

secsilm

👍5

Most helpful comment

I had the same problem with @omelnikov . The second python instance to use the same GPU will give the following error:

F tensorflow/core/kernels/conv_ops.cc:667] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWino
gradNonfusedAlgo(), &algorithms)
Aborted (core dumped)

System specs:
Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-87-generic x86_64)
python 3.6.3
tensorflow-gpu 1.4.1
cuda 8.0
cudnn 6

HaoshengZou on 6 Jan 2018

👍16

All 22 comments

@secsilm The error indicates that the cuDNN you've loaded is 5.0, not 5.1. Perhaps the following documentation will help:

https://www.tensorflow.org/install/install_windows#requirements_to_run_tensorflow_with_gpu_support

cuDNN v5.1. For details, see NVIDIA's documentation. Note that cuDNN is typically installed in a
different location from the other CUDA DLLs. Ensure that you add the directory where you installed the
cuDNN DLL to your %PATH% environment variable.

tatatodd on 1 Mar 2017

@tatatodd Yes, I have reset my cuDNN environment variable and problem solved.

secsilm on 1 Mar 2017

@tatatodd how did u reset ur cuDNN environment variable ? I have the same problem

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
loading datasets
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 970M
major: 5 minor: 2 memoryClockRate (GHz) 1.038
pciBusID 0000:01:00.0
Total memory: 3.00GiB
Free memory: 2.64GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970M, pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_dnn.cc:346] Loaded cudnn library: 5110 but source was compiled against 4007. If using a binary install, upgrade your cudnn library to match. If building from sources, make sure the library loaded matches the version you specified during compile configuration.
F tensorflow/core/kernels/conv_ops.cc:459] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted (core dumped)

ehfo0 on 2 Jun 2017

@ehfo0 You can just replace the old files(5.0) with the new files(5.1), if you have set the cudnn environment variable in PATH.

secsilm on 3 Jun 2017

👍3

I had this problem too, but realized that the error is thrown only if I try to use GPU in the second instance of python. It seems that only one instance of python can to claim the GPU resource. Weird. If you close the python session that used the GPU resources, then it is freed up and another session can use it. This is the behavior that I experienced in PyCharm and Python 3.6 via command prompt on Windows 10 x64. Anyone has any further insight or workaround?

omelnikov on 26 Dec 2017

👍14

I had the same problem with @omelnikov . The second python instance to use the same GPU will give the following error:

F tensorflow/core/kernels/conv_ops.cc:667] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWino
gradNonfusedAlgo(), &algorithms)
Aborted (core dumped)

System specs:
Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-87-generic x86_64)
python 3.6.3
tensorflow-gpu 1.4.1
cuda 8.0
cudnn 6

HaoshengZou on 6 Jan 2018

👍16

Same problem as provided by @HaoshengZou

yoosan on 12 Mar 2018

I had the same program.But after i close chrome and VMware.(free more memory).

It work!!!

UesugiErii on 13 Mar 2018

👍2

@UesugiErii Very Special Thanks to your thorough knowledge.

dndusdndus12 on 25 May 2018

@HaoshengZou , I had same Problem, but I don't know how can I fix this issue.. any one please give suggestion for this issue..

SriramganeshMarimuthu on 1 Jun 2018

When I use the keras==1.2.0, I have got the same problem. Fortunately, the issue was solved after I upgrade the tensorflow from 1.2.0 to 1.3.0 .

XiaoYigwr on 13 Jun 2018

The above workarounds work in almost all cases. However, for me, the problem persisted in spite of updating the drivers restarting the machine. I solved it by explicitly sourcing the .bashrc file.
source ~/.bashrc

anjany on 14 Jun 2018

Try this
python= import os os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152 os.environ["CUDA_VISIBLE_DEVICES"]="1"

tony2037 on 24 Sep 2018

👍12 😕5

I change the version of tensorflow-gpu from 1.4 to 1.2, and it works well.

conda install tensorflow-gpu=1.2

feedliu on 28 Nov 2018

👍2

feedliu tanks it worked for me

JaymeNeto on 10 Jan 2019

Try this

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"   # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="1"

This code saved me , many thanks

petitchamp on 19 Jan 2019

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"   # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="1"

Since I have only one GPU device on my machine, this setting turned out to be assigning the model to run on CPU. That is why my code was working. But still the actual issue was not solved.

cramraj8 on 11 Feb 2019

👍2

@cramraj8 Try the code below：
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="0"

xjhjinhui on 4 Mar 2019

Thanks @tony2037 , just add the 3 lines the code works.

Billy1203 on 5 Mar 2019

The above solutions didn't work for me, any other approach fellas?

abhiram11 on 21 May 2019

👍3

Try this

import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"   # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="1"

hello , I use this method, but I find it trains not with GPU, instead of CPU. 打开任务管理器是感觉是CPU爆表啊

liuzc188 on 6 Jul 2019

I change the version of tensorflow-gpu from 1.4 to 1.2, and it works well.
conda install tensorflow-gpu=1.2

Thx, very useful! tensorflow-gpu version has problems, you should check your own versions try again and again, uninstall and install..... tensorflow-gpu找到对应的版本号然后卸载再重装

liuzc188 on 6 Jul 2019

Was this page helpful?

0 / 5 - 0 ratings