Keras: F ./tensorflow/core/util/cuda_launch_config.h:127] Check failed: work_element_count > 0 (0 vs. 0)

Created on 7 Apr 2018 · 16Comments · Source: keras-team/keras

my computer get two gpus，it' s normal if i don't add this code in the context.
model = keras.utils.training_utils.multi_gpu_model(base_model, gpus=2)
however, it will use only one to compute, i don't understand what's it means 'work_element_count > 0'. is it that i have not cleared the cuda worker before ?

Source

JavisPeng

👍10

Most helpful comment

@mahaishou This issue was resolved when I upgraded tensorflow-gpu version to 1.9.0.

ashwinijoshigithub on 31 Aug 2018

👍2

All 16 comments

Same issue here

lucidyan on 10 Apr 2018

I have the same question

moorejee on 25 Apr 2018

Does nobody solve this issue?

linkinpark213 on 2 May 2018

Same issue here

vQuagliaro on 2 May 2018

I don't have the issue with tensorflow-gpu 1.7.0

vQuagliaro on 2 May 2018

@vQuagliaro same issue with tensorflow-gpu-1.7.0

Superlee506 on 13 May 2018

Same issue. Did anyone solve it?

yeshwanthv5 on 17 May 2018

Same issue.

wangxin0716 on 19 May 2018

Just try to re-build your environment with CUDA9.0, tensorflow-gpu 1.7.0 and cudnn7.0.
It solved this error for me and many other people.
And be careful with your training images, check if there are images with too many objects in it, it may cause OOM problem even if you reduced the ROI number in config. Try replacing those images may help solving this problem too.

chohaku84 on 21 May 2018

beacuse there has gpu process none data in one batch. Notice the input number. I uses four gpu and input 2 sample, this will occur. it's none business of the environment. If you have 2 gpus, make sure one batch at least has 2 sample.

mahaishou on 19 Jul 2018

@mahaishou how do you make sure that one batch has at least 2 samples? Please help!

ashwinijoshigithub on 19 Jul 2018

@ashwinijoshigithub do you output your logs like this?
6976/12000 [================>.............] - ETA: 2:02 - loss: 1.7103 - acc: 0.3749
7008/12000 [================>.............] - ETA: 2:01 - loss: 1.7102 - acc: 0.3747
7040/12000 [================>.............] - ETA: 2:00 - loss: 1.7103 - acc: 0.3741
7072/12000 [================>.............] - ETA: 2:00 - loss: 1.7103 - acc: 0.3740
7104/12000 [================>.............] - ETA: 1:59 - loss: 1.7102 - acc: 0.3744
12000 is my total samples and my batchsize is 32.

mahaishou on 9 Aug 2018

same ask.

@mahaishou how do you make sure that one batch has at least 2 samples? Please help!

qiuyinglin on 28 Aug 2018

@qiuyinglin control the number of one epoch input.
I use keras to train model, just like this
history = multi_model.fit(X_train, Y_train, batch_size=batch_size, epochs=1,
validation_data = (X_test, Y_test))
X_train is my input data, so just control the length of X_train.

mahaishou on 31 Aug 2018

👍1

@mahaishou This issue was resolved when I upgraded tensorflow-gpu version to 1.9.0.

ashwinijoshigithub on 31 Aug 2018

👍2

Closing as this is resolved