Keras: Non-OK-status: Internal: invalid configuration argument Aborted (core dumped)

Created on 11 Jul 2019 · 22Comments · Source: keras-team/keras

Please make sure that this is a Bug or a Feature Request and provide all applicable information asked by the template.
If your issue is an implementation question, please ask your question on StackOverflow or on the Keras Slack channel instead of opening a GitHub issue.

System information

Have I written custom code (as opposed to using example directory):
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow backend (yes / no): yes
TensorFlow version: v1.14.0-rc1-22-gaf24dc91b5 1.14.0
Keras version: 2.2.4
Python version: 3.6
CUDA/cuDNN version: Cuda compilation tools, release 10.0, V10.0.130
GPU model and memory: 2 gpus, each 11 GB

You can obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
You can obtain the Keras version with:
python -c 'import keras as k; print(k.__version__)'

Describe the current behavior
I was using below code to build a LSTM model.

    left = Input(shape=(128, 3072), dtype='float32', name='Input-Left')
    right = Input(shape=(128, 3072), dtype='float32', name='Input-Right')
    lstm = Bidirectional(LSTM(units=768,
                              activation='tanh'),
                         name='Bidirectional-LSTM')
    l_lstm = lstm(left)
    r_lstm = lstm(right)
    subtracted = Subtract(name='Subtract')([l_lstm, r_lstm])
    abs_subtracted = Lambda(function=backend.abs)(subtracted)
    mul = Multiply(name='multiplication')([l_lstm, r_lstm])
    concat = concatenate([abs_subtracted, mul])
    output = Dense(units=1)(concat)
    model = Model(inputs=[left, right],
                  outputs=output)
    model = multi_gpu_model(model, gpus=2)
    model.compile(loss='mean_squared_error',
                  optimizer='Adam',
                  metrics=['acc'])

Describe the expected behavior
expect the code run without error.

Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.

Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

get the following error

2019-07-11 00:34:47.259516: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:47.261497: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:47.263346: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:47.263979: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:47.264617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0, 1
2019-07-11 00:34:49.404341: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-11 00:34:49.404385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 1 
2019-07-11 00:34:49.404390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N Y 
2019-07-11 00:34:49.404394: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 1:   Y N 
2019-07-11 00:34:49.404729: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:49.405481: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:49.406172: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:49.406916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9428 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-07-11 00:34:49.407465: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-11 00:34:49.408128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10039 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
2019-07-11 00:34:49.942669: F ./tensorflow/core/kernels/random_op_gpu.h:227] Non-OK-status: CudaLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: invalid configuration argument
Aborted (core dumped)

tensorflow awaiting tensorflower buperformance

Source

xinsu626

👍1

Most helpful comment

for me, its cause i import torch and tf at the same time. you have to import tf before torch so tf can use GPU correctly. see https://github.com/tensorflow/tensorflow/issues/27487

uniphix000 on 11 Nov 2019

👍9 🚀3 🎉3 ❤2 😄2

All 22 comments

@xinsu626 I don't have multi-gpus. So I ran the your code (check gist here) without any errors. Can you try running the gist locally without multi_gpu_model? Thanks!

jvishnuvardhan on 11 Jul 2019

@jvishnuvardhan Thanks. I have tried on my local machine, and it works. However, if I run it on my GPU server, it always throw same error.

xinsu626 on 12 Jul 2019

@jvishnuvardhan I think the error should caused by tensorflow gpu. When I uninstalled the tensorflow gpu, there was no error.

xinsu626 on 12 Jul 2019

@xinsu626 Just to understand the source of the issue, either tensorflow-gpu or gpu drivers, did you ran any tensorflow code on single/multi GPU successfully? Thanks!

jvishnuvardhan on 12 Jul 2019

@jvishnuvardhan Yes. When the model's input is 2-D array, the model could successfully run on GPUs. However, when the model's input is 3-d arrays, it gave me the error.

xinsu626 on 13 Jul 2019

Closing this issue. I was able to resolve this issue. I think it is compatibility issue between tensorflow-gpu and cuda.

CUDA 10.0 and Tensorflow-gpu 1.14: No Error
CUDA 9.2 and Tensorflow-gpu 1.14: CudaLaunchKernel error.
CUDA 9.2 and Tensorflow-gpu 1.12: No Error.

xinsu626 on 17 Jul 2019

👎3 👍2

I am using CUDA 10.1 and Tensorflow-gpu 1.14 but still got this issue when using multi_gpu_model...

wendingp on 15 Aug 2019

@wendingp Sorry. There was a typo that I forgot to modify. It should be CUDA 10.0 and tf 1.14 instead of 10.1.

xinsu626 on 15 Aug 2019

Experiencing the same issue with CUDA 10.0 and Tensorflow-gpu 1.14. I don't think it is version related issue. At least not between CUDA and tensorflow gpu

DoDzilla-ai on 25 Sep 2019

👍4

I am using CUDA 10.０ and Tensorflow-gpu 1.14 but still got this issue,
2019-10-17 10:28:25.101386: F ./tensorflow/core/kernels/random_op_gpu.h:227] Non-OK-status: CudaLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: invalid configuration argument

jhxing on 17 Oct 2019

👍2

Closing this issue. I was able to resolve this issue. I think it is compatibility issue between tensorflow-gpu and cuda.

CUDA 10.0 and Tensorflow-gpu 1.14: No Error

CUDA 9.2 and Tensorflow-gpu 1.14: CudaLaunchKernel error.

CUDA 9.2 and Tensorflow-gpu 1.12: No Error.

Ｈｉ，　Ｉ　ｕｓｅ　CUDA 10.0 and Tensorflow-gpu 1.14　ｂｕｔ　ｓｔｉｌｌ　ｇｏｔ　ｔｈｅ　ｓａｍｅ　ｉｓｓｕｅ．

jhxing on 17 Oct 2019

👍1

I got CUDA10.0 and tf version 1.15.0-dev20190728
ｂｕｔ　I ｓｔｉｌｌ　ｇOｔ　ｔｈｅ　ｓａｍｅ　Iｓｓｕｅ．

guillefix on 21 Oct 2019

Same here

sedghi on 5 Nov 2019

CUDA10.0 and tensorflow-2.0 with the same error.

uniphix000 on 8 Nov 2019

👍2

for me, its cause i import torch and tf at the same time. you have to import tf before torch so tf can use GPU correctly. see https://github.com/tensorflow/tensorflow/issues/27487

uniphix000 on 11 Nov 2019

👍9 🚀3 🎉3 ❤2 😄2

CUDA10.0 and tensorflow-2.0 made from source in nvdia's jetson nano (see https://pythops.com/post/compile-deeplearning-libraries-for-jetson-nano) gives me this error as well.

paapu88 on 3 Dec 2019

for me, its cause i import torch and tf at the same time. you have to import tf before torch so tf can use GPU correctly. see tensorflow/tensorflow#27487

this fixed it for me

pcko1 on 18 Dec 2019

I tried installing the torch available on pytorch website for CUDA 10.0 using the following link:
pip install torch==1.2.0 torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html

but instead packages torch 1.2.0cu+92 and torchvision 0.4.0 cu+92 gets installed.

And then again core dumped error shows.

AKASH2907 on 13 Feb 2020

I'm facing this same issue with Cuda 10.0 and Tf 1.14

MADHAVAN001 on 15 Apr 2020

same issue here
cuda 10.1(cudnn 8.0.3)
TF 2.2.0
Nvidia Driver 431.60

salouri on 8 Sep 2020

Same issue

MinaJf on 9 Nov 2020

Hello

Please pay attention to

multiple versions of cuda libraries installed
a previous link created to fix the cuda library version for another software.
In my case I created the following link to fix eg software#2

libcudart.so.10.0 -> /usr/local/cuda/lib64/libcudart.so.10
software#2 worked but but in this way software#1 ( linked to libcudart.so.10 ) didn't work anymore getting just this error
/tensorflow/core/kernels/random_op_gpu.h:227] Non-OK-status: GpuLaunchKernel ,invalid device function.
I know , it is very strange , luckly I made the fix only yesterday , so I could remember the only modification I did ! I resolved in this way, hope this helps