Tensorflow: tf.keras.Sequential() fails

Created on 29 Jul 2020  ·  42Comments  ·  Source: tensorflow/tensorflow

System information

  • Running the most basic instruction fails, for example from the documentation page https://www.tensorflow.org/api_docs/python/tf/keras/Sequential
  • OS Platform and Distribution: Arch Linux kernel 5.7.10-arch1-1 (linux@archlinux) (gcc version 10.1.0 (GCC), GNU ld (GNU Binutils) 2.34.0)
  • TensorFlow installed from (source or binary): binary package (official Arch package)
  • TensorFlow version (use command below): tensorflow-cuda 2.3.0
  • Python version: Python 3.8.4
  • CUDA/cuDNN version: cuda 11.0
  • GPU model and memory: GeForce GTX 950M, Driver Version: 450.57 - 2004MiB

Describe the current behavior
Start python then run:


import tensorflow as tf

m = tf.keras.Sequential()

The last line fails with the following error messages:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 116, in __init__
    super(functional.Functional, self).__init__(  # pylint: disable=bad-super-call
  File "/usr/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 308, in __init__
    self._init_batch_counters()
  File "/usr/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 317, in _init_batch_counters
    self._train_counter = variables.Variable(0, dtype='int64', aggregation=agg)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 262, in __call__
    return cls._variable_v2_call(*args, **kwargs)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 244, in _variable_v2_call
    return previous_getter(
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 237, in <lambda>
    previous_getter = lambda **kws: default_variable_creator_v2(None, **kws)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variable_scope.py", line 2633, in default_variable_creator_v2
    return resource_variable_ops.ResourceVariable(
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/variables.py", line 264, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1507, in __init__
    self._init_from_args(
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1661, in _init_from_args
    handle = eager_safe_variable_handle(
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 242, in eager_safe_variable_handle
    return _variable_handle_from_shape_and_dtype(
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 174, in _variable_handle_from_shape_and_dtype
    gen_logging_ops._assert(  # pylint: disable=protected-access
  File "/usr/lib/python3.8/site-packages/tensorflow/python/ops/gen_logging_ops.py", line 49, in _assert
    _ops.raise_from_not_ok_status(e, name)
  File "/usr/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6843, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse

Describe the expected behavior

An empty sequential model is created, no error.

Standalone code to reproduce the issue

import tensorflow as tf

m = tf.keras.Sequential()
gpu keras bug

Most helpful comment

Same problem with
python 3.7, tf 2.3 , CUDA 10.1 and cuDnn 7.6, windows 10
python 3.7, tf-nigthly (1-SEP-2020) , CUDA 11.0 and cuDnn 8.03, windows 10

also replicated by just tf.Variable((2,3))

the problem does not happen in tensorflow 2.2

All 42 comments

@alexn11
Can you please refer to this issue with same error and verify if more than one python process accessing tf at same time is running on your system, please verify the link and update. [also please try to disable gpu and try]

Similar issues for reference:
link link1

I've tried to see if any other program was using the GPU at the same time and also if any other instance of python was running. In both case it was negative:

  • Before:
$ pgrep python
(nothing)
$ nvidia-smi
Wed Jul 29 12:50:50 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.57       Driver Version: 450.57       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 950M    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P8    N/A /  N/A |      0MiB /  2004MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
  • During: i launch python on another terminal and type in the few command as described above then get the error.
$ pgrep python
8625
$ nvidia-smi
Wed Jul 29 12:52:16 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.57       Driver Version: 450.57       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 950M    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   48C    P8    N/A /  N/A |   1781MiB /  2004MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      8625      C   python                           1778MiB |
+-----------------------------------------------------------------------------+

(the amount of memory used seems to be quite high?)

  • After: I quit python, everything is back to nothing again:
$ pgrep python
$ nvidia-smi
Wed Jul 29 12:53:15 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.57       Driver Version: 450.57       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 950M    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P8    N/A /  N/A |      0MiB /  2004MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

(See below for the deactivated GPU)

Seems to be working when deactivating the GPU:

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
import tensorflow as tf
m = tf.keras.Sequential()

At that point I get the following output:

2020-07-29 12:57:51.389675: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-07-29 12:57:51.410142: E tensorflow/stream_executor/cuda/cuda_driver.cc:314] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-07-29 12:57:51.410208: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: myhostname
2020-07-29 12:57:51.410220: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: myhostname
2020-07-29 12:57:51.410407: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 450.57.0
2020-07-29 12:57:51.410449: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 450.57.0
2020-07-29 12:57:51.410459: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 450.57.0
2020-07-29 12:57:51.410912: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-07-29 12:57:51.419007: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2300800000 Hz
2020-07-29 12:57:51.419293: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d524787410 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-29 12:57:51.419314: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-29 12:57:51.420774: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.

The model has been created:

>>> m
<tensorflow.python.keras.engine.sequential.Sequential object at 0x7ff7a49cfc40>

@alexn11
As its working by the deactivation, please confirm if you want to move this issue to closed status.

Does this means that the problem is with Cuda and/or the drivers?

To be honest I don't know what I am supposed to do.

@alexn11 , I'd recommend installing tensorflow-gpu through anaconda. It's the easiest way to get things up. BTW official installaiton guide says CUDA 10.1 (not sure if it helps).

I am having a similar problem. The problem is that I guess you got cudnn updated to 8.0, with tf 2.3 and some other stuff.

Basically, roll all the stuff to the previous version for it to work

Hello I get the same problem when i try to use it.
But the source of the problem that the tf is asks the cudnn64_7.dll.
but in the cudnn 10.1 i just find cudnn64_8.dll.
In this reason i get older version from cudnn and get the cudnn64_7.dll. Tf is run and identified my gpu but the problem with sequential() is born, after i removed the 64_7.dll the problem solved, but i can't run on gpu.

@alexn11
Please update on CUDA installation, is this still an issue?

I have cuda 11, cudnn 8 and tf 2.3, and I get the error.

@thephet
Please confirm if you have visited the mentioned links and if you have tried after deactivating the GPU.

@Saduf2019 If I deactivate the GPU it works fine but obviously very slow.

The error I get is the following:

2020-07-27 12:58:16.554013: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-07-27 12:58:16.871243: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-07-27 12:58:17.050553: F tensorflow/stream_executor/cuda/cuda_dnn.cc:1186] Check failed: cudnnSetRNNMatrixMathType(rnn_desc.get(), math_type) == CUDNN_STATUS_SUCCESS (3 vs. 0)
Aborted (core dumped) 

@thephet
This error has been resolved in this comment, please refer to this and confirm.

@Saduf2019 Thanks for your answer. Reading that comment I am not sure what I am supposed to do to this fix the error.

The code I am trying to run is this one:

https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-prediction-keras/

Check at the end before the comments when he puts it all together.

@alexn11 Can you please try with CUDA 10.1 and cuDNN 7.6 and let me know if you are still facing the same issue. Thanks!

I have same issue too. I have 2 laptop. One has rtx2060 mobile and other gtx860M.
I installed linux a while of time ago. before that it is worked smoothly but after i install windows 10 back its doesn't working anymore
I m facing with this error on gtx860m. rtx one just working fine with these versions. but gtx one not working whatever i do
visual studio 2019 community with c++
Cuda: 10.1 update2
Cudnn: 7.6.5
Windows10
( two laptop has same installation versions)
I tried all possiblities.
I tried python 3.7, python 3.8, anaconda versions, i tried driver version wich comes with cuda, i tried updating driver version to 451. I tried tensorflow 2.2.0 and 2.3.0.
and always i installed windows 10 from zero when trying different combination.

import tensorflow as tf
train_loss = tf.keras.metrics.Mean(name=“train_loss”)
import tensorflow as tf
m = tf.keras.Sequential()

or trying to train a model it s not working and giving this error

``` File "C:\Users\mehmet\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1507, in __init__
self._init_from_args(
File "C:\Users\mehmet\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1661, in _init_from_args
handle = eager_safe_variable_handle(
File "C:\Users\mehmet\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 242, in eager_safe_variable_handle
return _variable_handle_from_shape_and_dtype(
File "C:\Users\mehmet\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 174, in _variable_handle_from_shape_and_dtype
gen_logging_ops._assert( # pylint: disable=protected-access
File "C:\Users\mehmet\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_logging_ops.py", line 49, in _assert
_ops.raise_from_not_ok_status(e, name)
File "C:\Users\mehmet\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 6843, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse


**or stoping like this.**

```2020-08-05 17:36:46.242249: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 17:36:46.242249: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 17:36:48.741475: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-08-05 17:36:46.242249: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 17:36:48.741475: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-08-05 17:36:49.608631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 860M computeCapability: 5.0
coreClock: 1.0195GHz coreCount: 5 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 74.65GiB/s
2020-08-05 17:36:49.608692: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 17:36:49.613753: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 17:36:49.618772: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 17:36:49.620362: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 17:36:49.626329: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 17:36:49.629463: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 17:36:49.641028: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 17:36:49.641171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 17:36:49.641781: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-05 17:36:49.659086: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18a99c5aff0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-05 17:36:49.659142: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-08-05 17:36:49.659427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 860M computeCapability: 5.0
coreClock: 1.0195GHz coreCount: 5 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 74.65GiB/s
2020-08-05 17:36:49.659465: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 17:36:49.659487: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 17:36:49.659506: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 17:36:49.659523: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 17:36:49.659539: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 17:36:49.659555: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 17:36:49.659576: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 17:36:49.659646: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 17:36:49.751107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-05 17:36:49.751144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]    0
2020-08-05 17:36:49.751154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N
2020-08-05 17:36:49.751380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3121 MB memory) -> physical GPU (device: 0, name: GeForce GTX 860M, pci bus id: 0000:01:00.0, compute capability: 5.0)
2020-08-05 17:36:49.755399: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18a9c483ba0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-08-05 17:36:49.755435: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 860M, Compute Capability 5.0

Same as @mcagricaliskan , fresh windows install,

If I disable GPU acceleration everything works fine.

Cuda: 10.1 update2
Cudnn: 7.6.5
Windows: 10 N Enterprise
python 3.7 or python 3.8 (tried both)
tensorflow 2.2.0 or 2.3.0, same error

Output nvidia-smi

Thu Aug 06 20:13:29 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 451.67       Driver Version: 451.67       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 750 Ti WDDM  | 00000000:01:00.0  On |                  N/A |
| 40%   26C    P8     1W /  38W |    375MiB /  2048MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       888    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A      1136    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      6612    C+G   ...bbwe\Microsoft.Photos.exe    N/A      |
|    0   N/A  N/A      7184    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     12040    C+G   ...nputApp\TextInputHost.exe    N/A      |
|    0   N/A  N/A     12192    C+G   ...y\ShellExperienceHost.exe    N/A      |
+-----------------------------------------------------------------------------+

Those apps default on windows so I cannot remove them from the GPU processing,

I saw that some of the apps could be using tensor in an encapsulated way and could be interfering in the execution, I've found in
This comment from @pshved

So I cannot know the next steps to try to use CUDA with TF.

Full Output trace.

>>> import tensorflow as tf
2020-08-06 20:01:54.062996: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
>>> m = tf.keras.Sequential()
2020-08-06 20:02:04.140553: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-08-06 20:02:04.188734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 750 Ti computeCapability: 5.0
coreClock: 1.0845GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 80.47GiB/s
2020-08-06 20:02:04.190198: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-06 20:02:04.278519: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-06 20:02:04.340656: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-06 20:02:04.368035: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-06 20:02:04.441860: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-06 20:02:04.482944: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-06 20:02:05.691214: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-06 20:02:05.691838: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-06 20:02:05.721854: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1e8dbcffce0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-06 20:02:05.722329: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-08-06 20:02:05.723997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 750 Ti computeCapability: 5.0
coreClock: 1.0845GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 80.47GiB/s
2020-08-06 20:02:05.724720: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-06 20:02:05.725135: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-06 20:02:05.725493: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-06 20:02:05.725854: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-06 20:02:05.726529: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-06 20:02:05.727025: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-06 20:02:05.727351: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-06 20:02:05.727779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-06 20:02:06.240587: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-06 20:02:06.241027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2020-08-06 20:02:06.241452: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N
2020-08-06 20:02:06.242547: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1455 MB memory) -> physical GPU (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0, compute capability: 5.0)
2020-08-06 20:02:06.248210: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1e8e2876090 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-08-06 20:02:06.248571: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 750 Ti, Compute Capability 5.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\keras\engine\sequential.py", line 116, in __init__
    super(functional.Functional, self).__init__(  # pylint: disable=bad-super-call
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\keras\engine\training.py", line 308, in __init__
    self._init_batch_counters()
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\keras\engine\training.py", line 317, in _init_batch_counters
    self._train_counter = variables.Variable(0, dtype='int64', aggregation=agg)
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\ops\variables.py", line 262, in __call__
    return cls._variable_v2_call(*args, **kwargs)
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\ops\variables.py", line 244, in _variable_v2_call
    return previous_getter(
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\ops\variables.py", line 237, in <lambda>
    previous_getter = lambda **kws: default_variable_creator_v2(None, **kws)
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\ops\variable_scope.py", line 2633, in default_variable_creator_v2
    return resource_variable_ops.ResourceVariable(
  File "C:\Users\IuriAndreazza\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\ops\variables.py", line 264, in __call__
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse

I have the same problem.
Just two lines of code

from tensorflow.keras.models import Sequential
model = Sequential()

Traceback (most recent call last):
File "C:/TF3/T2.py", line 3, in
model = Sequential()
File "C:\TF3\lib\site-packages\tensorflow\python\training\tracking\base.py", line 464, in _method_wrapper
result = method(self, args, *kwargs)
File "C:\TF3\lib\site-packages\tensorflow\python\keras\engine\sequential.py", line 116, in __init__
name=name, autocast=False)
File "C:\TF3\lib\site-packages\tensorflow\python\training\tracking\base.py", line 464, in _method_wrapper
result = method(self, args, *kwargs)
File "C:\TF3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 308, in __init__
self._init_batch_counters()
File "C:\TF3\lib\site-packages\tensorflow\python\training\tracking\base.py", line 464, in _method_wrapper
result = method(self, args, *kwargs)
File "C:\TF3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 316, in _init_batch_counters
self._train_counter = variables.Variable(0, dtype='int64', aggregation=agg)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\variables.py", line 262, in __call__
return cls._variable_v2_call(args, *kwargs)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\variables.py", line 256, in _variable_v2_call
shape=shape)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\variables.py", line 237, in
previous_getter = lambda *kws: default_variable_creator_v2(None, *kws)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2650, in default_variable_creator_v2
shape=shape)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\variables.py", line 264, in __call__
return super(VariableMetaclass, cls).__call__(args, *kwargs)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1544, in __init__
distribute_strategy=distribute_strategy)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1692, in _init_from_args
graph_mode=self._in_graph_mode)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 245, in eager_safe_variable_handle
shape, dtype, shared_name, name, graph_mode, initial_value)
File "C:\TF3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 177, in _variable_handle_from_shape_and_dtype
math_ops.logical_not(exists), [exists], name="EagerVariableNameReuse")
File "C:\TF3\lib\site-packages\tensorflow\python\ops\gen_logging_ops.py", line 49, in _assert
_ops.raise_from_not_ok_status(e, name)
File "C:\TF3\lib\site-packages\tensorflow\python\framework\ops.py", line 6921, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse

Windows 10
PyCharm 2020.2
Tensorflow 2.3.
CUDA and CUDNN compatible versions have been installed correctly.

I am planning on attending the TensorFlow Certification and one of the prerequisites is to check if PyCharm works correctly with the Image Classification example on the Tensorflow page.

I need the GPU and checking if it works without the GPU is irrelevant.

This is the failing assertion which seems to be GPU independent. Do you know if you're doing something weird in your model?

from tensorflow.keras.models import Sequential
model = Sequential()

seems to work fine for me though, on tf-nightly.

This is the failing assertion which seems to be GPU independent. Do you know if you're doing something weird in your model?

from tensorflow.keras.models import Sequential
model = Sequential()

seems to work fine for me though, on tf-nightly.

The code file has nothing more than these two lines. No other imports. No further modeling. I have installed all packages in this PyCharm project by updating the interpreter. TensorFlow 2.3.0 is installed and tf-nightly 2.4.0 dev20200811.

Why would this work in Jupyter Notebooks or Colab but not in PyCharm?

Why would this work in Jupyter Notebooks or Colab but not in PyCharm?

I'm not familiar with PyCharm so unfortunately I cannot help with this.

I have the same problem. Python 3.7, tf 2.3 , CUDA 10.1 and cuDnn 7.6. If I deactivate GPU, using CPU will be so slow and it doesn't actually solve the problem. I've tried to downgrade to tf 2.2.0 and it can solve this problem.

@gowthamkpr
Right now I'm running with the following (downgraded) versions

cuda 10.2.89-5
cudnn 7.6.5.32-4
tensorflow-cuda 2.2.0-1

Waiting for the next update that would guarantee that my code runs.

Same problem with
python 3.7, tf 2.3 , CUDA 10.1 and cuDnn 7.6, windows 10
python 3.7, tf-nigthly (1-SEP-2020) , CUDA 11.0 and cuDnn 8.03, windows 10

also replicated by just tf.Variable((2,3))

the problem does not happen in tensorflow 2.2

Same problem! CUDA 11.0 and tensorflow 2.3.0
When running the code without GPU everything is fine, but slow. WIth GPU (google cloud tensorflow GPU) the programme crashes

@gowthamkpr thank you that worked for me. CUDA 11.0 was the problem for me. Once downgrading to CUDA 10.1 everything worked as expected.

@samuelvisscher what cuDNN version are you using in order to make it work please ?
I tried 8.0.3 and 7.6.5 and it is not working...

@q-55555 I'm using cuDNN 7.6.5.

If it helps, I setup my environment using Google Cloud AI Platform Notebooks, using the CUDA Toolkit 10.1 instance (with GPU drivers pre-installed). Just had to install tensorflow-gpu 2.3.1.

@samuelvisscher Thanks for your reply.
Unfortunately, it is not working for me on Windows 10, tensorflow-gpu 2.3.1, CUDA Toolkit 10.1 and cuDNN 7.6.5.
I don't have the error with tensorflow 2.2.0 and tf-nightly.

So I guess no other ideas to help us making it works with tensorflow 2.3.1 please ?

So I guess no other ideas to help us making it works with tensorflow 2.3.1 please ?

My understanding is that older GPUs are not supported anymore by the default version of either TensorFlow or Cuda. Solution would be to compile the corresponding library on your system. I tried something like that but failed to complete the compilation.

Thank you for your reply @alexn11
I don't think the problem is related to GPUs compatibility with Tensorflow or Cuda because I don't get the error with the most recent version of tensorflow (tf-nightly).

It seems to be a Windows related bug. I tested it on a Linux GPU colab with the same CUDA and cuDNN version and it's working fine...

No I have the issue with Arch Linux.

@omalleyt12, I think your commit https://github.com/tensorflow/tensorflow/commit/69565ec4003902794bc94e10ba5fe9469a0b3ae4 is creating this issue. When calling self._init_batch_counters() function in tensorflow/python/keras/engine/training.py, we get the following error:

"tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse".

The issue appears with GPU activated (but not with all GPU..., for example in colab, it is working fine).

Maybe you have an idea or a workaround to propose.
Could you help us please ?

this was resolved for me after upgrading to CUDA/cuDNN: 11.1 (using [email protected])

(though there were a few other less critical issues
https://github.com/tensorflow/tensorflow/issues/44192
https://github.com/tensorflow/tensorflow/issues/44381
)

I managed to make it works with tensorflow 2.3.1 in recompiling tensorflow for CUDA 11.1 and cudnn 8.0.4.

I have got windows 10 with GeForce GTX 850M nvida GPU and configured cuda 11.1 and cudnn 8.04 and I have tensorflow 2.3.0 but I am getting "Could not load dynamic library 'cudart64_101.dll'". can any one please assist on this?

I managed to make it works with tensorflow 2.3.1 in recompiling tensorflow for CUDA 11.1 and cudnn 8.0.4.

@q-55555 how did you managed to get it working? did you recompile the tensorflow on windows machine? any step by step guide would be appreciated.

Thanks

@amitport I tried renaming the file cudart64_111.dll to cudart64_101.dll but no luck.

@m4masood no I deleted the previous messages because mine was with cusolver64 and not with cudart64_101.dll.

if it helps, make sure you are using a tf version that looks for cuda 11.1 (nightly or 2.4 rc)
2.3 will still look for cuda 10.1 on windows, so if you upgraded cuda and not tf you still have an incompatibility

@m4masood Yes I recompiled tensorflow on Windows (See https://www.tensorflow.org/install/source_windows).
You need to specify the use of CUDA 11.1 and cudnn 8 in the configure.py step.

Was this page helpful?
0 / 5 - 0 ratings