Please go to Stack Overflow for help and support:
http://stackoverflow.com/questions/tagged/tensorflow
Also, please understand that many of the models included in this repository are experimental and research-style code. If you open a GitHub issue, here is our policy:
Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.
You can collect some of this information using our environment capture script:
https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh
You can obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.
I have been training model updates all day long with runs about 25 to 45 minutes. Several times it has generated this same error each time at different points. In the stack trace above it was the 2nd time in a row on the same model which is running through 65k steps. I have had runs complete with no issue even up to 134k steps. While the training is running I have it mapped to 1 GPU and have cifar10_eval.py running and mapped to another GPU. While those are running I also have TensorBoard running so that I see the progress in the scalar graphs.
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.
Running with Python 3.6.4
If it helps my D drive is a SSD device dedicated for doing deep learning projects. My C: drive is on a different SSD.
Have you solved this problem?
I got this same error running in python 3.5. Was training my network fine for days and now I get this error non-stop.
Any news on this? I just started getting this today during training, only occasionally. I haven't really changed anything. I've been training 10-20 models in parallel for weeks without seeing this, until now. TF version 1.8.0 and Python 3.6.5 on windows.
I'm still getting the same error message. Using win10, Python 3.5.3 and tf 1.7.0
Same error on Windows 10, TF 1.13.0rc1, Python 3.6
Any updates on that?
I'm getting a similar error when using BestExporter as exporter for train_and_evaluate.
@John3-16 I wonder if this is linked to the use of Tensorboard. I am also using Tensorboard while training. Maybe it's possible that the access is denied because Tensorboard was checking that file/folder at the moment?
I will anyways try to reproduce the error while NOT using Tensorboard. If the error shows up, no link to Tensorboard; if it doesn't, we will simply have higher expectations that it is linked, but no confirmation.
Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.
C:\Users\G522405\hub>python .\examples\image_retraining\retrain.py --tfhub_module https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/quantops/feature_vector/3 --image_dir .\train_images\mask\ --how_many_training_steps 5000 --testing_percentage 20 --saved_model_dir .\saved_model\
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorboard\compat\tensorflow_stub\dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From .\examples\image_retraining\retrain.py:1357: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.E0424 14:06:49.798524 20932 retrain.py:1001] WARNING: This tool is deprecated in favor of https://github.com/tensorflow/hub/tree/master/tensorflow_hub/tools/make_image_classifier
WARNING:tensorflow:From .\examples\image_retraining\retrain.py:920: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.W0424 14:06:49.806849 20932 deprecation_wrapper.py:119] From .\examples\image_retraining\retrain.py:920: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.
WARNING:tensorflow:From .\examples\image_retraining\retrain.py:921: The name tf.gfile.DeleteRecursively is deprecated. Please use tf.io.gfile.rmtree instead.
W0424 14:06:49.806849 20932 deprecation_wrapper.py:119] From .\examples\image_retraining\retrain.py:921: The name tf.gfile.DeleteRecursively is deprecated. Please use tf.io.gfile.rmtree instead.
WARNING:tensorflow:From .\examples\image_retraining\retrain.py:922: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
W0424 14:06:49.816877 20932 deprecation_wrapper.py:119] From .\examples\image_retraining\retrain.py:922: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
WARNING:tensorflow:From .\examples\image_retraining\retrain.py:171: The name tf.gfile.Walk is deprecated. Please use tf.io.gfile.walk instead.
W0424 14:06:49.816877 20932 deprecation_wrapper.py:119] From .\examples\image_retraining\retrain.py:171: The name tf.gfile.Walk is deprecated. Please use tf.io.gfile.walk instead.
I0424 14:06:54.408019 20932 retrain.py:188] Looking for images in 'mask_detected'
WARNING:tensorflow:From .\examples\image_retraining\retrain.py:191: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.W0424 14:06:54.423643 20932 deprecation_wrapper.py:119] From .\examples\image_retraining\retrain.py:191: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.
I0424 14:06:55.814184 20932 retrain.py:188] Looking for images in 'no_mask_detected'
I0424 14:06:58.353736 20932 resolver.py:79] Using C:\Users\G522405\AppData\Local\Temp\tfhub_modules to cache modules.
I0424 14:06:58.364075 20932 resolver.py:413] Downloading TF-Hub Module 'https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/quantops/feature_vector/3'.
I0424 14:09:56.641230 20932 resolver.py:122] Downloading https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/quantops/feature_vector/3: 10.65MB
I0424 14:10:31.264947 20932 resolver.py:122] Downloading https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/quantops/feature_vector/3: 12.96MB
I0424 14:10:31.280569 20932 resolver.py:122] Downloaded https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/quantops/feature_vector/3, Total size: 12.96MB
Traceback (most recent call last):
File ".\examples\image_retraining\retrain.py", line 1357, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\platform\app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 299, in run
_run_main(main, args)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 250, in _run_main
sys.exit(main(argv))
File ".\examples\image_retraining\retrain.py", line 1031, in main
module_spec = hub.load_module_spec(FLAGS.tfhub_module)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow_hub\module.py", line 63, in load_module_spec
path = registry.resolver(path)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow_hub\registry.py", line 42, in __call__
return impl(args, *kwargs)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow_hub\compressed_module_resolver.py", line 88, in __call__
self._lock_file_timeout_sec())
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow_hub\resolver.py", line 427, in atomic_download
tf_v1.gfile.Rename(tmp_dir, module_dir)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\lib\io\file_io.py", line 502, in rename
rename_v2(oldname, newname, overwrite)
File "C:\Users\G522405\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\lib\io\file_io.py", line 519, in rename_v2
compat.as_bytes(src), compat.as_bytes(dst), overwrite)
tensorflow.python.framework.errors_impl.UnknownError: Failed to rename: C:\Users\G522405\AppData\Local\Temp\tfhub_modules\d55dd1be26e6816dbb24ee566c7a6ac2ba6863ed.f22d2630950d4d8daabe63e155fd70fc.tmp to: C:\Users\G522405\AppData\Local\Temp\tfhub_modules\d55dd1be26e6816dbb24ee566c7a6ac2ba6863ed : Access is denied.
; Input/output error
Try running cmd or powershell as Administrator. This should solve the issue.
Try running cmd or powershell as Administrator. This should solve the issue.
No it didn't solve the issue
I am having the same problem. In fact, it seems to be a fairly common issue, for there are a few threads open. A few examples are the following threads: keras-team/keras-tuner#339, tensorflow/tensorflow#41380
In the latter (41380), a temporal solution is provided. Could somebody else test it?
Here is a sample of the error log:
File "snippet_ktuner.py", line 144, in <module>
tuner.search(x_train, y_train,
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\kerastuner\engine\base_tuner.py", line 130, in search
self.run_trial(trial, *fit_args, **fit_kwargs)
File "snippet_ktuner.py", line 111, in run_trial
super(HyperBatch, self).run_trial(trial, *args, **kwargs)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\kerastuner\tuners\hyperband.py", line 387, in run_trial
super(Hyperband, self).run_trial(trial, *fit_args, **fit_kwargs)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\kerastuner\engine\multi_execution_tuner.py", line 96, in run_trial
history = model.fit(*fit_args, **copied_fit_kwargs)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 66, in _method_wrapper
return method(self, *args, **kwargs)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 876, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\callbacks.py", line 365, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\callbacks.py", line 1177, in on_epoch_end
self._save_model(epoch=epoch, logs=logs)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\callbacks.py", line 1212, in _save_model
self.model.save_weights(filepath, overwrite=True)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1169, in save_weights
checkpoint_management.update_checkpoint_state_internal(
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\training\checkpoint_management.py", line 247, in update_checkpoint_state_internal
file_io.atomic_write_string_to_file(coord_checkpoint_filename,
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 545, in atomic_write_string_to_file
rename(temp_pathname, filename, overwrite)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 491, in rename
rename_v2(oldname, newname, overwrite)
File "C:\Users\<redacted>\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 507, in rename_v2
_pywrap_file_io.RenameFile(
tensorflow.python.framework.errors_impl.UnknownError: Failed to rename: models\keras_tuner\hyperband_tuner-20200721-134553\trials\trial_491451722e7ce7d1c8d48609bdb8e766\checkpoints\epoch_0\checkpoint.tmpd834708963ff4d52afab8d390c6003c2 to: models\keras_tuner\hyperband_tuner-20200721-134553\trials\trial_491451722e7ce7d1c8d48609bdb8e766\checkpoints\epoch_0\checkpoint : Access is denied.
; Input/output error
Check for a folder with the same name as the name of the file you want to create at that path. for me deleting the folder named 'checkpoint' allowed creating the file checkpoint at that path.
Most helpful comment
I got this same error running in python 3.5. Was training my network fine for days and now I get this error non-stop.