/research/domain_adaptationcd [your_path_to]/models/research/domain_adaptation
python pixel_domain_adaptation/pixelda_train.py -- --dataset_dir $DSN_DATA_DIR --source_dataset mnist --target_dataset mnist_m
Training of the model crashes, with an error back from the Tensorflow library itself.
Note: I am not using Bazel since the building was crashing with errors similar to #2542.
...
INFO:tensorflow:Trainable variables for scope: <filter object at 0x7fcb7ede20b8>
WARNING:tensorflow:update_ops in create_train_op does not contain all the update_ops in GraphKeys.UPDATE_OPS
Traceback (most recent call last):
File "pixel_domain_adaptation/pixelda_train.py", line 409, in <module>
tf.app.run()
File "/home/username/anaconda3/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "pixel_domain_adaptation/pixelda_train.py", line 405, in main
hparams=hparams)
File "pixel_domain_adaptation/pixelda_train.py", line 332, in run_training
summarize_gradients=FLAGS.summarize_gradients)
File "/home/username/anaconda3/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 436, in create_train_op
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/home/username/anaconda3/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/training.py", line 437, in create_train_op
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/home/username/anaconda3/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 378, in compute_gradients
processors = [_get_processor(v) for v in var_list]
File "/home/username/anaconda3/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 378, in <listcomp>
processors = [_get_processor(v) for v in var_list]
File "/home/username/anaconda3/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 153, in _get_processor
if v.op.type == "VarHandleOp":
AttributeError: 'filter' object has no attribute 'op'
@Aldream Did you solve this bug? I came across same bug in
tensorflow\python\training\optimizer.py", line 179, in _get_processor
if v.op.type == "VarHandleOp":
AttributeError: 'filter' object has no attribute 'op'
Platform: Tensorflow 1.4.0
Hi @StephenGreat,
No, still stuck at the same point.
@bousmalis @dmrd can you take a look? Thanks.
You can fix this by editing pixelda_train.py
I'm not sure this is the best way, but it works.
Go to line 326 and change:
**vars_to_train = list(discriminator_vars)**
discriminator_train_op = slim.learning.create_train_op(
discriminator_loss,
discriminator_optimizer,
update_ops=discriminator_update_ops,
**variables_to_train=vars_to_train**,
clip_gradient_norm=hparams.clip_gradient_norm,
summarize_gradients=FLAGS.summarize_gradients)
if hparams.generator_steps == 0:
generator_train_op = tf.no_op()
else:
**gen_vars_to_train = list(generator_vars)**
generator_optimizer = tf.train.AdamOptimizer(
learning_rate, beta1=hparams.adam_beta1)
generator_train_op = slim.learning.create_train_op(
generator_loss,
generator_optimizer,
update_ops=generator_update_ops,
**variables_to_train=gen_vars_to_train,**
clip_gradient_norm=hparams.clip_gradient_norm,
summarize_gradients=FLAGS.summarize_gradients)
This won't fix all of your problems. I'm now getting:
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/99139fd0db7d1d0bf07ad10838e7e6d5/execroot/__main__/bazel-out/k8-fastbuild/bin/research/domain_adaptation/pixel_domain_adaptation/pixelda_train.runfiles/__main__/research/domain_adaptation/pixel_domain_adaptation/pixelda_train.py", line 202, in _train
[discriminator_train_op, global_step])
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 539, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1013, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1102, in run
raise six.reraise(*original_exc_info)
File "/usr/local/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1089, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1161, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 941, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_5_batch/fifo_queue' is closed and has insufficient elements (requested 32, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
I am able to resolve it by changing
var_list = filter(is_trainable, slim.get_model_variables(scope))
to
var_list = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope)
in _get_vars_and_update_ops(hparams, scope) in pixelda_train.py.
the problem is the different behaviour of the filter function in Python 3 (where it returns an iterator) instead of a list in Python 2.
A simple fix is to change filter(....) to list(filter(.....))
At the same time, this simple fix also works by changing
variables_to_train=discriminator_vars,
to
variables_to_train=list(discriminator_vars)
The same for generator.
Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.
Most helpful comment
@bousmalis @dmrd can you take a look? Thanks.