Tensor2tensor: Retval[0] does not have value issue for multiple problems

Created on 17 Jul 2017  路  3Comments  路  Source: tensorflow/tensor2tensor

Hello :) i am working on transformer for multiple problems. Only one problem doesnt show any errors. But from two problems t2t-trainer shows error like title.
The location that error occurs is at line 1010 in _train_model of tensorflow/contrib/learn/python/learn/estimators/estimator.py
How can i resolve that?
I tried adding more swap partitions(100GB) or decreasing max_case numbers and so on. Could anyone please help me?
(My env is 46GB Ram 100GB swap gtx1080)

Thank you in advance,
Jae

Most helpful comment

This was due to summaries not working with tf.conds right. Should be corrected in 1.1.0, please take a look and reopen if you see this again!

All 3 comments

Please provide the command-line you used to start the trainer and the stack trace of the error. Thanks.

Hi,
Please find below the command- line call and stack trace for this error. Also, please note that training a single problem using the same parameters is working fine. The error appears when we try to train multiple problems together.

Command line call:
t2t-trainer \
--data_dir=$DATA_DIR \
--problems=wmt_ende_tokens_32k-wmt_enfr_tokens_32k \
--model=transformer \
--hparams_set=transformer_base_single_gpu \
--output_dir=$TRAIN_DIR

Stack trace:
Traceback (most recent call last):
File "/home/dilip/anaconda2/bin/t2t-trainer", line 83, in
tf.app.run()
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/dilip/anaconda2/bin/t2t-trainer", line 79, in main
schedule=FLAGS.schedule)
File "/home/dilip/tensor2tensor/tensor2tensor/utils/trainer_utils.py", line 247, in run
run_locally(exp_fn(output_dir))
File "/home/dilip/tensor2tensor/tensor2tensor/utils/trainer_utils.py", line 540, in run_locally
exp.train_and_evaluate()
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 502, in train_and_evaluate
self.train(delay_secs=0)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 280, in train
hooks=self._train_monitors + extra_hooks)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 677, in _call_train
monitors=hooks)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 296, in new_func
return func(args, *kwargs)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 458, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1010, in _train_model
_, loss = mon_sess.run([model_fn_ops.train_op, model_fn_ops.loss])
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 518, in run
run_metadata=run_metadata)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 862, in run
run_metadata=run_metadata)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 818, in run
return self._sess.run(args, *kwargs)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 972, in run
run_metadata=run_metadata)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 818, in run
return self._sess.run(args, *kwargs)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/home/dilip/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Retval[0] does not have value

This was due to summaries not working with tf.conds right. Should be corrected in 1.1.0, please take a look and reopen if you see this again!

Was this page helpful?
0 / 5 - 0 ratings