tensorflow/models/tutorials/rnn/translate/seq2seq_model.py
Hello everyone,
When I ran the translate model, I encountered the following issue:
File "translate.py", line 294, in main
train()
File "translate.py", line 153, in train
model = create_model(sess, False)
File "translate.py", line 132, in create_model
dtype=dtype)
File "/Users/richard_xiong/Documents/DeepLearningMaster/RNN/seq2seq_model.py", line 181, in __init__
softmax_loss_function=softmax_loss_function)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/seq2seq.py", line 1130, in model_with_buckets
softmax_loss_function=softmax_loss_function))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/seq2seq.py", line 1058, in sequence_loss
softmax_loss_function=softmax_loss_function))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/seq2seq.py", line 1022, in sequence_loss_by_example
crossent = softmax_loss_function(logit, target)
File "/Users/richard_xiong/Documents/DeepLearningMaster/RNN/seq2seq_model.py", line 117, in sampled_loss
num_classes=self.target_vocab_size),
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/nn.py", line 1412, in sampled_softmax_loss
name=name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/nn.py", line 1219, in _compute_sampled_logits
inputs, sampled_w, transpose_b=True) + sampled_b
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 1729, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1442, in _mat_mul
transpose_b=transpose_b, name=name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2242, in create_op
set_shapes_for_outputs(ret)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1617, in set_shapes_for_outputs
shapes = shape_func(op)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1568, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 675, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Shape must be rank 2 but is rank 1 for 'model_with_buckets/sequence_loss/sequence_loss_by_example/sampled_softmax_loss/MatMul_1' (op: 'MatMul') with input shapes: [?], [?,1024].
It seems it's the intrinsic matrix multiplication error in the function 'tf.nn.seq2seq.model_with_buckets()'
Does anyone have any ideas? Thank you!
I encounter the same issue with python 3.4 and TensorFlow version 12.1. Any inputs?
@richardxiong If you are using r0.12, try tensorflow/models/rnn/translate/ instead.
@bxshi I'm using python 2.7 and both version 12.0 and 12.1 have the same issue. It seems the directory has already changed and the current folder has been moved to tensorflow/models/tutorial/rnn/translate/
Any ideas?
@richardxiong Ahh my bad. I did not realize that you already using the translate.py in the main repo instead of the one from the tensorflow/models.
Hi Richard, can you try running it on the nightly build of TensorFlow? Let me know if that works better. A number of these models were updated with new code that may not work with r0.12 unfortunately.
I have tried this with tensorflow/tensorflow’s master branch with no problem, but it does not work with r0.12 branch.
On Jan 5, 2017, at 8:47 PM, Neal Wu notifications@github.com wrote:
Hi Richard, can you try running it on the nightly build of TensorFlow? Let me know if that works better.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
Hi Neal @nealwu
I tried run on the nightly build latest version, it works fine. One thing to note is the module of RNN cells are still in the contrib.rnn, instead of nn.rnn_cell (it seems it doesn't update the folder, but the code works).
Also as a reference @bxshi , thanks for the help!
@richardxiong I believe tf.contrib.rnn is the latest version of the code and tf.nn.rnn_cell is deprecated, at least in master.
@nealwu Okay thanks for letting me know Neal!
@richardxiong I tried the code from directory tensorflow/models/tutorial/rnn/translate/, but still encountered the problems. Any ideas?
@zhihuizheng are you using the master branch of TensorFlow? the translate model does not compatible with version 0.12.
@bxshi Oh no, I am using master+v0.12. How can I fix it?
@zhihuizheng, you can either use the nightly build (which I believe does not enable SSE) from
https://github.com/tensorflow/tensorflow/blob/master/README.md
Or you can first pip/pip3 uninstall tensorflow, and then compile TensorFlow under the master branch.
I think the command is
bazel build --copt=-march=native -c opt //tensorflow/tools/pip_package:build_pip_package
you can find more details on tensorflow.org
FYI, here is the guide on installing from source: https://www.tensorflow.org/get_started/os_setup#installing_from_sources
I am getting the same error. Can anyone explain smoothly how to get rid of it?
The error is:
ValueError: Shape must be rank 2 but is rank 1 for 'model_with_buckets/sequence_
loss/sequence_loss_by_example/sampled_softmax_loss/LogUniformCandidateSampler' (
op: 'LogUniformCandidateSampler') with input shapes: [?].
getting the same error.......
Same problem for me. I tried using the nightly build version, but the problem there is that the "current" seq2seq API (as in version 1.0) is completely reworked (see this commit), which means I cannot use the nightly version without throwing away my current implementation.
@ebrevdo, what's the roadmap for the new seq2seq API? Is there already any documentation available for the new API?
@nealwu, do you know an "easy" work around for the problem until a newer version is available?
I hope it doesn't sound too demanding, but the problem is that I'm currently implementing the system for my bachelor thesis in tensorflow and this issue blocks me currently.
For anyone who still get this bug: Line 1022 in tensorflow/python/ops/seq2seq.py, change softmax_loss_function(logit, target) to "softmax_loss_function(target, logit)".
Someone swapped the order of the arguments that function.
use tensorflow v0.12.x and python 3.5.x, you wont get all these errors and everything will run smoothly except for corpus loading and utf- unicode error that can be solved easily
Also experiencing the same problem, on python 2.7. Hopefully, recompiling might save me.
Regarding the new seq2seq API, I just pushed the last of the basics: a new
rnncell decoder wrapper. Should be in master tomorrow. We'll be adding
more in the coming weeks but I hope we've now got feature parity with the
old legacy API.
On Mar 14, 2017 10:16 AM, "Jerhone" notifications@github.com wrote:
Also experiencing the same problem, on python 2.7. Hopefully, recompiling
might save me.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/models/issues/836#issuecomment-286493727,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABtim-P7Y-J7PL3nAmndFz-zNjNDTUM8ks5rlstrgaJpZM4LYyWr
.
(well, + dynamic decoding and scheduled sampling)
On Mar 14, 2017 10:41 PM, "Eugene Brevdo" ebrevdo@gmail.com wrote:
Regarding the new seq2seq API, I just pushed the last of the basics: a new
rnncell decoder wrapper. Should be in master tomorrow. We'll be adding
more in the coming weeks but I hope we've now got feature parity with the
old legacy API.On Mar 14, 2017 10:16 AM, "Jerhone" notifications@github.com wrote:
Also experiencing the same problem, on python 2.7. Hopefully, recompiling
might save me.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/models/issues/836#issuecomment-286493727,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABtim-P7Y-J7PL3nAmndFz-zNjNDTUM8ks5rlstrgaJpZM4LYyWr
.
I think this change that I just merged should solve the issue: https://github.com/tensorflow/models/pull/982. It looks like the sampled_loss method in seq2seq_model.py had its arguments reversed.
For anyone who is still facing this error you can change the following in seq2seq_model.py
line 103 from def sampled_loss(inputs, labels): to def sampled_loss(labels, inputs):
My current settings are:
theano: 0.8.2
tensorflow: 1.0.0
Using TensorFlow backend.
keras: 1.2.2
@gerarq looks like you are reversing that change I linked to above. Why does that fix things for you? According to https://www.tensorflow.org/api_docs/python/tf/contrib/legacy_seq2seq/model_with_buckets, softmax_loss_function should take inputs first and labels second.
It looks like the problem is with our documentation. https://www.tensorflow.org/api_docs/python/tf/contrib/legacy_seq2seq/sequence_loss_by_example suggests the other order.
I believe this should be fixed now via https://github.com/tensorflow/models/pull/1226. If you run into further issues let me know.
Thanks for taking care of that!
I was facing the same issue.
@gerarq Thanks for the quick fix!
@dhakrasp you should pull this repository again. The fix was merged in https://github.com/tensorflow/models/pull/1226.
Most helpful comment
I think this change that I just merged should solve the issue: https://github.com/tensorflow/models/pull/982. It looks like the
sampled_lossmethod inseq2seq_model.pyhad its arguments reversed.