What is the top-level directory of the model you are using:
lm_1b
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
e.g. modify the BATCH_SIZE variable to 10 at https://github.com/tensorflow/models/blob/master/lm_1b/lm_1b_eval.py#L69
Also minor modifications for Python 3 compatibility.
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Linux Centos 7
TensorFlow installed from (source or binary):
binary
TensorFlow version (use command below):
v1.0.0-65-g4763edf-dirty 1.0.1
Bazel version (if compiling from source):
N/A
python3.5 lm_1b_eval.py --mode eval --pbtxt data/graph-2016-09-10.pbtxt --vocab_file data/vocab-2016-09-10.txt --input_data ../data/news.en.heldout-00000-of-00050 --ckpt "data/ckpt-*"
The repository code contains a BATCH_SIZE variable set to 1. Modifying this to 10 raises the following error:
"ValueError: Cannot feed value of shape (10, 1) for Tensor 'inputs_in:0', which has shape '(1, 1)'"
I looked through the actual graph pbtxt, and it pretty clearly only contemplates a batch_size of 1 (similar to the NUM_TIMESTEPS issue referenced in #496), although the referenced paper states "We used a batch size of 128" (https://arxiv.org/pdf/1602.02410.pdf).
It's very difficult to manipulate the graph definition directly to allow for batches greater than 1.
It would be great if the TF team could make available the actual graph source code that allowed for the referenced 128 batch size, or at least the source code for the published graph so we can implement this ourselves.
Even with the GPU, dumping word embeddings is quite slow; batching this up would deliver significant improvements.
https://github.com/tensorflow/models/blob/master/lm_1b/lm_1b_eval.py#L69
@panyx0718 Any updates on this?
The eval script is designed to only support batch size 1.
Feel free to contribute a PR to support larger batch size. Thanks!
@panyx0718 I think the issue extends beyond the eval script. For example, attempting to evaluate
softmax = self.sess.run(self.tf_layers['softmax_out'],
feed_dict={
self.tf_layers['char_inputs_in']: char_ids_inputs,
self.tf_layers['inputs_in']: inputs,
self.tf_layers['targets_in']: targets,
self.tf_layers['target_weights_in']: weights})
where char_ids_inputs
has dimension [BATCH_SIZE x NUM_STEPS X MAX_WORD_LENGTH] = [2 x 1 x 50]
and inputs has dimension [BATCH_SIZE x NUM_STEPS] = [2 x 1]
yields the following error:
ValueError: Cannot feed value of shape (2, 1, 50) for Tensor u'char_inputs_in:0', which has shape '(1, 1, 50)'
Most helpful comment
@panyx0718 I think the issue extends beyond the eval script. For example, attempting to evaluate
where
char_ids_inputs
has dimension[BATCH_SIZE x NUM_STEPS X MAX_WORD_LENGTH] = [2 x 1 x 50]
and inputs has dimension[BATCH_SIZE x NUM_STEPS] = [2 x 1]
yields the following error: