Models: lm_1b BATCH_SIZE > 1

Created on 20 May 2017  路  3Comments  路  Source: tensorflow/models

System information

Also minor modifications for Python 3 compatibility.

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Linux Centos 7

  • TensorFlow installed from (source or binary):
    binary

  • TensorFlow version (use command below):
    v1.0.0-65-g4763edf-dirty 1.0.1

  • Bazel version (if compiling from source):
    N/A

  • CUDA/cuDNN version:
    8.0.61
  • GPU model and memory:
    GeForce GTX 1080
  • Exact command to reproduce:

python3.5 lm_1b_eval.py --mode eval --pbtxt data/graph-2016-09-10.pbtxt --vocab_file data/vocab-2016-09-10.txt --input_data ../data/news.en.heldout-00000-of-00050 --ckpt "data/ckpt-*"

Describe the problem

The repository code contains a BATCH_SIZE variable set to 1. Modifying this to 10 raises the following error:

"ValueError: Cannot feed value of shape (10, 1) for Tensor 'inputs_in:0', which has shape '(1, 1)'"

I looked through the actual graph pbtxt, and it pretty clearly only contemplates a batch_size of 1 (similar to the NUM_TIMESTEPS issue referenced in #496), although the referenced paper states "We used a batch size of 128" (https://arxiv.org/pdf/1602.02410.pdf).

It's very difficult to manipulate the graph definition directly to allow for batches greater than 1.
It would be great if the TF team could make available the actual graph source code that allowed for the referenced 128 batch size, or at least the source code for the published graph so we can implement this ourselves.

Even with the GPU, dumping word embeddings is quite slow; batching this up would deliver significant improvements.

Source code / logs

https://github.com/tensorflow/models/blob/master/lm_1b/lm_1b_eval.py#L69

Most helpful comment

@panyx0718 I think the issue extends beyond the eval script. For example, attempting to evaluate

softmax = self.sess.run(self.tf_layers['softmax_out'],
    feed_dict={
        self.tf_layers['char_inputs_in']: char_ids_inputs,
    self.tf_layers['inputs_in']: inputs,
    self.tf_layers['targets_in']: targets,
    self.tf_layers['target_weights_in']: weights})

where char_ids_inputs has dimension [BATCH_SIZE x NUM_STEPS X MAX_WORD_LENGTH] = [2 x 1 x 50] and inputs has dimension [BATCH_SIZE x NUM_STEPS] = [2 x 1] yields the following error:

ValueError: Cannot feed value of shape (2, 1, 50) for Tensor u'char_inputs_in:0', which has shape '(1, 1, 50)'

All 3 comments

@panyx0718 Any updates on this?

The eval script is designed to only support batch size 1.
Feel free to contribute a PR to support larger batch size. Thanks!

@panyx0718 I think the issue extends beyond the eval script. For example, attempting to evaluate

softmax = self.sess.run(self.tf_layers['softmax_out'],
    feed_dict={
        self.tf_layers['char_inputs_in']: char_ids_inputs,
    self.tf_layers['inputs_in']: inputs,
    self.tf_layers['targets_in']: targets,
    self.tf_layers['target_weights_in']: weights})

where char_ids_inputs has dimension [BATCH_SIZE x NUM_STEPS X MAX_WORD_LENGTH] = [2 x 1 x 50] and inputs has dimension [BATCH_SIZE x NUM_STEPS] = [2 x 1] yields the following error:

ValueError: Cannot feed value of shape (2, 1, 50) for Tensor u'char_inputs_in:0', which has shape '(1, 1, 50)'
Was this page helpful?
0 / 5 - 0 ratings

Related issues

rakashi picture rakashi  路  3Comments

hanzy123 picture hanzy123  路  3Comments

Mostafaghelich picture Mostafaghelich  路  3Comments

25b3nk picture 25b3nk  路  3Comments

chenyuZha picture chenyuZha  路  3Comments