Tensor2tensor: How to use decode with attention_lm on ptb dataset?

Created on 3 Oct 2017 · 6Comments · Source: tensorflow/tensor2tensor

Environment:

DCA-LAPO-MAC:git lapolonio$ source ~/projects/Research/tensorflow/bin/activate
(tensorflow) DCA-LAPO-MAC:git lapolonio$ python --version
Python 2.7.10
(tensorflow) DCA-LAPO-MAC:git lapolonio$ pip show tensor2tensor
Name: tensor2tensor
Version: 1.2.4
(tensorflow) DCA-LAPO-MAC:git lapolonio$ pip show tensorflow
Name: tensorflow
Version: 1.3.0

Commands:

PROBLEM=languagemodel_ptb10k
MODEL=attention_lm
HPARAMS=attention_lm_small

DATA_DIR=$HOME/t2t_data
TMP_DIR=/tmp/t2t_datagen
TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS

mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR

# Generate data
t2t-datagen \
  --data_dir=$DATA_DIR \
  --tmp_dir=$TMP_DIR \
  --problem=$PROBLEM

# Train
# *  If you run out of memory, add --hparams='batch_size=1024'.
t2t-trainer \
  --data_dir=$DATA_DIR \
  --problems=$PROBLEM \
  --model=$MODEL \
  --hparams_set=$HPARAMS \
  --output_dir=$TRAIN_DIR

# Decode

DECODE_FILE=$DATA_DIR/decode_this.txt
echo "where are you" >> $DECODE_FILE

t2t-decoder \
  --data_dir=$DATA_DIR \
  --problems=$PROBLEM \
  --model=$MODEL \
  --hparams_set=$HPARAMS \
  --output_dir=$TRAIN_DIR \
  --decode_from_file=$DECODE_FILE

The error I get is:

ValueError: Dimension 0 in both shapes must be equal, but are 4 and 5 for 'Slice' (op: 'Slice') with input shapes: [?,?,1,1,1], [4], [5].

How do I use a checkpoint and a seed input to generate output? I saw in https://github.com/tensorflow/tensor2tensor/issues/86 using t2t-trainer and eval_steps to generate output. Is that the only way?

Source

lapolonio

Most helpful comment

I met with the similar problem as yours. It seems that the attention_lm is just the decoder part of the transformer, thus it wraps sentences in features['targets']. When you want to decode, anything you input, either interactively or from file, is taken as the features['inputs'], but the problem of ptb assumes that there is no such features['inputs'] and sentences should be in 'targets'. That is the problem here. I am trying to solve it now. Should I have any progress, I would update here. Anyway, there are a lot of such 'bugs' in tensor2tensor now.

renqianluo on 11 Oct 2017

👍4

All 6 comments

renqianluo on 11 Oct 2017

👍4

@renqianluo Is there anything I can do to help? Do you have the solution documented somewhere?

lapolonio on 26 Oct 2017

I believe this has been fixed. Please let us know if that's not the case.

rsepassi on 14 Nov 2017

i do the same thing and got the decode result like this:

INFO:tensorflow:Inference results INPUT: but while the new york stock exchange did n't fall apart friday as the dow jones industrial average plunged N points most of it in the final hour it barely managed to stay this side of chaos
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: the N stock specialist firms on the big board floor the buyers and sellers of last resort who were criticized after the N crash once again could n't handle the selling pressure
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: once again the specialists were not able to handle the imbalances on the floor of the new york stock exchange said christopher
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: some circuit breakers installed after the october N crash failed their first test traders say unable to cool the selling panic in both stocks and futures
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: big investment banks refused to step up to the plate to support the beleaguered floor traders by buying big blocks of stock traders say
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: seven big board stocks ual amr bankamerica walt disney capital cities/abc philip morris and pacific telesis group stopped trading and never resumed
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: heavy selling of standard & poor 's 500-stock index futures in chicago
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: no it was n't black monday
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: the equity market was
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: the
INFO:tensorflow:Inference results OUTPUT: