Environment:
DCA-LAPO-MAC:git lapolonio$ source ~/projects/Research/tensorflow/bin/activate
(tensorflow) DCA-LAPO-MAC:git lapolonio$ python --version
Python 2.7.10
(tensorflow) DCA-LAPO-MAC:git lapolonio$ pip show tensor2tensor
Name: tensor2tensor
Version: 1.2.4
(tensorflow) DCA-LAPO-MAC:git lapolonio$ pip show tensorflow
Name: tensorflow
Version: 1.3.0
Commands:
PROBLEM=languagemodel_ptb10k
MODEL=attention_lm
HPARAMS=attention_lm_small
DATA_DIR=$HOME/t2t_data
TMP_DIR=/tmp/t2t_datagen
TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS
mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR
# Generate data
t2t-datagen \
--data_dir=$DATA_DIR \
--tmp_dir=$TMP_DIR \
--problem=$PROBLEM
# Train
# * If you run out of memory, add --hparams='batch_size=1024'.
t2t-trainer \
--data_dir=$DATA_DIR \
--problems=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR
# Decode
DECODE_FILE=$DATA_DIR/decode_this.txt
echo "where are you" >> $DECODE_FILE
t2t-decoder \
--data_dir=$DATA_DIR \
--problems=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR \
--decode_from_file=$DECODE_FILE
The error I get is:
ValueError: Dimension 0 in both shapes must be equal, but are 4 and 5 for 'Slice' (op: 'Slice') with input shapes: [?,?,1,1,1], [4], [5].
How do I use a checkpoint and a seed input to generate output? I saw in https://github.com/tensorflow/tensor2tensor/issues/86 using t2t-trainer and eval_steps to generate output. Is that the only way?
I met with the similar problem as yours. It seems that the attention_lm is just the decoder part of the transformer, thus it wraps sentences in features['targets']. When you want to decode, anything you input, either interactively or from file, is taken as the features['inputs'], but the problem of ptb assumes that there is no such features['inputs'] and sentences should be in 'targets'. That is the problem here. I am trying to solve it now. Should I have any progress, I would update here. Anyway, there are a lot of such 'bugs' in tensor2tensor now.
@renqianluo Is there anything I can do to help? Do you have the solution documented somewhere?
I believe this has been fixed. Please let us know if that's not the case.
i do the same thing and got the decode result like this:
INFO:tensorflow:Inference results INPUT: but while the new york stock exchange did n't fall apart friday as the dow jones industrial average plunged N points most of it in the final hour it barely managed to stay this side of chaos
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: the N stock specialist firms on the big board floor the buyers and sellers of last resort who were criticized after the N crash once again could n't handle the selling pressure
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: once again the specialists were not able to handle the imbalances on the floor of the new york stock exchange said christopher
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: some circuit breakers installed after the october N crash failed their first test traders say unable to cool the selling panic in both stocks and futures
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: big investment banks refused to step up to the plate to support the beleaguered floor traders by buying big blocks of stock traders say
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: seven big board stocks ual amr bankamerica walt disney capital cities/abc philip morris and pacific telesis group stopped trading and never resumed
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: heavy selling of standard & poor 's 500-stock index futures in chicago
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: no it was n't black monday
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: the equity market was
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: the
INFO:tensorflow:Inference results OUTPUT:
my tf version is 1.5 ,and t2t version 1.4.3
@renqianluo I am having the same issue. Were you able to fix it?
Most helpful comment
I met with the similar problem as yours. It seems that the attention_lm is just the decoder part of the transformer, thus it wraps sentences in features['targets']. When you want to decode, anything you input, either interactively or from file, is taken as the features['inputs'], but the problem of ptb assumes that there is no such features['inputs'] and sentences should be in 'targets'. That is the problem here. I am trying to solve it now. Should I have any progress, I would update here. Anyway, there are a lot of such 'bugs' in tensor2tensor now.