Bert: IndexError in run_classifier.py::MrpcProcessor::_create_examples (2)

Created on 4 Apr 2019  路  3Comments  路  Source: google-research/bert

What: "IndexError: list index out of range"

Location: MrpcProcessor::_create_examples function from run_classifier.py

Reason: Missing input validation when reading the lines: interpreting a newline as a line with at least 5 elements.

Steps to reproduce:
Corrupt the "glue_data/MRPC/train.tsv" by adding a trailing newline character at the end of the file (see attached file).
Run
python run_classifier.py \
--task_name=MRPC \
--do_train=true \
--do_eval=true \
--data_dir=$GLUE_DIR/MRPC \
--vocab_file=$BERT_BASE_DIR/vocab.txt \
--bert_config_file=$BERT_BASE_DIR/bert_config.json \
--init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
--max_seq_length=128 \
--train_batch_size=32 \
--learning_rate=2e-5 \
--num_train_epochs=3.0 \
--output_dir=/tmp/mrpc_output/

Traceback:
Traceback (most recent call last):
File "bert/run_classifier.py", line 981, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "bert/run_classifier.py", line 842, in main
train_examples = processor.get_train_examples(FLAGS.data_dir)
File "bert/run_classifier.py", line 302, in get_train_examples
self._read_tsv(os.path.join(data_dir, "train.tsv")), "train")
File "bert/run_classifier.py", line 325, in _create_examples
text_a = tokenization.convert_to_unicode(line[3])
IndexError: list index out of range

list_ioob2.zip

Most helpful comment

Same error please help out I am also struck on this please anyone help

All 3 comments

Same error please help out I am also struck on this please anyone help

Supposedly you can change the following to return a list with the number of classes in your particular case.

https://github.com/google-research/bert/blob/0fce551b55caabcfba52c61e18f34b541aef186a/run_classifier.py#L354-L356

Was this page helpful?
0 / 5 - 0 ratings