Bert: Getting all negative predictions when fine-tune my data

Created on 21 Nov 2018 · 14Comments · Source: google-research/bert

Hi, I'm following the fine-tuning codes on my own dataset, which is a sentence pair classification task. All the parameters are the same as the example code. However, I get all negative predictions when doing the evaluation. Any ideas of what happened?

The code:

export BERT_BASE_DIR=/home/fy/uncased_L-12_H-768_A-12
export GLUE_DIR=/home/fy/glue_data
export TRAINED_CLASSIFIER=/tmp/ml_output/

python run_concept_classifier.py \
  --task_name=MRPC \
  --do_eval=true \
  --data_dir=$GLUE_DIR/ml_concept \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$TRAINED_CLASSIFIER \
  --max_seq_length=512 \
  --output_dir=/tmp/ml_output/

The evaluation result:
INFO:tensorflow: Eval results
INFO:tensorflow: eval_accuracy = 0.64309764
INFO:tensorflow: eval_fn = 106.0
INFO:tensorflow: eval_fp = 0.0
INFO:tensorflow: eval_loss = 1.2086438
INFO:tensorflow: eval_precision = 0.0
INFO:tensorflow: eval_recall = 0.0
INFO:tensorflow: eval_tn = 191.0
INFO:tensorflow: eval_tp = 0.0
INFO:tensorflow: global_step = 83
INFO:tensorflow: loss = 1.1816607

Source

yanfan0531

All 14 comments

Do you shuffle your data?

monanahe on 21 Nov 2018

👍1

Do you shuffle your data?

Thanks for reminding. I forgot to shuffle the data. I'll update the result once the model's trained on the new dataset. Thanks again!

yanfan0531 on 21 Nov 2018

Do you shuffle your data?

The model performs perfectly after shuffling the dataset. Thanks! I am closing it.

INFO:tensorflow: eval_accuracy = 0.97306395
INFO:tensorflow: eval_fn = 5.0
INFO:tensorflow: eval_fp = 3.0
INFO:tensorflow: eval_loss = 0.13963467
INFO:tensorflow: eval_precision = 0.97115386
INFO:tensorflow: eval_recall = 0.9528302
INFO:tensorflow: eval_tn = 188.0
INFO:tensorflow: eval_tp = 101.0
INFO:tensorflow: global_step = 83
INFO:tensorflow: loss = 0.13657574

yanfan0531 on 21 Nov 2018

Even after shuffling my model predicting only positive i,e 2 column has high probability, My data set has question and answer pairs with their label, i,e label is 1 if question and answer matches ,else 0.

Am doing anything wrong here,tired with different sequence lengths

tiru1930 on 12 Dec 2018

@yanfan0531 Does shuffling the dataset mean that all the examples of a class not be in a serial order ..i.e.all class o examples followed by all class 1 examples and so on. Instead we should have examples in train.tsv mixed e.g. few of class 0 and then of class 1 ,then few of class 0 and so on. Actually, my bert classifier is always predicting one particular class for all examples in test dataset and my training dataset examples are in a serial order, not mixed order

anubhavpnp on 4 May 2019

@anubhavpnp Hi, you are right about the shuffling. You should try mixing the order of labels in training data, and it solved my problem.

yanfan0531 on 4 May 2019

Thanks for your help @yanfan0531 ..Shuffling solved my issue and bert is now giving correct predictions. I read a bit more and shuffling is usually a pre requisite for training a model in neural networks.

anubhavpnp on 5 May 2019

How do we get probabilities [0.0,1.0] ? Do we need to add an additional softmax as per #322?

adrianog on 18 Jun 2019

How do we get probabilities [0.0,1.0] ? Do we need to add an additional softmax as per #322?

I did not add any additional softmax layer..My bert prediction classifier is working fine.. @adrianog

anubhavpnp on 18 Jun 2019

Are the probabilities returned for each class in the range [0.0,1.0]? Not even the example colab returns normalised probabilities.

adrianog on 18 Jun 2019

yes..it is between 0 and 1

anubhavpnp on 18 Jun 2019

🎉1

Thanks for confirming. Is your code the same as on the example colab ?

adrianog on 18 Jun 2019

I am using code which is here..https://github.com/winstarwang/rasa_nlu_bert

anubhavpnp on 18 Jun 2019

👍1

I don't see any difference with the official google bert model function, so I'm not too sure as to why probabilities are not in the [0,1] range in the example colab... And in my code.

Relevant bits:

adrianog on 18 Jun 2019

Was this page helpful?

0 / 5 - 0 ratings