Flair: Error when fine-tuning BERT model for classification

Created on 14 Mar 2019 · 7Comments · Source: flairNLP/flair

When I run the following commands on Jupyter notebook for BERT sentence classification for the given existing MRPC dataset in windows 7

1) Export command is not recognized
2) Giving syntax error when I run, python run_classifier.py (But, when I tried, run run_classifier.py, it is working, but the flags are not getting assigned)
3) What are the indentation rules to run the below command
github

export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
export GLUE_DIR=/path/to/glue

python run_classifier.py \
--task_name=MRPC \
--do_train=true \
--do_eval=true \
--data_dir=$GLUE_DIR/MRPC \
--vocab_file=$BERT_BASE_DIR/vocab.txt \
--bert_config_file=$BERT_BASE_DIR/bert_config.json \
--init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
--max_seq_length=128 \
--train_batch_size=32 \
--learning_rate=2e-5 \
--num_train_epochs=3.0 \
--output_dir=/tmp/mrpc_output/

question wontfix

Source

sharathyadav1993

👀1

All 7 comments

I think this issue is more related to pytorch-pretrained-BERT, but from the given example you should use the following:

!export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
!export GLUE_DIR=/path/to/glue

Just use the "!" before the shell commands. Your Notebook will then interpret it as shell commands. Otherwise the Python interpreter is used and this causes the invalid syntax error :)

stefan-it on 15 Mar 2019

Hi Stefan,

Thank you for the response. As you told, I used "!" before the commands are shown below. But still i got these errors.
I am following the procedure mentioned in https://github.com/google-research/bert#sentence-and-sentence-pair-classification-tasks

!set BERT_BASE_DIR=C:/Users/SHemantharaj/Desktop/NLP_Experiment/Google_BERT_New/bert-master/BERT_BASE_DIR/uncased_L-12_H-768_A-12
!set GLUE_DIR=C:/Users/SHemantharaj/Desktop/NLP_Experiment/Google_BERT_New/bert-master/GLUE_DIR
!python run_classifier.py
!--task_name=MRPC \
!--do_train=True \
!--do_eval=False \
!--data_dir=$GLUE_DIR/MRPC
!--vocab_file=$BERT_BASE_DIR/vocab.txt \
!--bert_config_file=$BERT_BASE_DIR/bert_config.json \
!--init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
!--max_seq_length=128 \
!--train_batch_size=32 \
!--learning_rate=2e-5 \
!--num_train_epochs=3.0 \
!--output_dir=/tmp/mrpc_output/

These are the errors
C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packagesh5py__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packages\absl\flags_flagvalues.py", line 527, in _assert_validators
validator.verify(self)
File "C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packages\absl\flags_validators.py", line 81, in verify
raise _exceptions.ValidationError(self.message)
absl.flags._exceptions.ValidationError: Flag --data_dir must be specified.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_classifier.py", line 981, in
tf.app.run()
File "C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packages\tensorflow\python\platform\app.py", line 119, in run
argv = flags.FLAGS(_sys.argv if argv is None else argv, known_only=True)
File "C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packages\tensorflow\python\platform\flags.py", line 112, in __call__
return self.__dict__['__wrapped'].__call__(args, *kwargs)
File "C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packages\absl\flags_flagvalues.py", line 635, in __call__
self._assert_all_validators()
File "C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packages\absl\flags_flagvalues.py", line 509, in _assert_all_validators
self._assert_validators(all_validators)
File "C:\Users\SHemantharaj\AppData\Local\Continuum\anaconda3_old_version_Final\libsite-packages\absl\flags_flagvalues.py", line 530, in _assert_validators
raise _exceptions.IllegalFlagValueError('%s: %s' % (message, str(e)))
absl.flags._exceptions.IllegalFlagValueError: flag --data_dir=None: Flag --data_dir must be specified.
'--task_name' is not recognized as an internal or external command,
operable program or batch file.
'--vocab_file' is not recognized as an internal or external command,
operable program or batch file.

sharathyadav1993 on 15 Mar 2019

@sharathyadav1993 just put ! before exports and python.

erip on 24 Mar 2019

"just put ! before exports and python" even after doing these, I am getting the following error:
export' is not recognized as an internal or external command,
operable program or batch file.
'export' is not recognized as an internal or external command,
operable program or batch file.

how to rectify it?

smv1796 on 1 Nov 2019

Learn how to set environment variables in Windows.

erip on 1 Nov 2019

I got a similar error while training the ALBERT model. The primary reason was the parameters were not passed correctly with the py file.

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/absl/flags/_flag.py", line 181, in _parse
return self.parser.parse(argument)
File "/usr/local/lib/python3.6/dist-packages/absl/flags/_argument_parser.py", line 286, in parse
raise ValueError('Non-boolean argument to boolean flag', argument)
ValueError: ('Non-boolean argument to boolean flag', '/content/drive/My Drive/google-research-master/albert')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_pretraining.py", line 568, in
tf.compat.v1.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 293, in run
flags_parser,
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 362, in _run_init
flags_parser=flags_parser,
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 212, in _register_and_parse_flags_with_usage
args_to_main = flags_parser(original_argv)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 31, in _parse_flags_tolerate_undef
return flags.FLAGS(_sys.argv if argv is None else argv, known_only=True)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/flags.py", line 112, in __call__
return self.__dict__['__wrapped'].__call__(args, *kwargs)
File "/usr/local/lib/python3.6/dist-packages/absl/flags/_flagvalues.py", line 626, in __call__
unknown_flags, unparsed_args = self._parse_args(args, known_only)
File "/usr/local/lib/python3.6/dist-packages/absl/flags/_flagvalues.py", line 774, in _parse_args
flag.parse(value)
File "/usr/local/lib/python3.6/dist-packages/absl/flags/_flag.py", line 166, in parse
self.value = self._parse(argument)
File "/usr/local/lib/python3.6/dist-packages/absl/flags/_flag.py", line 184, in _parse
'flag --%s=%s: %s' % (self.name, argument, e))
absl.flags._exceptions.IllegalFlagValueError: flag --do_eval=/content/drive/My Drive/google-research-master/albert: ('Non-boolean argument to boolean flag', '/content/drive/My Drive/google-research-master/albert')

It worked when I execute as below:
!python run_pretraining.py --output_dir="/content/drive/My Drive/google-research-master/Albert_Output/" --export_dir="/content/drive/My Drive/google-research-master/Albert_Output/" --do_train=True --do_eval=True --input_file=False --albert_config_file=False

IsinghGitHub on 17 Nov 2019

👀3

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.