The bug happens on latest version 0.9.0
Traceback (most recent call last):
File "<my_local_path>/bin/fairseq-eval-lm", line 33, in <module>
sys.exit(load_entry_point('fairseq==0.9.0', 'console_scripts', 'fairseq-eval-lm')())
File "<my_local_path>/lib/python3.7/site-packages/fairseq_cli/eval_lm.py", line 223, in cli_main
main(args)
File <my_local_path>/python3.7/site-packages/fairseq_cli/eval_lm.py", line 157, in main
if args.add_bos_token:
AttributeError: 'Namespace' object has no attribute 'add_bos_token'
There is no way to pass the arg 'add_bos_token' from command line and 'add_bos_token' is not mentioned in doc.
Adding '--context-window' in command line removes the 'no attribute 'add_bos_token'' error , but it causes another AssertionError in lm_context_window_dataset.py
Traceback (most recent call last):
File "<my_local_path>/bin/fairseq-eval-lm", line 33, in <module>
sys.exit(load_entry_point('fairseq==0.9.0', 'console_scripts', 'fairseq-eval-lm')())
File "<my_local_path>/lib/python3.7/site-packages/fairseq_cli/eval_lm.py", line 223, in cli_main
main(args)
File "<my_local_path>/lib/python3.7/site-packages/fairseq_cli/eval_lm.py", line 84, in main
pad_idx=task.source_dictionary.pad(),
File "<my_local_path>/lib/python3.7/site-packages/fairseq/data/lm_context_window_dataset.py", line 18, in __init__
assert isinstance(dataset, MonolingualDataset)
AssertionError
When running with "validate.py" script (as follows) from master branch, it throws different "Unable to infer Criterion arguments" error:
python fairseq/fairseq_cli/validate.py data-bin/<mydata> --path <my_roberta_checkpoint_path>/model.pt --task masked_lm --max-tokens 128
The error message:
Traceback (most recent call last):
File ".../fairseq/fairseq_cli/validate.py", line 132, in <module>
cli_main()
File ".../fairseq/fairseq_cli/validate.py", line 128, in cli_main
distributed_utils.call_main(args, main, override_args=override_args)
File ".../fairseq/fairseq/distributed_utils.py", line 189, in call_main
main(args, **kwargs)
File ".../fairseq/fairseq_cli/validate.py", line 65, in main
criterion = task.build_criterion(model_args)
File ".../fairseq/fairseq/tasks/fairseq_task.py", line 267, in build_criterion
return criterions.build_criterion(args, self)
File ".../fairseq/fairseq/registry.py", line 44, in build_x
return builder(args, *extra_args, **extra_kwargs)
File ".../fairseq/fairseq/criterions/fairseq_criterion.py", line 56, in build_criterion
'{}.build_criterion'.format(cls.__name__)
NotImplementedError: Unable to infer Criterion arguments, please implement MaskedLmLoss.build_criterion
Steps to reproduce the behavior (always include the command you ran):
cmd example that i run is as follows:
fairseq-eval-lm data-bin/<mydata> --path /<my_model_checkpoint>/model.pt --max-sentences 2 --tokens-per-sample 128 --task masked_lm --criterion masked_lm
The parameters are same as my training setting. This is a Roberta model trained with 'masked_lm' with 'fairseq-train' and was preprocessed with 'fairseq-preprocess' and encoded with 'fastBPE' (i.e., --bpe=fastbpe). Thus, the test data is pre-encoded with the fastbpe and preprocessed.
There is no issue when retrain a model on this checkpoint with 'fairseq-preprocess' .
N/A
Should return me the perplexity score on the test data in data-bin/.
pip, source): pipThis is related to the issue #1324 1324
RoBERTa doesn't work with fairseq-eval-lm. That script is only meant for left-to-right language models, such as the ones here: https://github.com/pytorch/fairseq/tree/master/examples/language_model
If you just want to measure masked language modeling perplexity, you can use fairseq-validate as described here: https://github.com/pytorch/fairseq/issues/1324#issuecomment-566262347
as i have reported, the fairseq-validate raise "Unable to infer Criterion arguments" error. I have to manually modify 'eval_lm.py' (line 179) from master branch to fix the 'no attribute 'add_bos_token' error in order to make it work. It is a bug that current 'fairseq-eval-lm' tool expects an 'add_bos_token' argument but there is no way to pass it from command line.
Ah, sorry, I missed the error with fairseq-validate, thanks for clarifying.
This should be fixed. I still recommend using fairseq-validate and not to use fairseq-eval-lm, since the latter is really only intended for left-to-right generative models and may not produce correct results.
many thanks