Fairseq: UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 8: ordinal not in range(128)

Created on 18 Dec 2019  ·  3Comments  ·  Source: pytorch/fairseq

❓ Questions and Help

What is your question?

when I run the following code,I have faced the error.

“fairseq-generate data-bin3/iwslt14.tokenized.de-en --path checkpoints2/transformer_iwslt_de_en/checkpoint_best.pt --batch-size 128 --beam 5 --remove-bpe

Traceback (most recent call last):
File "/usr/local/python3/bin/fairseq-generate", line 8, in
sys.exit(cli_main())
File "/usr/local/python3/lib/python3.6/site-packages/fairseq_cli/generate.py", line 203, in cli_main
main(args)
File "/usr/local/python3/lib/python3.6/site-packages/fairseq_cli/generate.py", line 135, in main
print('S-{}\t{}'.format(sample_id, src_str))
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 8: ordinal not in range(128)

question

Most helpful comment

Thanks for your help! I add "PYTHONIOENCODING=utf-8" ,now it can run properly.
As follows:
"PYTHONIOENCODING=utf-8 fairseq-generate data-bin3/iwslt14.tokenized.de-en --path checkpoints2/transformer_iwslt_de_en/checkpoint_best.pt --batch-size 128 --beam 5 --remove-bpe"

All 3 comments

Usually that means your locale environment variables are not set properly. Can you try running:

locale -a

and then (you may need to adjust based on the output above, the important part is UTF-8):

LC_ALL=en_US.UTF-8 fairseq-generate (...)

Thanks for your reply. I will try.

Thanks for your help! I add "PYTHONIOENCODING=utf-8" ,now it can run properly.
As follows:
"PYTHONIOENCODING=utf-8 fairseq-generate data-bin3/iwslt14.tokenized.de-en --path checkpoints2/transformer_iwslt_de_en/checkpoint_best.pt --batch-size 128 --beam 5 --remove-bpe"

Was this page helpful?
0 / 5 - 0 ratings