Rasa: UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 75: illegal multibyte sequence

Created on 29 May 2019  Â·  9Comments  Â·  Source: RasaHQ/rasa

i don't know why add emoticons to code. So that the command can‘t run。 like this 🤖 . I'm so sad!

result error: UnicodeEncodeError: 'gbk' codec can't encode character '\U0001f916' in position 196: illegal multibyte sequence

Rasa version:
1.0.1
Python version:
3.5
Operating system (windows, osx, ...):
windows

type

Most helpful comment

with open("somefile.py", encoding="utf-8") as f:
#code here

All 9 comments

Hi this is a windows specific issue -- I believe @erohmensing has look into this?

@jiyikong which command are you using when you hit this error?

Any. This is a problem with the system environment.

Right, i realize that -- was just looking for an example to give you with the temporary fix.
Can you set the environment variable PYTHONIOENCODING='utf8', does that work?
If not, do the commands work when you include that var in the command, e.g.

PYTHONIOENCODING='utf8' rasa init

We are already aware and looking into a full fix for printing in non-utf-8 encoded terminals.

windows cmd and chcp 65001, Still have errors. In the end, I removed all the emojs then work correctly.
But there is still a problem with training Chinese.

- all error 

Traceback (most recent call last):
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "d:\Users\xxx\AppData\Local\Continuum\anaconda3\Scripts\rasa.exe\__main__.py", line 9, in <module>
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\__main__.py", line 70, in main
    cmdline_arguments.func(cmdline_arguments)
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\cli\train.py", line 69, in train
    kwargs=extract_additional_arguments(args),
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\train.py", line 48, in train
    kwargs=kwargs,
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\asyncio\base_events.py", line 467, in run_until_complete
    return future.result()
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\asyncio\futures.py", line 294, in result
    raise self._exception
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\asyncio\tasks.py", line 240, in _step
    result = coro.send(None)
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\train.py", line 91, in train_async
    training_files, skill_imports
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\data.py", line 67, in get_core_nlu_di
rectories
    story_files, nlu_data_files = get_core_nlu_files(paths, skill_imports)
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\data.py", line 114, in get_core_nlu_f
iles
    path, skill_imports
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\data.py", line 139, in _find_core_nlu
_files_in_directory
    if _is_nlu_file(full_path):
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\data.py", line 157, in _is_nlu_file
    content = io_utils.read_json_file(file_path)
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\site-packages\rasa\utils\io.py", line 128, in read_json_
file
    return json.load(f)
  File "d:\users\xxx\appdata\local\continuum\anaconda3\lib\json\__init__.py", line 265, in load
    return loads(fp.read(),
UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 162: illegal multibyte sequence

Hmm okay, we will look into this in https://github.com/RasaHQ/rasa/issues/3408

i'll close this since it's covered in #3408

with open("somefile.py", encoding="utf-8") as f:
#code here

I hope this can help you

Was this page helpful?
0 / 5 - 0 ratings