Click: Behavioural difference between unicode decoding in `click.testing.CliRunner` and outside of `click.testing` (`UnicodeDecodeError`)

Created on 12 Sep 2017  路  3Comments  路  Source: pallets/click

(e2e) Adam@Adams-MBP ~/D/click-experiment> env | grep LANG
LANG=en_gb.UTF-8
(e2e) Adam@Adams-MBP ~/D/click-experiment> env | grep PYTHONIOENCODING
(e2e) Adam@Adams-MBP ~/D/click-experiment>
(e2e) Adam@Adams-MBP ~/D/click-experiment> env
Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.VygZz7PVFg/Render
EDITOR=vim
HOME=/Users/Adam
LANG=en_gb.UTF-8
LOGNAME=Adam
LSCOLORS=gxfxbEaEBxxEhEhBaDaCaD
MYVIMRC=/Users/Adam/.config/nvim/init.vim
NVIM_LISTEN_ADDRESS=/var/folders/3l/ffdytd9n2cnbgch4rwt6k7jm0000gn/T/nvimmMGW7R/0
PATH=/Users/Adam/.virtualenvs/e2e/bin:/Users/Adam/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/Users/Ada
m/.local/bin
PWD=/Users/Adam/Desktop/click-experiment
SECURITYSESSIONID=186a7
SHELL=/usr/local/bin/fish
SHLVL=3
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.JaAq6mgZJg/Listeners
TERM=xterm-256color
TERM_PROGRAM=Apple_Terminal
TERM_PROGRAM_VERSION=388.1.1
TERM_SESSION_ID=33676A3D-5E6B-4BB7-8BFB-E16430E5ED91
TMPDIR=/var/folders/3l/ffdytd9n2cnbgch4rwt6k7jm0000gn/T/
USER=Adam
VIM=
VIMRUNTIME=
VIRTUAL_ENV=/Users/Adam/.virtualenvs/e2e
XPC_FLAGS=0x0
XPC_SERVICE_NAME=0
__CF_USER_TEXT_ENCODING=0x1F5:0x0:0x2

The expected behaviour of the Result object is that it includes output as a combined stdout and stderr.
However, in the following case, the output in the Result object is not the same as the output would be from stderr.

import sys
import click
from click.testing import CliRunner

EXAMPLE = '\udce2\udc98\udcba\udced\udca0\udcbd\udce4\udcbd\udca0\udce5\udca5\udcbd'

@click.command()
def write_to_stderr():
    sys.stderr.write(EXAMPLE)


runner = CliRunner()
result = runner.invoke(write_to_stderr, [])
print(result)
assert result.output == ''

sys.stderr.write(EXAMPLE)

This outputs:

<Result UnicodeEncodeError('utf-8', '\udce2\udc98\udcba\udced\udca0\udcbd\udce4\udcbd\udca0\udce5\udca5\udcbd', 0, 12, 'surrogates not allowed')>
\udce2\udc98\udcba\udced\udca0\udcbd\udce4\udcbd\udca0\udce5\udca5\udcbd

I would expect that the result.output would be the same as the stderr output.

I believe that a solution to this problem is https://github.com/pallets/click/issues/371.
I put this up as a new issue because https://github.com/pallets/click/issues/371 is a feature request but this is a bug.

bug test runner

All 3 comments

371 was closed with #868. Tested with the update and it doesn't solve this.

Also tried with click.echo(EXAMPLE, err=True) instead of sys.stderr.write(EXAMPLE) with the same result as OP.

Plain sys.stdout is opened with errors="strict", so you can get the same error by writing to stdout instead of stderr. sys.stderr is opened with errors="backslashreplace", so it prints the escaped representation instead of failing. This suggests that ultimately the error is with your data, so you should investigate why you have a string with invalid characters instead of a bytes object.

Click can at least mirror standard behavior by opening stderr with the same error handler. As you've noticed, you'll also need to create the runner with mix_stderr=False, otherwise it will try to write everything to stdout in strict mode.

Thank you for following up with this @davidism .

Was this page helpful?
0 / 5 - 0 ratings