Greetings. Thanks for your work on an excellent library!
I am writing some doctests whose output seems to be sensitive to the width of the terminal—dumps of Pandas dataframes that sometimes exceed 80 chars in width. When I run the commands on the Python interpreter interactively, I can reproduce the Pandas behavior: when the window is wide, the full table is printed; when the window is narrow, columns in the middle are truncated.
I can confirm that doctest handles things correctly as well. When I execute python -m doctest myfile.py with the window at 80 chars width, the output is truncated and the test fails. When I execute the same command at 120 chars width, the output is not truncated and the doctests pass.
However, when running py.test --doctest-module myfile.py the tests fail regardless of window width. I can restart my window manager (tmux), I can restart the window manager, resize the window itself, or even close the window and start over from scratch, but no combination of these operations seems to convince py.test that my terminal is wider than 80 chars.
This isn’t a problem on my personal machine (OS X High Sierra) but it is on a work CentOS 6 system running py.test 3.6.2.
Thanks for submitting an issue!
Here's a quick checklist in what to include:
pip list of the virtual environment you are usingHi @standage,
Thanks for the report.
I have scanned the standard doctest module for any references related to terminal size and could not find any. Can you produce a minimal reproducible case which demonstrates the problem? Thanks!
@standage Can you post a minimal example please, which we can use to reproduce your issue?
Ok, here is a minimal example.
import pandas
moduledata = pandas.read_table('data.tsv')
def query(q):
"""Bug demo
When running the code below in the interactive python interpreter or with
doctest, the output is sensitive to the window width. With py.test it
always assumes a width of 80 chars.
>>> 1 + 1
2
>>> query('ID.str.contains("qwerty")')
ID Name Chromosome Start End Variants
1 qwertyqwerty qwertyqwerty 2 3333 4444 rs123456789,rs987654321,rs42
"""
return moduledata.query(q)
ID Name Chromosome Start End Variants
asdfhjkl asdfhjkl 1 100 200 rs1234,rs5678,rs13579
qwertyqwerty qwertyqwerty 2 3333 4444 rs123456789,rs987654321,rs42
When I run the following in the interactive interpreter I get expected behavior: truncated output with a thin window, full output with a wide window.
>>> import example
>>> example.query('ID.str.contains("qwerty")')
ID Name Chromosome Start End Variants
1 qwertyqwerty qwertyqwerty 2 3333 4444 rs123456789,rs987654321,rs42
>>>
When I run python -m doctest example.py I get the reported behavior: failure with a thin window, no error with a wide window.
When I run py.test --doctest-modules example.py it fails no matter how wide my terminal is.
Hi @standage,
Thanks for the reproducible example, I can reproduce it locally.
The problem seems to be on how terminal width is detected by pandas. Given your example:
from example import query
import pandas
moduledata = pandas.read_table('data.tsv')
print(query('ID.str.contains("qwerty")'))
When I execute this script in my terminal (Windows) while setting COLUMNS with different sizes, it will truncate depending on the size I set:
$ set COLUMNS=30
$ python foo.py
ID ... Variants
1 qwertyqwerty ... rs123456789,rs987654321,rs42
[1 rows x 6 columns]
$ set COLUMNS=120
$ python foo.py
ID Name Chromosome Start End Variants
1 qwertyqwerty qwertyqwerty 2 3333 4444 rs123456789,rs987654321,rs42
$ set COLUMNS=
$ python foo.py
ID Name Chromosome Start End Variants
1 qwertyqwerty qwertyqwerty 2 3333 4444 rs123456789,rs987654321,rs42
So it seems to be related to how pandas detects terminal width in order to print a DataFrame. Not sure if doctest does something to simulate a wider terminal, or if pandas disregards terminal width when it detects it is running under doctest...
This has to do with terminal capturing:
$ pytest --doctest-modules t.py
=================================================================== test session starts ===================================================================
platform linux -- Python 3.6.5, pytest-3.8.1, py-1.6.0, pluggy-0.7.1
rootdir: /tmp/t, inifile:
collected 1 item
t.py F [100%]
======================================================================== FAILURES =========================================================================
____________________________________________________________________ [doctest] t.query ____________________________________________________________________
006 Bug demo
007
008 When running the code below in the interactive python interpreter or with
009 doctest, the output is sensitive to the window width. With py.test it
010 always assumes a width of 80 chars.
011
012 >>> 1 + 1
013 2
014 >>> query('ID.str.contains("qwerty")')
Expected:
ID Name Chromosome Start End Variants
1 qwertyqwerty qwertyqwerty 2 3333 4444 rs123456789,rs987654321,rs42
Got:
ID ... Variants
1 qwertyqwerty ... rs123456789,rs987654321,rs42
<BLANKLINE>
[1 rows x 6 columns]
/tmp/t/t.py:14: DocTestFailure
================================================================ 1 failed in 0.33 seconds =================================================================
$ pytest -s --doctest-modules t.py
=================================================================== test session starts ===================================================================
platform linux -- Python 3.6.5, pytest-3.8.1, py-1.6.0, pluggy-0.7.1
rootdir: /tmp/t, inifile:
collected 1 item
t.py .
================================================================ 1 passed in 0.34 seconds =================================================================
when the stdin / stdout are redirected they are not ttys and so pandas (rightfully) identifies that they are 80 characters wide
@asottile thanks, makes sense.
I wonder what doctest does to circumvent this though... 🤔
The difference is when doctest patches sys
The ordering from pytest:
sysThe ordering from doctest
sysunder doctest, pandas determines the terminal size as an import side-effect (or it retains an instance to the original streams 🤷♂️)
under doctest, pandas determines the terminal size as an import side-effect
Can you point to some code? I would like to know if we can reproduce this behavior in pytest.
I guess it doesn't determine it as part of import, but it determines whether it should try to determine it: https://github.com/pandas-dev/pandas/blob/0c58a825181bde2c4d7e1a912e246181d33f55d6/pandas/core/config_init.py#L318-L321
(and since shutil.get_terminal_size just uses file descriptors, it doesn't care that sys.stdout / sys.stderr have been reassigned)
https://github.com/ESSS/pytest-regressions/issues/3#issuecomment-408511359 seems like a reasonable approach -- probably as an autouse fixture that sets / unsets it 🤷♂️
Ok, so if I understand correctly:
I'd be happy to submit a PR, although I don't know enough about pytest internals to know where a command like pandas.set_option('display.width', 1000) would go.
By the way, I've finally posted the package from which I made my contrived example above at https://github.com/bioforensics/MicroHapDB.
pandas.set_option('display.width', 1000)
this wouldn't go in pytest itself, but in your test suite -- probably as an autouse=True, scope='session' fixture in your conftest.py
I'm familiar with fixtures, but not conftest.py files or the autouse/scope attributes. I'm scanning the documentation now but if there is a simple solution for the example above I'd be curious to see what that looks like.
something like this:
import pandas
import pytest
@pytest.fixture(autouse=True, scope='session')
def pandas_terminal_width():
pandas.set_option('max_cols', 1000)
put that in tests/conftest.py
Oh good, that looked a lot like my first blundering attempts! :-) Thanks.
For the contrived example above, if I put conftest.py in the same directory as example.py and run py.test --doctest-modules example.py, I can confirm that the fixture is executed (by adding in some print statements) but it doesn't seem to make a difference in the final output. Still fails the test. ☹️
ah yeah, try max_cols instead -- I've edited the example
Actually, I had to set both. Only increasing the width truncated the number of columns, and only increasing the number of columns just wrapped the output at 80 chars. Doing both did the trick!
Updating the example now.
Thanks!!!
Oops, since I can't update your example it's:
import pandas
import pytest
@pytest.fixture(autouse=True, scope='session')
def pandas_terminal_width():
pandas.set_option('display.width', 1000)
pandas.set_option('display.max_columns', 1000)