Pytest: Terminal width inference issues causing doctest failures

Created on 24 Sep 2018  ·  18Comments  ·  Source: pytest-dev/pytest

Greetings. Thanks for your work on an excellent library!

I am writing some doctests whose output seems to be sensitive to the width of the terminal—dumps of Pandas dataframes that sometimes exceed 80 chars in width. When I run the commands on the Python interpreter interactively, I can reproduce the Pandas behavior: when the window is wide, the full table is printed; when the window is narrow, columns in the middle are truncated.

I can confirm that doctest handles things correctly as well. When I execute python -m doctest myfile.py with the window at 80 chars width, the output is truncated and the test fails. When I execute the same command at 120 chars width, the output is not truncated and the doctests pass.

However, when running py.test --doctest-module myfile.py the tests fail regardless of window width. I can restart my window manager (tmux), I can restart the window manager, resize the window itself, or even close the window and start over from scratch, but no combination of these operations seems to convince py.test that my terminal is wider than 80 chars.

This isn’t a problem on my personal machine (OS X High Sierra) but it is on a work CentOS 6 system running py.test 3.6.2.


Thanks for submitting an issue!

Here's a quick checklist in what to include:

  • [x] Include a detailed description of the bug or suggestion
  • [ ] pip list of the virtual environment you are using
  • [x] pytest and operating system versions
  • [x] Minimal example if possible
doctests bug

All 18 comments

Hi @standage,

Thanks for the report.

I have scanned the standard doctest module for any references related to terminal size and could not find any. Can you produce a minimal reproducible case which demonstrates the problem? Thanks!

@standage Can you post a minimal example please, which we can use to reproduce your issue?

Ok, here is a minimal example.

example.py

import pandas

moduledata = pandas.read_table('data.tsv')

def query(q):
    """Bug demo

    When running the code below in the interactive python interpreter or with
    doctest, the output is sensitive to the window width. With py.test it
    always assumes a width of 80 chars.

    >>> 1 + 1
    2
    >>> query('ID.str.contains("qwerty")')
                 ID          Name  Chromosome  Start   End                      Variants
    1  qwertyqwerty  qwertyqwerty           2   3333  4444  rs123456789,rs987654321,rs42
    """
    return moduledata.query(q)

data.tsv

ID  Name    Chromosome  Start   End Variants
asdfhjkl    asdfhjkl    1   100 200 rs1234,rs5678,rs13579
qwertyqwerty    qwertyqwerty    2   3333    4444    rs123456789,rs987654321,rs42

When I run the following in the interactive interpreter I get expected behavior: truncated output with a thin window, full output with a wide window.

>>> import example
>>> example.query('ID.str.contains("qwerty")')
             ID          Name  Chromosome  Start   End                      Variants
1  qwertyqwerty  qwertyqwerty           2   3333  4444  rs123456789,rs987654321,rs42
>>>

When I run python -m doctest example.py I get the reported behavior: failure with a thin window, no error with a wide window.

When I run py.test --doctest-modules example.py it fails no matter how wide my terminal is.

Hi @standage,

Thanks for the reproducible example, I can reproduce it locally.

The problem seems to be on how terminal width is detected by pandas. Given your example:

from example import query
import pandas
moduledata = pandas.read_table('data.tsv')
print(query('ID.str.contains("qwerty")'))

When I execute this script in my terminal (Windows) while setting COLUMNS with different sizes, it will truncate depending on the size I set:

$ set COLUMNS=30

$ python foo.py
             ID              ...                                   Variants
1  qwertyqwerty              ...               rs123456789,rs987654321,rs42

[1 rows x 6 columns]

$ set COLUMNS=120

$ python foo.py
             ID          Name  Chromosome  Start   End                      Variants
1  qwertyqwerty  qwertyqwerty           2   3333  4444  rs123456789,rs987654321,rs42

$ set COLUMNS=

$ python foo.py
             ID          Name  Chromosome  Start   End                      Variants
1  qwertyqwerty  qwertyqwerty           2   3333  4444  rs123456789,rs987654321,rs42

So it seems to be related to how pandas detects terminal width in order to print a DataFrame. Not sure if doctest does something to simulate a wider terminal, or if pandas disregards terminal width when it detects it is running under doctest...

This has to do with terminal capturing:

$ pytest --doctest-modules t.py
=================================================================== test session starts ===================================================================
platform linux -- Python 3.6.5, pytest-3.8.1, py-1.6.0, pluggy-0.7.1
rootdir: /tmp/t, inifile:
collected 1 item                                                                                                                                          

t.py F                                                                                                                                              [100%]

======================================================================== FAILURES =========================================================================
____________________________________________________________________ [doctest] t.query ____________________________________________________________________
006 Bug demo
007 
008     When running the code below in the interactive python interpreter or with
009     doctest, the output is sensitive to the window width. With py.test it
010     always assumes a width of 80 chars.
011 
012     >>> 1 + 1
013     2
014     >>> query('ID.str.contains("qwerty")')
Expected:
                 ID          Name  Chromosome  Start   End                      Variants
    1  qwertyqwerty  qwertyqwerty           2   3333  4444  rs123456789,rs987654321,rs42
Got:
                 ID              ...                                   Variants
    1  qwertyqwerty              ...               rs123456789,rs987654321,rs42
    <BLANKLINE>
    [1 rows x 6 columns]

/tmp/t/t.py:14: DocTestFailure
================================================================ 1 failed in 0.33 seconds =================================================================
$ pytest -s --doctest-modules t.py
=================================================================== test session starts ===================================================================
platform linux -- Python 3.6.5, pytest-3.8.1, py-1.6.0, pluggy-0.7.1
rootdir: /tmp/t, inifile:
collected 1 item                                                                                                                                          

t.py .

================================================================ 1 passed in 0.34 seconds =================================================================

when the stdin / stdout are redirected they are not ttys and so pandas (rightfully) identifies that they are 80 characters wide

@asottile thanks, makes sense.

I wonder what doctest does to circumvent this though... 🤔

The difference is when doctest patches sys

The ordering from pytest:

  1. patch sys
  2. import module under test

The ordering from doctest

  1. import module under test
  2. patch sys

under doctest, pandas determines the terminal size as an import side-effect (or it retains an instance to the original streams 🤷‍♂️)

under doctest, pandas determines the terminal size as an import side-effect

Can you point to some code? I would like to know if we can reproduce this behavior in pytest.

xref: https://github.com/ESSS/pytest-regressions/issues/3

I guess it doesn't determine it as part of import, but it determines whether it should try to determine it: https://github.com/pandas-dev/pandas/blob/0c58a825181bde2c4d7e1a912e246181d33f55d6/pandas/core/config_init.py#L318-L321

(and since shutil.get_terminal_size just uses file descriptors, it doesn't care that sys.stdout / sys.stderr have been reassigned)

https://github.com/ESSS/pytest-regressions/issues/3#issuecomment-408511359 seems like a reasonable approach -- probably as an autouse fixture that sets / unsets it 🤷‍♂️

Ok, so if I understand correctly:

  • when running doctest, stdout is captured after pandas is loaded and has had a chance to determine terminal width correctly
  • when running in pytest, stdout is already captured before pandas is loaded, so it determines that the output is not a tty and defaults to 80 chars width?

I'd be happy to submit a PR, although I don't know enough about pytest internals to know where a command like pandas.set_option('display.width', 1000) would go.

By the way, I've finally posted the package from which I made my contrived example above at https://github.com/bioforensics/MicroHapDB.

pandas.set_option('display.width', 1000)

this wouldn't go in pytest itself, but in your test suite -- probably as an autouse=True, scope='session' fixture in your conftest.py

I'm familiar with fixtures, but not conftest.py files or the autouse/scope attributes. I'm scanning the documentation now but if there is a simple solution for the example above I'd be curious to see what that looks like.

something like this:

import pandas
import pytest

@pytest.fixture(autouse=True, scope='session')
def pandas_terminal_width():
    pandas.set_option('max_cols', 1000)

put that in tests/conftest.py

Oh good, that looked a lot like my first blundering attempts! :-) Thanks.

For the contrived example above, if I put conftest.py in the same directory as example.py and run py.test --doctest-modules example.py, I can confirm that the fixture is executed (by adding in some print statements) but it doesn't seem to make a difference in the final output. Still fails the test. ☹️

ah yeah, try max_cols instead -- I've edited the example

Actually, I had to set both. Only increasing the width truncated the number of columns, and only increasing the number of columns just wrapped the output at 80 chars. Doing both did the trick!

Updating the example now.

Thanks!!!

Oops, since I can't update your example it's:

import pandas
import pytest

@pytest.fixture(autouse=True, scope='session')
def pandas_terminal_width():
    pandas.set_option('display.width', 1000)
    pandas.set_option('display.max_columns', 1000)
Was this page helpful?
0 / 5 - 0 ratings