Pytest: xfail doesn't actually run the test (but the docs promise that it does)

Created on 2 Jul 2015  Â·  12Comments  Â·  Source: pytest-dev/pytest

From http://pytest.org/latest/skipping.html#mark-a-test-function-as-expected-to-fail:

This test will be run but no traceback will be reported when it fails. Instead terminal reporting will list it in the “expected to fail” or “unexpectedly passing” sections.

Consider this snippet:

def test_xfail():
    import pytest
    pytest.xfail("doesn't work")

    import os
    os._exit(0)

This produces an "x" in the pytest output. If the test were actually run beyond the xfail as documented, then the interpreter should have just exited. What xfail seems to do is terminate the execution of the test. But that's inconvenient. Suppose you have a test that'll always raise an exception. You'd have to wrap the test in a try block and call xfail in a finally clause. Instead, pytest should mark the test as expecting to fail and continue on.

needs information enhancement

Most helpful comment

I think the behavior should be changed. Many reasons:

  • Given your documentation, I don't think anyone expected the two xfails to be inconsistent with each other. I certainly didn't.
  • The behavior change is really quite low-risk. The worst that could happen is that a few test suites now report unexpected passes, which I imagine folks might like to know about anyway. In fact, the current behavior _masks_ these unexpected passes. To me, the whole point of xfail is "I know this is broken right now, but let me know once it starts working." The current behavior defeats this purpose.
  • As it stands, imperative xfail and skip really do the same thing--so what's the point of having the imperative xfail at all? If you're going to stick with that interpretation, imperative xfail should IMO be deprecated to avoid API redundancy.

All 12 comments

The documentation you mention actually refers to the xfail mark, not the explicit pytest.xfail usage.

Consider this example:

# contents of test_foo.py
import pytest

def test_xfail():
    print
    print '  xfail-before'
    pytest.xfail('xfail')
    print '  xfail-after'

@pytest.mark.xfail
def test_xfail_mark():
    print
    print '  mark-before'
    print '  mark-after'     
    assert 0
$ pytest test_foo.py -s -v
============================= test session starts =============================
platform win32 -- Python 2.7.7 -- py-1.4.26 -- pytest-2.7.0 -- D:\Shared\dist\12.0-win64\python-2.7.X\python.exe
qt-api: pyqt4
rootdir: X:\wellres, inifile:
plugins: timeout, localserver, xdist, cov, mock, qt
collected 2 items

test_foo.py::test_xfail
  xfail-before
xfail
test_foo.py::test_xfail_mark
  mark-before
  mark-after
xfail

As you can see, the test with the mark runs completely, but the one with the pytest.xfail runs only up to the pytest.xfail call. What happens is that pytest actually uses an exception to capture that a test called pytest.xfail:

def xfail(reason=""):
    """ xfail an executing test or setup functions with the given reason."""
    __tracebackhide__ = True
    raise XFailed(reason)

While this could be changed to just let the test continue execution and report it as xfail, I don't think that's an option as it would unfortunately change existing behavior and might break existing test suites.

Regardless if the behavior should change or not, I think that the documentation should explicitly mention this difference. A PR for the docs would be very welcome! :smile:

I think the behavior should be changed. Many reasons:

  • Given your documentation, I don't think anyone expected the two xfails to be inconsistent with each other. I certainly didn't.
  • The behavior change is really quite low-risk. The worst that could happen is that a few test suites now report unexpected passes, which I imagine folks might like to know about anyway. In fact, the current behavior _masks_ these unexpected passes. To me, the whole point of xfail is "I know this is broken right now, but let me know once it starts working." The current behavior defeats this purpose.
  • As it stands, imperative xfail and skip really do the same thing--so what's the point of having the imperative xfail at all? If you're going to stick with that interpretation, imperative xfail should IMO be deprecated to avoid API redundancy.

You raise some valid points!

Given your documentation, I don't think anyone expected the two xfails to be inconsistent with each other. I certainly didn't.

I agree that the documentation could be improved (that's why I mention a PR would be welcomed to fix that), and I agree that it would be better for the two _xfail_ forms to be consistent with each other.

The behavior change is really quite low-risk. The worst that could happen is that a few test suites now report unexpected passes, which I imagine folks might like to know about anyway.

I agree that it is low-risk, and in my experience pytest.mark.xfail is much more used than pytest.xfail, but keep in mind that a test suite containing a test just like your example (with a os._exit() call) or in which a pytest.xfail() call is preventing crashing code from being reached will now start to fail. pytest tries very hard to be backward compatible in order to avoid new releases from breaking existing (working) test suites.

As it stands, imperative xfail and skip really do the same thing.

They do the same thing, but the semantics are different: _skip_ means a test should not be executed in some expected circumstances (a Windows-only test being skipped on Linux for example). _xfail_ on the other hand means that a test is failing always/sometimes but it shouldn't, like a test exercising a reported bug or a _flaky_ test. So, after the test suite finishes, _skips_ are OK, but _xfails_ should to be worked-on eventually or signal red-flags that should be checked out eventually.

All in all, I don't lean too strongly on either side of the issue and of course I welcome the discussion. :smile:

Perhaps others can chip in? Any thoughts on this @hpk42, @flub, @bubenkoff, @The-Compiler...?

_(Btw, please keep in mind that I only closed the issue because usually people forget to close their own issues, at no point I wanted to put an end to the discussion, hope you understand.)_

I'm not sure really - I agree they should be consistent, but I also see the danger of breaking existing code.

So I searched for code using pytest.xfail, and found this for example:

def test_record_to_file(camera, previewing, mode, filenames_format_options):
    ...
    if resolution[1] > 480 and format == 'mjpeg':
        pytest.xfail('Locks up camera')
    camera.start_recording(filename1, **options)

So it seems people really rely on that behavior, and it was easy to find an example to prove it - so I agree it should be clarified in the documentation, but not changed.

If it is indeed not changed, I would like to request an xfail-like imperative thing that works like the marker.

Maybe adding a skip parameter which is True by default to pytest.xfail? That means you could do pytest.xfail(skip=False) if you want that behavior.

If others agree with the idea, would you like to submit a PR? I'm sure it'd be much appreciated, and we're happy to help if you get stuck.

I took a look at the code, and I must admit that I wasn't able to make much sense of it. (In particular how the marking happens and where an xfail mark takes effect.)

Maybe adding a skip parameter which is True by default to pytest.xfail?

I like this suggestion, but I'm having trouble figuring out how one would implement it...

Currently when pytest.xfail is called, it raises an exception which is caught by pytest and it marks the test as xfailed. All good.

But how to implement this in a way to "mark" the test it xfailed, but still continue execution after the call? Mind that pytest.xfail can be called anytime including from other fixtures or functions, and the fact that pytest.xfail is a global function severely limits our options on how to obtain the "test request" at the point of the call.

A thread-local variable maybe that tracks what test request is active? (This would still permit recursion with some care, but no other form of concurrency.)

I have to admit I didn't think about the implementation yet :laughing:.

What @inducer proposed or ugly hacks using sys._getframe are about the only things coming to mind...

marks make, the functions raise exceptions, i think that is consistent enough
got experimenting with something more dynamic about outcome control i suggest to write a plugin and using a fixture

this issue is has died down after getting a bit out of hand, we should open to the point issues for the documentation fix and close off the rest after a wait period

Was this page helpful?
0 / 5 - 0 ratings