Pytest: approx with tuple containing None

Created on 4 Oct 2018  路  15Comments  路  Source: pytest-dev/pytest

Hi, I just updated pytest from 3.2.1 to 3.8.1 and it breaks some of my tests.

The problem seems to be that the behavior of approx changed. I'm using it with tuples that contain a mix of floats and None values. In 3.2.1, this worked just fine, but in 3.8.1 I get an error. Minimal example:

pytest 3.2.1

C:\Users\felix>python
Python 3.6.5 | packaged by conda-forge | (default, Apr  6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pytest import approx
>>> (1, 2, None) == approx((1.000000001, 2, None))
True

pytest 3.8.1

C:\Users\felix>python
Python 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pytest import approx
>>> (1, 2, None) == approx((1.000000001, 2, None))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\felix\AppData\Local\conda\conda\envs\py37\lib\site-packages\_pytest\python_api.py", line 528, in approx
    return cls(expected, rel, abs, nan_ok)
  File "C:\Users\felix\AppData\Local\conda\conda\envs\py37\lib\site-packages\_pytest\python_api.py", line 64, in __init__
    self._check_type()
  File "C:\Users\felix\AppData\Local\conda\conda\envs\py37\lib\site-packages\_pytest\python_api.py", line 215, in _check_typ
    self.expected, at="index {}".format(index)
TypeError: cannot make approximate comparisons to non-numeric values: (1.000000001, 2, None)  at index 2

Is this an intended change? If so, I'd like know if there's a better way to compare tuples like these.

Thanks in advance.

approx regression

Most helpful comment

I'd be interested in implementing the following changes (as discussed above):

  1. Allow non-Number elements in sequences / mappings. For those elements, check for exact equality only.
  2. Add support for nested sequences / mappings (e.g. [1, 2, [3, 4]] == approx([1, 2, [3, 4]]))

I already looked through the tests, and the changes above would only make test_expected_value_type_error fail. This test ensures that a type error is raised with inputs like nested sequences / mappings or sequences / mappings containing non-Number elements.

Therefore, I would consider the proposed changes to be backwards-compatible (expands use cases of approx without impacting existing ones).

I'd like to get a short "Go" from you guys before really commiting to this. Thanks in advance ;-)

All 15 comments

i beleive this was unintended, but none was not originally supposed to be supported

When dealing with sequences/mappings, I think it would be best if approx worked like this:

  • for each (non-sequence/mapping) object within the sequence/mapping, first test for equality
  • if the object is not equal but of type float, test if it's within the allowed approximation range
  • if none of the two cases above are satisfied, the assertion fails

This would make approx work like the usual comparison of containers, except that for floats it also allows very close (but not equal) numbers. If two objects are equal, they are always also approximatively equal, and it would be nice if approx worked like this.

What do you think?

its really easy to go wrong there by allowing too much or too little,

we should prepare an exemplar test for the desired behavior and enable it
its really tricky to properly declare the type-matching behavior, since "numeric collections" are required to approximate as well as quasi-numbers, even integers

I'm surprised approx supports sequences at all! Also it seems broken, it completely discards the type:

import pytest


def f_under_test():
    return (1, 2, 3)


def test():
    # expected an assertion error, the lhs is a tuple, the rhs is a list
    assert f_under_test() == pytest.approx([1, 2, 3.0000001])
$ pytest t.py 
============================= test session starts ==============================
platform linux -- Python 3.6.5, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /tmp/t, inifile:
collected 1 item                                                               

t.py .                                                                   [100%]

=========================== 1 passed in 0.02 seconds ===========================

if it were me, I'd completely deprecate anything but numbers being allowed as an argument to approx

approx has a collections concept that is fuzzy, we should put a better lens on it there

if it were me, I'd completely deprecate anything but numbers being allowed as an argument to approx

I disagree, being able to compare sequence of numbers using approx is very useful.

But I agree it should be more strict with the sequence type, not considering tuples and lists equal.

@kalekundert @tadeu what's your opinion on this?

Regarding @fkohlgrueber's original suggestion, I can see that it would be convenient to check for exact equality (or identity even) before checking types. This behavior would be incompatible with #3473, but personally I think it would be more useful than erroring immediately when given non-float data.

It may be worth pointing out that pandas uses NaN (not None) to represent missing data, and this use-case is already supported by the nan_ok option. This also side-steps the type issue, since NaN is still a float.

I don't see how more strictly checking container type would be useful, and I can imagine some cases where it'd be annoying (e.g. if you want to check a function that returns a numpy array without having to make a numpy array yourself). So I'm opposed to that.

Hi @kalekundert thanks for sharing your thoughts.

I don't see how more strictly checking container type would be useful, and I can imagine some cases where it'd be annoying (e.g. if you want to check a function that returns a numpy array without having to make a numpy array yourself). So I'm opposed to that.

Not sure, I think implicitly considering lists and tuples equal will be surprising for most users, because lists and tuples are considered different even if they have the same contents.

I understand that in most cases one might not care about this, but then it is easy to be explicit about it:

assert list(func_that_returns_numpy_array()) == pytest.approx([100, 250, 300])

I think the most important thing in this thread is to establish if there's a consensus on whether we should test for exact equality before doing any type checks. I'm leaning towards this being a good idea. It would be convenient, and I can't think of any way in which it would produce surprising results.

With regards to comparing lists and tuples, I can understand where you're coming from, because I agree that it's more strictly correct to consider lists and tuples to be unequal. But I still think this is a case where the practicality outweighs correctness, for many reasons:

  • The purpose of approx() is to make comparing floats easier, and abstracting various containers in a consistent way advances this purpose. Really approx() just thinks of things as either "iterable" or "scalar" (or "mapping", to be complete). The documentation could be more clear about this, though.
  • The proposed change would make a common use-case (comparing multiple floats) more difficult for the benefit of an extremely rare use-case (wanting to test whether you're getting a tuple or a list).
  • The proposed change also has the potential to break a lot of existing code, which in my opinion should make it a non-starter all on its own.
  • Mimicking the behavior of the == operator for the underlying types isn't a desirable goal in general. For example, equality comparisons with numpy arrays always result in boolean arrays, but approx() implicitly condenses these arrays into a single scalar for compatibility with assert. If comparisons between lists and tuples are surprising, wouldn't the same logic dictate that numpy arrays comparing as scalars is also surprising?
  • The type-cast you have above only works for one-dimensional arrays. (It also requires copying the array, which is a performance issue in general, but probably not so much in tests.) In other words, the proposed change would require the user to think about corner-cases that we've already thought about and handled.

If we want to keep discussing this, we should probably open a new issue. But I just don't see the benefit, and I see a lot of costs.

If we want to keep discussing this, we should probably open a new issue. But I just don't see the benefit, and I see a lot of costs.

Excellent summary @kalekundert, many thanks! Given your points, I now agree that being strict about the types is probably not a good idea. 馃憤

It would be awesome to be able to do this:

expected = {'a': 1.0, 'b': {'x': 2.0, 'y': 3.0, 'z': nan, 'w': None, 'h': 'something'}, 'c': [[4.0, 5.0], [None, 6.0]], 'd': np.array(...)}
obtained = {'a': 1.1, 'b': {'x': 2.1, 'y': 3.1, 'z': nan, 'w': None, 'h': 'something'}, 'c': [[4.1, 5.1], [None, 6.1]], 'd': np.array(...)}

assert obtained == approx(expected, rel=0.2)  # Pass

I don't care much about strict types if it would help to implement comparisons like that, i.e., this could also pass:

obtained = {'a': 1.1, 'b': {'x': 2.1, 'y': 3.1, 'z': nan, 'w': None, 'h': 'something'}, 'c': ((4.1, 5.1), (None, 6.1)), 'd': list(...)}

Things that would be tricky would be set's and dict keys, those could be left unsupported because of ambiguities of comparing, for example, {1.0} with {1.000000001, 1.000000002}.

Related: #3164

I'd be interested in implementing the following changes (as discussed above):

  1. Allow non-Number elements in sequences / mappings. For those elements, check for exact equality only.
  2. Add support for nested sequences / mappings (e.g. [1, 2, [3, 4]] == approx([1, 2, [3, 4]]))

I already looked through the tests, and the changes above would only make test_expected_value_type_error fail. This test ensures that a type error is raised with inputs like nested sequences / mappings or sequences / mappings containing non-Number elements.

Therefore, I would consider the proposed changes to be backwards-compatible (expands use cases of approx without impacting existing ones).

I'd like to get a short "Go" from you guys before really commiting to this. Thanks in advance ;-)

@fkohlgrueber proposal looks good!

Seconded, let me know if you have any questions about how the code works.

Was this page helpful?
0 / 5 - 0 ratings