Hi, I just updated pytest from 3.2.1 to 3.8.1 and it breaks some of my tests.
The problem seems to be that the behavior of approx changed. I'm using it with tuples that contain a mix of floats and None values. In 3.2.1, this worked just fine, but in 3.8.1 I get an error. Minimal example:
C:\Users\felix>python
Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pytest import approx
>>> (1, 2, None) == approx((1.000000001, 2, None))
True
C:\Users\felix>python
Python 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pytest import approx
>>> (1, 2, None) == approx((1.000000001, 2, None))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\felix\AppData\Local\conda\conda\envs\py37\lib\site-packages\_pytest\python_api.py", line 528, in approx
return cls(expected, rel, abs, nan_ok)
File "C:\Users\felix\AppData\Local\conda\conda\envs\py37\lib\site-packages\_pytest\python_api.py", line 64, in __init__
self._check_type()
File "C:\Users\felix\AppData\Local\conda\conda\envs\py37\lib\site-packages\_pytest\python_api.py", line 215, in _check_typ
self.expected, at="index {}".format(index)
TypeError: cannot make approximate comparisons to non-numeric values: (1.000000001, 2, None) at index 2
Is this an intended change? If so, I'd like know if there's a better way to compare tuples like these.
Thanks in advance.
i beleive this was unintended, but none was not originally supposed to be supported
When dealing with sequences/mappings, I think it would be best if approx worked like this:
This would make approx work like the usual comparison of containers, except that for floats it also allows very close (but not equal) numbers. If two objects are equal, they are always also approximatively equal, and it would be nice if approx worked like this.
What do you think?
its really easy to go wrong there by allowing too much or too little,
we should prepare an exemplar test for the desired behavior and enable it
its really tricky to properly declare the type-matching behavior, since "numeric collections" are required to approximate as well as quasi-numbers, even integers
I'm surprised approx supports sequences at all! Also it seems broken, it completely discards the type:
import pytest
def f_under_test():
return (1, 2, 3)
def test():
# expected an assertion error, the lhs is a tuple, the rhs is a list
assert f_under_test() == pytest.approx([1, 2, 3.0000001])
$ pytest t.py
============================= test session starts ==============================
platform linux -- Python 3.6.5, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /tmp/t, inifile:
collected 1 item
t.py . [100%]
=========================== 1 passed in 0.02 seconds ===========================
if it were me, I'd completely deprecate anything but numbers being allowed as an argument to approx
approx has a collections concept that is fuzzy, we should put a better lens on it there
if it were me, I'd completely deprecate anything but numbers being allowed as an argument to approx
I disagree, being able to compare sequence of numbers using approx is very useful.
But I agree it should be more strict with the sequence type, not considering tuples and lists equal.
@kalekundert @tadeu what's your opinion on this?
Regarding @fkohlgrueber's original suggestion, I can see that it would be convenient to check for exact equality (or identity even) before checking types. This behavior would be incompatible with #3473, but personally I think it would be more useful than erroring immediately when given non-float data.
It may be worth pointing out that pandas uses NaN (not None) to represent missing data, and this use-case is already supported by the nan_ok option. This also side-steps the type issue, since NaN is still a float.
I don't see how more strictly checking container type would be useful, and I can imagine some cases where it'd be annoying (e.g. if you want to check a function that returns a numpy array without having to make a numpy array yourself). So I'm opposed to that.
Hi @kalekundert thanks for sharing your thoughts.
I don't see how more strictly checking container type would be useful, and I can imagine some cases where it'd be annoying (e.g. if you want to check a function that returns a numpy array without having to make a numpy array yourself). So I'm opposed to that.
Not sure, I think implicitly considering lists and tuples equal will be surprising for most users, because lists and tuples are considered different even if they have the same contents.
I understand that in most cases one might not care about this, but then it is easy to be explicit about it:
assert list(func_that_returns_numpy_array()) == pytest.approx([100, 250, 300])
I think the most important thing in this thread is to establish if there's a consensus on whether we should test for exact equality before doing any type checks. I'm leaning towards this being a good idea. It would be convenient, and I can't think of any way in which it would produce surprising results.
With regards to comparing lists and tuples, I can understand where you're coming from, because I agree that it's more strictly correct to consider lists and tuples to be unequal. But I still think this is a case where the practicality outweighs correctness, for many reasons:
approx() is to make comparing floats easier, and abstracting various containers in a consistent way advances this purpose. Really approx() just thinks of things as either "iterable" or "scalar" (or "mapping", to be complete). The documentation could be more clear about this, though.== operator for the underlying types isn't a desirable goal in general. For example, equality comparisons with numpy arrays always result in boolean arrays, but approx() implicitly condenses these arrays into a single scalar for compatibility with assert. If comparisons between lists and tuples are surprising, wouldn't the same logic dictate that numpy arrays comparing as scalars is also surprising?If we want to keep discussing this, we should probably open a new issue. But I just don't see the benefit, and I see a lot of costs.
If we want to keep discussing this, we should probably open a new issue. But I just don't see the benefit, and I see a lot of costs.
Excellent summary @kalekundert, many thanks! Given your points, I now agree that being strict about the types is probably not a good idea. 馃憤
It would be awesome to be able to do this:
expected = {'a': 1.0, 'b': {'x': 2.0, 'y': 3.0, 'z': nan, 'w': None, 'h': 'something'}, 'c': [[4.0, 5.0], [None, 6.0]], 'd': np.array(...)}
obtained = {'a': 1.1, 'b': {'x': 2.1, 'y': 3.1, 'z': nan, 'w': None, 'h': 'something'}, 'c': [[4.1, 5.1], [None, 6.1]], 'd': np.array(...)}
assert obtained == approx(expected, rel=0.2) # Pass
I don't care much about strict types if it would help to implement comparisons like that, i.e., this could also pass:
obtained = {'a': 1.1, 'b': {'x': 2.1, 'y': 3.1, 'z': nan, 'w': None, 'h': 'something'}, 'c': ((4.1, 5.1), (None, 6.1)), 'd': list(...)}
Things that would be tricky would be set's and dict keys, those could be left unsupported because of ambiguities of comparing, for example, {1.0} with {1.000000001, 1.000000002}.
Related: #3164
I'd be interested in implementing the following changes (as discussed above):
[1, 2, [3, 4]] == approx([1, 2, [3, 4]]))I already looked through the tests, and the changes above would only make test_expected_value_type_error fail. This test ensures that a type error is raised with inputs like nested sequences / mappings or sequences / mappings containing non-Number elements.
Therefore, I would consider the proposed changes to be backwards-compatible (expands use cases of approx without impacting existing ones).
I'd like to get a short "Go" from you guys before really commiting to this. Thanks in advance ;-)
@fkohlgrueber proposal looks good!
Seconded, let me know if you have any questions about how the code works.
Most helpful comment
I'd be interested in implementing the following changes (as discussed above):
[1, 2, [3, 4]] == approx([1, 2, [3, 4]]))I already looked through the tests, and the changes above would only make
test_expected_value_type_errorfail. This test ensures that a type error is raised with inputs like nested sequences / mappings or sequences / mappings containing non-Number elements.Therefore, I would consider the proposed changes to be backwards-compatible (expands use cases of approx without impacting existing ones).
I'd like to get a short "Go" from you guys before really commiting to this. Thanks in advance ;-)