Pandas: assert_almost_equal

Created on 3 Jun 2016  路  9Comments  路  Source: pandas-dev/pandas

Hi, I'm wondering if the following behavior of pandas._testing.assert_almost_equal is expected:

Code Sample, a copy-pastable example if possible

from pandas import _testing
_testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)

Expected Output

Expect no output / no AssertionError

Actual Output

AssertionError                            Traceback (most recent call last)
<ipython-input-199-ace78e82c603> in <module>()
      1 from pandas import _testing
----> 2 _testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)

pandas/src/testing.pyx in pandas._testing.assert_almost_equal (pandas/src/testing.c:3887)()

pandas/src/testing.pyx in pandas._testing.assert_almost_equal (pandas/src/testing.c:3653)()

AssertionError: expected 0.00001 but got 0.00001, with decimal 3

Note that the numbers differ at decimal 6 and the output suggests they are different at position 3.

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 15.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 8.0.2
setuptools: 21.2.1
Cython: None
numpy: 1.11.0
scipy: 0.17.1
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: 2.3.3
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None

Bug Testing

Most helpful comment

+1 to always compare using specified precision (not fb / fa).

All 9 comments

hmm, that seems odd. in any event try with master, you can now pass an integer to check_less_precise (it still breaks but more flexibility).

Hi, thanks for the speedy response! Looking at the master branch, we will see the same problem:

https://github.com/pydata/pandas/blob/master/pandas/src/testing.pyx#L197

decimal_almost_equal(1, fb / fa, decimal)

The numbers (fa, fb) are divided and compared to 1. Which is interesting, but not quite the same as just comparing the parts after the decimal point (as the docstring suggests). Perhaps I'm just reading the docs incorrectly.

+1 to always compare using specified precision (not fb / fa).

The problem is still present:

Here some examples based on the first comment.

````

import pandas as pd
pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)
Traceback (most recent call last):
File "C:\Anaconda3\envs\worker\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)
File "pandas/_libs/testing.pyx", line 59, in pandas._libs.testing.assert_almost_equal (pandas_libs\testing.c:4156)
File "pandas/_libs/testing.pyx", line 209, in pandas._libs.testing.assert_almost_equal (pandas_libs\testing.c:3863)
AssertionError: expected 0.00001 but got 0.00001, with decimal 3

pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=2)
Traceback (most recent call last):
File "C:\Anaconda3\envs\worker\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=2)
File "pandas/_libs/testing.pyx", line 59, in pandas._libs.testing.assert_almost_equal (pandas_libs\testing.c:4156)
File "pandas/_libs/testing.pyx", line 209, in pandas._libs.testing.assert_almost_equal (pandas_libs\testing.c:3863)
AssertionError: expected 0.00001 but got 0.00001, with decimal 2

pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=1)
Traceback (most recent call last):
File "C:\Anaconda3\envs\worker\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=1)
File "pandas/_libs/testing.pyx", line 59, in pandas._libs.testing.assert_almost_equal (pandas_libs\testing.c:4156)
File "pandas/_libs/testing.pyx", line 209, in pandas._libs.testing.assert_almost_equal (pandas_libs\testing.c:3863)
AssertionError: expected 0.00001 but got 0.00001, with decimal 1
````

I don't know how to make this works, in addition, I find the message a little weird.

The problem is still present:

and this is an open issue. you are welcome to submit a PR to fix.

I would like to fix it but I not 100% sure that I understand the rationale of the code.

When using

decimal_almost_equal(1, fb / fa, decimal)

instead of

decimal_almost_equal(fa, fb, decimal)

From my understanding, the idea here is that the _comparison precision_ (i.e. represented by decimal here) is expressed relatively to 1, and that is scaled along with the numbers to compare. I guess the goal is to be able to provide a relative precision and have it work with numbers spanning very different ranges.
While interesting, it is a bit confusing, and not clearly specified in the documentation.

Should we update the documentation or the code? I would vote to change the code. The whole point of providing the precision is that the user _knows_ what is the correct epsilon to use with the values that are being compared.

Did I understood the rationale of the code? Are you OK with a PR to change the code?

@NewbiZ : PR changes are welcome so long as they fix the issue AND not break existing tests! :smile:

@jreback I'm Sorry, maybe I did not pick the right words, I just wanted to leave a reminder and add a couple of things. It wasn't a complaint at all. You're right, I tried to correct it but it was difficult for me to understand the why of some parts of this function.

Bests,

take

Was this page helpful?
0 / 5 - 0 ratings