Pandas: Missing data and np.seterr(all='raise'): Viewing the missing yields FloatingPointError

Created on 26 Feb 2016  路  4Comments  路  Source: pandas-dev/pandas

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
np.seterr(all='raise')

s = pd.Series([np.nan,np.nan,np.nan],index=[1,2,3]); print(s); print(s.head())

Expected Output

Certainly not a FloatingPointError:
FloatingPointError: invalid value encountered in greater.

The issue appears to lie in numpy, as

np.array([np.nan, np.nan]) > 1e8

also raises the error. I have cross-posted the issue there, but thought you guys also would want to be aware of this.

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-49-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 8.0.3
setuptools: 20.1.1
Cython: None
numpy: 1.10.4
scipy: 0.16.0
statsmodels: None
IPython: 4.0.1
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.5
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None

Missing-data Usage Question

Most helpful comment

So pandas is quietly overwriting each user's numpy error behavior? I think that is something that should be indeed documented. I was searching for this issue for an hour and didn't find anything, before asking on stackoverflow and finally ending up here.

All 4 comments

We could pretty easily wrap the __repr__s in a context manager that disables the np.seterr, but I wonder how many others will crop up. They come about pretty naturally as part of index alignment.

so we explicity set:

In [5]: np.seterr(all='ignore')
Out[5]: {'divide': 'raise', 'invalid': 'raise', 'over': 'raise', 'under': 'raise'}

in pandas/compat/numpy_compat.py to remove all of these issues.

I suppose you could doc it, but prob hard to find. I only recall this happening 1 or 2 times in the past, so not sure its much of an issue.

So pandas is quietly overwriting each user's numpy error behavior? I think that is something that should be indeed documented. I was searching for this issue for an hour and didn't find anything, before asking on stackoverflow and finally ending up here.

@eXcuvator well if you want to add it to the documentation, the a pull-request would be fine.

The point is all of these errors are irrelevant and converted to NaN as appropriate. That is the point.

Was this page helpful?
0 / 5 - 0 ratings