Pandas: Missing data and np.seterr(all='raise'): Viewing the missing yields FloatingPointError

Created on 26 Feb 2016 · 4Comments · Source: pandas-dev/pandas

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
np.seterr(all='raise')

s = pd.Series([np.nan,np.nan,np.nan],index=[1,2,3]); print(s); print(s.head())

Expected Output

Certainly not a FloatingPointError:
FloatingPointError: invalid value encountered in greater.

The issue appears to lie in numpy, as

np.array([np.nan, np.nan]) > 1e8

also raises the error. I have cross-posted the issue there, but thought you guys also would want to be aware of this.

output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-49-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 8.0.3
setuptools: 20.1.1
Cython: None
numpy: 1.10.4
scipy: 0.16.0
statsmodels: None
IPython: 4.0.1
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.5
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None

Missing-data Usage Question

Source

eXcuvator

Most helpful comment

So pandas is quietly overwriting each user's numpy error behavior? I think that is something that should be indeed documented. I was searching for this issue for an hour and didn't find anything, before asking on stackoverflow and finally ending up here.

eXcuvator on 27 Feb 2016

👍2

All 4 comments

We could pretty easily wrap the __repr__s in a context manager that disables the np.seterr, but I wonder how many others will crop up. They come about pretty naturally as part of index alignment.

TomAugspurger on 26 Feb 2016

so we explicity set:

In [5]: np.seterr(all='ignore')
Out[5]: {'divide': 'raise', 'invalid': 'raise', 'over': 'raise', 'under': 'raise'}

in pandas/compat/numpy_compat.py to remove all of these issues.

I suppose you could doc it, but prob hard to find. I only recall this happening 1 or 2 times in the past, so not sure its much of an issue.

jreback on 27 Feb 2016

eXcuvator on 27 Feb 2016

👍2

@eXcuvator well if you want to add it to the documentation, the a pull-request would be fine.

The point is all of these errors are irrelevant and converted to NaN as appropriate. That is the point.

jreback on 27 Feb 2016

Was this page helpful?

0 / 5 - 0 ratings

Related issues

DataFrame.describe can't return percentiles when data set contain nan

tade0726 · 3Comments

Incompatibility between pandas.infer_freq and pandas.to_timedelta

idanivanov · 3Comments

to_sql UnicodeEncodeError

matthiasroder · 3Comments

Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1

marcelnem · 3Comments

Better display of negative Timedelta

scls19fr · 3Comments