Skip to content

Missing data and np.seterr(all='raise'): Viewing the missing yields FloatingPointError #12464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eXcuvator opened this issue Feb 26, 2016 · 4 comments
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Usage Question

Comments

@eXcuvator
Copy link

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
np.seterr(all='raise')

s = pd.Series([np.nan,np.nan,np.nan],index=[1,2,3]); print(s); print(s.head())

Expected Output

Certainly not a FloatingPointError:
FloatingPointError: invalid value encountered in greater.

The issue appears to lie in numpy, as

np.array([np.nan, np.nan]) > 1e8

also raises the error. I have cross-posted the issue there, but thought you guys also would want to be aware of this.

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-49-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 8.0.3
setuptools: 20.1.1
Cython: None
numpy: 1.10.4
scipy: 0.16.0
statsmodels: None
IPython: 4.0.1
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.5
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None

@TomAugspurger TomAugspurger added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Feb 26, 2016
@TomAugspurger
Copy link
Contributor

We could pretty easily wrap the __repr__s in a context manager that disables the np.seterr, but I wonder how many others will crop up. They come about pretty naturally as part of index alignment.

@jreback
Copy link
Contributor

jreback commented Feb 27, 2016

so we explicity set:

In [5]: np.seterr(all='ignore')
Out[5]: {'divide': 'raise', 'invalid': 'raise', 'over': 'raise', 'under': 'raise'}

in pandas/compat/numpy_compat.py to remove all of these issues.

I suppose you could doc it, but prob hard to find. I only recall this happening 1 or 2 times in the past, so not sure its much of an issue.

@eXcuvator
Copy link
Author

So pandas is quietly overwriting each user's numpy error behavior? I think that is something that should be indeed documented. I was searching for this issue for an hour and didn't find anything, before asking on stackoverflow and finally ending up here.

@jreback
Copy link
Contributor

jreback commented Feb 27, 2016

@eXcuvator well if you want to add it to the documentation, the a pull-request would be fine.

The point is all of these errors are irrelevant and converted to NaN as appropriate. That is the point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Usage Question
Projects
None yet
Development

No branches or pull requests

3 participants