Skip to content

BUG: Series.resample across daylight saving boundary causes segfault. #9468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jeremywhelchel opened this issue Feb 11, 2015 · 7 comments
Closed
Labels
Bug Resample resample method Timezones Timezone data dtype
Milestone

Comments

@jeremywhelchel
Copy link

The following snippet of code is causing a segfault. Here it's failing at head, but I've seen it fail in the 0.15.2 release as well, with both NumPy 1.7.1 and NumPy 1.9.1.
Interestingly this doesn't happen with Pandas 0.13.1.

>>> import pandas
>>> import pytz
>>> pandas.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.16.0-30-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.15.2-182-gbb9c311
nose: None
Cython: 0.20.1post0
numpy: 1.9.1
scipy: None
statsmodels: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.4.0
pytz: 2013b
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: None
pymysql: None
psycopg2: None
>>> LA_TZ = pytz.timezone('America/Los_Angeles')
>>> 
>>> s = pandas.Series({
...     pandas.Timestamp('2010-1-1', tz=LA_TZ): 1,
...     pandas.Timestamp('2011-4-1', tz=LA_TZ): 1})
>>> s
2010-01-01 00:00:00-08:00    1
2011-04-01 00:00:00-07:00    1
dtype: int64
>>> s.resample('D', how='max', fill_method='pad')
Segmentation fault (core dumped)

Other observations:

  • The start/end need to cross the daylight savings switch. Notice utc-08:00 vs utc-07:00 above
  • Only fails with how=max and fill_method=pad
  • Needs a large enough resampled window. 2011-1-1 to 2011-4-1 won't do it. But 2010-1-1 to 2011-4-1 will.
@jreback jreback added Can't Repro Bug Resample resample method Timezones Timezone data dtype and removed Can't Repro labels Feb 11, 2015
@jreback jreback added this to the 0.17.0 milestone Feb 11, 2015
@jreback
Copy link
Contributor

jreback commented Feb 11, 2015

see the linked master issue
which are all related

welcome a pull request from someone ln these

@ghost
Copy link

ghost commented Feb 11, 2015

Are you sure it's the same issue? In this bug's case it's a segfault, and the core points at a double-free of a pointer deep in cython-generated code.

@jreback
Copy link
Contributor

jreback commented Feb 11, 2015

there r about 10 issues linked
I suspect of not the same they are all pretty related
that's why its a master issue
I get that it's a seg fault but the input to the function is invalid in the first place

@jeremywhelchel
Copy link
Author

Do we have a way of identifying this invalid input? It seems far preferable to throw a descriptive error than let the process segfault.

@jreback
Copy link
Contributor

jreback commented Feb 11, 2015

I am all for someone to take this on

the bin edges are messed upwhen crossing the dst boundaries and the input to the grouper is invalid

you can simply step thru this and see

as I said there are about 10 linked issues which boil down to a couple of cases

@rockg
Copy link
Contributor

rockg commented Feb 12, 2015

The master issue basically has all the fixes laid out. I got stuck fixing the DateOffsets to handle DST in a nice way and then some Nano offset tests were failing and I got frustrated. I will try and put a PR together to at least have a working copy for others to work on as well.

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 10, 2015
@jreback
Copy link
Contributor

jreback commented Mar 11, 2015

closed by #9623

@jreback jreback closed this as completed Mar 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Resample resample method Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants