Skip to content

REF/PERF: PeriodEngine covers narrow type #29038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
proost opened this issue Oct 16, 2019 · 3 comments
Closed

REF/PERF: PeriodEngine covers narrow type #29038

proost opened this issue Oct 16, 2019 · 3 comments
Labels
Index Related to the Index class or subclasses

Comments

@proost
Copy link
Contributor

proost commented Oct 16, 2019

Code Sample, a copy-pastable example if possible

pi = pd.period_range(start=2000, end=2019, freq="A")

p = pd.Period(2010, freq="A")

In [7]: pi._engine.get_loc(p)                                                   
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-7-cd77985ba7c9> in <module>
----> 1 pi._engine.get_loc(p)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: Period('2010', 'A-DEC')

Problem description

When i handled #28628, i found PeriodEngine only works for Int type. So PeriodIndex.get_loc compute twice because PeriodEngine.get_loc only accepts Int type.

Suggestion

.get_loc on PeriodIndex computes once. Either expanding PeriodEngine.get_loc accept data type or removeing PeriodEngine.get_loc are ok

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit : None
python : 3.7.1.final.0
python-bits : 64
OS : Linux
OS-release : 5.0.0-31-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : ko_KR.UTF-8
LOCALE : ko_KR.UTF-8

pandas : 0.25.1
numpy : 1.16.1
pytz : 2018.7
dateutil : 2.7.5
pip : 19.3
setuptools : 40.6.3
Cython : 0.29.2
pytest : 5.1.0
hypothesis : None
sphinx : 1.8.2
blosc : None
feather : None
xlsxwriter : 1.1.2
lxml.etree : 4.2.5
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.2.0
pandas_datareader: None
bs4 : 4.8.0
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.2.5
matplotlib : 3.0.2
numexpr : 2.6.8
odfpy : None
openpyxl : 2.5.12
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : 1.2.15
tables : 3.4.4
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.2

@jbrockmendel
Copy link
Member

Does pi.get_loc work as expected? I haven't checked, but it wouldn't surprise me if PeriodEngine isn't supposed to accept a Period object

@proost
Copy link
Contributor Author

proost commented Oct 17, 2019

@jbrockmendel
Yes.

pi._engine.get_loc("2010") 
pi._engine.get_loc(2010)
pi = pd.period_range(start=2000,end=2019,freq="d")
d = datetime.datetime(2010,1,1)
pi._engine.get_loc(d)

all raises exception.
PeriodEngine.get_loc always called when PeriodIndex.get_loccalled. So Why we call function which can do nothing?

And DatetimeEngine.get_loc can find location with Timestamp or datetime object.
So if use PeriodEngine.get_loc , should PeriodEngine handle Period?

What i expect is
if use PeriodEngine.get_locwhen using PeriodIndex.get_loc, fix PeriodEngine.get_loc
or abandon calling PeriodEngine.get_loc when using PeriodIndex.get_loc

@proost
Copy link
Contributor Author

proost commented Oct 24, 2019

Close Issue.
I try to change the code in two way.
First, I try to remove PeriodEngine.get_loc in PeriodIndex.get_loc, so just compute .get_loc once, not several in try .. except statement. But problem is test case change inevitable. I give it up this way.
Second, I try to change PeriodEngine.get_loc to accept "Period" object not only "Period.ordinal". (Original PeriodEngine.get_loc only accept "Period.ordinal") This way is success in that work successfully and pass test cases. However Problem is most cases user don't use "Period" object using PeriodIndex.get_loc And PeriodEngine.get_loc can't accept "intolerance" argument.
Finally, Integrating PeriodIndex.get_loc fails. Close issue

@proost proost reopened this Oct 26, 2019
proost added a commit to proost/pandas that referenced this issue Oct 26, 2019
proost added a commit to proost/pandas that referenced this issue Oct 26, 2019
proost added a commit to proost/pandas that referenced this issue Oct 26, 2019
proost added a commit to proost/pandas that referenced this issue Oct 26, 2019
proost added a commit to proost/pandas that referenced this issue Oct 26, 2019
proost added a commit to proost/pandas that referenced this issue Oct 26, 2019
proost added a commit to proost/pandas that referenced this issue Oct 26, 2019
@jbrockmendel jbrockmendel added Bug Index Related to the Index class or subclasses labels Oct 28, 2019
proost added a commit to proost/pandas that referenced this issue Oct 29, 2019
proost added a commit to proost/pandas that referenced this issue Oct 29, 2019
proost added a commit to proost/pandas that referenced this issue Oct 29, 2019
proost added a commit to proost/pandas that referenced this issue Oct 29, 2019
proost added a commit to proost/pandas that referenced this issue Oct 29, 2019
proost added a commit to proost/pandas that referenced this issue Oct 29, 2019
proost added a commit to proost/pandas that referenced this issue Oct 30, 2019
proost added a commit to proost/pandas that referenced this issue Oct 30, 2019
proost added a commit to proost/pandas that referenced this issue Oct 30, 2019
proost added a commit to proost/pandas that referenced this issue Oct 30, 2019
proost added a commit to proost/pandas that referenced this issue Nov 4, 2019
proost added a commit to proost/pandas that referenced this issue Nov 5, 2019
proost added a commit to proost/pandas that referenced this issue Nov 5, 2019
proost added a commit to proost/pandas that referenced this issue Nov 20, 2019
proost added a commit to proost/pandas that referenced this issue Nov 20, 2019
proost added a commit to proost/pandas that referenced this issue Dec 2, 2019
@jbrockmendel jbrockmendel removed the Bug label Dec 18, 2019
proost added a commit to proost/pandas that referenced this issue Dec 28, 2019
proost added a commit to proost/pandas that referenced this issue Dec 28, 2019
@proost proost changed the title Bug: PeriodEngine doesn't work REF/PERF: PeriodEngine covers narrow type Dec 28, 2019
proost added a commit to proost/pandas that referenced this issue Dec 30, 2019
proost added a commit to proost/pandas that referenced this issue Jan 6, 2020
proost added a commit to proost/pandas that referenced this issue Jan 7, 2020
proost added a commit to proost/pandas that referenced this issue Jan 14, 2020
proost added a commit to proost/pandas that referenced this issue Jan 15, 2020
@proost proost closed this as completed Jan 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants