Skip to content

Consider extending hashtable-free indexing algorithms for large sorted indexes #14273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ssanderson opened this issue Sep 21, 2016 · 1 comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Performance Memory or execution speed performance

Comments

@ssanderson
Copy link
Contributor

See discussion in #14266.

IndexEngine.get_loc currently does lookups without building a hash table for large indices, as does DatetimeEngine.__contains__. get_indexer, however, still always populates a table, as does __contains__ on non-datetime indices.

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Performance Memory or execution speed performance labels Sep 21, 2016
@jreback jreback added this to the Next Major Release milestone Sep 21, 2016
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@jbrockmendel
Copy link
Member

__contains__ now directly calls get_loc, avoiding creating a hashtable in the same set of cases. get_indexer always uses a mapping though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants