-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: Implement PeriodArray._unique #23586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This would also work for DatetimeArray/TimedeltaArray if put into DatetimelikeArrayMixin and the last line were changed to |
@TomAugspurger do you remember why we don't have a base |
Hmm, can we say in general whether factorizing or converting to object is more expensive? |
Probably not in general, but my feeling is that it is likely the overhead of keeping track of the codes in factorize will give less overhead compared to doing it in object mode. For integers:
(but so this also clearly shows that for PeriodArray it is worth to explicitly use |
Actually, can't we do something like this as default: + def unique(self):
+ from pandas.core.algorithms import unique
+ return self._from_factorized(unique(self._values_for_factorize), self) |
Yeah, I think so... I don't think we make any claims about the order of the result (but I think right now unique and the default factorize will preserve it). |
Avoid an astype(object).
should work.
The text was updated successfully, but these errors were encountered: