-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Q: Faster way to Filter DataView #6164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @torronen Is this for the DataView or DataFrame? Looks like DataFrame, but just wanted to confirm before tagging it. |
@luisquintanilla yes, you are correct, it is DataFrame. |
Thanks for that clarification. |
What do you mean by "support some type indices"? Also, do you have any numbers for speed between this and LINQ? It would be good to see how far behind we really are. |
@michaelgsharp I am thinking about something like a dictionary or hashset to select items quickly. For example, I might want get metrics for observations from each city separately: one test set for Helsinki, 2nd for Seattle etc. Getting the numbers is a good point to validate it. Actually, this issue is mostly about my perception of slowness and I do not yet have an exact comparison. I will do some, but I might not be able to get them very quickly. |
Some increase in performance of Filtering should be achieved with #6869. |
I filter data from a dataview to get all items within a specific time period.
It seems slow compared to filtering with LINQ from objects in memory. Is there a faster way to do it?
In this example, I am creating predictions for a certain time period at a time.
In another example, I may need to filter by exact match. Normally, I might create a dictionary to help, but is there a way to support some type "indices" for DataViews?
The text was updated successfully, but these errors were encountered: