-
Notifications
You must be signed in to change notification settings - Fork 97
Use more buffers (redux) #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use more buffers (redux) #128
Conversation
ceb7ccc
to
0153531
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for playing around with this problem as well @alimanfoo. Seems you are encountering the same subtleties I ran into. Left some comments below, but wouldn't take them too seriously. They are more intended as advice. Completely understand this is still a work in progress at this stage.
Would just highlight one point that comes up below and have generally learned by playing with this problem. Namely NumPy ndarray
s end up being a great solution to use for all data in codecs. They support both buffer protocols, they support more types out-of-the-box, and they are easier for us and new users to the codebase to work with. This saves on compat code, work the same on Python 2/3, are easily handled by builtin compressors, and make things like casting, reshaping, and getting bytes trivial. In short, a lot of mileage can be gotten out of ndarray
s.
Thanks @jakirkham for the comments, this is a tricky one to get right. FWIW I think the key issue here is that many codecs require an input that, under Python 3, exposes a new-style buffer interface onto a C contiguous block of memory. E.g., things like A blocker from just passing a numpy array to these functions is the datetime & timedelta datatypes which you can't take a memoryview of. If I included a conversion from datetime or timedelta to int64 within the |
This is my temptation as well. Let's go for it. 😄 |
OK, 'tis done! I'm using There were still some quirks in the Let me know what you think. |
047ac0a
to
f4f6f5d
Compare
Beautiful! Thanks for doing that. 😄
Yeah, I ran into these two as well. Agree this seems fine.
Added a few more comments above in our existing threads to keep continuity with the existing discussion. A couple other comments above as well. For the most part these are moving to minor points. IOW this is looking pretty good. |
…atibility tests This reverts commit 047ac0a.
1acb560
to
5084c56
Compare
Hi @jakirkham, after some thrashing around I think I have basically converged on something very similar to what you have in #121. There are some small differences which I'm more than happy to discuss. I've also pushed on and refactored the buffer compatibility code in the cython modules, to re-use the new compat functions. I still have to address the point you raised about dealing with unicode arrays, but apart from that, I'd be very grateful if you could take a look and see if there's anything you think I've missed, or anything you think should be done differently. |
Latest commit disallows unicode array.array, seemed like it was not worth the effort to support it. |
In the interests of getting to a 0.6 release asap, I'd like to suggest we move forward with this PR instead of #121, mainly for the reason that I've had the chance to add in a bit more documentation and comments, so hopefully when we or someone else needs to revisit this in future, they should be able to comprehend what is being done and why. I believe that the approaches taken here and in #121 are essentially the same, with the main differences being in function naming and some minor implementation details. Please let me know if any objections. Whatever PR we take forward, primary credit for this work should go to @jakirkham. |
Had some pretty minor comments above. Think this is basically done at this stage.
Sounds good to me.
I think the test coverage and function naming were also better here. The API has also improved generally.
Thanks for working on this as well. It has come a long way since that PR. |
Thanks again @jakirkham for the review. Your suggestions are all good and I've gone with them in latest commits. Thanks also for bottoming out some really gnarly Python quirks, it is amazing how deep the rabbit hole goes! If CI passes I propose to merge if no objections. |
Thanks @alimanfoo! 🎉 Agree this looks great. Thanks for your hard work here. 😄 |
Based on work in #121, trying to make a clean split between PY2 and PY3 code paths, and make function naming more intuitive.
TODO:
tox -e py37
passes locallytox -e py27
passes locallytox -e docs
passes locally