-
Notifications
You must be signed in to change notification settings - Fork 0
BUG: #17778 - DataFrame.to_pickle() fails for .zip format on MacOS and pandas 0.20.3 #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2 +/- ##
=========================================
Coverage ? 91.36%
=========================================
Files ? 164
Lines ? 49891
Branches ? 0
=========================================
Hits ? 45582
Misses ? 4309
Partials ? 0
Continue to review full report at Codecov.
|
2813bb9
to
9a80160
Compare
There are |
52eab51
to
df5233c
Compare
GH17778: add 'zip' format to unittests. Added entry in doc/source/whatsnew/v0.22.0.txt file to Bug Fixes section.
pandas/io/pickle.py
Outdated
@@ -42,7 +42,17 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | |||
if protocol < 0: | |||
protocol = pkl.HIGHEST_PROTOCOL | |||
try: | |||
pkl.dump(obj, f, protocol=protocol) | |||
import zipfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update docstring as now it also uses 'zip' compression
Moved imports to top.
@@ -42,7 +46,14 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | |||
if protocol < 0: | |||
protocol = pkl.HIGHEST_PROTOCOL | |||
try: | |||
pkl.dump(obj, f, protocol=protocol) | |||
if isinstance(f, zipfile.ZipFile): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like it - I don't expect that _get_handle
returns something other than file-like. Creating temp file is unacceptable - some buffered solution can be introduced.Unfortunately support for buffered zip writing is in Python 3.6, but we can backport it and place in: https://github.com/pandas-dev/pandas/blob/master/pandas/compat/
Python 3.6 supports mode='w'
in Zipfile.open
(https://github.com/python/cpython/blob/master/Lib/zipfile.py#L1312) so this whole commit for Python 3.6 could look like:
if mode == 'wb':
f = zipfile.open('data.bin', mode='w')
else:
# find filename and open for read
It should be really easy to backport for Python 3,x - backporting for Python 2.7 can be trickier
git diff upstream/master -u -- "*.py" | flake8 --diff