Skip to content

BUG: #17778 - DataFrame.to_pickle() fails for .zip format on MacOS and pandas 0.20.3 #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ghost
Copy link

@ghost ghost commented Nov 3, 2017

@codecov-io
Copy link

codecov-io commented Nov 3, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@1647a72). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master       #2   +/-   ##
=========================================
  Coverage          ?   91.36%           
=========================================
  Files             ?      164           
  Lines             ?    49891           
  Branches          ?        0           
=========================================
  Hits              ?    45582           
  Misses            ?     4309           
  Partials          ?        0
Flag Coverage Δ
#multiple 89.17% <100%> (?)
#single 39.41% <15.78%> (?)
Impacted Files Coverage Δ
pandas/io/common.py 69.74% <100%> (ø)
pandas/io/pickle.py 84.21% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1647a72...8903378. Read the comment docs.

@ghost ghost force-pushed the #17778 branch 5 times, most recently from 2813bb9 to 9a80160 Compare November 9, 2017 13:18
@ghost
Copy link
Author

ghost commented Nov 9, 2017

There are PermissionError on Windows AppVeyor build - I need to look closer into it.

@ghost ghost force-pushed the #17778 branch 3 times, most recently from 52eab51 to df5233c Compare November 10, 2017 17:28
GH17778: add 'zip' format to unittests.
Added entry in doc/source/whatsnew/v0.22.0.txt file to Bug Fixes section.
@@ -42,7 +42,17 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL):
if protocol < 0:
protocol = pkl.HIGHEST_PROTOCOL
try:
pkl.dump(obj, f, protocol=protocol)
import zipfile

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update docstring as now it also uses 'zip' compression

@@ -42,7 +46,14 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL):
if protocol < 0:
protocol = pkl.HIGHEST_PROTOCOL
try:
pkl.dump(obj, f, protocol=protocol)
if isinstance(f, zipfile.ZipFile):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like it - I don't expect that _get_handle returns something other than file-like. Creating temp file is unacceptable - some buffered solution can be introduced.Unfortunately support for buffered zip writing is in Python 3.6, but we can backport it and place in: https://github.com/pandas-dev/pandas/blob/master/pandas/compat/

Python 3.6 supports mode='w' in Zipfile.open (https://github.com/python/cpython/blob/master/Lib/zipfile.py#L1312) so this whole commit for Python 3.6 could look like:

if mode == 'wb':
    f = zipfile.open('data.bin', mode='w')
else:
    # find filename and open for read

It should be really easy to backport for Python 3,x - backporting for Python 2.7 can be trickier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame.to_pickle() fails for .zip format on MacOS and pandas 0.20.3
3 participants