Skip to content

bpo-17852: Fix PR #3372, flush BufferedWriter objects on exit. #4847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

nascheme
Copy link
Member

@nascheme nascheme commented Dec 13, 2017

New version of PR #3372, which was reverted by 317def9.

We can't use _Py_PyAtExit() as it only supports registering a single
callback. It is used by the atexit module and so we can't use it. We
can't use Py_AtExit() either because it calls functions too late in the
interpreter shutdown process. Instead, create io._flush_all_buffers.
In io.py, register it with the atexit module.

https://bugs.python.org/issue17852

We can't use _Py_PyAtExit() as it only supports registering a single
callback.  It is used by the atexit module and so we can't use it.  We
can't use Py_AtExit() either because it calls functions too late in the
interpreter shutdown process.  Instead, create io._flush_all_buffers.
In io.py, register it with the atexit module.
@nascheme nascheme added the type-bug An unexpected behavior, bug, or error label Dec 13, 2017
@nascheme nascheme requested a review from pitrou December 13, 2017 19:23
Lib/_pyio.py Outdated
for w in _all_writers:
try:
w.flush()
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except Exception sounds better here.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach here is nice and adequate. It would be nice to come up with a test case, if that's easily doable.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@nascheme
Copy link
Member Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@pitrou: please review the changes made to this pull request.

@nascheme
Copy link
Member Author

I spent some time trying to think of how to build a test. It is difficult. I believe you need a BufferedWriter + FileIO pair to be part of a reference cycle. Then you need the finalization of the FileIO object to happen before the BufferedWriter. At least, that is my memory of how to trigger the non-flushing behaviour. I spent about half a day trying to trigger the bug originally and figure out what was happening. Unfortunately I did not save the script that triggered it. I do distinctly recall that the buffered file needs to be part of a reference cycle. The motivation for me to make this patch was to hopefully save someone else going through solving a similar bug.

@nascheme
Copy link
Member Author

Ha ha, I managed to trigger it. I added a script to bpo-17852. Also added some explanation that this bug has to do with topologically ordering of finalizers.

@pitrou
Copy link
Member

pitrou commented Dec 14, 2017

Could you convert your script into a simple unit test?

@nascheme
Copy link
Member Author

nascheme commented Dec 17, 2017

Closing this PR as it is not really a fix. It works if the reference cycle containing the raw file and buffered file object are not re-claimed before atexit gets run. However, the cycle can be claimed during a normal gc.collect() run. In that case, the raw file can get closed before the buffered file and data in the buffer will be discarded.

I thought having the raw file keep a list of weak refs to the buffer and calling flush() on those when the raw close() is called would be a proper fix. However, that doesn't work either. The GC clears the weakrefs in handle_weakrefs() before it calls __del__ on the raw file object. So, the raw file cannot get the buffer to flush itself before the raw file is closed. So, using weak refs does not work, at least with how gcmodule works at the moment.

The only other idea I have is to split the buffered IO object into two parts. A "proxy" object that wraps the underlying state. The raw file object would keep a strong reference to this underlying state object.Then the raw file can call flush() before closing itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting change review type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants