-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-17852: Fix PR #3372, flush BufferedWriter objects on exit. #4847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We can't use _Py_PyAtExit() as it only supports registering a single callback. It is used by the atexit module and so we can't use it. We can't use Py_AtExit() either because it calls functions too late in the interpreter shutdown process. Instead, create io._flush_all_buffers. In io.py, register it with the atexit module.
Lib/_pyio.py
Outdated
for w in _all_writers: | ||
try: | ||
w.flush() | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except Exception
sounds better here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the approach here is nice and adequate. It would be nice to come up with a test case, if that's easily doable.
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again. |
Thanks for making the requested changes! @pitrou: please review the changes made to this pull request. |
I spent some time trying to think of how to build a test. It is difficult. I believe you need a BufferedWriter + FileIO pair to be part of a reference cycle. Then you need the finalization of the FileIO object to happen before the BufferedWriter. At least, that is my memory of how to trigger the non-flushing behaviour. I spent about half a day trying to trigger the bug originally and figure out what was happening. Unfortunately I did not save the script that triggered it. I do distinctly recall that the buffered file needs to be part of a reference cycle. The motivation for me to make this patch was to hopefully save someone else going through solving a similar bug. |
Ha ha, I managed to trigger it. I added a script to bpo-17852. Also added some explanation that this bug has to do with topologically ordering of finalizers. |
Could you convert your script into a simple unit test? |
Closing this PR as it is not really a fix. It works if the reference cycle containing the raw file and buffered file object are not re-claimed before atexit gets run. However, the cycle can be claimed during a normal gc.collect() run. In that case, the raw file can get closed before the buffered file and data in the buffer will be discarded. I thought having the raw file keep a list of weak refs to the buffer and calling flush() on those when the raw close() is called would be a proper fix. However, that doesn't work either. The GC clears the weakrefs in handle_weakrefs() before it calls The only other idea I have is to split the buffered IO object into two parts. A "proxy" object that wraps the underlying state. The raw file object would keep a strong reference to this underlying state object.Then the raw file can call flush() before closing itself. |
New version of PR #3372, which was reverted by 317def9.
We can't use _Py_PyAtExit() as it only supports registering a single
callback. It is used by the atexit module and so we can't use it. We
can't use Py_AtExit() either because it calls functions too late in the
interpreter shutdown process. Instead, create io._flush_all_buffers.
In io.py, register it with the atexit module.
https://bugs.python.org/issue17852