Skip to content

bpo-17852: Fixed the documentation about closing files #23135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 27, 2020

Conversation

Volker-Weissmann
Copy link
Contributor

@Volker-Weissmann Volker-Weissmann commented Nov 3, 2020

Most programming languages, including Python 2 call all the destructor of global variables if the program exists successfully.
Python 3 is not guaranteed to.
This can result in the arguments of file.write() not being written to the disk.
The guys in this issue also released this problem and discussed if it should be this way, but apparently all of them were completely brain-dead idiots because they did not check if the documentation of Python matches the actual behaviour. Instead, the documentation stated that Python will call f.close() (and I quote) "eventually", which is completely BS, because Python may never call f.close().
I can't fix the (imho stupid) behavior of Python 3 not calling destructors, but at least I can fix the documentation (this PR).

Especially for a programming language like Python, that is supposed to be used by people with limited CS knowledge, it is important that no features in the programming language yields surprising behavior and good documentation that is correct is also very important.

Excuse me for my language, but if one of the most popular programming languages doesn't manage to fix a simple but important documentation error within a year after someone reports it, it is a disgrace and failure of the open source model. If you support this kind of misleading documentation, you are 100 % responsible to every bug written by someone who thought that f.close() is called when the program exists. And they are 0 % responsible for the bug, because they trusted the documentation.

https://bugs.python.org/issue17852

@nascheme
Copy link
Member

nascheme commented Nov 3, 2020

I understand you are upset but calling the developers names is probably not helping. I agree that either the documentation should note this issue or we should actually try to fix the bug. Having buffered file objects randomly lose data is not a friendly behavior.

Your documentation change is not quite correct, I think. The problem is not that __del__ isn't called, it is that the ordering of the __del__ calls can be unexpected. For this bug, it happens if your buffered file object is part of a garbage reference cycle. The ordering of __del__ calls is essentially random at that point. It can happen that the buffer object gets cleaned up after the file object has been cleaned (closed). In that case, the buffer cannot be written because the file is already closed.

AFAIK, there is no way to fix that in the Python GC logic. We would need to re-structure the _io classes to prevent it (merge the buffer object with the file descriptor object or use a weakref).

@Volker-Weissmann
Copy link
Contributor Author

Volker-Weissmann commented Nov 4, 2020

I understand you are upset but calling the developers names is probably not helping.

I know, I'm sorry. I'm just tired of this "You should have expected this surprising behavior that is not documented!" victim-blaming. Multiple people told me that the old documentation was correct.

I agree that either the documentation should note this issue or we should actually try to fix the bug. Having buffered file objects randomly lose data is not a friendly behavior.

Thank you for saying that.

Your documentation change is not quite correct, I think. The problem is not that __del__ isn't called, it is that the ordering of the __del__ calls can be unexpected. For this bug, it happens if your buffered file object is part of a garbage reference cycle. The ordering of __del__ calls is essentially random at that point. It can happen that the buffer object gets cleaned up after the file object has been cleaned (closed). In that case, the buffer cannot be written because the file is already closed.

AFAIK, there is no way to fix that in the Python GC logic. We would need to re-structure the _io classes to prevent it (merge the buffer object with the file descriptor object or use a weakref).

Fixing this bug would not prevent data loss, because the docs says that python can exit without calling __del__. The other question is whether

f = open("path")
f.write("data")
del(f)
exit(0)

is guaranteed to write to the disk.

Co-authored-by: Inada Naoki <[email protected]>
Suggestion from methane

Co-authored-by: Inada Naoki <[email protected]>
Copy link
Member

@methane methane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But wait a week for another core dev review.

@Volker-Weissmann
Copy link
Contributor Author

LGTM. But wait a week for another core dev review.

When will "another core dev review" happen?

@methane
Copy link
Member

methane commented Nov 27, 2020

When will "another core dev review" happen?

No one knows. I thought if someone find this and want to review. But it was not happened in two weeks.

@methane methane merged commit c8aaf71 into python:master Nov 27, 2020
@miss-islington
Copy link
Contributor

Thanks @Volker-Weissmann for the PR, and @methane for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8, 3.9.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-23527 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Nov 27, 2020
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 27, 2020
Co-authored-by: Inada Naoki <[email protected]>
(cherry picked from commit c8aaf71)

Co-authored-by: Volker-Weissmann <[email protected]>
@bedevere-bot
Copy link

GH-23528 is a backport of this pull request to the 3.8 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 27, 2020
Co-authored-by: Inada Naoki <[email protected]>
(cherry picked from commit c8aaf71)

Co-authored-by: Volker-Weissmann <[email protected]>
miss-islington added a commit that referenced this pull request Nov 27, 2020
Co-authored-by: Inada Naoki <[email protected]>
(cherry picked from commit c8aaf71)

Co-authored-by: Volker-Weissmann <[email protected]>
@JulienPalard
Copy link
Member

Thanks @Volker-Weissmann!

I agree than relying on destructors / garbage collectors to flush data is a bug magnet, so the doc is way better that way: explicit is better.

Next time you find an inconsistency in the doc, don't hesitate to open such good PRs, they help everybody, and it's appreciated (just skip the part you know, but let's no longer speak about it).

miss-islington added a commit that referenced this pull request Dec 1, 2020
…GH-23527)

Co-authored-by: Inada Naoki <[email protected]>
(cherry picked from commit c8aaf71)


Co-authored-by: Volker-Weissmann <[email protected]>
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants