-
-
Notifications
You must be signed in to change notification settings - Fork 329
avoid race condition during chunk write #327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zarr/storage.py
Outdated
# On windows, rename() can't overwrite files. So | ||
# the file is removed first. | ||
os.remove(new) | ||
os.rename(old, new) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is the case where we would want the pyosreplace
package. It's a Windows only backport for os.replace
on Python 2.
Just to mention that there are some potential consequences from implementing this PR that we should be aware of, described in #328. I don't think that should stop us from moving ahead with this PR, using os.replace() is clearly a better thing to do under any circumstances, but just something to be aware of. |
ddfcec1
to
a4069b0
Compare
@jakirkham I've made a guess at the |
Think you had it basically right. Unfortunately there is some lack of clarity about when a colon or semicolon should be used. IIUC it should be a semicolon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for working on this @sbalmer.
Hopefully the comment above is useful. Am personally fine with how this is. Though if you want to use environment markers instead (or others prefer this), that could be also done, but don't personally think this is critical.
Thanks @sbalmer. Could you add pyosreplace to requirements.txt and requirements_dev.txt, pinned to the latest version in the latter. I guess environment markers are needed in both. |
When the chunk file is first removed before the new version is moved into place, racing reads may encounter a missing chunk. Using rename() or replace() without remove() avoids the issue on Posix-Systems as the methods are atomic. The fallback of remove() -> rename() is included for Windows pre Python 3.3. Fixes zarr-developers#263
so it's not repeated on every write
Because the env markers didn't work. Just guessing at this point.
0d28ca2
to
1aee7cf
Compare
@alimanfoo |
Thanks @sbalmer. LGTM. 😄 Let's see what @alimanfoo thinks. 😉 |
I wonder if we can just drop the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks @sbalmer, looks good. Could you add a release note to docs/release.rst.
FWIW I imagine it could still be possible that some failure occurs during the attempt to write to the temporary file (e.g., device gets full or something like that), in which case an exception occurs before the value has been fully written, in which case we'd still want to clean up the temporary file. |
Good point. |
I've resolved conflicts after merging #352 and added a release note. Will merge if CI passes. |
@sbalmer apologies I didn't know your full name to include in the release note, you're credited currently just as "sbalmer", happy to leave it that way or add your full name, whichever you prefer. |
Thanks @sbalmer 😄 |
Thanks @alimanfoo, @jakirkham! Now we can use mainline again :-) |
When the chunk file is first removed before the new version
is moved into place, racing reads may encounter a missing chunk.
Using rename() or replace() without remove() avoids the issue
on Posix-Systems as the methods are atomic. The fallback of
remove() -> rename() is included for Windows pre Python 3.3.
Fixes #263
TODO:
tox -e docs
)