-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Reliable version of os.replace for Windows (wait/retry loop) #3239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Honestly the wait_for() function seems pretty over-engineered. Its only known call site (outside tests) has only one function and one exception. I think we can make it simpler. I still pondering Jukka's comment in the issue but TBH I'm not sure I follow the scenarios. How likely is it that we're in a scenario where we end up waiting a second for each cache file we write? That would be pretty horrible, but does it really happen? After all the issue only noticed occasional AppVeyor failures. I have some other nits but I think we should decide on the high-level approach first. Should we really change the behavior of write_cache() (which was pretty carefully designed when it comes to avoiding corrupt files no matter at what point the process is killed hard, assuming reasonable filesystem semantics) or is just retrying the replace() until it succeeds enough? |
Sure, but I thought we may also want to use it for
User misconfiguration. For example, cache folder has write permission so new temp files can be created, but the existing cache files are owned by another user so they can never be replaced.
Yes, but you said we wanted to solve it not just for |
Let's have a simple wait function now -- we can refactor it when we need it for the remove calls (those haven't been failing AFAIK).
Then the first replace() call will time out, which will raise an exception, which AFAIK isn't caught. So in this case we should only be waiting 1 sec extra before getting an error. Right? |
Yes precisely.
They are not because the |
Hm, let me see. (Literally just writing as I reason through this so you can check for yourself if there's a flaw in this argument.) At this point the only thing we've done to the filesystem is the makedirs() call, which is idempotent. We've also computed the string to be written to the data file but haven't written it to the file yet. Now we're trying to compute the contents of the meta file and the crucial input, mtime+size of the source file (path) is unavailable due to a stat() error. That file existed before (or we wouldn't have gotten this far). What we're doing here is mostly a slight optimization then -- cleaning up cache files for a source file that no longer exists. Presumably if the source file reappears it will have a different mtime so the cache files will be invalid anyway. Or if it is restored from a backup we're pessimizing things slightly. I think the reason this code exists at all is to clean the cache of irrelevant entries. The try/except is meant to avoid needing other checks in case the cache files don't exist at all. When could ignoring the error cause trouble? I guess the only interesting scenario is when there was a valid data/meta pair and somehow we delete the data but leave the meta. On a subsequent run, assuming if the file is restored from backup (otherwise we would rule the meta file invalid without ever looking at the data file), the is_meta_fresh() function would get to the point where it tries to call getmtime() on the data file: Line 818 in 65b9b0b
Now suppose we add error handling to the cache handling code -- then that getmtime() call would still fail but we'd presumably catch the error -- and then is_meta_fresh() should rule the meta file stale and return False, at which point we're still fine. Concluding I don't think that the code you flagged can cause incorrect semantics -- but thanks for asking and it was a nice puzzle! |
I agree. I would summarize your argument by saying that before anyone relies on a cache file, they call I was also afraid that someone might call It never happens in the current version, and based on the code style I see, I don't see much risk it will happen in the future: |
I believe there's a very good reason why these two are separated, having to do with the different phases of processing. API design is hard! |
I am bit concerned about my tests; they rely on pretty tight timing constraints (0.1 sec < 0.25 sec < 0.4 sec). While normally, this is more than enough of a difference relative to random noise and disk I/O, on a slow VM with multiple processes running in parallel and/or heavy disk activity it might actually fail intermittently. At the same time, I don't want to delete the tests, or make them much longer (since these delays add onto the total test time). I guess I'll use a separate pair of test src / dest files for each test, so that the waiting time for locks to expire after the test already passed isn't done sequentially. This will give me more room to increase the duration of locks and timeouts used for tests. |
As you should be, given the (ironic) test failures... |
xdist isn't flexible enough to run a given set of tests in one process If these tests are split into multiple processes, they will take a lot longer since each will wait for locks to die out at the end.
mypy/util.py
Outdated
|
||
|
||
def _replace(src: PathType, dest: PathType) -> None: | ||
repl = cast(Callable[[], None], partial(os.replace, src, dest)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you use a lambda? Shouldn't even need the cast:
repl = lambda: os.replace(src, dest)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh duh.. This code is no longer in the current version, but I'm glad I won't be casting partial
as often in the future.
mypy/test/testutil.py
Outdated
try: | ||
import collections.abc as collections_abc | ||
except ImportError: | ||
import collections as collections_abc # type: ignore # PY32 and earlier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we don't support PY32 any more.
mypy/test/testutil.py
Outdated
start_time = time.perf_counter() | ||
|
||
def f() -> None: | ||
if time.perf_counter() - start_time < lag: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this would be more readable as if time.perf_counter() < start_time + lag:
-- then I can see immediately that it's taking this branch when called before lag
time has passed.
mypy/test/testutil.py
Outdated
class WaitRetryTests(TestCase): | ||
def test_waitfor(self) -> None: | ||
with self.assertRaises(OSError): | ||
util.wait_for(create_funcs(), (PermissionError, FileExistsError), 0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the exceptions should be in a list, not a tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good to know, somehow I thought exception syntax was the same as isinstance
, only tuples allowed :)
mypy/test/testutil.py
Outdated
|
||
class WaitRetryTests(TestCase): | ||
def test_waitfor(self) -> None: | ||
with self.assertRaises(OSError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I highly recommend adding a comment to each subtest (i.e. each util.wait_for() call, possibly wrapped in a context manager) explaining what it is for.
mypy/test/testutil.py
Outdated
|
||
|
||
def lock_file(filename: str, duration: float) -> Thread: | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you stick to the prevailing docstring style?
"""Opens filename (which must exist) for reading.
After duration sec, release the handle.
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though honestly the first line should probably say something like
"""Open a file and keep it open in a background thread for a while."""
mypy/test/testutil.py
Outdated
|
||
@skipUnless(WIN32, "only relevant for Windows") | ||
class ReliableReplace(TestCase): | ||
# will be cleaned up automatically when this class goes out of scope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually that would depend on Python finalization order which is a nightmare.
Can't you make this more explicit? E.g. in tearDownClass() below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I didn't know that.. scary, I always relied on that. Fixed.
mypy/util.py
Outdated
|
||
|
||
if sys.version_info >= (3, 6): | ||
PathType = Union[AnyStr, os.PathLike] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.PathLike is also generic in AnyStr. But by not mentioning that here you'll get it instantiated with Any instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I wonder if perhaps PathType should be availalbe from types
stdlib module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking of proposing to add it to typing
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is actually python/typing#402 that seems related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, actually it's not going to work in runtime because PathLike
derives only from abc.ABC
, and there's no equivalent thing exported from typing
. I'll use AnyStr for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pkch Maybe a string literal 'PathLike[AnyStr]'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ilevkivskyi it works! I thought strings can only help with forward declarations, but I guess they just tell runtime not to worry about it whatever the cause might be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it works until someone calls get_type_hints
on this function. Generally, this is considered a temporary workaround, for example we prohibit things like 'collections.Counter[str]'
, but here I think it is OK.
Also note, that when you write PathType = Union[AnyStr, 'os.PathLike[AnyStr]']
, you create a generic type alias (since AnyStr
is a type variable), so that when you write just PathType
, it will be translated to Union[Any, os.PathLike[Any]]
. What you want is probably:
def _replace(src: PathType[AnyStr], dest: PathType[AnyStr], timeout: float = 10) -> None: ...
mypy/util.py
Outdated
PathType = AnyStr | ||
|
||
|
||
def _replace(src: PathType, dest: PathType, timeout: float = 10) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you don't parameterize the PathType types, they'll be using Any instead of AnyStr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I make this mistake 3 times a day. I guess I'll update #3141 (--warn-implicit-any
) in case anyone else has this problem.
mypy/util.py
Outdated
Increase wait time exponentially until total wait of timeout sec | ||
On timeout, give up and reraise the last exception seen | ||
''' | ||
n_iter = max(1, math.ceil(math.log2(timeout / 0.001))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you bound the iteration count without using the math number? ISTM you can just use something like
wait = 0.001
while wait <= 0.5:
<stuff>
time.sleep(wait)
wait = wait*2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can, but if I use the no-math version, the final iteration will have wait time anywhere between 0.25 and 0.5 sec, so the total timeout is only precise up to a multiplicative factor of 2. That precision might be fine for our purposes, but the tests become really inefficient. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why you can't generate exactly the same sequence of wait times using either algorithm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I need to know what wait
should start from. I'd like to start it from somewhere around 0.001 sec, but of course I don't care about the precise value. I do care about the total timeout though, which means I care about the final (rather than starting) value of the wait. Specifically, if my timeout is say 1 sec, I want my final wait
just before I give up, to be 0.5 sec (that way the total wait will be 0.5 + 0.5/(2**1) + 0.5/(2**2) + ... = 1
).
But in order to get the correct starting value without math, I'd need to create a second loop, where I repeat timeout /= 2
until it gets to (say) below 0.001. I can do that, but is it better than calling math.log
?
Actually, it might be easier to read, let me commit such a version, and let's decide which is better.
Do you need to sync typeshed as part of this PR? That's usually an antipattern. |
Hmm AppVeyor refused to build due to unmergeable branch until I commit typeshed. I'm surprised too, this is only the second time it's happening since I've been working with mypy (the first time was over a month ago). Let me commit without typeshed, and we can figure out what's going on. |
Oh, maybe it was due to the typeshed mess-up in a recent PR. It's all fixed now. |
(The comment doesn't seem to have a "reply" link so responding here.) Why is the exact duration of the timeout important? Timeouts are usually just guidelines. This seems perhaps just an artifact of the testing strategy? |
Yes I only care about precise timeout for testing purposes, What can we do though? I don't want to get rid of tests. And I guess it doesn't hurt if the users know that |
mypy/util.py
Outdated
if wait < 0.001: | ||
break | ||
wait /= 2 | ||
while True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gvanrossum Moving the discussion here.
I could just make the time allowance in test much larger. That way it will pass even though this function guarantees timeout
only up to a factor of 2x. But the test is already slow (adds 3-4 sec to the entire test suite) due to the file locks.
I was trying to make this test not affect total test runtime by adding another worker, since it's just sleeping for a while. But it didn't work yet (also, I fear having an extra worker might slow down tests a bit due to more process switching).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just keep it simple rather than trying to work against xdist. If it's hard to figure out now I'm sure it will be hard to maintain in the future.
Just in case, my go-to xdist wizard is @gnprice .
Thanks, indeed "files changed" no longer lists typeshed. Yay! |
mypy/test/testutil.py
Outdated
short_lock = 0.25 | ||
long_lock = 2 | ||
|
||
threads = [] # type: List[Thread] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if you made tmpdir and threads just instance variables, resetting and cleaning them for each test case rather than once for all test cases in the class? You have only one test_* method anyways, so...???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still like you to ponder this, but it's optional at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(FWIW that was actually an old comment that GitHub had somehow hung on to. As were several other comments on code you had already ripped out... Sorry about the noise!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting there! I'm going to be super careful here since it only affects Windows and I don't want to destabilize things again a week before the 0.510 release.
mypy/util.py
Outdated
@@ -2,8 +2,16 @@ | |||
|
|||
import re | |||
import subprocess | |||
import os | |||
import sys | |||
import math |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yo no longer need this import, nor itertools.count
below.
mypy/test/testutil.py
Outdated
|
||
def prepare_src_dest(self, src_lock_duration: float, dest_lock_duration: float | ||
) -> Tuple[str, str]: | ||
# Create two temporary files with random names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this into a docstring and add that this also spawns threads that lock each of them for the specified duration. Also mention the content of each is unique.
short_lock = timeout / 4 | ||
long_lock = timeout * 2 | ||
|
||
threads = [] # type: List[Thread] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should become an instance variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was that the entire file locking arrangement is shared across all the test*
methods of WIndowsReplace
test case. Specifically, I want to wait for all the file locks to expire at the end, rather than wait for the file lock in each test to expire. The former is much more efficient: while the other tests are running, old file locks will naturally use up part (or even all) of their duration, so the wait at the end will be much shorter. If we adopt this approach, I need threads
to be class-level since WindowsReplace
is instantiated as many times as there are test*
methods in it. It also means I must not delete the tmpdir
until the end of the entire test case, i.e., until tearDownClass
rather than instance-level tearDown
. Is there a disadvantage to this approach?
Note: I temporarily reduced the number of test*
metods to 1 due to issues with xdist
sending different tests to different workers (and thus defeating the entire mechanism I described in the previous paragraph). But this is a horrible arrangement; once I figure out how to resolve the xdist
issue, I definitely want to follow the standard practice of keeping each test in its own method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back to having 4 tests, since otherwise it's too messy. There's no semantic problem with that, and I'll deal with the extra few seconds in test runtime later.
But with this, I'd like to keep the class-level attributes if you don't see an issue with them; both semantically and performance-wise, I don't want to force each individual test to wait for the threads to expire.
mypy/test/testutil.py
Outdated
threads = [] # type: List[Thread] | ||
|
||
@classmethod | ||
def tearDownClass(cls) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please switch to (instance-level) tearDown.
mypy/test/testutil.py
Outdated
self.threads.append(lock_file(dest, dest_lock_duration)) | ||
return src, dest | ||
|
||
def replace_ok(self, src_lock_duration: float, dest_lock_duration: float, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a docstring for this.
mypy/util.py
Outdated
if wait > timeout: | ||
raise | ||
else: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move this return
into the try
block, then you don't need the else
block.
mypy/test/testutil.py
Outdated
@@ -44,7 +44,11 @@ def tearDownClass(cls) -> None: | |||
|
|||
def prepare_src_dest(self, src_lock_duration: float, dest_lock_duration: float | |||
) -> Tuple[str, str]: | |||
# Create two temporary files with random names. | |||
'''Create two files in self.tmpdir random names (src, dest) and unique contents; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use """ for docstrings please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opps one day I'll learn to follow patterns.
mypy/util.py
Outdated
if wait < 0.001: | ||
break | ||
wait /= 2 | ||
while True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just keep it simple rather than trying to work against xdist. If it's hard to figure out now I'm sure it will be hard to maintain in the future.
Just in case, my go-to xdist wizard is @gnprice .
import tempfile | ||
from contextlib import contextmanager | ||
from threading import Thread | ||
from unittest import TestCase, main, skipUnless |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, turns out mypy.test
, unittest
, and pytest are currently being used...ouch...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but Max didn't start that. IIUC we want to kill mypy.test, but it's low priority (not sure if there's even an issue) and pytest works well enough with unittest (also, personally, I'm more used to the unittest way of writing tests -- let pytest just be the test runner).
This PR is only limited to We don't rely on
This PR can fix (I also submitted a suggestion to fix |
Have you heard the story of the Hydra? I'm beginning to despair that this PR will ever be complete. Please just focus on os.replace(), and if the other problem becomes too annoying we'll deal with it later -- but even then I suspect for a pure cleanup the suggestion I made for TemporaryDirectory (just catch and ignore the exception) is good enough, so we won't need to revisit the retry logic there. |
No worries, we got it! This PR is now 100% focused on the I ran the tests of
My fix to both problems simplifies the code a lot: instead of asking a thread to lock the file, I'm instead keeping the file open after I wrote to it, and asking a thread to close it later. That way, it's perfectly obvious what's going on, and there's also no unnecessary reopening of files. I also split the tests into several smaller parts because it was too hard to debug without it. It works fine as is, but I do have a fix to I reran the tests again many times, and found no more failures (apart from the intermittent inability to delete folders or files on Windows, which we'll deal with separately). |
So I'm getting more and more worried about the complexity of this endeavour. The test failures you fixed do nothing to increase my confidence. (The fact that they weren't obvious in review is a warning sign.) What would be lost if we simply caught the original error in write_cache() and gave up (maybe with a warning message)? The next run will re-create the cache file if necessary, and the tests will pass. (We also recently merged #3255 which disables cache writing in most tests.) |
We can do that, but with a large cache failures would be pretty common. Still it's an option. That said, writing tests in this case is much harder than writing the actual functionality because we have to artificially and reliably recreate the behavior with concurrency. There was never really a concern with the correctness of the code itself. Maybe we should just allow the (now fixed and simplified) tests to be of slightly higher complexity than the rest of the code in this PR? Either way is fine with me, I don't have a large code base to comment on how annoying the caching issue is. |
mypy/test/testutil.py
Outdated
""" | ||
src, dest = self.prepare_src_dest(src_lock_duration, dest_lock_duration) | ||
util._replace(src, dest, timeout=timeout) | ||
self.assertEqual(open(dest).read(), src, 'replace failed') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use with open...
; this code will throw a ResourceWarning on recent versions of Python.
mypy/test/testutil.py
Outdated
self.replace_ok(0, 0, self.timeout) | ||
|
||
def test_original_problem(self) -> None: | ||
# Make sure we can reproduce the issue with our setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't appear to be any comment to tell us what "the issue" is. Maybe add a reference to the original GitHub issue?
The cache issue so far has only been annoying for the AppVeyor tests. I've got a simpler solution that catches write and replace errors for both JSON files. I think that because of the way I designed the code in the first place this is not going to create incorrectness -- the worst that can happen is that the next run will need to do extra work. See #3288 |
BTW if we go with my solution, and later find the replace() issue still causing problems, we can revisit your version. But for now, my spider-sense is tingling. |
Sure, if the problem from aborting the writing of the cache is just a minor performance issue, we don't have to bring the wait/retry loop just for Windows. I'll apply CR comments from @JelleZijlstra , then close this PR, and move this code into a separate repo on my github account, just in case we or anyone else want to use it later. |
Fix #3215
Copying my comment from there:
For now I used (3), but made the
wait_for
API to support refactoring into (1) if it's desired.