-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-109593: ResourceTracker.ensure_running() calls finalizers #109620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hmm, but |
Oh, |
What is a process end? |
Sorry, I meant interpreter shutdown. |
339467e
to
c1dad09
Compare
Oh in fact, what I want is to trigger a GC collection: I fixed my PR for that. I confirm that my change fix #109593 (comment) reproducer. @pitrou: Would you mind to review this updated fix? |
multiprocessing: Reduce the risk of reentrant calls to ResourceTracker.ensure_running() by running explicitly a garbage collection, to call pending finalizers, before acquiring the ResourceTracker lock.
c1dad09
to
d11bc95
Compare
Ok, this is better, but the problem is that Instantiating a semaphore is currently 500x faster than a GC collection, and that's an optimistic measurement without a lot of objects allocated: >>> %timeit gc.collect()
10.7 ms ± 352 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit mp.Semaphore()
21.7 µs ± 395 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each) |
Do you want to propose a different fix? My concern is that currently, multiprocessing can hang randomly. IMO it's bad and must be fixed. I prefer a slow multiprocessing than a multiprocessing which hangs randomly. Also, I have limited interest in developing the most efficient fix. So if you have free cycles, please go ahead :-) These days, I'm busy fixing tons of buildbot failures:
Obviously, I would be fine with a fast and correct fix for this issue :-) |
Well, TBH, I'm not sure this issue is very common as I don't think I have every seen it elsewhere. But, yes, I'll try to come up with a fix. |
It makes fail more and more buildbots, so for me, it's an urgency. What I mean in my previous message is that I'm considering to fix the issue right now, and we can have time later to revisit the issue and find a better fix (with lower impact on performance). |
This code isn't new, so it's surprising it's failing "more and more"? |
I modified the CI recently to stop ignoring silently tests failing randomly (FAILURE then SUCCESS): pass new The affected buildbot "PPC64LE Fedora Stable Refleaks 3.x" is blocking Python releases, it's part of STABLE buildbots. |
Well... do you want to undo the change on the failing buildbot until we fix this issue? |
Please take a look at alternate PR #109629 |
Oh sure, if i don't have the bandwith to fix regressions, i will undo this change, once we listed failing tests. That would be reasonable. |
Since @pitrou has a better approach, i convert this change to a draft for now. |
multiprocessing: Reduce the risk of reentrant calls to ResourceTracker.ensure_running() by running explicitly all finalizers before acquiring the ResourceTracker lock.