Skip to content

test_asyncio environment changed during test execution due to empty threading._dangling without MainThread before some tests #91676

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gpshead opened this issue Apr 18, 2022 · 6 comments · Fixed by #91680 or #91692
Assignees
Labels
3.9 only security fixes 3.10 only security fixes tests Tests in the Lib/test dir topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@gpshead
Copy link
Member

gpshead commented Apr 18, 2022

We're seeing frequent flakiness on CI and buildbots (heavily loaded systems, triggering more race conditions?) with test_asyncio of late. #91260 has some discussion as that seems to have surfaced it more frequently, at least in the stable 3.10 branch.

$ python -m test.bisect_cmd test_asyncio --fail-env-changed
...
0:00:00 load avg: 0.93 [1/2/1] test_asyncio.test_threads failed (env changed)
Warning -- threading._dangling was modified by test_asyncio.test_threads
  Before: {<weakref at 0x7f3b2e77fa10; to '_MainThread' at 0x7f3b2f107490>}
  After:  {<weakref at 0x7f3b2e77ffb0; to '_MainThread' at 0x7f3b2f107490>}

from a 3.10 branch with #91674 patched in.

0:09:20 load avg: 1.49 [427/427/1] test_asyncio failed (env changed) (1 min 22 sec)
Warning -- threading_cleanup() failed to cleanup -1 threads (count: 0, dangling: 1)
Warning -- Dangling thread: <_MainThread(MainThread, started 140300013757632)>

from https://github.com/python/cpython/runs/6069450575?check_suite_focus=true CI and from many stable buildbots such as https://buildbot.python.org/all/#/builders/608/builds/700/steps/6/logs/stdio

While "interesting" for _MainThread to wind up in threading._dangling... code wise it should always be in there. Yet somehow it is not before execution?!?

after a fork() from the non-main thread or from C, it won't be. danger - something is using fork with threads running, a guaranteed source of problems. Fixing it to always exist in _dangling beforehand would not be a bad thing, but only if we can figure out how it happens to not be in the first place.

@gpshead gpshead added the type-bug An unexpected behavior, bug, or error label Apr 18, 2022
@gpshead gpshead changed the title test environment chained due to empty threading._dangling without MainThread before some tests test_asyncio environment changed during test execution due to empty threading._dangling without MainThread before some tests Apr 18, 2022
@gpshead
Copy link
Member Author

gpshead commented Apr 18, 2022

I've also seen this one:

../b/python -m test.bisect_cmd test_asyncio.test_threads test_asyncio.test_unix_events --fail-env-changed -j 12 -N 1000 2>&1 | less
...
0:00:00 load avg: 0.64 [1/2/1] test_asyncio.test_threads failed (env changed)
Warning -- threading._dangling was modified by test_asyncio.test_threads
Warning --   Before: {<weakref at 0x7f67f8967970; to '_MainThread' at 0x7f67f92eb370>}
Warning --   After:  {<weakref at 0x7f67f91a1210; to 'Thread' at 0x7f67f8979480>, <weakref at 0x7f67f8967f60; to '_MainThread' at 0x7f67f92eb370>}

@gpshead
Copy link
Member Author

gpshead commented Apr 18, 2022

notice the specific weakref instance to the same address _MainThread changed above and here:

python -m test --matchfile /tmp/tmpu15fc2a3 --fail-env-changed test_asyncio.test_threads test_asyncio.test_unix_events -j 2
0:00:00 load avg: 1.42 Run tests in parallel using 2 child processes
0:00:00 load avg: 1.42 [1/2/1] test_asyncio.test_threads failed (env changed)
Warning -- threading._dangling was modified by test_asyncio.test_threads
  Before: {<weakref at 0x7f3929113a10; to '_MainThread' at 0x7f3929a93490>}
  After:  {<weakref at 0x7f3929979800; to '_MainThread' at 0x7f3929a93490>}
0:00:00 load avg: 1.42 [2/2/1] test_asyncio.test_unix_events ran no tests

== Tests result: ENV CHANGED ==

@gpshead
Copy link
Member Author

gpshead commented Apr 19, 2022

The code in Lib/test/libregrtest/save_env.py has race conditions that could lead to this https://github.com/python/cpython/blob/main/Lib/test/libregrtest/save_env.py#L216 are invoked by enter and exit on the saved_test_environment context manager. see resource_info for how.

@gpshead
Copy link
Member Author

gpshead commented Apr 19, 2022

the irony of code that is supposed to help buildbot instability also being one probable cause of buildbot instability...

@gpshead
Copy link
Member Author

gpshead commented Apr 19, 2022

The excess threads running on occasion after a test ends are asyncio concurrent.futures threadpool _worker threads named asyncio_0 or asyncio_1... ('<Thread(asyncio_0, started 140097331353152)>', (<function _worker at 0x7f6af3c3f5f0>, '_worker', 'concurrent.futures.thread', None))

The change that exacerbated this race condition notably no longer does a loop.run_until_complete(loop.shutdown_default_executor()) which is what I believe would've seen the worker threads cleaned up? [edit: confirmed, PR coming]

@gpshead gpshead added 3.10 only security fixes 3.9 only security fixes labels Apr 19, 2022
@AlexWaygood AlexWaygood added tests Tests in the Lib/test dir topic-asyncio labels Apr 19, 2022
pablogsal pushed a commit that referenced this issue Apr 19, 2022
…eaks its executor (GH-91680)

For things like test_asyncio.test_thread this was causing frequent
"environment modified by test" errors as the executor threads had not
always stopped running after the test was over.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Apr 19, 2022
… no longer leaks its executor (pythonGH-91680)

For things like test_asyncio.test_thread this was causing frequent
"environment modified by test" errors as the executor threads had not
always stopped running after the test was over.
(cherry picked from commit 61570ae)

Co-authored-by: Gregory P. Smith <[email protected]>
miss-islington added a commit that referenced this issue Apr 19, 2022
…eaks its executor (GH-91680)

For things like test_asyncio.test_thread this was causing frequent
"environment modified by test" errors as the executor threads had not
always stopped running after the test was over.
(cherry picked from commit 61570ae)

Co-authored-by: Gregory P. Smith <[email protected]>
@gpshead gpshead closed this as completed May 13, 2022
@vstinner
Copy link
Member

Thanks for the fix ;-)

hello-adam pushed a commit to hello-adam/cpython that referenced this issue Jun 2, 2022
… no longer leaks its executor (pythonGH-91680)

For things like test_asyncio.test_thread this was causing frequent
"environment modified by test" errors as the executor threads had not
always stopped running after the test was over.
(cherry picked from commit 61570ae)

Co-authored-by: Gregory P. Smith <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes 3.10 only security fixes tests Tests in the Lib/test dir topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
None yet
3 participants