-
Notifications
You must be signed in to change notification settings - Fork 423
Description
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
I have a PR in a different repo that rewrites some things to run an asyncio loop in a separate thread, and that PR experienced surprisingly consistent segfaults in CI. I managed to repro it locally a couple times and extract a coredump as well as the Python tracebacks (but locally it is less consistent). The segfaults stopped completely when I disabled memray in CI.
The Python stacktrace is always like this, with the "Current thread" somewhere in asyncio.run()
:
Python faulthandler traceback from CI
tests/unit_tests/test_lib/test_progress.py::test_remaining_operations Fatal Python error: Segmentation fault
Current thread 0x000000017d78b000 (most recent call first):
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/selectors.py", line 517 in register
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/selector_events.py", line 284 in _add_reader
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/selector_events.py", line 124 in _make_self_pipe
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/selector_events.py", line 66 in __init__
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/unix_events.py", line 64 in __init__
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/events.py", line 720 in new_event_loop
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/events.py", line 823 in new_event_loop
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 137 in _lazy_init
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 58 in __enter__
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 194 in run
File "/Users/distiller/project/wandb/sdk/lib/asyncio_compat.py", line 75 in run
File "/Users/distiller/project/wandb/sdk/lib/asyncio_manager.py", line 233 in _main
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1012 in run
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1075 in _bootstrap_inner
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1032 in _bootstrap
Thread 0x000000016c74f000 (most recent call first):
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 534 in read
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 567 in from_io
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 1160 in _thread_receiver
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 341 in run
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 411 in _perform_spawn
Thread 0x00000001e41d5ec0 (most recent call first):
File "/opt/homebrew/Cellar/[email protected]/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/contextlib.py", line 137 in __enter__
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pytest_memray/plugin.py", line 212 in wrapper
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/python.py", line 157 in pytest_pyfunc_call
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/python.py", line 1671 in runtest
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/runner.py", line 178 in pytest_runtest_call
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/runner.py", line 246 in <lambda>
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/runner.py", line 344 in from_call
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/runner.py", line 245 in call_and_report
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/runner.py", line 136 in runtestprotocol
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/runner.py", line 117 in pytest_runtest_protocol
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/xdist/remote.py", line 227 in run_one_test
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/xdist/remote.py", line 206 in pytest_runtestloop
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/main.py", line 343 in _main
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/main.py", line 289 in wrap_session
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/_pytest/main.py", line 336 in pytest_cmdline_main
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/xdist/remote.py", line 427 in <module>
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 1291 in executetask
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 341 in run
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 411 in _perform_spawn
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 389 in integrate_as_primary_thread
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 1273 in serve
File "/Users/distiller/project/.nox/unit_tests-3-12/lib/python3.12/site-packages/execnet/gateway_base.py", line 1806 in serve
File "<string>", line 8 in <module>
File "<string>", line 1 in <module>
The C tracebacks I managed to get a couple times are all essentially the same, with Thread 3 crashing in memray::tracking_api::PythonStackTracker::emitPendingPushesAndPops()
:
C traceback from local run
Thread 0:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x187afd3cc __psynch_cvwait + 8
1 libsystem_pthread.dylib 0x187b3c0e0 _pthread_cond_wait + 984
2 Python 0x1039a5e5c take_gil + 456
3 Python 0x1039a65fc PyEval_RestoreThread + 24
4 Python 0x103a00b54 posix_do_stat + 160
5 Python 0x1039f9ce8 os_lstat + 168
6 Python 0x1038d369c cfunction_vectorcall_FASTCALL_KEYWORDS + 92
7 Python 0x103977414 _PyEval_EvalFrameDefault + 42244
8 Python 0x10388527c method_vectorcall + 184
9 Python 0x1038842f0 object_vacall + 228
10 Python 0x103884538 PyObject_CallFunctionObjArgs + 56
11 tracer.cpython-312-darwin.so 0x1031e6cd4 CTracer_trace + 1228
12 Python 0x1039bbefc call_trace_func + 116
13 Python 0x1039b8548 call_one_instrument + 132
14 Python 0x1039b7dcc call_instrumentation_vector + 288
15 Python 0x10396d2a4 _PyEval_EvalFrameDefault + 916
16 Python 0x103882944 _PyVectorcall_Call + 152
17 Python 0x103979288 _PyEval_EvalFrameDefault + 50040
18 Python 0x103881e60 _PyObject_FastCallDictTstate + 208
19 Python 0x10388331c _PyObject_Call_Prepend + 136
20 Python 0x1038f7ff8 slot_tp_call + 144
21 Python 0x103881fcc _PyObject_MakeTpCall + 128
22 Python 0x103977560 _PyEval_EvalFrameDefault + 42576
23 Python 0x103881e60 _PyObject_FastCallDictTstate + 208
24 Python 0x10388331c _PyObject_Call_Prepend + 136
25 Python 0x1038f7ff8 slot_tp_call + 144
26 Python 0x103882cdc _PyObject_Call + 124
27 Python 0x103978e74 _PyEval_EvalFrameDefault + 48996
28 Python 0x103881e60 _PyObject_FastCallDictTstate + 208
29 Python 0x10388331c _PyObject_Call_Prepend + 136
30 Python 0x1038f7ff8 slot_tp_call + 144
31 Python 0x103881fcc _PyObject_MakeTpCall + 128
32 Python 0x103977560 _PyEval_EvalFrameDefault + 42576
33 Python 0x10388527c method_vectorcall + 184
34 Python 0x103978e74 _PyEval_EvalFrameDefault + 48996
35 Python 0x103881e60 _PyObject_FastCallDictTstate + 208
36 Python 0x10388331c _PyObject_Call_Prepend + 136
37 Python 0x1038f7ff8 slot_tp_call + 144
38 Python 0x103881fcc _PyObject_MakeTpCall + 128
39 Python 0x103977560 _PyEval_EvalFrameDefault + 42576
40 Python 0x103881e60 _PyObject_FastCallDictTstate + 208
41 Python 0x10388331c _PyObject_Call_Prepend + 136
42 Python 0x1038f7ff8 slot_tp_call + 144
43 Python 0x103881fcc _PyObject_MakeTpCall + 128
44 Python 0x103977560 _PyEval_EvalFrameDefault + 42576
45 Python 0x10396cca0 PyEval_EvalCode + 184
46 Python 0x103968f20 builtin_exec + 448
47 Python 0x103978258 _PyEval_EvalFrameDefault + 45896
48 Python 0x10388527c method_vectorcall + 184
49 Python 0x103978e74 _PyEval_EvalFrameDefault + 48996
50 Python 0x10396cca0 PyEval_EvalCode + 184
51 Python 0x1039cfccc run_eval_code_obj + 88
52 Python 0x1039cddac run_mod + 132
53 Python 0x1039cd3f4 PyRun_StringFlags + 124
54 Python 0x103969058 builtin_exec + 760
55 Python 0x1038d369c cfunction_vectorcall_FASTCALL_KEYWORDS + 92
56 Python 0x103977414 _PyEval_EvalFrameDefault + 42244
57 Python 0x10396cca0 PyEval_EvalCode + 184
58 Python 0x1039cfccc run_eval_code_obj + 88
59 Python 0x1039cddac run_mod + 132
60 Python 0x1039cd3f4 PyRun_StringFlags + 124
61 Python 0x1039cd320 PyRun_SimpleStringFlags + 64
62 Python 0x1039f1754 Py_RunMain + 720
63 Python 0x1039f1c40 pymain_main + 304
64 Python 0x1039f1ce0 Py_BytesMain + 40
65 dyld 0x18779ab98 start + 6076
Thread 1:
0 libsystem_kernel.dylib 0x187afa7dc read + 8
1 Python 0x1039efab0 _Py_read + 76
2 Python 0x103a1a398 _io_FileIO_readinto + 172
3 Python 0x10388e240 method_vectorcall_FASTCALL_KEYWORDS_METHOD + 136
4 Python 0x103884058 PyObject_VectorcallMethod + 148
5 Python 0x103a1fd24 _bufferedreader_raw_read + 156
6 Python 0x103a1f5e0 _bufferedreader_fill_buffer + 64
7 Python 0x103a20834 _io__Buffered_read + 936
8 Python 0x103978108 _PyEval_EvalFrameDefault + 45560
9 Python 0x103885338 method_vectorcall + 372
10 Python 0x103978e74 _PyEval_EvalFrameDefault + 48996
11 Python 0x10388527c method_vectorcall + 184
12 Python 0x103a4dad8 thread_run + 144
13 Python 0x1039e1e10 pythread_wrapper + 48
14 libsystem_pthread.dylib 0x187b3bc0c _pthread_start + 136
15 libsystem_pthread.dylib 0x187b36b80 thread_start + 8
Thread 2:
0 libsystem_pthread.dylib 0x187b36b6c start_wqthread + 0
Thread 3 Crashed:
0 libsystem_kernel.dylib 0x187b02388 __pthread_kill + 8
1 libsystem_pthread.dylib 0x187b3b88c pthread_kill + 296
2 libsystem_c.dylib 0x187a0cd04 raise + 32
3 Python 0x1039f73a8 faulthandler_fatal_error + 416
4 libsystem_platform.dylib 0x187b756a4 _sigtramp + 56
5 _memray.cpython-312-darwin.so 0x104f676ac memray::tracking_api::PythonStackTracker::emitPendingPushesAndPops() + 2544
6 _memray.cpython-312-darwin.so 0x104f676ac memray::tracking_api::PythonStackTracker::emitPendingPushesAndPops() + 2544
7 _memray.cpython-312-darwin.so 0x104f69b84 memray::tracking_api::Tracker::trackAllocationImpl(void*, unsigned long, memray::hooks::Allocator, std::__1::optional<memray::tracking_api::NativeTrace> const&) + 100
8 _memray.cpython-312-darwin.so 0x104f3e42c memray::intercept::malloc(unsigned long) + 376
9 Python 0x1038dbdac _PyObject_Malloc + 112
10 Python 0x1039b9058 _Py_Instrument + 1932
11 Python 0x10396d230 _PyEval_EvalFrameDefault + 800
12 Python 0x103885338 method_vectorcall + 372
13 Python 0x103978e74 _PyEval_EvalFrameDefault + 48996
14 Python 0x103885338 method_vectorcall + 372
15 Python 0x103a4dad8 thread_run + 144
16 Python 0x1039e1e10 pythread_wrapper + 48
17 libsystem_pthread.dylib 0x187b3bc0c _pthread_start + 136
18 libsystem_pthread.dylib 0x187b36b80 thread_start + 8
Thread 4:
0 libsystem_kernel.dylib 0x187afd3cc __psynch_cvwait + 8
1 libsystem_pthread.dylib 0x187b3c0e0 _pthread_cond_wait + 984
2 libc++.1.dylib 0x187a6c300 std::__1::condition_variable::__do_timed_wait(std::__1::unique_lock<std::__1::mutex>&, std::__1::chrono::time_point<std::__1::chrono::system_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>) + 104
3 _memray.cpython-312-darwin.so 0x104f69634 void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, memray::tracking_api::Tracker::BackgroundThread::start()::$_3>>(void*) + 200
4 libsystem_pthread.dylib 0x187b3bc0c _pthread_start + 136
5 libsystem_pthread.dylib 0x187b36b80 thread_start + 8
Thread 3 crashed with ARM Thread State (64-bit):
x0: 0x0000000000000000 x1: 0x0000000000000000 x2: 0x0000000000000001 x3: 0x0000000000000000
x4: 0x0000000000000073 x5: 0x0000000000000069 x6: 0x0000000000000100 x7: 0x000000039f9061f8
x8: 0xfca33e065c7d4e37 x9: 0xfca33e05c3ed3e37 x10: 0xcccccccccccccccd x11: 0x000000000000000a
x12: 0x000000039f9061d1 x13: 0x0000000000000000 x14: 0x0000000000000032 x15: 0x0000000000000001
x16: 0x0000000000000148 x17: 0x00000001f6b25558 x18: 0x0000000000000000 x19: 0x000000000000000b
x20: 0x0000000000001843 x21: 0x000000039f9070e0 x22: 0x0000000103cb0e90 x23: 0x0000000000000000
x24: 0x0000000000000004 x25: 0x0000000000000000 x26: 0x000000000000015c x27: 0x000000039e45b3c0
x28: 0x0000000103d20778 fp: 0x000000039f906250 lr: 0x0000000187b3b88c
sp: 0x000000039f906230 pc: 0x0000000187b02388 cpsr: 0x40001000
far: 0x0000000000000000 esr: 0x56000080 Address size fault
So far I have only seen this with Python 3.12. We also run CI with Python 3.8 and it never segfaulted.
Expected Behavior
Shouldn't segfault.
Steps To Reproduce
See details; I haven't figured out a minimal repro yet, but I'm hoping something can be figured out from the C traceback.
Memray Version
1.18.0
Python Version
3.12
Operating System
Linux, macOS
Anything else?
No response