Closed
Description
This is a weird one.
If you have a short-lived Client object with the transport thread intact, it can cause a segfault.
This is what I've found after a ton of debugging:
- It only happens in python 3.12, and only in CI (race condition?)
- It appears to happen in the transport thread. You can see this if you remove the kwargs passed into
test_python_version_deprecation
(introduced in CI fixup #1913) If you disable the metrics thread (withmetrics_interval="0ms"
), the segfault goes away- This is likely a red herring. I "fixed" the segfault here as well, and
sending_elasticapm_client
has the metrics thread disabled. But we do spin the threads up and down in a specific order, so perhaps the metrics thread is just changing the timing enough to cause the segfault in some cases?
- This is likely a red herring. I "fixed" the segfault here as well, and
- If you disable the transport thread (with
transport_class="tests.fixtures.DummyTransport
), the segfault goes away - You can also see this failure in the
test_wrapper_script_instrumentation
test which is currently skipped in python 3.12.- You can stop the segfault with a five second sleep in testapp.py (thus the "short-lived Client object")
- You can also stop the segfault by disabling either thread as mentioned above
With all of the above, I decided it wasn't a blocker, so I'm opening this technical debt issue. I don't think customers will run into this, but it is possible.