Skip to content

Segfault for short-lived Client objects in python 3.12 #1916

Closed
@basepi

Description

@basepi

This is a weird one.

If you have a short-lived Client object with the transport thread intact, it can cause a segfault.

This is what I've found after a ton of debugging:

  • It only happens in python 3.12, and only in CI (race condition?)
  • It appears to happen in the transport thread. You can see this if you remove the kwargs passed into test_python_version_deprecation (introduced in CI fixup #1913)
  • If you disable the metrics thread (with metrics_interval="0ms"), the segfault goes away
    • This is likely a red herring. I "fixed" the segfault here as well, and sending_elasticapm_client has the metrics thread disabled. But we do spin the threads up and down in a specific order, so perhaps the metrics thread is just changing the timing enough to cause the segfault in some cases?
  • If you disable the transport thread (with transport_class="tests.fixtures.DummyTransport), the segfault goes away
  • You can also see this failure in the test_wrapper_script_instrumentation test which is currently skipped in python 3.12.
    • You can stop the segfault with a five second sleep in testapp.py (thus the "short-lived Client object")
    • You can also stop the segfault by disabling either thread as mentioned above

With all of the above, I decided it wasn't a blocker, so I'm opening this technical debt issue. I don't think customers will run into this, but it is possible.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions