-
Notifications
You must be signed in to change notification settings - Fork 431
chore(telemetry): ensure instrumentation telemetry is compatible with python 3.12 #6859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Includes: - pinning images and package dependencies in CircleCI - removing stale `master` branch - dropping pylons framework test - updating versioning documentation - drop Python < 3.7 as supported on setup.py - removing pylons/boto from CI suitespec - release note for 2.0
This reverts commit 0d91f94.
Includes: - pinning images and package dependencies in CircleCI - removing stale `master` branch - dropping pylons framework test - updating versioning documentation - drop Python < 3.7 as supported on setup.py - removing pylons/boto from CI suitespec - release note for 2.0
python X.Y and above
This PR adds a note to upgrade to 2.x in `upgrading.rst` and removes all deprecated items slated for removal in 2.0.0. This includes: - `DD_GEVENT_PATCH_ALL`: no special configuration is now necessary to make `ddtrace-run` work with gevent. - `DD_AWS_TAG_ALL_PARAMS`: the boto/botocore/aiobotocore integrations no longer collect all API parameters by default. - `DD_REMOTECONFIG_POLL_SECONDS`: replaced by `DD_REMOTE_CONFIG_POLL_INTERVAL_SECONDS` - ASM deprecated constants including ``APPSEC_ENABLED``, ``APPSEC_JSON``, ``APPSEC_EVENT_RULE_VERSION``, ``APPSEC_EVENT_RULE_ERRORS``, ``APPSEC_EVENT_RULE_LOADED``, ``APPSEC_EVENT_RULE_ERROR_COUNT``, ``APPSEC_WAF_DURATION``, ``APPSEC_WAF_DURATION_EXT``, ``APPSEC_WAF_TIMEOUTS``, ``APPSEC_WAF_VERSION``, ``APPSEC_ORIGIN_VALUE``, ``APPSEC_BLOCKED``, ``IAST_JSON``, ``IAST_ENABLED``, ``IAST_CONTEXT_KEY``. These constants were meant for private use only and should not affect existing code. - ``ddtrace.contrib.grpc.constants.GRPC_PORT_KEY``: replaced by `ddtrace.ext.net.TARGET_PORT` - ``ddtrace.ext.cassandra.ROW_COUNT``, ``ddtrace.ext.mongo.ROW_COUNT``, ``ddtrace.ext.sql.ROW_COUNT``: replaced by `ddtrace.ext.db.ROWCOUNT` - `ddtrace.filters.TraceCiVisibilityFilter`: removed as this was for private use only and does not affect existing code. - `ddtrace.contrib.starlette.get_resource` and `ddtrace.contrib.starlette.span_modifier` and `ddtrace.contrib.fastapi.span_modifier`: the fastapi and starlette integrations now provide the full route and not just mounted route for sub-applications by default. - `ddtrace.contrib.starlette.config['aggregate_resources']` and `ddtrace.contrib.fastapi.config['aggregate_resources']`: the starlette and fastapi integrations no longer have the option to aggregate resources as this occurs by default now. - `DD_TRACE_OBFUSCATION_QUERY_STRING_PATTERN`: replaced by `DD_TRACE_OBFUSCATION_QUERY_STRING_REGEXP`. Additionally, the `pep562` dependency and references to it have been removed as it is no longer needed after dropping support for Python < 3.7. Note that `DD_CALL_BASIC_CONFIG` and `DD_LOG_FORMAT` are removed in #6612 which includes a subtle change in functionality. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
552ce71
to
b8e4221
Compare
@mabdinur great work! Can you add a release note for this? |
I updated the PR scope from |
c9bcb33
to
44a87f8
Compare
BenchmarksBenchmark execution time: 2023-09-08 16:28:38 Comparing candidate commit 6c86ea3 in PR branch Found 4 performance improvements and 7 performance regressions! Performance is the same for 93 metrics, 0 unstable metrics. scenario:otelspan-start
scenario:otelspan-start-finish
scenario:otelspan-start-finish-telemetry
scenario:sethttpmeta-all-disabled
scenario:sethttpmeta-obfuscation-disabled
scenario:sethttpmeta-obfuscation-regular-case-explicit-query
scenario:sethttpmeta-obfuscation-regular-case-implicit-query
scenario:sethttpmeta-obfuscation-send-querystring-disabled
scenario:sethttpmeta-obfuscation-worst-case-implicit-query
scenario:span-start-finish
scenario:span-start-finish-telemetry
|
… python 3.12 (#6859) - Make the instrumentation telemetry client compatible with python3.12: python/cpython#104826 - Start telemetry worker thread as early as possible. - Delays sending all telemetry events until app-started is queued. - Refactors tests to align with this new logic. - Telemetry events (metrics/logs/integrations) are queued as early as possible but these events are only sent when the trace agent writer is started. This **may** result in a memory leak if high cardinality telemetry metrics and logs are added in the future. This is not a concern right now. - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) - [x] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. - [x] This PR doesn't touch any of that. --------- Co-authored-by: Yun Kim <[email protected]> Co-authored-by: Tahir H. Butt <[email protected]> Co-authored-by: Gabriele N. Tornetta <[email protected]> Co-authored-by: Yun Kim <[email protected]> Co-authored-by: Emmett Butler <[email protected]> Co-authored-by: ZStriker19 <[email protected]>
… python 3.12 (#6859) ## Motivation - Make the instrumentation telemetry client compatible with python3.12: python/cpython#104826 ## Description - Start telemetry worker thread as early as possible. - Delays sending all telemetry events until app-started is queued. - Refactors tests to align with this new logic. ## Risk - Telemetry events (metrics/logs/integrations) are queued as early as possible but these events are only sent when the trace agent writer is started. This **may** result in a memory leak if high cardinality telemetry metrics and logs are added in the future. This is not a concern right now. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) - [x] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. - [x] This PR doesn't touch any of that. --------- Co-authored-by: Yun Kim <[email protected]> Co-authored-by: Tahir H. Butt <[email protected]> Co-authored-by: Gabriele N. Tornetta <[email protected]> Co-authored-by: Yun Kim <[email protected]> Co-authored-by: Emmett Butler <[email protected]> Co-authored-by: ZStriker19 <[email protected]>
## Description This PR ensures the instrumentation telemetry client is compatible with python3.12's threading module by: - Removing `telemetry_writer.add_count_metric` from `SpanAggregator.shutdown`. This ensures telemetry events are not queued on tracer shutdown. - Ensuring the telemetry writer thread is only started once per application. - Ensures the telemetry writer thread is disabled when an application is shutdown. ## Motivation The following change failed to support the telemetry client in python3.12: #6859. This PR will hopefully fix this 🤞. ### Reproduction for python3.12 runtime errors ``` docker run --rm -it python:3.12.0rc3 bash root@a1f3c1d307ec:/# pip install ddtrace==2.0.0rc2 root@a1f3c1d307ec:/# python -c "import ddtrace; _ = ddtrace.tracer.trace('foo'); raise Exception" ``` ### Output ``` Traceback (most recent call last): File "<string>", line 1, in <module> Exception Exception ignored in atexit callback: <bound method Tracer._atexit of <ddtrace.tracer.Tracer object at 0xffffbd967260>> Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 293, in _atexit self.shutdown(timeout=self.SHUTDOWN_TIMEOUT) File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 1024, in shutdown processor.shutdown(timeout) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 270, in shutdown self._queue_span_count_metrics("spans_created", "integration_name", None) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 291, in _queue_span_count_metrics telemetry_writer.add_count_metric( File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 514, in add_count_metric if self.status == ServiceStatus.RUNNING or self.enable(): ^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 218, in enable self.start() File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/service.py", line 58, in start self._start_service(*args, **kwargs) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/periodic.py", line 135, in _start_service self._worker.start() File "/usr/local/lib/python3.12/threading.py", line 971, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't create new thread at interpreter shutdown ``` ## Risk This PR reverts an optimization that ensured telemetry span creation metrics were queued in batches of 100. Without this optimization we can expect a 2-5% increase to span creation and span finish. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) - [x] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. - [x] This PR doesn't touch any of that.
## Description This PR ensures the instrumentation telemetry client is compatible with python3.12's threading module by: - Removing `telemetry_writer.add_count_metric` from `SpanAggregator.shutdown`. This ensures telemetry events are not queued on tracer shutdown. - Ensuring the telemetry writer thread is only started once per application. - Ensures the telemetry writer thread is disabled when an application is shutdown. ## Motivation The following change failed to support the telemetry client in python3.12: #6859. This PR will hopefully fix this 🤞. ### Reproduction for python3.12 runtime errors ``` docker run --rm -it python:3.12.0rc3 bash root@a1f3c1d307ec:/# pip install ddtrace==2.0.0rc2 root@a1f3c1d307ec:/# python -c "import ddtrace; _ = ddtrace.tracer.trace('foo'); raise Exception" ``` ### Output ``` Traceback (most recent call last): File "<string>", line 1, in <module> Exception Exception ignored in atexit callback: <bound method Tracer._atexit of <ddtrace.tracer.Tracer object at 0xffffbd967260>> Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 293, in _atexit self.shutdown(timeout=self.SHUTDOWN_TIMEOUT) File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 1024, in shutdown processor.shutdown(timeout) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 270, in shutdown self._queue_span_count_metrics("spans_created", "integration_name", None) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 291, in _queue_span_count_metrics telemetry_writer.add_count_metric( File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 514, in add_count_metric if self.status == ServiceStatus.RUNNING or self.enable(): ^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 218, in enable self.start() File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/service.py", line 58, in start self._start_service(*args, **kwargs) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/periodic.py", line 135, in _start_service self._worker.start() File "/usr/local/lib/python3.12/threading.py", line 971, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't create new thread at interpreter shutdown ``` ## Risk This PR reverts an optimization that ensured telemetry span creation metrics were queued in batches of 100. Without this optimization we can expect a 2-5% increase to span creation and span finish. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) - [x] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. - [x] This PR doesn't touch any of that. (cherry picked from commit 065784b)
Backport 065784b from #7043 to 2.0. ## Description This PR ensures the instrumentation telemetry client is compatible with python3.12's threading module by: - Ensuring the telemetry writer thread is only started once per application. - Ensures the telemetry writer thread is disabled when an application is shutdown. - Ensures telemetry metrics are queued `SpanAggregator.shutdown` without restarting the telemetry writer thread. ## Motivation The following change failed to support the telemetry client in python3.12: #6859. This PR will hopefully fix this 🤞. ### Reproduction for python3.12 runtime errors ``` docker run --rm -it python:3.12.0rc3 bash root@a1f3c1d307ec:/# pip install ddtrace==2.0.0rc2 root@a1f3c1d307ec:/# python -c "import ddtrace; _ = ddtrace.tracer.trace('foo'); raise Exception" ``` ### Output ``` Traceback (most recent call last): File "<string>", line 1, in <module> Exception Exception ignored in atexit callback: <bound method Tracer._atexit of <ddtrace.tracer.Tracer object at 0xffffbd967260>> Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 293, in _atexit self.shutdown(timeout=self.SHUTDOWN_TIMEOUT) File "/usr/local/lib/python3.12/site-packages/ddtrace/tracer.py", line 1024, in shutdown processor.shutdown(timeout) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 270, in shutdown self._queue_span_count_metrics("spans_created", "integration_name", None) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/processor/trace.py", line 291, in _queue_span_count_metrics telemetry_writer.add_count_metric( File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 514, in add_count_metric if self.status == ServiceStatus.RUNNING or self.enable(): ^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/telemetry/writer.py", line 218, in enable self.start() File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/service.py", line 58, in start self._start_service(*args, **kwargs) File "/usr/local/lib/python3.12/site-packages/ddtrace/internal/periodic.py", line 135, in _start_service self._worker.start() File "/usr/local/lib/python3.12/threading.py", line 971, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't create new thread at interpreter shutdown ``` ## Risk ~~This PR reverts an optimization that ensured telemetry span creation metrics were queued in batches of 100. Without this optimization we can expect a 5-10% increase to span creation and span finish.~~ ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) - [x] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. - [x] This PR doesn't touch any of that. Co-authored-by: Munir Abdinur <[email protected]>
Motivation
Description
Risk
Checklist
changelog/no-changelog
.Reviewer Checklist
@DataDog/security-design-and-guidance
.