Skip to content

tracing: add execution tracing #138013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

andrewbaptist
Copy link
Contributor

This commit adds the ability to capture execution traces from the past
few seconds of execution when something seems wrong. Often when a timer
fires and we detect something is wrong, the relevant information is
already lost. The new flight recorder in go
golang/go#63185 creates a ring buffer that
enables capturing these traces. This commit adds the capability to
capture traces but doesn't enable it anywhere.

There is a small performance cost of having the flight recorder always
enabled, so some performance testing is required to determine if we need
to protect this behind a cluster setting.

Epic: none

Release note: None

Copy link

blathers-crl bot commented Dec 26, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andrewbaptist andrewbaptist force-pushed the 2024-12-26-flight-recorder branch from 28602c6 to 51a1a72 Compare December 27, 2024 18:06
This commit adds the ability to capture execution traces from the past
few seconds of execution when something seems wrong. Often when a timer
fires and we detect something is wrong, the relevant information is
already lost. The new flight recorder in go
golang/go#63185 creates a ring buffer that
enables capturing these traces. This commit adds the capability to
capture traces but doesn't enable it anywhere.

There is a small performance cost of having the flight recorder always
enabled, so some performance testing is required to determine if we need
to protect this behind a cluster setting.

Epic: none

Release note: None
We have seen a number of slow local responses which appear due to slow
fsyncs, however it is not clear if this is happening in our process our
outside. This commit adds flight recorder snapshotting when we see a
slow response.

Epic: none

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants