Skip to content

Conversation

sfleen
Copy link
Contributor

@sfleen sfleen commented Sep 30, 2025

The current control plane tracing relies on the linkerd-jaeger extension, and does not work when using the native tracing configuration.

This removes the previous configs, and adds new control plane tracing config that mirrors the existing proxy tracing configs.

The previous configuration was meant entirely for internal testing purposes, and shouldn't be subject to any breaking change guarantees.

This clarifies the tracing documentation that the trace collector must be meshed, along with the specifics of how the service account name for the collector should be set.

Signed-off-by: Scott Fleener <[email protected]>
This separates out the service account namespace from the name to make it clearer how to set the correct service account for the trace collector.

Signed-off-by: Scott Fleener <[email protected]>
@sfleen sfleen requested a review from a team as a code owner September 30, 2025 17:58
The current control plane tracing relies on the linkerd-jaeger extension, and does not work when using the native tracing configuration.

This removes the previous configs, and adds new control plane tracing config that mirrors the existing proxy tracing configs.

The previous configuration was meant entirely for internal testing purposes, and shouldn't be subject to any breaking change guarantees.

Signed-off-by: Scott Fleener <[email protected]>
@sfleen sfleen force-pushed the sfleen/trace-control-plane branch from 839876c to 7f43cf4 Compare September 30, 2025 18:00
// PodDisruptionBudget contains the fields to set the PDB
PodDisruptionBudget struct {
MaxUnavailable int `json:"maxUnavailable"`
MaxUnavailable string `json:"maxUnavailable"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little surprising. Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field can also be a percentage value, which can't parse as an int. Not sure if there's a better way to encode this

# -- Service account namespace for the trace collector. If there's no explicitly set service account for the
# trace collector, this should be set to the namespace of the deployment/statefulset/daemonset of the trace
# collector instead.
serviceAccountNamespace: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
serviceAccountNamespace: ""
namespace: ""

blegh, sorry to yakshave this, but we are probably better off just calling this 'namespace'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about "name" and "namespace" for the fields? That lets us leave the old one in with the same semantics to avoid a breaking change, and have a less confusing name since "Service account" isn't exactly the right name for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I'm partial to serviceAccountName and namespace here:

  • these are how these fields are named on pod spec/metadata resources, so it's consistent
  • the 'name' is ambiguous. i think it's clearer if we document that this is explicitly a serviceAccountName (as on the pod spec).
  • the namespace is more general -- it includes the workloads and the services blah blah blah

Comment on lines 295 to 296
# collector, this should be set to the name of the deployment/statefulset/daemonset of the trace collector
# instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this true? if the collector doesn't specify a service account, I thought it used the default service account for it's namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From all of my testing, the value here is correct. However, I'm not entirely sure "service account" is the most accurate name, since this is actually the name of the mesh identity the collector's proxy uses (which is either the service account or deployment/etc. name if not present).

Comment on lines 298 to 301
# -- Service account namespace for the trace collector. If there's no explicitly set service account for the
# trace collector, this should be set to the namespace of the deployment/statefulset/daemonset of the trace
# collector instead.
serviceAccountNamespace: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this change, did the namespace need to be specified as part of the serviceAccountName? Is this breaking change necessary?

Furthermore, we're introducing a new, required value here so this breaking change will affect everyone who has proxy.tracing enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namespace did need to be specified before this, in the format <service account>.<namespace>. I'd rather we have two separate fields with a clearer intent.

WDYT about #14557 (comment) as a better state?

Copy link
Member

@adleong adleong Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, can we leave serviceAccountName in the values.yaml but document it as deprecated and that the name/namespace fields are preferred but that it still exists for backwards compatibility. Let's leave a paper trail of why there are two different configuration methods here.

# Configures tracing in the controllers and how traces are exported
tracing:
# -- Enables trace collection and export in the proxy
enable: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any point to having this boolean vs just looking at if collector.endpoints is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vaguely remember @alpeb suggesting an explicit enable/disable field, but I don't remember the specifics of why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember either 😆
Looking through #13994, there was discussion around not using boolean values, and validating that tracing.enable is consistent with the contents of collector.endpoint in the proxy config below. Anyways, relying on collector.endpoints emptiness sounds like a simpler approach to me too, but would be good to use the same approach here and in the proxy section.

# Conflicts:
#	cli/cmd/install_test.go
#	pkg/charts/linkerd2/values.go
@cratelyn cratelyn changed the title feat!(tracing): Improve control plane tracing configuration feat(tracing)!: Improve control plane tracing configuration Sep 30, 2025
@cratelyn
Copy link
Member

if you will pardon the drive-by nitpick; i've renamed this to follow the conventional commit spec for breaking changes with a specified scope.

@sfleen sfleen requested review from olix0r and adleong October 1, 2025 15:59
BUILD.md Outdated
flag.
[Distributed Tracing](https://opentracing.io/docs/overview/what-is-tracing/)
for development purposes. It can be enabled globally i.e Control plane
components and their proxies by using the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there's an extra space character after "components"

Comment on lines 298 to 301
# -- Service account namespace for the trace collector. If there's no explicitly set service account for the
# trace collector, this should be set to the namespace of the deployment/statefulset/daemonset of the trace
# collector instead.
serviceAccountNamespace: ""
Copy link
Member

@adleong adleong Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, can we leave serviceAccountName in the values.yaml but document it as deprecated and that the name/namespace fields are preferred but that it still exists for backwards compatibility. Let's leave a paper trail of why there are two different configuration methods here.

sfleen added 4 commits October 1, 2025 14:03
# Conflicts:
#	cli/cmd/install_test.go
#	pkg/charts/linkerd2/values.go
#	pkg/charts/linkerd2/values_test.go
Signed-off-by: Scott Fleener <[email protected]>
@sfleen sfleen changed the base branch from main to sfleen/tracing-docs October 1, 2025 18:23
sfleen added 5 commits October 1, 2025 14:37
# Conflicts:
#	cli/cmd/install_test.go
#	pkg/charts/linkerd2/values.go
Signed-off-by: Scott Fleener <[email protected]>
Signed-off-by: Scott Fleener <[email protected]>
Base automatically changed from sfleen/tracing-docs to main October 1, 2025 19:21
# Conflicts:
#	cli/cmd/install_test.go
#	pkg/charts/linkerd2/values.go
#	pkg/charts/linkerd2/values_test.go
@sfleen sfleen enabled auto-merge (squash) October 1, 2025 19:23
@sfleen sfleen merged commit 3b46333 into main Oct 1, 2025
69 of 72 checks passed
@sfleen sfleen deleted the sfleen/trace-control-plane branch October 1, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants