Skip to content
This repository was archived by the owner on Aug 14, 2024. It is now read-only.

Clarifications for the new trace context spec #410

Closed
bruno-garcia opened this issue Aug 23, 2021 · 8 comments · Fixed by #414
Closed

Clarifications for the new trace context spec #410

bruno-garcia opened this issue Aug 23, 2021 · 8 comments · Fixed by #414
Assignees
Labels
question Further information is requested

Comments

@bruno-garcia
Copy link
Member

bruno-garcia commented Aug 23, 2021

The new spec #407 was merged and while going through in details a few questions came up:

  • We'll only add the data to the envelope item header if:
    • We're creating an envelope from a transaction
    • Or a transaction is bound to the scope (e.g: trace info exists in the scope)
  • For this first iteration, we'll just add to envelope headers, no need to add to tracestate? Given we're not adding support to the backend to read it (at least in Java/.NET)?
  • Or are the first use cases using Python so we should send out the tracestate so they can use it there?
  • On the server side, we read some user data. Do we bind that to the scope and override whatever is there? Or what if the user code already resets user after our middleware runs. Data would just change.
  • Same with DSN public key, do we propagate that downstream? The server will be using its own, so what is its use?
  • Context freezing: When we're continuing a trace, that means it's frozen from the start? The scope data of that service doesn't play a role at all?
  • We only need the envelope item header stuff when the content of the envelope contains a transaction?
  • I assume this should be an opt-in, particularly while it's an experimental thing. What should we name the option? EnabledExperimentalTraceContextPropagation?
@bruno-garcia bruno-garcia added the question Further information is requested label Aug 23, 2021
@jan-auer
Copy link
Member

jan-auer commented Aug 23, 2021

  • We'll only add the data to the envelope item header if:
    • We're creating an envelope from a transaction
    • Or a transaction is bound to the scope (e.g: trace info exists in the scope)
  • We only need the envelope item header stuff when the content of the envelope contains a transaction?

I'll add that to the spec. Envelopes with only attachments should have it too, so that we can sample attachments of sampled transactions... TL;DR what you write in your first bullet point is correct: Add whenever there's a transaction.

  • For this first iteration, we'll just add to envelope headers, no need to add to tracestate? Given we're not adding support to the backend to read it (at least in Java/.NET)?
  • Or are the first use cases using Python so we should send out the tracestate so they can use it there?

Please do add sending of tracestate right away. We have support on Python backend already, which we require for initial testing. You do not have to add parsing or work on a Java server side implementation, yet.

  • On the server side, we read some user data. Do we bind that to the scope and override whatever is there? Or what if the user code already resets user after our middleware runs. Data would just change.
  • Context freezing: When we're continuing a trace, that means it's frozen from the start? The scope data of that service doesn't play a role at all?

See this section: "Such a context is immediately frozen in the SENT state and should no longer allow for modifications". So, you do not change the context when you inherit it from an incoming header. Only do that when the SDK creates it.

Reason for this is to enforce absolutely consistent sampling across the trace. Obviously has the disadvantage of only being able to sample by information available to the frontend SDK, but that's something that belongs in usage docs.

  • Same with DSN public key, do we propagate that downstream? The server will be using its own, so what is its use?

👍 Public key of the SDK that starts the trace needs to be propagate. This is what we use to resolve the same rules across the entire trace if they are different per project.

  • I assume this should be an opt-in, particularly while it's an experimental thing. What should we name the option? EnabledExperimentalTraceContextPropagation?

Yes, and excellent question, feel free to choose a name. Since this is experimental, we can also go with leading _ (as per JS convention etc). cc @lobsterkatie

TODOs

  • Rename to "Trace State"
  • Describe when to add envelope headers and why
  • Describe need for propagating DSN public key
  • Determine internal client option field to enable trace context propagation
  • Add a list of considered gotchas, e.g. changed sampling when frontends are involved

@rhcarvalho
Copy link
Contributor

  • What should we name the option? EnabledExperimentalTraceContextPropagation?

...TraceContext... is an unfortunate name because SDKs already had a "TraceContext" before this spec was merged.
That existing context is always propagated.

Perhaps this is more about 1) Adding or not trace to envelope header and 2) Adding tracestate to outgoing requests. Would the option affect both together? Or do we need finer granularity?
Depending on what the option does we can find a descriptive name.

Also, there is the related action of parsing a potential incoming tracestate header. Would that be covered by the same bool option?

@rhcarvalho
Copy link
Contributor

the disadvantage of only being able to sample by information available to the frontend SDK

I find this to be often a misconception. Some traces do start on the frontend, typically coming from manual instrumentation of a single page application, and some "accidentally" start on the frontend because of lack of backend->frontend propagation (requires manual work to render a <meta> tag).

In a lot of cases traces start on the backend when the first request is made, then it continues on the frontend and other (backend) servers.

Perhaps it was already discussed in another venue, but what may end up happening is that the same "backend transaction" will have to be "dynamically sampled" sometimes with information from a backend, sometimes from a frontend (and sometimes with data from a different backend server, depending on what service originated the trace).

@bruno-garcia
Copy link
Member Author

Perhaps it was already discussed in another venue, but what may end up happening is that the same "backend transaction" will have to be "dynamically sampled" sometimes with information from a backend, sometimes from a frontend (and sometimes with data from a different backend server, depending on what service originated the trace).

This sounds like a completely valid use case. I wonder if we should consider this already as opposed to simply do a "front-end data only flows through" approach.

@jan-auer
Copy link
Member

To keep the discussion focused, I'd suggest scoping this issue to clarification of the spec and discuss extended use cases in a different place.

...TraceContext... is an unfortunate name because SDKs already had a "TraceContext" before this spec was merged.
That existing context is always propagated.

Great point, let's rename to "Trace State" to disambiguate. This also brings it more in line with the W3C spec.

Would the option affect both together? Or do we need finer granularity?

Since this is a fully internal option that doesn't have to be documented and will likely not remain once this feature goes stable (in this way, but to discussed at a much later point), a single option should do the trick.

Perhaps it was already discussed in another venue, but what may end up happening is that the same "backend transaction" will have to be "dynamically sampled" sometimes with information from a backend, sometimes from a frontend (and sometimes with data from a different backend server, depending on what service originated the trace).

Correct, this is a known limitation of the current approach that we're going with. We'll be evaluating this next. For now this is how it works. I think it's a great point to mention in the doc though, to clarify that this was considered.

@jan-auer jan-auer self-assigned this Aug 25, 2021
@maciejwalkowiak
Copy link
Contributor

One more question - doesn't the base64ed header value need to be URI encoded at the very end? Base64 value has a new line after 76 character, which does not seem to work when placed as an HTTP header.

@jan-auer
Copy link
Member

jan-auer commented Aug 26, 2021

Base64 doesn't place newlines in the encoded string, at least not the abstract algorithm. I think you might be looking at the behavior of the command line tool base64 -w , which does that. Ensure that the encoded string doesn't have newlines, or otherwise strip them.

@lobsterkatie
Copy link
Member

lobsterkatie commented Aug 30, 2021

Just read through the spec and I, too, had a few questions/observations/thoughts. I know some of them we've discussed verbally but I'm including them here for visibility and posterity. (Also, I know this issue is closed, but it seemed like the most relevant place to put this.)

  1. An easy one: In the "joined by comma" example in the Tracestate Headers section, there shouldn't be a space between the entries. See https://www.w3.org/TR/trace-context/#combined-header-value.

    [UPDATE: Fixed in https://github.com/fix(tracing): Small fixes to tracestate docs about third-party data #420/.]

  2. Also easy: In the last snippet in that same section, with an example header containing both sentry and third-party data, the order needs to be reversed, since whatever key-value pair you're adding or modifying should end up the leftmost one. See https://www.w3.org/TR/trace-context/#mutating-the-tracestate-field.

    [UPDATE: Fixed in https://github.com/fix(tracing): Small fixes to tracestate docs about third-party data #420/.]

  3. Medium-easy: With the addition of user data and transaction name, we are now reliably over the character limit (256) listed in the spec. See https://www.w3.org/TR/trace-context/#value for the spec and feat(tracing): Add user data/transaction name to tracestate value sentry-python#1177 for example data.

    [UPDATE: Moved to https://github.com/tracing: Including all current data in tracestate breaks character limit #424.]

  4. Less easy: Concerns I and others have raised about including transaction name in the tracestate, to wit: The possibility you mentioned in the Freezing the Context section, in which a transaction name is changed after the tracestate data is frozen, feels like a big blocker. In Express, for example, the transaction name isn't finalized until the end of the transaction (too late for tracestate computation). Not only does this mean that it can't be filtered on, it also leaves open the possibility of the transaction name containing PII, in the form of data embedded in the URL path. Once this feature moves from "experimental" to more than that, If we do settle on including transaction name, we'll need to do a full audit to see which integrations have this problem, and if it's possible to get the correct transaction name sooner.

    [UPDATE: Moved to https://github.com/tracing: PII/correctness problems with including transaction in tracestate #425]

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants