getsentry · AbhiPrasad · Jun 24, 2022 · Jun 22, 2022 · Jun 22, 2022 · Jun 22, 2022
diff --git a/src/components/sidebar.tsx b/src/components/sidebar.tsx
@@ -208,6 +208,7 @@ export default () => {
           <SidebarLink to="/sdk/performance/" title="Performance">
             <SidebarLink to="/sdk/performance/trace-context/">Trace Contexts</SidebarLink>
             <SidebarLink to="/sdk/performance/span-operations/">Span Operations</SidebarLink>
+            <SidebarLink to="/sdk/performance/dynamic-sampling-context/">Dynamic Sampling Context</SidebarLink>
           </SidebarLink>
           <SidebarLink to="/sdk/event-payloads/" title="Event Payloads">
             <SidebarLink to="/sdk/event-payloads/transaction/">

diff --git a/src/docs/sdk/performance/dynamic-sampling-context.mdx b/src/docs/sdk/performance/dynamic-sampling-context.mdx
@@ -0,0 +1,167 @@
+---
+title: "Dynamic Sampling Context (Experimental)"
+---
+
+<Alert level="warning">
+
+This page is under active development.
+Specifications are not final and subject to change.
+Anything that sounds fishy probably is - nothing is set in stone.
+Opening PRs to improve this page is therefore highly encouraged!
+
+</Alert>
+
+Traces sampling done through the <Link to="/sdk/performance/#tracessamplerate">`tracesSampleRate`</Link> or <Link to="/sdk/performance/#tracessampler">`tracesSampler`</Link> options in the SDKs has quite a few drawbacks for users of Sentry SDKs:
+
+- Changing the sampling rate involved either redeploying applications (which is problematic in case of applications that are not updated automatically, i.e., mobile apps or physically distributed software) or building complex systems to dynamically fetch a sample rate.
+- Sampling only happened based on a factor of randomness.
+- Employing sampling rules, for example, based on event parameters, is very complex.
+- While writing rules for singular **transactions** is possible, enforcing them on entire **traces** is infeasible.
+
+The solution for these problems is **Dynamic Sampling**.
+Dynamic Sampling allows users to configure **sampling rules** directly in the Sentry interface. Important: Sampling rules may be applied to **entire traces** or to a **single transaction**.
+
+## High-Level Problem Statement
+
+### Ingest
+
+Implementing Dynamic Sampling comes with challenges, especially on the ingestion side of things.
+For Dynamic Sampling, we want to make sampling decisions for entire traces.
+However, to keep ingestion speedy, Relay only looks at singular transactions in isolation (as opposed to looking at whole traces).
+This means that we need the exact same decision basis for all transactions belonging to a trace.
+In other words, all transactions of a trace need to hold all of the information to make a sampling decision, and that **information needs to be the same across all transactions of the trace**.
+We call the information we base sampling decisions on **"Dynamic Sampling Context"** or **"DSC"**.
+As a mental model:
+The head transaction in a trace determines the Dynamic Sampling Context for all following transactions in that trace.
+No information can be changed, added or deleted after the first propagation.
+
+### SDKs
+
+SDKs are responsible for propagating **Dynamic Sampling Context** across all applications that are part of a trace.
+This involves:
+
+1. Collecting the information that makes up the DSC **xor** extracting the DSC from incoming requests.
+2. Propagating DSC to downstream SDKs.
+3. Sending the DSC to Sentry via the `trace` envelope header.
+
+Because there are quite a few things to keep in mind for DSC propagation and to avoid every SDK running into the same problems, we defined a [unified propagation mechanism](#unified-propagation-mechanism) (step-by-step instructions) that all SDK implementations should be able to follow.
+
+## Baggage
+
+We chose `baggage` as the propagation mechanism for DSC. ([w3c baggage spec](https://www.w3.org/TR/baggage/))
+Baggage is a standard HTTP header with URI encoded key-value pairs.
+
+For the propagation of DSC, SDKs first read the DSC from the baggage header of incoming requests/messages.
+To propagate DSC to downstream SDKs/services, we create a baggage header (or modify an existing one) through HTTP request instrumentation.
+
+<Alert level="info">
+
+Other vendors might also be using the `baggage` header.
+If a `baggage` header already exists on an outgoing request, SDKs should aim to be good citizens by only **appending** Sentry values to the header.
+In the case that another vendor added Sentry values to an outgoing request, SDKs may overwrite those values.
+
+SDKs must not add other vendors' baggage from incoming requests to outgoing requests.
+Sentry SDKs only concern themselves with Sentry baggage.
+
+</Alert>
+
+The following is an example of what a baggage header containing Dynamic Sampling Context may look like:
+
+```
+baggage: other-vendor-value-1=foo;bar;baz, sentry-traceid=771a43a4192642f0b136d5159a501700, sentry-publickey=49d0f7386ad645858ae85020e393bef3; sentry-userid=Am%C3%A9lie, other-vendor-value-2=foo;bar;
+```
+
+See the [Payloads section](#payloads) for a complete list of key-value pairs that SDKs should propagate.
+
+## Payloads
+
+Dynamic Sampling Context is propagated via a baggage header and sent to Sentry via the `trace` envelope header.
+
+### Baggage-Header
+
+SDKs may set the following key-value pairs on baggage headers:
+
+- `sentry-traceid` - The original trace ID as generated by the SDK.
+- `sentry-publickey` - Public key as defined by the user via the DSN in the SDK options.
+- `sentry-samplerate` - Sample rate as defined by the user in the SDK options.
+- `sentry-release` - The release as defined by the user in the SDK options.
+- `sentry-environment` - The environment as defined by the user in the SDK options.
+- `sentry-transaction` - The name of the trace's origin transaction in unparameterized (raw) format.
+- `sentry-userid` - User ID as set by the user with <Link to="/sdk/unified-api/#scope">`scope.set_user`</Link>.
+- `sentry-usersegment` - User segment as set by the user with <Link to="/sdk/unified-api/#scope">`scope.set_user`</Link>.
+
+All of these values are required in a sense, that when they are known to the head SDK at the time of propagation, they must also be propagated.
+In any case, `sentry-traceid`, `sentry-publickey`, `sentry-samplerate` should always be known to the SDK, so these values are strictly required.
+
+SDKs must set all of the keys in the form of "`sentry-[name]`".
+The prefix "`sentry-`" acts to identify key-value pairs set by Sentry SDKs.
+Additionally, we chose `[name]` to be written in "snake case" without any underscore ( `_` ) characters. This naming convention is the most language agnostic.
+
+### Envelope Header
+
+Dynamic Sampling Context is transferred to Sentry through the `trace` envelope header.
+The value of this header corresponds directly to the definition of <Link to="/sdk/performance/trace-context/#trace-context">Trace Context</Link>.
+
+When a transaction is reported to Sentry, the Dynamic Sampling Context must be mapped to the `trace` envelope header in the following way:
+
+- `sentry-traceid` ➝ `trace_id`
+- `sentry-publickey` ➝ `public_key`
+- `sentry-samplerate` ➝ `sample_rate`
+- `sentry-release` ➝ `release`
+- `sentry-environment` ➝ `environment`
+- `sentry-transaction` ➝ `transaction`
+- `sentry-userid` ➝ `user.id`
+- `sentry-usersegment` ➝ `user.segment`
+
+## Unified Propagation Mechanism
+
+SDKs should follow these steps for any incoming and outgoing requests (in python pseudo-code for illustrative purposes):
+
+```python
+def collect_dynamic_sampling_context():
+  # Placeholder function that collects as many values for Dynamic Sampling Context
+  # as possible and returns a dict
+
+def has_sentry_value_in_baggage_header(request):
+  # Placeholder function that returns True when there is at least one key-value pair in the baggage
+  # header of `request`, for which the key starts with "sentry-". Otherwise, it returns False.
+
+def on_incoming_request(request):
+  if has_header(request, "sentry-trace") and (not has_header(request, "baggage") or not has_sentry_value_in_baggage_header(request)):
+    # Request comes from an old SDK which doesn't support Dynamic Sampling Context yet
+    # --> we don't propagate baggage for this trace
+    transaction.baggage_locked = True
+    transaction.baggage = {}
+  elif has_header(request, "baggage") and has_sentry_value_in_baggage_header(request):
+    transaction.baggage_locked = True
+    transaction.baggage = baggage_header_to_dict(request.headers.baggage)
+
+def on_outgoing_request(request):
+  if not transaction.baggage_locked:
+    transaction.baggage_locked = True
+    if not transaction.baggage:
+      transaction.baggage = {}
+    transaction.baggage = merge_dicts(collect_dynamic_sampling_context(), transaction.baggage)
+
+  if has_header(request, "baggage"):
+    outgoing_baggage_dict = baggage_header_to_dict(request.headers.baggage)
+    merged_baggage_dict = merge_dicts(outgoing_baggage_dict, transaction.baggage)
+    merged_baggage_header = dict_to_baggage_header(merged_baggage_dict)
+    set_header(request, "baggage", merged_baggage_header)
+  else:
+    baggage_header = dict_to_baggage_header(transaction.baggage)
+    set_header(request, "baggage", baggage_header)
+```
+
+While there is no strict necessity for the `transaction.baggage_locked` flag yet, there is a future use case where we need it:
+We might want users to be able to set Dynamic Sampling Context values themselves.
+The flag becomes relevant after the first propagation, where Dynamic Sampling Context becomes immutable.
+When users attempt to set DSC afterwards, our SDKs should make this operation a noop.
+
+## Considerations
+
+Todo:
+
+- Why baggage and not trace context https://www.w3.org/TR/trace-context/?
+- Why must baggage be immutable before the second transaction has been started?
+- Why can't we just make the decision for the whole trace in Relay after the trace is complete?
diff --git a/src/docs/sdk/performance/trace-context.mdx b/src/docs/sdk/performance/trace-context.mdx
@@ -22,6 +22,7 @@ Regardless of the transport mechanism, the trace context is a JSON object with t
 
 - `trace_id` (string, required) - UUID V4 encoded as a hexadecimal sequence with no dashes (e.g. `771a43a4192642f0b136d5159a501700`) that is a sequence of 32 hexadecimal digits. This must match the trace id of the submitted transaction item.
 - `public_key` (string, required) - Public key from the DSN used by the SDK. It allows Sentry to sample traces spanning multiple projects, by resolving the same set of rules based on the starting project.
+- `sample_rate` (number, optional) - The sample rate as defined by the user on the SDK where the trace originated.
 - `release` (string, optional) - The release name as specified in client options, usually: `package@x.y.z+build`. _This should match the `release` attribute of the transaction event payload_.\*
 - `environment` - The environment name as specified in client options, for example `staging`. _This should match the `environment` attribute of the transaction event payload_.\*
 - `user` (object, optional) - A subset of the scope's user context containing the following fields: