Skip to content

Commit fd1d124

Browse files
authored
feat: add guide for trace support (#2212)
The metrics doc is already very huge and long, keep the trace in another doc as trace details will go long in the future.
1 parent a347fb3 commit fd1d124

2 files changed

Lines changed: 44 additions & 2 deletions

File tree

mkdocs.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,9 @@ nav:
7373
- Rollout:
7474
- Adapter Rollout: guides/adapter-rollout.md
7575
- InferencePool Rollout: guides/inferencepool-rollout.md
76-
- Metrics and Observability: guides/metrics-and-observability.md
76+
- Observability:
77+
- Metrics: guides/metrics-and-observability.md
78+
- Traces: guides/trace.md
7779
- Configuration Guide:
7880
- Configuring the EndPoint Picker via configuration YAML file: guides/epp-configuration/config-text.md
7981
- Prefix Cache Aware Plugin: guides/epp-configuration/prefix-aware.md
@@ -85,7 +87,7 @@ nav:
8587
- Conformance Tests: guides/conformance-tests.md
8688
- Performance:
8789
- Benchmark: performance/benchmark/index.md
88-
- Advanced Benchmarking Configs:
90+
- Advanced Benchmarking Configs:
8991
- Prefix Cache Aware: performance/benchmark/advanced-configs/prefix-cache-aware.md
9092
- Decode Heavy Workload: performance/benchmark/advanced-configs/decode-heavy-workload.md
9193
- Prefill Heavy Workload: performance/benchmark/advanced-configs/prefill-heavy-workload.md

site-src/guides/trace.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Trace
2+
3+
This guide describes the current state of trace and how to use them.
4+
5+
## Requirements
6+
7+
You would need to enable the trace feature with following helm option:
8+
9+
```yaml
10+
inferenceExtension:
11+
tracing:
12+
enabled: true
13+
otelExporterEndpoint: "http://localhost:4317"
14+
sampling:
15+
sampler: "parentbased_traceidratio"
16+
samplerArg: "0.1"
17+
```
18+
19+
- `otelExporterEndpoint`: Points to your OpenTelemetry collector endpoint.
20+
- `sampler`: Currently, only `parentbased_traceidratio` is supported. This sampler respects the parent span's sampling decision and applies the configured ratio for root spans.
21+
- `samplerArg`: Base sampling rate for new traces, range [0.0, 1.0]. For example, "0.1" enables 10% sampling.
22+
23+
## Span Coverage
24+
25+
Currently, the inference gateway covers the entry point of the external processing request.
26+
27+
- **Tracer Name**: `gateway-api-inference-extension`
28+
- **Span Name**: `gateway.request`
29+
30+
This span is the root span the entire lifecycle of an external processing request from Envoy, including header and body processing, scheduling decisions, and response handling.
31+
32+
## Attributes
33+
34+
### Span Attributes
35+
36+
Currently, the `gateway.request` span does not include custom attributes. It primarily serves as a container for the request's execution time and provides context for child spans in downstream services.
37+
38+
## Context Propagation
39+
40+
The inference gateway supports distributed tracing by propagating the trace context to downstream services (e.g., model servers).

0 commit comments

Comments
 (0)