Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,9 @@ nav:
- Rollout:
- Adapter Rollout: guides/adapter-rollout.md
- InferencePool Rollout: guides/inferencepool-rollout.md
- Metrics and Observability: guides/metrics-and-observability.md
- Observability:
- Metrics: guides/metrics-and-observability.md
- Traces: guides/trace.md
- Configuration Guide:
- Configuring the EndPoint Picker via configuration YAML file: guides/epp-configuration/config-text.md
- Prefix Cache Aware Plugin: guides/epp-configuration/prefix-aware.md
Expand All @@ -85,7 +87,7 @@ nav:
- Conformance Tests: guides/conformance-tests.md
- Performance:
- Benchmark: performance/benchmark/index.md
- Advanced Benchmarking Configs:
- Advanced Benchmarking Configs:
- Prefix Cache Aware: performance/benchmark/advanced-configs/prefix-cache-aware.md
- Decode Heavy Workload: performance/benchmark/advanced-configs/decode-heavy-workload.md
- Prefill Heavy Workload: performance/benchmark/advanced-configs/prefill-heavy-workload.md
Expand Down
40 changes: 40 additions & 0 deletions site-src/guides/trace.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Trace

This guide describes the current state of trace and how to use them.

## Requirements

You would need to enable the trace feature with following helm option:

```yaml
inferenceExtension:
tracing:
enabled: true
otelExporterEndpoint: "http://localhost:4317"
sampling:
sampler: "parentbased_traceidratio"
samplerArg: "0.1"
```

- `otelExporterEndpoint`: Points to your OpenTelemetry collector endpoint.
- `sampler`: Currently, only `parentbased_traceidratio` is supported. This sampler respects the parent span's sampling decision and applies the configured ratio for root spans.
- `samplerArg`: Base sampling rate for new traces, range [0.0, 1.0]. For example, "0.1" enables 10% sampling.

## Span Coverage

Currently, the inference gateway covers the entry point of the external processing request.

- **Tracer Name**: `gateway-api-inference-extension`
- **Span Name**: `gateway.request`

This span is the root span the entire lifecycle of an external processing request from Envoy, including header and body processing, scheduling decisions, and response handling.

## Attributes

### Span Attributes

Currently, the `gateway.request` span does not include custom attributes. It primarily serves as a container for the request's execution time and provides context for child spans in downstream services.

## Context Propagation

The inference gateway supports distributed tracing by propagating the trace context to downstream services (e.g., model servers).