Feature: Support for A2A (Agent-to-Agent) Distributed Trace Context Propagation

### Which component is this feature for?

Traceloop SDK

### 🔖 Feature description

Add built-in support for W3C TraceContext propagation across agent service boundaries, enabling unified distributed traces in multi-agent (A2A) architectures where independent agent services communicate over HTTP.

**Current state:** `traceloop-sdk` provides excellent in-process tracing via decorators (`@workflow`, `@task`, `@agent`), but trace context is confined to the local process via Python `ContextVar`. When Agent A calls Agent B over HTTP, the trace breaks -- each agent produces an isolated trace with its own `trace_id`.

**Proposed additions:**

1. **Configure W3C propagators during `Traceloop.init()`:**
```python
from opentelemetry.propagate import set_global_textmap
from opentelemetry.propagators.composite import CompositeTextMapPropagator
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator
from opentelemetry.baggage.propagation import W3CBaggagePropagator

# During Traceloop.init(), after TracerProvider is set:
set_global_textmap(CompositeTextMapPropagator([
    TraceContextTextMapPropagator(),
    W3CBaggagePropagator(),
]))
```

2. **Provide helper utilities for inject/extract:**
```python
from traceloop.sdk.propagation import inject_trace_context, extract_trace_context

# Calling agent -- inject into outgoing HTTP headers
@task(name="call-remote-agent")
async def call_remote_agent(payload: dict) -> dict:
    headers = inject_trace_context()  # Returns {"traceparent": "00-...", "tracestate": "..."}
    response = await httpx.post("http://agent-b:8000/process", headers=headers, json=payload)
    return response.json()

# Receiving agent -- extract from incoming HTTP headers
from traceloop.sdk.propagation.middleware import TraceContextMiddleware

app = FastAPI()
app.add_middleware(TraceContextMiddleware)  # Auto-extracts traceparent from requests
```

3. **Result -- unified trace across agents:**
```
Trace [abc123] -- single trace_id across 2 services
  orchestrate (Agent A)
    +-- call-remote-agent (Agent A, CLIENT)
         +-- POST /process (Agent B, SERVER)    <-- remote parent
              +-- analyze (Agent B)
                   +-- llm-call (Agent B)
```

### 🎤 Why is this feature needed?

The [A2A (Agent-to-Agent) protocol](https://google.github.io/A2A/latest/) is gaining adoption for multi-agent GenAI architectures where specialized agents communicate over HTTP. The A2A spec explicitly [recommends W3C TraceContext](https://google.github.io/A2A/latest/topics/enterprise-ready/) for distributed tracing across agent boundaries.

**The problem today:**

- `traceloop-sdk` traces are process-local. When Agent A calls Agent B, two disconnected traces are created.
- Users cannot see end-to-end latency, token usage, or cost across a multi-agent workflow in their observability platform.
- Every team building multi-agent systems has to implement W3C TraceContext propagation themselves on top of traceloop-sdk.

**Real-world impact:**

In production multi-agent deployments exporting to platforms like Langfuse, Instana, Datadog, or Jaeger, users need:
- A single trace tree showing the full orchestration flow across all agents
- Cross-agent latency breakdown (which agent is the bottleneck?)
- Aggregated token usage and cost per end-to-end request
- Service dependency graphs (which agents call which?)

All of this works automatically once trace context propagates via `traceparent` headers -- the observability backends already support it. The missing piece is that `traceloop-sdk` doesn't configure the W3C propagators or provide helpers for cross-service context injection/extraction.

**Note:** This builds on top of the multi-exporter capability discussed in #3478. With multi-export + A2A propagation, users get unified cross-agent traces in multiple observability platforms simultaneously.

### ✌️ How do you aim to achieve this?

Based on investigation of the traceloop-sdk internals and OpenTelemetry Python SDK:

**Step 1: Configure global propagators in `Traceloop.init()`**

After `trace.set_tracer_provider()` is called, add:
```python
set_global_textmap(CompositeTextMapPropagator([
    TraceContextTextMapPropagator(),
    W3CBaggagePropagator(),
]))
```

This is a one-line addition. Both `TraceContextTextMapPropagator` and `W3CBaggagePropagator` are already bundled with `opentelemetry-api` (a dependency of traceloop-sdk). **No new dependencies required.**

**Step 2: Provide propagation helper utilities**

```python
# traceloop/sdk/propagation/__init__.py

from opentelemetry import context, propagate

def inject_trace_context(carrier=None):
    """Inject current trace context into HTTP headers."""
    if carrier is None:
        carrier = {}
    propagate.inject(carrier)
    return carrier

def extract_trace_context(carrier):
    """Extract trace context from incoming HTTP headers."""
    return propagate.extract(carrier)
```

**Step 3: Provide optional ASGI middleware**

```python
# traceloop/sdk/propagation/middleware.py

from opentelemetry import context, propagate, trace

class TraceContextMiddleware:
    """ASGI middleware for automatic W3C TraceContext extraction."""
    def __init__(self, app):
        self.app = app

    async def __call__(self, scope, receive, send):
        if scope["type"] != "http":
            await self.app(scope, receive, send)
            return

        headers = {
            k.decode(): v.decode()
            for k, v in scope.get("headers", [])
        }
        remote_ctx = propagate.extract(headers)
        token = context.attach(remote_ctx)
        try:
            tracer = trace.get_tracer("traceloop.sdk")
            with tracer.start_as_current_span(
                f"{scope.get('method', '')} {scope.get('path', '')}",
                kind=trace.SpanKind.SERVER,
            ):
                await self.app(scope, receive, send)
        finally:
            context.detach(token)
```

**Key design decisions:**
- Zero new dependencies (uses OTel APIs already in the dependency tree)
- Fully backward compatible (propagation helpers are opt-in)
- Follows W3C standards (not proprietary headers)
- Works with any OTLP-compatible backend (Langfuse, Instana, Datadog, Jaeger, etc.)
- Propagator configuration in `init()` is transparent -- existing single-service users see no change

### 🔄️ Additional Information

**Performance impact:** Negligible. `propagate.inject()` adds ~0.01ms and ~200 bytes per outgoing call. `propagate.extract()` adds ~0.02ms per incoming request. This is insignificant compared to HTTP round-trip latency (1-100ms) and LLM API call latency (200-5000ms).

**Async context considerations:** OTel context flows correctly through `await` and `asyncio.gather()`. It does NOT flow through `asyncio.create_task()` -- this is a known Python/OTel limitation that should be documented.

**Alternative approaches considered:**
- Custom `X-Traceloop-*` headers: Rejected -- proprietary, not recognized by observability backends
- Requiring users to configure propagators themselves: Current state, but every multi-agent team has to figure this out independently

**References:**
- [W3C Trace Context Specification](https://www.w3.org/TR/trace-context/)
- [W3C Baggage Specification](https://www.w3.org/TR/baggage/)
- [A2A Protocol - Enterprise Features](https://google.github.io/A2A/latest/topics/enterprise-ready/)
- [OpenTelemetry Context Propagation](https://opentelemetry.io/docs/specs/otel/context/api-propagators/)
- Related: #3478 (Multiple OTLP Endpoints/Exporters)

### 👀 Have you spent some time to check if this feature request has been raised before?

- [X] I checked and didn't find similar issue

### Are you willing to submit PR?

Yes I am willing to submit a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Support for A2A (Agent-to-Agent) Distributed Trace Context Propagation #3683

Which component is this feature for?

🔖 Feature description

🎤 Why is this feature needed?

✌️ How do you aim to achieve this?

🔄️ Additional Information

👀 Have you spent some time to check if this feature request has been raised before?

Are you willing to submit PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Support for A2A (Agent-to-Agent) Distributed Trace Context Propagation #3683

Description

Which component is this feature for?

🔖 Feature description

🎤 Why is this feature needed?

✌️ How do you aim to achieve this?

🔄️ Additional Information

👀 Have you spent some time to check if this feature request has been raised before?

Are you willing to submit PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions