Skip to content

Feature: Support for A2A (Agent-to-Agent) Distributed Trace Context Propagation #3683

@mmphego

Description

@mmphego

Which component is this feature for?

Traceloop SDK

🔖 Feature description

Add built-in support for W3C TraceContext propagation across agent service boundaries, enabling unified distributed traces in multi-agent (A2A) architectures where independent agent services communicate over HTTP.

Current state: traceloop-sdk provides excellent in-process tracing via decorators (@workflow, @task, @agent), but trace context is confined to the local process via Python ContextVar. When Agent A calls Agent B over HTTP, the trace breaks -- each agent produces an isolated trace with its own trace_id.

Proposed additions:

  1. Configure W3C propagators during Traceloop.init():
from opentelemetry.propagate import set_global_textmap
from opentelemetry.propagators.composite import CompositeTextMapPropagator
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator
from opentelemetry.baggage.propagation import W3CBaggagePropagator

# During Traceloop.init(), after TracerProvider is set:
set_global_textmap(CompositeTextMapPropagator([
    TraceContextTextMapPropagator(),
    W3CBaggagePropagator(),
]))
  1. Provide helper utilities for inject/extract:
from traceloop.sdk.propagation import inject_trace_context, extract_trace_context

# Calling agent -- inject into outgoing HTTP headers
@task(name="call-remote-agent")
async def call_remote_agent(payload: dict) -> dict:
    headers = inject_trace_context()  # Returns {"traceparent": "00-...", "tracestate": "..."}
    response = await httpx.post("http://agent-b:8000/process", headers=headers, json=payload)
    return response.json()

# Receiving agent -- extract from incoming HTTP headers
from traceloop.sdk.propagation.middleware import TraceContextMiddleware

app = FastAPI()
app.add_middleware(TraceContextMiddleware)  # Auto-extracts traceparent from requests
  1. Result -- unified trace across agents:
Trace [abc123] -- single trace_id across 2 services
  orchestrate (Agent A)
    +-- call-remote-agent (Agent A, CLIENT)
         +-- POST /process (Agent B, SERVER)    <-- remote parent
              +-- analyze (Agent B)
                   +-- llm-call (Agent B)

🎤 Why is this feature needed?

The A2A (Agent-to-Agent) protocol is gaining adoption for multi-agent GenAI architectures where specialized agents communicate over HTTP. The A2A spec explicitly recommends W3C TraceContext for distributed tracing across agent boundaries.

The problem today:

  • traceloop-sdk traces are process-local. When Agent A calls Agent B, two disconnected traces are created.
  • Users cannot see end-to-end latency, token usage, or cost across a multi-agent workflow in their observability platform.
  • Every team building multi-agent systems has to implement W3C TraceContext propagation themselves on top of traceloop-sdk.

Real-world impact:

In production multi-agent deployments exporting to platforms like Langfuse, Instana, Datadog, or Jaeger, users need:

  • A single trace tree showing the full orchestration flow across all agents
  • Cross-agent latency breakdown (which agent is the bottleneck?)
  • Aggregated token usage and cost per end-to-end request
  • Service dependency graphs (which agents call which?)

All of this works automatically once trace context propagates via traceparent headers -- the observability backends already support it. The missing piece is that traceloop-sdk doesn't configure the W3C propagators or provide helpers for cross-service context injection/extraction.

Note: This builds on top of the multi-exporter capability discussed in #3478. With multi-export + A2A propagation, users get unified cross-agent traces in multiple observability platforms simultaneously.

✌️ How do you aim to achieve this?

Based on investigation of the traceloop-sdk internals and OpenTelemetry Python SDK:

Step 1: Configure global propagators in Traceloop.init()

After trace.set_tracer_provider() is called, add:

set_global_textmap(CompositeTextMapPropagator([
    TraceContextTextMapPropagator(),
    W3CBaggagePropagator(),
]))

This is a one-line addition. Both TraceContextTextMapPropagator and W3CBaggagePropagator are already bundled with opentelemetry-api (a dependency of traceloop-sdk). No new dependencies required.

Step 2: Provide propagation helper utilities

# traceloop/sdk/propagation/__init__.py

from opentelemetry import context, propagate

def inject_trace_context(carrier=None):
    """Inject current trace context into HTTP headers."""
    if carrier is None:
        carrier = {}
    propagate.inject(carrier)
    return carrier

def extract_trace_context(carrier):
    """Extract trace context from incoming HTTP headers."""
    return propagate.extract(carrier)

Step 3: Provide optional ASGI middleware

# traceloop/sdk/propagation/middleware.py

from opentelemetry import context, propagate, trace

class TraceContextMiddleware:
    """ASGI middleware for automatic W3C TraceContext extraction."""
    def __init__(self, app):
        self.app = app

    async def __call__(self, scope, receive, send):
        if scope["type"] != "http":
            await self.app(scope, receive, send)
            return

        headers = {
            k.decode(): v.decode()
            for k, v in scope.get("headers", [])
        }
        remote_ctx = propagate.extract(headers)
        token = context.attach(remote_ctx)
        try:
            tracer = trace.get_tracer("traceloop.sdk")
            with tracer.start_as_current_span(
                f"{scope.get('method', '')} {scope.get('path', '')}",
                kind=trace.SpanKind.SERVER,
            ):
                await self.app(scope, receive, send)
        finally:
            context.detach(token)

Key design decisions:

  • Zero new dependencies (uses OTel APIs already in the dependency tree)
  • Fully backward compatible (propagation helpers are opt-in)
  • Follows W3C standards (not proprietary headers)
  • Works with any OTLP-compatible backend (Langfuse, Instana, Datadog, Jaeger, etc.)
  • Propagator configuration in init() is transparent -- existing single-service users see no change

🔄️ Additional Information

Performance impact: Negligible. propagate.inject() adds ~0.01ms and ~200 bytes per outgoing call. propagate.extract() adds ~0.02ms per incoming request. This is insignificant compared to HTTP round-trip latency (1-100ms) and LLM API call latency (200-5000ms).

Async context considerations: OTel context flows correctly through await and asyncio.gather(). It does NOT flow through asyncio.create_task() -- this is a known Python/OTel limitation that should be documented.

Alternative approaches considered:

  • Custom X-Traceloop-* headers: Rejected -- proprietary, not recognized by observability backends
  • Requiring users to configure propagators themselves: Current state, but every multi-agent team has to figure this out independently

References:

👀 Have you spent some time to check if this feature request has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions