Skip to content

OpenTelemetry compatibility through new collector #3500

@lfagliano

Description

@lfagliano

Feature description

Hello!

The problem

Today, DLT sort of lacks native integration with OpenTelemetry, an industry-standard for observability, especially in cloud native infrastructure and open source projects. Teams running DLT in production alongside other instrumented services cannot easily:

  • Correlate DLT pipeline execution with upstream triggers (API calls, scheduled jobs)
  • Send pipeline metrics to their existing observability stack (Datadog, Grafana, Honeycomb, etc.)
  • Get distributed traces that include DLT as part of a larger system view

However, I mentioned 'sort' because DLT has was designed with quite a good alignment with OpenTelemetry. For instance steps resemble in nature the idea of spans. This is even more evident by the fact that DLT also supports Sentry, which is also very much related to OTel. Because of that I think adding OpenTelemetry support is not a complicated task, neither an intrusive one.

Are you a dlt user?

Yes, I run dlt in production.

Use case

Benefits

🔍 End-to-end visibility across distributed systems
📊 Pipeline metrics in existing dashboards/alerting
🧩 Works with any OTEL-compatible backend (vendor-neutral)
⚡ Non-intrusive: optional dependency, extends existing Collector pattern

Proposed solution

Proposed solution

Add an OpenTelemetryCollector that leverages DLT's existing trace lifecycle hooks (on_start_trace, on_start_trace_step, etc.) to emit OTEL spans and metrics.

This is a natural fit because DLT already has a span-like architecture:

  • on_start_trace → root span
  • on_start_trace_step → child spans for extract/normalize/load
  • on_end_trace_step → record metrics, end span

No changes to core pipeline logic required—just a new Collector implementation.

This is something that we had been exploring in Vandebron to further increase our observability of DLT pipelines, and we managed to apply it to our pipelines. In this PR I add a proposed solution, where we just add a new collector that just extends the LogCollector, and allows us to work with OpenTelemetry.

This allows us to have things (in this case, tempo in grafana) such as traces for root cause analysis

Image

and, metrics (rows, tables and pipeline runs) and trace attributes for observability and monitoring

Image

Related issues

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions