Skip to content

Default values for histogram bucket boundaries are oriented around milliseconds rather than seconds #5821

@jakegavin

Description

@jakegavin

Problem Statement

Histograms are commonly used for recording latencies. The default values for bucket boundaries are []float64{0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000} (code). This works well when working with milliseconds however the Prometheus documentation recommends using seconds, rather than milliseconds for units. When recording latency metrics in seconds with the default buckets, the vast majority of timings will land in the 0 second to 5 seconds bucket. This results inaccurate histogram quantile calculations.

This is very similar to this issue in the .NET repo: open-telemetry/opentelemetry-dotnet#4797

Proposed Solution

opentelemetry-go could use a different set of default buckets when the histogram units are known to be seconds.

This was implemented in the .NET library here: open-telemetry/opentelemetry-dotnet#4820

Alternatives

The current workaround is to use the WithExplicitBucketBoundaries option on all histograms dealing in seconds.

Prior Art

.NET issue: open-telemetry/opentelemetry-dotnet#4797
.NET solution: open-telemetry/opentelemetry-dotnet#4820

Additional Context

This would likely be a breaking change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions