Skip to content

Conversation

@shakuzen
Copy link
Member

In the case where the Timer/LongTaskTimer builder is not used, the DistributionStatisticsConfig is necessarily one with the default configuration being passed to registerMeterIfNecessary. We can avoid creating and configuring a DistributionStatisticsConfig that will always have the same config by keeping a static default one to use in these cases.

In performance critical areas the API to register Timer/LongTaskTimer that doesn't use the Builder should be preferred for this reason.

The usage in DefaultMeterObservationHandler has been updated accordingly, which should improve the performance and allocations for all applications using it.

In the case where the Timer/LongTaskTimer builder is not used, the DistributionStatisticsConfig is necessarily one with the default configuration being passed to registerMeterIfNecessary. We can avoid creating and configuring a DistributionStatisticsConfig that will always have the same config by keeping a static default one to use in these cases.

In performance critical areas the API to register Timer/LongTaskTimer that doesn't use the Builder should be preferred for this reason.

The usage in DefaultMeterObservationHandler has been updated accordingly, which should improve the performance and allocations for all applications using it.
@shakuzen shakuzen added this to the 1.16.0-M3 milestone Aug 25, 2025
@shakuzen shakuzen added enhancement A general enhancement performance Issues related to general performance module: micrometer-core An issue that is related to our core module labels Aug 25, 2025
@shakuzen
Copy link
Member Author

Benchmark/profiling analysis of the changes

Retrieve existing timer benchmark - Before

Benchmark                                                    Mode  Cnt      Score     Error   Units
MeterRegistrationBenchmark.registerTimer                     avgt    5      0.023 ±   0.001   us/op
MeterRegistrationBenchmark.registerTimer:gc.alloc.rate       avgt    5  14867.901 ± 255.952  MB/sec
MeterRegistrationBenchmark.registerTimer:gc.alloc.rate.norm  avgt    5    176.000 ±   0.001    B/op

Retrieve existing timer benchmark - After

Benchmark                                                    Mode  Cnt     Score     Error   Units
MeterRegistrationBenchmark.registerTimer                     avgt    5     0.015 ±   0.001   us/op
MeterRegistrationBenchmark.registerTimer:gc.alloc.rate       avgt    5  7883.567 ± 172.471  MB/sec
MeterRegistrationBenchmark.registerTimer:gc.alloc.rate.norm  avgt    5    64.000 ±   0.001    B/op

These changes save 8 nanoseconds per operation retrieving an existing timer (the method was renamed to registerExistingTimer after running these benchmarks), and perhaps more importantly, it saves 112 bytes allocated per operation.

Retrieve existing timer allocation profiling - Before

       bytes  percent  samples  top
  ----------  -------  -------  ---
180253016322   27.26%   343806  io.micrometer.core.instrument.distribution.DistributionStatisticConfig
180020757181   27.23%   343363  java.lang.Double
149984354951   22.69%   286073  io.micrometer.core.instrument.Meter$Id
 91151489246   13.79%   173858  io.micrometer.core.instrument.MeterRegistry$$Lambda.0x00003fc00102dcf8
 59728872188    9.03%   113924  io.micrometer.core.instrument.distribution.DistributionStatisticConfig$Builder
     2621435    0.00%        5  java.util.concurrent.locks.AbstractQueuedSynchronizer$ExclusiveNode

Retrieve existing timer allocation profiling - After

       bytes  percent  samples  top
  ----------  -------  -------  ---
224818984183   62.49%   428809  io.micrometer.core.instrument.Meter$Id
134962483827   37.51%   257421  io.micrometer.core.instrument.MeterRegistry$$Lambda.0x00000fc00102da70
     2097148    0.00%        4  java.util.concurrent.locks.AbstractQueuedSynchronizer$ExclusiveNode

Perhaps interesting to note, you can see from the allocation profiling (and confirm with JIT compilation logs) that even before these changes there was no allocation for the Timer Builder itself - that allocation gets eliminated. It is two Doubles getting boxed for the minimum/maximum expected values config on the DistributionStatisticsConfig, its Builder, and itself getting allocated, which is avoided after these changes.

The lambda getting allocated is a capturing lambda where the configured PauseDetector is being passed. In a separate PR from this, we can look at eliminating this in the case the default PauseDetector (no-op implementation) is being used.

@shakuzen
Copy link
Member Author

I did consider trying to eliminate creating a DistributionStatisticsConfig even when the Timer Builder is used, if nothing on it is called that changes the defaults for the DSC. While perhaps possible, it would be significantly more complicated than this change, partially because AbstractTimerBuilder is public and has a protected final field for the DSC Builder. Perhaps we could extend the builder and track if anything has been changed from the defaults and return the static default DSC instance if nothing has. I'm not sure how worth it it is to do that when hopefully this change offers a reasonable compromise for performance critical instrumentations.

@shakuzen shakuzen changed the title Skip DistributionStatisticsConfig creation when retrieving timers Avoid DistributionStatisticsConfig creation when retrieving timers Aug 25, 2025
@shakuzen shakuzen merged commit 84322ca into micrometer-metrics:main Aug 25, 2025
11 checks passed
@shakuzen shakuzen deleted the nano-timer branch August 25, 2025 10:49
@shakuzen
Copy link
Member Author

The lambda getting allocated is a capturing lambda where the configured PauseDetector is being passed. In a separate PR from this, we can look at eliminating this in the case the default PauseDetector (no-op implementation) is being used.

The capture was more than just the PauseDetector, which required more changes to avoid, but I've opened #6670 for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement A general enhancement module: micrometer-core An issue that is related to our core module performance Issues related to general performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant