[ML] Improvements to bounds scaling for metric functions when there is a strong periodicity in the rate of values

An issue raised by support has shown up that there is an interplay between the correction we apply to the model variance for non-uniform data rate, for functions such as the mean, and the seasonal variance scale we estimate if we model periodicity in the function values. In particular, we can end up "double counting" the increase in variance we expect if the periodic pattern in the rate coincides with the periodic pattern in the values.

In fact, this could all be modelled. Ideally, for each residual model (normal, gamma, log-normal, mixture, etc) we'd estimate the model parameters as a function of the count of values in the bucket and the estimate of the variance at that offset in the period.

This is not a trivial change, we'd need to adapt all our residual model classes and also rework `maths::CRegression::COnlineLeastSquares` to support multiple regressors. A good first step would be to produce a data set which displays this problem since we don't have access to enough of the data  the issue was reported against to reproduce this problem directly. The characteristics would be seasonality in metric values with a seasonal increase in observed variance coinciding with large a seasonal drop off in data rate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Improvements to bounds scaling for metric functions when there is a strong periodicity in the rate of values #435

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ML] Improvements to bounds scaling for metric functions when there is a strong periodicity in the rate of values #435

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions