Skip to content

[ML] Improvements to bounds scaling for metric functions when there is a strong periodicity in the rate of values #435

Closed
@tveasey

Description

@tveasey

An issue raised by support has shown up that there is an interplay between the correction we apply to the model variance for non-uniform data rate, for functions such as the mean, and the seasonal variance scale we estimate if we model periodicity in the function values. In particular, we can end up "double counting" the increase in variance we expect if the periodic pattern in the rate coincides with the periodic pattern in the values.

In fact, this could all be modelled. Ideally, for each residual model (normal, gamma, log-normal, mixture, etc) we'd estimate the model parameters as a function of the count of values in the bucket and the estimate of the variance at that offset in the period.

This is not a trivial change, we'd need to adapt all our residual model classes and also rework maths::CRegression::COnlineLeastSquares to support multiple regressors. A good first step would be to produce a data set which displays this problem since we don't have access to enough of the data the issue was reported against to reproduce this problem directly. The characteristics would be seasonality in metric values with a seasonal increase in observed variance coinciding with large a seasonal drop off in data rate.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions