Skip to content

[Feature] "MetricsContext#add" (reduce metrics duplication) #46

Open
@simonespa

Description

@simonespa

Describe the user story

Currently, all metrics are filtered against all defined dimensions, resulting in metrics associated to unrelated dimensions with a consequent duplication, unless you flush the metric logger multiple times, resulting in separate JSON logs in CloudWatch.

As a developer I'd like to filter a group of metrics against different dimensions within the same metric logger context (same JSON payload)

Use Case

I'm currently working on a project where the server sends events rather than individual metrics. This event is a single JSON object containing all relevant metrics, dimensions and properties measured during the course of it (the metric logger is flushed only once per event instance).

As an example, let's take a pretty common event for web applications and call it page-request, which is triggered any time a web page is requested by a user. Let's assume the collected metrics and dimensions are the followings:

RequestCount
Counts the number of HTTP requests the server receives from the user. This metric is used to calculate the RPS.

  • Dimensions: PageType
  • Unit: Count
  • Aggregations: Sum

ResponseTime
The response time in milliseconds.

  • Dimensions: PageType
  • Unit: Milliseconds
  • Aggregations: Avg, 50th percentile, 95th percentile, 99th percentile

UpstreamRequestCount
Counts the the number of HTTP request the app performs towards its upstream services.

  • Dimensions: Client
  • Unit: Count
  • Aggregations: Sum

Where:

  • PageType: is the type of page requested by the user (e.g. home, player, etc)
  • Client: the name of the upstream service

From the example, RequestCount and ResponseTime share the same dimension, whereas UpstreamRequestCount is applied to a different one.

Let's write an example of metric logger which is called once immediately after the HTTP response has been sent to the user

const namespace = config.get('namespace');

export const logPageRequest = metricScope(metrics => {
  return async pageRequestEvent => {
    const {
      requestCount,
      responseTime,
      upstreamRequestCount,
      pageType,
      client
    } = pageRequestEvent;
    metrics.setNamespace(namespace);
    metrics.putMetric('RequestCount', requestCount, Unit.Count);
    metrics.putMetric('ResponseTime', responseTime, Unit.Milliseconds);
    metrics.putMetric('UpstreamRequestCount', upstreamRequestCount, Unit.Count);
    metrics.setDimensions(
      { PageType: pageType },
      { Client: client }
    );
  };
});

This example generates the following JSON log

default

and the following metrics are extracted

Screenshot 2020-06-12 at 16 47 05

Screenshot 2020-06-12 at 16 47 22

Screenshot 2020-06-12 at 16 47 47

We can see that UpstreamRequestCount is also applied to the PageType dimension and RequestCount and ResponseTime to Client, effectively generating unnecessary new metrics (3 in this example).

Describe the outcome you'd like

According to the previous example, I'd like to filter the UpstreamRequestCount by Client only and RequestCount and ResponseTime by PageType, resulting in the following metrics

Screenshot 2020-06-12 at 16 49 41

Screenshot 2020-06-12 at 16 50 00

Screenshot 2020-06-12 at 16 50 22

Here, the PageType group contains only the metrics that we want to apply, same thing for Client.

According to the EMF specification it is possible to add multiple CloudWatchMetrics objects

"CloudWatchMetrics": [
  {
    ... ...
  },
  {
    ... ...
  }
]

in order to define different groups of metrics that we want to apply to different dimensions. If we consider the previous example once again, we need to generate a JSON payload like the following

new

where the two metrics sharing the same dimensions are defined within the same CloudWatchMetrics object.

Generally speaking, each CloudWatchMetrics object contains metric that are filtered by the same group of dimensions.

Describe the solution you are proposing

To do so, I'm proposing to add a new method to the MetricsContext interface called add. The method will accept only one parameter which is an object defined in the next section.

Syntax

{
    "Name": String,
    "Value": Number,
    "Unit": String,
    "Metrics": [ MetricItem, ... ],
    "Dimensions": Object
}

Properties

Name

The metric name.

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: String

Value

The metric value.

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number

Unit

The metric unit (e.g. Unit.Count, Unit.Milliseconds, etc.)

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number

Metrics

An array of objects (see MetricItem type).

Required: only if Name, Value and Unit are undefined, optional otherwise.
Type: Array

Dimensions

The dimensions to filter the defined metrics by. This objects is a map of key/value pairs that stores the name and value of the dimension. Each property value must be of type String.

Required: yes
Type: Object

Types

MetricItem

{
  "Name": String,
  "Value": Number
  "Unit": String
}

Considering the previous example, our metric logger will look like

const namespace = config.get('namespace');

export const logPageRequest = metricScope(metrics => {
  return async pageRequestEvent => {
    const {
      requestCount,
      responseTime,
      upstreamRequestCount,
      pageType,
      client
    } = pageRequestEvent;
    metrics.setNamespace(namespace);
    metrics.add({
      Metrics: [
        {
          Name: 'RequestCount',
          Value: requestCount,
          Unit: Unit.Count
        },
        {
          Name: 'ResponseTime',
          Value: responseTime,
          Unit: Unit.Milliseconds
        }
      ],
      Dimensions: { PageType: pageType }
    });
  };
});

When we have one metric, we can either do

    metrics.add({
      Metrics: [
        {
          Name: 'UpstreamRequestCount',
          Value: upstreamRequestCount,
          Unit: Unit.Count
        }
      ],
      Dimensions: { Client: client }
    });

or

    metrics.add({
      Name: 'UpstreamRequestCount',
      Value: upstreamRequestCount,
      Unit: Unit.Count,
      Dimensions: { Client: client }
    });

Any other considerations about the solution

We could have modified the LogSerializer to optionally generate the multiple CloudWatchMetrics objects by means of a flag, but currently there is no association between group of metrics sharing the same dimensions. To achieve that, we could have modified the internal data structure by adding a mapping between them, resulting in a new method anyway to allow the user to express this relationship via the public API.

By creating a brand new method we keep the current data structure as is and the API back compatible with the previous version. The new method will have a separate data structure to allow the LogSerializer to easily understand how to serialise it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions