Description
Describe the user story
Currently, all metrics are filtered against all defined dimensions, resulting in metrics associated to unrelated dimensions with a consequent duplication, unless you flush the metric logger multiple times, resulting in separate JSON logs in CloudWatch.
As a developer I'd like to filter a group of metrics against different dimensions within the same metric logger context (same JSON payload)
Use Case
I'm currently working on a project where the server sends events rather than individual metrics. This event is a single JSON object containing all relevant metrics, dimensions and properties measured during the course of it (the metric logger is flushed only once per event instance).
As an example, let's take a pretty common event for web applications and call it page-request
, which is triggered any time a web page is requested by a user. Let's assume the collected metrics and dimensions are the followings:
RequestCount
Counts the number of HTTP requests the server receives from the user. This metric is used to calculate the RPS.
- Dimensions:
PageType
- Unit:
Count
- Aggregations:
Sum
ResponseTime
The response time in milliseconds.
- Dimensions:
PageType
- Unit:
Milliseconds
- Aggregations:
Avg
,50th percentile
,95th percentile
,99th percentile
UpstreamRequestCount
Counts the the number of HTTP request the app performs towards its upstream services.
- Dimensions:
Client
- Unit:
Count
- Aggregations:
Sum
Where:
PageType:
is the type of page requested by the user (e.g. home, player, etc)Client
: the name of the upstream service
From the example, RequestCount
and ResponseTime
share the same dimension, whereas UpstreamRequestCount
is applied to a different one.
Let's write an example of metric logger which is called once immediately after the HTTP response has been sent to the user
const namespace = config.get('namespace');
export const logPageRequest = metricScope(metrics => {
return async pageRequestEvent => {
const {
requestCount,
responseTime,
upstreamRequestCount,
pageType,
client
} = pageRequestEvent;
metrics.setNamespace(namespace);
metrics.putMetric('RequestCount', requestCount, Unit.Count);
metrics.putMetric('ResponseTime', responseTime, Unit.Milliseconds);
metrics.putMetric('UpstreamRequestCount', upstreamRequestCount, Unit.Count);
metrics.setDimensions(
{ PageType: pageType },
{ Client: client }
);
};
});
This example generates the following JSON log
and the following metrics are extracted
We can see that UpstreamRequestCount
is also applied to the PageType
dimension and RequestCount
and ResponseTime
to Client
, effectively generating unnecessary new metrics (3 in this example).
Describe the outcome you'd like
According to the previous example, I'd like to filter the UpstreamRequestCount
by Client
only and RequestCount
and ResponseTime
by PageType
, resulting in the following metrics
Here, the PageType
group contains only the metrics that we want to apply, same thing for Client
.
According to the EMF specification it is possible to add multiple CloudWatchMetrics
objects
"CloudWatchMetrics": [
{
... ...
},
{
... ...
}
]
in order to define different groups of metrics that we want to apply to different dimensions. If we consider the previous example once again, we need to generate a JSON payload like the following
where the two metrics sharing the same dimensions are defined within the same CloudWatchMetrics
object.
Generally speaking, each CloudWatchMetrics
object contains metric that are filtered by the same group of dimensions.
Describe the solution you are proposing
To do so, I'm proposing to add a new method to the MetricsContext
interface called add
. The method will accept only one parameter which is an object defined in the next section.
Syntax
{
"Name": String,
"Value": Number,
"Unit": String,
"Metrics": [ MetricItem, ... ],
"Dimensions": Object
}
Properties
Name
The metric name.
Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: String
Value
The metric value.
Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number
Unit
The metric unit (e.g. Unit.Count, Unit.Milliseconds, etc.)
Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number
Metrics
An array of objects (see MetricItem type).
Required: only if Name, Value and Unit are undefined, optional otherwise.
Type: Array
Dimensions
The dimensions to filter the defined metrics by. This objects is a map of key/value pairs that stores the name and value of the dimension. Each property value must be of type String.
Required: yes
Type: Object
Types
MetricItem
{
"Name": String,
"Value": Number
"Unit": String
}
Considering the previous example, our metric logger will look like
const namespace = config.get('namespace');
export const logPageRequest = metricScope(metrics => {
return async pageRequestEvent => {
const {
requestCount,
responseTime,
upstreamRequestCount,
pageType,
client
} = pageRequestEvent;
metrics.setNamespace(namespace);
metrics.add({
Metrics: [
{
Name: 'RequestCount',
Value: requestCount,
Unit: Unit.Count
},
{
Name: 'ResponseTime',
Value: responseTime,
Unit: Unit.Milliseconds
}
],
Dimensions: { PageType: pageType }
});
};
});
When we have one metric, we can either do
metrics.add({
Metrics: [
{
Name: 'UpstreamRequestCount',
Value: upstreamRequestCount,
Unit: Unit.Count
}
],
Dimensions: { Client: client }
});
or
metrics.add({
Name: 'UpstreamRequestCount',
Value: upstreamRequestCount,
Unit: Unit.Count,
Dimensions: { Client: client }
});
Any other considerations about the solution
We could have modified the LogSerializer
to optionally generate the multiple CloudWatchMetrics
objects by means of a flag, but currently there is no association between group of metrics sharing the same dimensions. To achieve that, we could have modified the internal data structure by adding a mapping between them, resulting in a new method anyway to allow the user to express this relationship via the public API.
By creating a brand new method we keep the current data structure as is and the API back compatible with the previous version. The new method will have a separate data structure to allow the LogSerializer
to easily understand how to serialise it.