Skip to content

Histogram #151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: histogram_feature_branch
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,42 @@ Thread-safety for the second use case is achieved by using a ReentrantReadWriteL
With all the internal synchronization measures, however, there're still certain multi-threading use cases that are not covered by this library, which might require external synchronizations or other protection measures.
This is due to the fact that the execution order of APIs are not determined in async contexts. For example, if user needs to associate a given set of properties with a metric in each thread, the results are not guaranteed since the execution order of `putProperty()` is not determined across threads. In such cases, we recommend using a different MetricsLogger instance for different threads, so that no resources are shared and no thread-safety problem would ever happen. Note that this can often be simplified by using a ThreadLocal variable.

## Aggregation

### Built in Aggregation

There are 3 types of aggregation implemented in this library: List, Statistic Sets, and Histograms.

- List reports all values added to a metrics as a list of values.
- Statistic Sets only reports the maximum, minimum, sum, and count of values added to a metric.
- Histograms use the Sparse Exponential Histogram Algorithm (SEH) to place each value added to a metric into a bin which keeps track of how many values have been added it. A histogram will report the bin values, the count of values in each bin, and a statistic set about the provided values. Note: SEH only accepts values greater than 0

There are several ways to set the aggregation type of a metric:
1. Use the `AggregationType` parameter when calling `putMetric` on `MetricsLogger`
```
MetricsLogger logger = new MetricsLogger();
logger.putMetric("metric", 1, AggregationType.Histogram);
```
2. By default, `MetricsLogger` will set all metrics that are added using `putMetric` without specificying an aggregation type to use List aggregation. This default behaviour can be changed to any of the other aggregation types by using a setter (it is recommended that this be done before any metrics are added to the logger because trying to change the aggregation type of an exist log with throw an error):
```
MetricsLogger logger = new MetricsLogger();
logger.setDefaultAggregationType(AggregationType.StatisticSet);
```

### Custom Histograms

Custom histograms can also be created if the sparse exponential histogram algorithm is not the best for the given data. To do this use the `HistogramMetric` class.

```
ArrayList<Double> values = Arrays.asList(1, 1234, 7, 100);
ArrayList<Integer> counts = Arrays.asList(1, 2, 7, 10);

HistogramMetric histogram = HistogramMetric(values, counts);

MetricsLogger logger = new MetricsLogger();
logger.setMetric("myHistogram", histogram);
```

## Examples

Check out the [examples](https://github.com/awslabs/aws-embedded-metrics-java/tree/master/examples) directory to get started.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,7 @@ public MetricsLogger putMetric(String key, double value, AggregationType aggrega
* @throws InvalidMetricException if the metric is invalid
*/
public MetricsLogger setMetric(String key, Metric value) throws InvalidMetricException {
rwl.readLock().lock();
try {
this.context.setMetric(key, value);
return this;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
public enum AggregationType {
LIST(0),
STATISTIC_SET(1),
HISTOGRAM(2),
UNKNOWN_TO_SDK_VERSION(-1);

private final int value;
Expand Down
160 changes: 160 additions & 0 deletions src/main/java/software/amazon/cloudwatchlogs/emf/model/Histogram.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
/* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package software.amazon.cloudwatchlogs.emf.model;

import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import software.amazon.cloudwatchlogs.emf.Constants;
import software.amazon.cloudwatchlogs.emf.exception.InvalidMetricException;

/** Histogram metric type */
class Histogram extends Statistics {
Histogram(List<Double> values, List<Integer> counts) throws IllegalArgumentException {
if (counts.size() != values.size()) {
throw new IllegalArgumentException("Counts and values must have the same size");
}

if (values.stream().anyMatch(n -> n == null) || counts.stream().anyMatch(n -> n == null)) {
throw new IllegalArgumentException("Values and counts cannot contain null values");
}

if (!validSize(counts.size())) {
throw new IllegalArgumentException(
String.format(
"Histogram provided with %d bins but CloudWatch will drop Histograms with more than %d bins",
counts.size(), Constants.MAX_DATAPOINTS_PER_METRIC));
}

this.max = Collections.max(values);
this.min = Collections.min(values);
this.count = counts.stream().mapToInt(Integer::intValue).sum();
this.sum = 0d;
for (int i = 0; i < counts.size(); i++) {
this.sum += values.get(i) * counts.get(i);
}
this.counts = counts;
this.values = values;
}

Histogram() {
count = 0;
sum = 0.;
values = new ArrayList<>();
counts = new ArrayList<>();
}

@JsonProperty("Values")
public List<Double> values;

@JsonProperty("Counts")
public List<Integer> counts;

@JsonIgnore private boolean reduced = false;

@JsonIgnore private static final double EPSILON = 0.1;
@JsonIgnore private static final double BIN_SIZE = Math.log(1 + EPSILON);
@JsonIgnore private final Map<Double, Integer> buckets = new HashMap<>();

/**
* @param value the value to add to the histogram
* @throws InvalidMetricException if adding this value would increase the number of bins in the
* histogram to more than {@value Constants#MAX_DATAPOINTS_PER_METRIC}
* @see Constants#MAX_DATAPOINTS_PER_METRIC
*/
@Override
void addValue(double value) throws InvalidMetricException {
reduced = false;
super.addValue(value);

double bucket = getBucket(value);
if (!buckets.containsKey(bucket) && !validSize(counts.size() + 1)) {
throw new InvalidMetricException(
String.format(
"Adding this value increases the number of bins in this histogram to %d"
+ ", CloudWatch will drop any Histogram metrics with more than %d bins",
buckets.size() + 1, Constants.MAX_DATAPOINTS_PER_METRIC));
}
// Add the value to the appropriate bucket (or create a new bucket if necessary)
buckets.compute(
bucket,
(k, v) -> {
if (v == null) {
return 1;
} else {
return v + 1;
}
});
}

/**
* Updates the Values and Counts lists to represent the buckets of this histogram.
*
* @return the reduced histogram
*/
Histogram reduce() {
if (reduced) {
return this;
}

this.values = new ArrayList<>(buckets.size());
this.counts = new ArrayList<>(buckets.size());

for (Map.Entry<Double, Integer> entry : buckets.entrySet()) {
this.values.add(entry.getKey());
this.counts.add(entry.getValue());
}

reduced = true;
return this;
}

/**
* Gets the value of the bucket for the given value.
*
* @param value the value to find the closest bucket for
* @return the value of the bucket the given value goes in
*/
private static double getBucket(double value) {
short index = (short) Math.floor(Math.log(value) / BIN_SIZE);
return Math.exp((index + 0.5) * BIN_SIZE);
}

private boolean validSize(int size) {
return size <= Constants.MAX_DATAPOINTS_PER_METRIC;
}

@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Histogram that = (Histogram) o;
return count == that.count
&& that.sum.equals(sum)
&& that.max.equals(max)
&& that.min.equals(min)
&& buckets.equals(that.buckets);
}

@Override
public int hashCode() {
return super.hashCode() + buckets.hashCode();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
/*
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package software.amazon.cloudwatchlogs.emf.model;

import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
import software.amazon.cloudwatchlogs.emf.Constants;
import software.amazon.cloudwatchlogs.emf.exception.InvalidMetricException;

/** Represents the Histogram of the EMF schema. */
public class HistogramMetric extends Metric<Histogram> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class Histogram has less visibility than the member that uses it


HistogramMetric(
Unit unit,
StorageResolution storageResolution,
List<Double> values,
List<Integer> counts)
throws IllegalArgumentException {
this(unit, storageResolution, new Histogram(values, counts));
}

protected HistogramMetric(
String name, Unit unit, StorageResolution storageResolution, Histogram histogram) {
this.unit = unit;
this.storageResolution = storageResolution;
this.values = histogram;
this.name = name;
}

HistogramMetric(Unit unit, StorageResolution storageResolution, Histogram histogram) {
this.unit = unit;
this.storageResolution = storageResolution;
this.values = histogram;
}

@Override
protected Queue<Metric<Histogram>> serialize() throws InvalidMetricException {
// Histograms will be rejected from CWL if they have more than
// Constants.MAX_DATAPOINTS_PER_METRIC number of bins. Unlike MetricDefinition histograms
// cannot be broken into multiple messages therefore an error is raised to let users know
// their message won't be sent otherwise only this metric will be sent
if (isOversized()) {
throw new InvalidMetricException(
String.format(
"Histogram metric, %s, has %d values which exceeds the maximum amount "
+ "of bins allowed, %d, and Histograms cannot be broken into "
+ "multiple metrics therefore it will not be published",
name, values.values.size(), Constants.MAX_DATAPOINTS_PER_METRIC));
}
Queue<Metric<Histogram>> metrics = new LinkedList<>();
metrics.offer(this);
return metrics;
}

@Override
protected boolean isOversized() {
return values.values.size() > Constants.MAX_DATAPOINTS_PER_METRIC;
}

@Override
public boolean hasValidValues() {
return values != null && values.count > 0 && !isOversized();
}

public static HistogramMetricBuilder builder() {
return new HistogramMetricBuilder();
}

public static class HistogramMetricBuilder
extends Metric.MetricBuilder<Histogram, HistogramMetricBuilder> {

@Override
protected HistogramMetricBuilder getThis() {
return this;
}

public HistogramMetricBuilder() {
this.values = new Histogram();
}

@Override
public Histogram getValues() {
rwl.readLock().lock();
try {
return values.reduce();
} finally {
rwl.readLock().unlock();
}
}

@Override
public HistogramMetricBuilder addValue(double value) {
rwl.readLock().lock();
try {
values.addValue(value);
return this;
} finally {
rwl.readLock().unlock();
}
}

@Override
public HistogramMetric build() {
rwl.writeLock().lock();
try {
values.reduce();
if (name == null) {
return new HistogramMetric(unit, storageResolution, values);
}
return new HistogramMetric(name, unit, storageResolution, values);
} finally {
rwl.writeLock().unlock();
}
}
}
}
Loading