Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
de249a6
Added base tests cases for aggregation functions
ivancea Jun 20, 2024
cc71f76
Merge branch 'main' into esql/aggregations-base-tests
ivancea Jun 21, 2024
04f6327
Unified aggregations with functions parent classes, and discarded str…
ivancea Jun 21, 2024
031988d
Added generated top_list docs, and ip_prefix link to docs
ivancea Jun 21, 2024
88e510e
Spotless apply
ivancea Jun 21, 2024
435eab2
Restored public finals
ivancea Jun 21, 2024
31241e2
Removed some finals to not change original code
ivancea Jun 21, 2024
8258564
Add test case for no input
ivancea Jun 21, 2024
7423f8f
Merge branch 'main' into esql/aggregations-base-tests
ivancea Jun 25, 2024
e4f8b89
Added null check tests
ivancea Jun 25, 2024
3b1291b
Remove unused imports
ivancea Jun 25, 2024
40f480f
Added random cases and multirow support
ivancea Jun 25, 2024
81f059b
Renamed base class Function to ScalarFunction
ivancea Jun 25, 2024
63e6d6d
Divided methods between classes
ivancea Jun 25, 2024
e1c1426
Adapted to work with evaluators and folding
ivancea Jun 26, 2024
80cc03b
Merge branch 'main' into esql/aggregations-base-tests
ivancea Jun 26, 2024
8be98ed
Merge branch 'esql/aggregations-base-tests' into esql/aggregations-ba…
ivancea Jun 26, 2024
a1d2437
Merge branch 'main' into esql/aggregations-base-tests-random-cases
ivancea Jul 1, 2024
7cd74fb
Moved toMultiRow to base class
ivancea Jul 1, 2024
36804bc
Format
ivancea Jul 1, 2024
8252962
Added tests and new docs to avg agg
ivancea Jul 1, 2024
7d52560
Merge branch 'main' into esql/aggregations-base-tests-random-cases
ivancea Jul 2, 2024
0153061
Merge branch 'main' into esql/aggregations-base-tests-random-cases
ivancea Jul 4, 2024
995276e
Added custom multirow test cases
ivancea Jul 4, 2024
dbe3435
Merge branch 'esql/aggregations-base-tests-random-cases' into avg-agg…
ivancea Jul 4, 2024
18b2be9
Updated avg tests with new cases
ivancea Jul 4, 2024
5392e92
Comment existing functions
ivancea Jul 4, 2024
d3fff55
Remove redundant "multiRow" from test case generators
ivancea Jul 5, 2024
289a827
Merge branch 'esql/aggregations-base-tests-random-cases' into avg-agg…
ivancea Jul 5, 2024
12b4908
Updated avg tests
ivancea Jul 5, 2024
4749b41
Format
ivancea Jul 5, 2024
e014d20
Merge branch 'esql/aggregations-base-tests-random-cases' into avg-agg…
ivancea Jul 5, 2024
e85e657
Merge branch 'main' into avg-agg-tests
ivancea Jul 8, 2024
3c65007
Merge branch 'main' into avg-agg-tests
elasticmachine Jul 8, 2024
c48b393
Use stream syntax for test cases
ivancea Jul 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/reference/esql/functions/aggregation-functions.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
The <<esql-stats-by>> command supports these aggregate functions:

// tag::agg_list[]
* <<esql-agg-avg>>
* <<esql-avg>>
* <<esql-agg-count>>
* <<esql-agg-count-distinct>>
* <<esql-agg-max>>
Expand All @@ -23,7 +23,6 @@ The <<esql-stats-by>> command supports these aggregate functions:
* experimental:[] <<esql-agg-weighted-avg>>
// end::agg_list[]

include::avg.asciidoc[]
include::count.asciidoc[]
include::count-distinct.asciidoc[]
include::max.asciidoc[]
Expand All @@ -33,6 +32,7 @@ include::min.asciidoc[]
include::percentile.asciidoc[]
include::st_centroid_agg.asciidoc[]
include::sum.asciidoc[]
include::layout/avg.asciidoc[]
include::layout/top.asciidoc[]
include::values.asciidoc[]
include::weighted-avg.asciidoc[]
47 changes: 0 additions & 47 deletions docs/reference/esql/functions/avg.asciidoc

This file was deleted.

5 changes: 5 additions & 0 deletions docs/reference/esql/functions/description/avg.asciidoc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions docs/reference/esql/functions/examples/avg.asciidoc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

48 changes: 48 additions & 0 deletions docs/reference/esql/functions/kibana/definition/avg.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions docs/reference/esql/functions/kibana/docs/avg.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions docs/reference/esql/functions/layout/avg.asciidoc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions docs/reference/esql/functions/parameters/avg.asciidoc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/reference/esql/functions/signature/avg.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions docs/reference/esql/functions/types/avg.asciidoc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import org.elasticsearch.xpack.esql.core.tree.Source;
import org.elasticsearch.xpack.esql.core.type.DataType;
import org.elasticsearch.xpack.esql.expression.SurrogateExpression;
import org.elasticsearch.xpack.esql.expression.function.Example;
import org.elasticsearch.xpack.esql.expression.function.FunctionInfo;
import org.elasticsearch.xpack.esql.expression.function.Param;
import org.elasticsearch.xpack.esql.expression.function.scalar.multivalue.MvAvg;
Expand All @@ -28,7 +29,20 @@
public class Avg extends AggregateFunction implements SurrogateExpression {
public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "Avg", Avg::new);

@FunctionInfo(returnType = "double", description = "The average of a numeric field.", isAggregation = true)
@FunctionInfo(
returnType = "double",
description = "The average of a numeric field.",
isAggregation = true,
examples = {
@Example(file = "stats", tag = "avg"),
@Example(
description = "The expression can use inline functions. For example, to calculate the average "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the existing examples here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

+ "over a multivalued column, first use `MV_AVG` to average the multiple values per row, "
+ "and use the result with the `AVG` function",
file = "stats",
tag = "docsStatsAvgNestedExpression"
) }
)
public Avg(Source source, @Param(name = "number", type = { "double", "integer", "long" }) Expression field) {
super(source, field);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
import org.elasticsearch.compute.data.Page;
import org.elasticsearch.core.Releasables;
import org.elasticsearch.xpack.esql.core.expression.Expression;
import org.elasticsearch.xpack.esql.core.expression.FieldAttribute;
import org.elasticsearch.xpack.esql.core.expression.Literal;
import org.elasticsearch.xpack.esql.core.type.DataType;
import org.elasticsearch.xpack.esql.core.util.NumericUtils;
import org.elasticsearch.xpack.esql.expression.SurrogateExpression;
Expand Down Expand Up @@ -251,6 +253,13 @@ private void resolveExpression(Expression expression, Consumer<Expression> onAgg
expression = new FoldNull().rule(expression);
assertThat(expression.dataType(), equalTo(testCase.expectedType()));

assumeTrue(
"Surrogate expression with non-trivial children cannot be evaluated",
expression.children()
.stream()
.allMatch(child -> child instanceof FieldAttribute || child instanceof DeepCopy || child instanceof Literal)
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right thing to do for now because it doesn't block the rest of the aggs. It's really a shame most of the tests for AVG don't pick up, but that's life sometimes. We'll get there.


if (expression instanceof AggregateFunction == false) {
onEvaluableExpression.accept(expression);
return;
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

package org.elasticsearch.xpack.esql.expression.function.aggregate;

import com.carrotsearch.randomizedtesting.annotations.Name;
import com.carrotsearch.randomizedtesting.annotations.ParametersFactory;

import org.elasticsearch.xpack.esql.core.expression.Expression;
import org.elasticsearch.xpack.esql.core.tree.Source;
import org.elasticsearch.xpack.esql.core.type.DataType;
import org.elasticsearch.xpack.esql.expression.function.AbstractAggregationTestCase;
import org.elasticsearch.xpack.esql.expression.function.MultiRowTestCaseSupplier;
import org.elasticsearch.xpack.esql.expression.function.TestCaseSupplier;

import java.util.ArrayList;
import java.util.List;
import java.util.function.Supplier;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import static org.hamcrest.Matchers.equalTo;

public class AvgTests extends AbstractAggregationTestCase {
public AvgTests(@Name("TestCase") Supplier<TestCaseSupplier.TestCase> testCaseSupplier) {
this.testCase = testCaseSupplier.get();
}

@ParametersFactory
public static Iterable<Object[]> parameters() {
var suppliers = new ArrayList<TestCaseSupplier>();

Stream.of(
MultiRowTestCaseSupplier.intCases(1, 1000, Integer.MIN_VALUE, Integer.MAX_VALUE, true),
MultiRowTestCaseSupplier.longCases(1, 1000, Long.MIN_VALUE, Long.MAX_VALUE, true),
MultiRowTestCaseSupplier.doubleCases(1, 1000, -Double.MAX_VALUE, Double.MAX_VALUE, true)
).flatMap(List::stream).map(AvgTests::makeSupplier).collect(Collectors.toCollection(() -> suppliers));

suppliers.add(
// Folding
new TestCaseSupplier(
List.of(DataType.INTEGER),
() -> new TestCaseSupplier.TestCase(
List.of(TestCaseSupplier.TypedData.multiRow(List.of(200), DataType.INTEGER, "field")),
"Avg[field=Attribute[channel=0]]",
DataType.DOUBLE,
equalTo(200.)
)
)
);

return parameterSuppliersFromTypedDataWithDefaultChecks(suppliers);
}

@Override
protected Expression build(Source source, List<Expression> args) {
return new Avg(source, args.get(0));
}

private static TestCaseSupplier makeSupplier(TestCaseSupplier.TypedDataSupplier fieldSupplier) {
return new TestCaseSupplier(List.of(fieldSupplier.type()), () -> {
var fieldTypedData = fieldSupplier.get();

Object expected = switch (fieldTypedData.type().widenSmallNumeric()) {
case INTEGER -> fieldTypedData.multiRowData()
.stream()
.map(v -> (Integer) v)
.collect(Collectors.summarizingInt(Integer::intValue))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found that sometimes we don't line up with these fancy methods in Collectors. So long as we do that's all good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this works with doubles, I guess the worst that could happen is we comparing doubles with an error margin (Which I think was commented already somewhere?). If some test fails because of that, we can use it as an example and do it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

.getAverage();
case LONG -> fieldTypedData.multiRowData()
.stream()
.map(v -> (Long) v)
.collect(Collectors.summarizingLong(Long::longValue))
.getAverage();
case DOUBLE -> fieldTypedData.multiRowData()
.stream()
.map(v -> (Double) v)
.collect(Collectors.summarizingDouble(Double::doubleValue))
.getAverage();
default -> throw new IllegalStateException("Unexpected value: " + fieldTypedData.type());
};

return new TestCaseSupplier.TestCase(
List.of(fieldTypedData),
"Avg[field=Attribute[channel=0]]",
DataType.DOUBLE,
equalTo(expected)
);
});
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import java.util.Comparator;
import java.util.List;
import java.util.function.Supplier;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import static org.hamcrest.Matchers.equalTo;
Expand All @@ -37,14 +38,15 @@ public static Iterable<Object[]> parameters() {

for (var limitCaseSupplier : TestCaseSupplier.intCases(1, 1000, false)) {
for (String order : List.of("asc", "desc")) {
for (var fieldCaseSupplier : Stream.of(
Stream.of(
MultiRowTestCaseSupplier.intCases(1, 1000, Integer.MIN_VALUE, Integer.MAX_VALUE, true),
MultiRowTestCaseSupplier.longCases(1, 1000, Long.MIN_VALUE, Long.MAX_VALUE, true),
MultiRowTestCaseSupplier.doubleCases(1, 1000, -Double.MAX_VALUE, Double.MAX_VALUE, true),
MultiRowTestCaseSupplier.dateCases(1, 1000)
).flatMap(List::stream).toList()) {
suppliers.add(TopTests.makeSupplier(fieldCaseSupplier, limitCaseSupplier, order));
}
)
.flatMap(List::stream)
.map(fieldCaseSupplier -> TopTests.makeSupplier(fieldCaseSupplier, limitCaseSupplier, order))
.collect(Collectors.toCollection(() -> suppliers));
}
}

Expand Down