ArcadeDB version
Observed on Docker images:
arcadedata/arcadedb:26.4.1-SNAPSHOT
arcadedata/arcadedb:26.4.2
Environment
- Host OS: Windows 10
- Architecture: x86_64
- Deployment: Docker
- ArcadeDB endpoint: HTTP
/api/v1/command/arcade
- Request mode used for direct reproduction:
language: cypher
serializer: studio
- Differential comparison target: Neo4j Docker
neo4j:latest
Describe the bug
ArcadeDB may evaluate reduce(...) incorrectly when the list operand is produced by an aggregate expression inline in the same projection.
The problem is not reduce(...) by itself, and it is not the aggregate by itself. The boundary is much narrower:
WITH ... AS ages RETURN reduce(... IN ages ...) works
RETURN reduce(... IN collect(...) ...) does not
WITH ... AS age_sum RETURN reduce(... IN [age_sum] ...) works
RETURN reduce(... IN [sum(...)] ...) does not
So the issue appears to be specific to inline aggregate expressions nested directly inside the reduce(...) list operand.
To Reproduce
Setup:
CREATE (:Person {name:'Alice', age:30});
CREATE (:Person {name:'Bob', age:25});
CREATE (:Person {name:'Charlie', age:35});
Query:
MATCH (p:Person)
RETURN p.name AS name,
reduce(total = 0, n IN collect(p.age) | total + n) AS total_age_sum
ORDER BY name;
Expected behavior
Each grouped row should reduce over its own one-element collected list:
Alice, 30
Bob, 25
Charlie, 35
Observed Neo4j result:
Alice, 30
Bob, 25
Charlie, 35
Actual behavior
Observed ArcadeDB 26.4.1-SNAPSHOT and 26.4.2 result:
Alice, 0
Bob, 0
Charlie, 0
So the inline collect(p.age) is not being fed into reduce(...) correctly.
Control case
If the aggregate result is materialized first and only then passed into reduce(...), ArcadeDB behaves normally:
MATCH (p:Person)
WITH p.name AS name, collect(p.age) AS ages
RETURN name,
reduce(total = 0, n IN ages | total + n) AS total_age_sum
ORDER BY name;
Observed result on Neo4j and ArcadeDB 26.4.1-SNAPSHOT / 26.4.2:
Alice, 30
Bob, 25
Charlie, 35
This makes the boundary clear: reduce(...) itself works, and collect(...) itself works, but the inline aggregate expression inside reduce(...) does not.
Stronger reproducer
The same family also appears with other inline aggregate expressions, not just collect(...).
For example:
MATCH (p:Person)
RETURN p.name AS name,
reduce(total = 0, n IN [sum(p.age)] | total + n) AS s
ORDER BY name;
Observed Neo4j result:
Alice, 30
Bob, 25
Charlie, 35
Observed ArcadeDB 26.4.1-SNAPSHOT / 26.4.2 result:
Alice, 30
Bob, 55
Charlie, 90
The same cumulative pattern also appears with count(*):
MATCH (p:Person)
RETURN p.name AS name,
reduce(total = 0, n IN [count(*)] | total + n) AS s
ORDER BY name;
Observed Neo4j result:
Alice, 1
Bob, 1
Charlie, 1
Observed ArcadeDB 26.4.1-SNAPSHOT / 26.4.2 result:
Alice, 1
Bob, 2
Charlie, 3
So this is not limited to collect(...). More generally, inline aggregate expressions inside reduce(...) may be evaluated against the wrong state.
Additional failure mode
Without a grouping key, ArcadeDB can also silently lose the projected result column:
MATCH (p:Person)
RETURN reduce(total = 0, n IN collect(p.age) | total + n) AS total_age_sum;
Observed Neo4j result:
Observed ArcadeDB 26.4.1-SNAPSHOT / 26.4.2 result:
<row present, but `total_age_sum` missing>
This suggests the same underlying issue can show up either as a wrong numeric result or as a dropped projection value, depending on the aggregation shape.
ArcadeDB version
Observed on Docker images:
arcadedata/arcadedb:26.4.1-SNAPSHOTarcadedata/arcadedb:26.4.2Environment
/api/v1/command/arcadelanguage: cypherserializer: studioneo4j:latestDescribe the bug
ArcadeDB may evaluate
reduce(...)incorrectly when the list operand is produced by an aggregate expression inline in the same projection.The problem is not
reduce(...)by itself, and it is not the aggregate by itself. The boundary is much narrower:WITH ... AS ages RETURN reduce(... IN ages ...)worksRETURN reduce(... IN collect(...) ...)does notWITH ... AS age_sum RETURN reduce(... IN [age_sum] ...)worksRETURN reduce(... IN [sum(...)] ...)does notSo the issue appears to be specific to inline aggregate expressions nested directly inside the
reduce(...)list operand.To Reproduce
Setup:
Query:
Expected behavior
Each grouped row should reduce over its own one-element collected list:
Observed Neo4j result:
Actual behavior
Observed ArcadeDB
26.4.1-SNAPSHOTand26.4.2result:So the inline
collect(p.age)is not being fed intoreduce(...)correctly.Control case
If the aggregate result is materialized first and only then passed into
reduce(...), ArcadeDB behaves normally:Observed result on Neo4j and ArcadeDB
26.4.1-SNAPSHOT/26.4.2:This makes the boundary clear:
reduce(...)itself works, andcollect(...)itself works, but the inline aggregate expression insidereduce(...)does not.Stronger reproducer
The same family also appears with other inline aggregate expressions, not just
collect(...).For example:
Observed Neo4j result:
Observed ArcadeDB
26.4.1-SNAPSHOT/26.4.2result:The same cumulative pattern also appears with
count(*):Observed Neo4j result:
Observed ArcadeDB
26.4.1-SNAPSHOT/26.4.2result:So this is not limited to
collect(...). More generally, inline aggregate expressions insidereduce(...)may be evaluated against the wrong state.Additional failure mode
Without a grouping key, ArcadeDB can also silently lose the projected result column:
Observed Neo4j result:
Observed ArcadeDB
26.4.1-SNAPSHOT/26.4.2result:This suggests the same underlying issue can show up either as a wrong numeric result or as a dropped projection value, depending on the aggregation shape.