-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Improve ES|QL rate performance for high-cardinality using local circuit breaker #140359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Buildkite benchmark this with tsdb-metricsgen-240m-highcardinality please |
💚 Build Succeeded
This build ran two tsdb-metricsgen-240m-highcardinality benchmarks to evaluate performance impact of this PR. History |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
dnhatn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two comments, but the changes look great. Thank you for the investigation @JonasKunz!
| this.dateFactor = isDateNanos ? 1_000_000_000.0 : 1000.0; | ||
| try { | ||
| this.reducedStates = driverContext.bigArrays().newObjectArray(256); | ||
| this.reducedStates = bigArrays.newObjectArray(256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should release localCircuitBreakerService if creating reducedStates fails.
| this.driverContext = driverContext; | ||
| this.bigArrays = driverContext.bigArrays(); | ||
| localCircuitBreakerService = new LocalCircuitBreaker.SingletonService( | ||
| driverContext.bigArrays().breakerService(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should pass the actual settings here, but we can do that in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to, but I looked at DriverContext and it didn't expose the Settings unless I missed it.
If it's easy, I think we can fix it on this PR.
Do you have a suggestion on what we the best way to gain access to the real Settings here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are currently missing some wiring. We can move these settings to PlannerSettings and pass PlannerSettings to DriverContext. Let's address this in a follow-up.
Removes a bottleneck in the ES|QL rate implementation which was found via profiling:
I benchmarked the rate function locally with a high cardinality (2k series) series on fine granular time buckets.
This means that for the aggregation there are many groups and each group only contains little to medium amounts of data.
The profiling showed that the main bottleneck for this use-case was in fact the memory accounting for the
CircuitBreaker: We allocate multiple arrays for each group, which in turn means that we are hammering the underlying atomic counter with concurrent updates.This PR fixes this by making sure that we use the
LocalCircuitBreakerhere instead: This circuit breaker pre-allocates larger chunks of memory from the atomic counter, causing the number of updates to be significantly reduced.In my local benchmarks, this reduced the response time for the query by about ~30%.
Unfortunately those results don't exactly translate to our high cardinality CI benchmarks: Those benchmarks also have a high number of groups, but much more data per group, causing this problem to have a lower impact. For that reason we only see a 3~5% improvement here on CI for the high cardinality benchmarks.