You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Optimise memberlist kv store access by storing data unencoded. (#4345)
* Optimise memberlist kv store access by storing data unencoded.
The following profile data was taken from running 50 idle ingesters with
memberlist, with almost everything at default values (5s heartbeats):
```
52.16% mergeBytesValueForKey
+- 52.16% mergeValueForKey
+- 47.84% computeNewValue
+- 27.24% codec Proto Decode
+- 26.25% mergeWithTime
```
It is apparent from the this that a lot of time is spent on the memberlist
receive path, as might be expected, specifically, the merging of the update
into the current state. The cost however is not in decoding the incoming
states (occurs in `mergeBytesValueForKey` before `mergeValueForKey`), but
in fact decoding _current state_ of the value in the store (as it is stored
encoded). The ring state was measured at 123K (50 ingesters), so it makes
sense that decoding could be costly.
This can be avoided by storing the value in it's decoded `Mergeable` form.
When doing this, care has to be taken to deep copy the value when
accessed, as it is modified in place before being updated in the store,
and accessed outside the store mutex.
Note a side effect of this change is that is no longer straightforward
to expose the `memberlist_kv_store_value_bytes` metric, as this reported
the size of the encoded data, therefore it has been removed.
Signed-off-by: Steve Simpson <[email protected]>
* Typo.
Signed-off-by: Steve Simpson <[email protected]>
* Review comments.
Signed-off-by: Steve Simpson <[email protected]>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+2
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,7 @@
4
4
*[FEATURE] Ruler: Add new `-ruler.query-stats-enabled` which when enabled will report the `cortex_ruler_query_seconds_total` as a per-user metric that tracks the sum of the wall time of executing queries in the ruler in seconds. #4317
5
5
6
6
*[CHANGE] Querier / ruler: Change `-querier.max-fetched-chunks-per-query` configuration to limit to maximum number of chunks that can be fetched in a single query. The number of chunks fetched by ingesters AND long-term storare combined should not exceed the value configured on `-querier.max-fetched-chunks-per-query`. #4260
7
+
*[CHANGE] Memberlist: the `memberlist_kv_store_value_bytes` has been removed due to values no longer being stored in-memory as encoded bytes. #4345
7
8
*[ENHANCEMENT] Add timeout for waiting on compactor to become ACTIVE in the ring. #4262
8
9
*[ENHANCEMENT] Reduce memory used by streaming queries, particularly in ruler. #4341
9
10
*[ENHANCEMENT] Ring: allow experimental configuration of disabling of heartbeat timeouts by setting the relevant configuration value to zero. Applies to the following: #4342
@@ -13,6 +14,7 @@
13
14
*`-alertmanager.sharding-ring.heartbeat-timeout`
14
15
*`-compactor.ring.heartbeat-timeout`
15
16
*`-store-gateway.sharding-ring.heartbeat-timeout`
17
+
*[ENHANCEMENT] Memberlist: optimized receive path for processing ring state updates, to help reduce CPU utilization in large clusters. #4345
16
18
*[BUGFIX] HA Tracker: when cleaning up obsolete elected replicas from KV store, tracker didn't update number of cluster per user correctly. #4336
0 commit comments