Skip to content

Commit c09d6ee

Browse files
authored
Merge pull request #1560 from cortexproject/hand-over-doc
Document the ingester hand-over process
2 parents 4f397ee + 45d191a commit c09d6ee

File tree

3 files changed

+57
-0
lines changed

3 files changed

+57
-0
lines changed

docs/architecture.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ Ingesters are semi-stateful in that they always retain the last 12 hours worth o
6565

6666
As *semi*-stateful processes, ingesters are *not* designed to be long-term data stores. In Cortex, that role is played by the [chunk store](#chunk-store).
6767

68+
A [hand-over process](ingester-handover.md) manages the state when ingesters are added, removed or replaced.
69+
6870
#### Write de-amplification
6971

7072
Ingesters store the last 12 hours worth of samples in order to perform **write de-amplification**, i.e. batching and compressing samples for the same series and flushing them out to the [chunk store](#chunk-store). Under normal operations, there should be *many* orders of magnitude fewer queries per second (QPS) worth of writes to the chunk store than to the ingesters.

docs/arguments.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Cortex Arguments Explained
22

3+
Duration arguments should be specified with a unit like `5s` or `3h`. Valid time units are "ms", "s", "m", "h".
4+
35
## Querier
46

57
- `-querier.max-concurrent`
@@ -155,6 +157,14 @@ It also talks to a KVStore and has it's own copies of the same flags used by the
155157

156158
## Ingester
157159

160+
- `-ingester.join-after`
161+
162+
How long to wait in PENDING state during the [hand-over process](ingester-handover.md). (default 0s)
163+
164+
- `-ingester.ingester.max-transfer-retries`
165+
166+
How many times a LEAVING ingester tries to find a PENDING ingester during the [hand-over process](ingester-handover.md). Each attempt takes a second or so. (default 10)
167+
158168
- `-ingester.normalise-tokens`
159169

160170
Write out "normalised" tokens to the ring. Normalised tokens consume less memory to encode and decode; as the ring is unmarshalled regularly, this significantly reduces memory usage of anything that watches the ring.

docs/ingester-handover.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Ingester Hand-over
2+
3+
The [ingester](architecture.md#ingester) holds several hours of sample
4+
data in memory. When we want to shut down an ingester, either for
5+
software version update or to drain a node for maintenance, this data
6+
must not be discarded.
7+
8+
Each ingester goes through different states in its lifecycle. When
9+
working normally, the state is `ACTIVE`.
10+
11+
On start-up, an ingester first goes into state `PENDING`. After a
12+
[short time](arguments.md#ingester), if nothing happens, it adds
13+
itself to the ring and goes into state ACTIVE.
14+
15+
A running ingester is notified to shut down by Unix signal
16+
`SIGINT`. On receipt of this signal it goes into state `LEAVING` and
17+
looks for an ingester in state `PENDING`. If it finds one, that
18+
ingester goes into state `JOINING` and the leaver transfers all its
19+
in-memory data over to the joiner. On successful transfer the leaver
20+
removes itself from the ring and exits and the joiner changes to
21+
`ACTIVE`, taking over ownership of the leaver's
22+
[ring tokens](architecture.md#hashing).
23+
24+
If a leaving ingester does not find a pending ingester after [several
25+
attempts](arguments.md#ingester), it will flush all of its chunks to
26+
the backing database, then remove itself from the ring and exit. This
27+
may take tens of minutes to complete.
28+
29+
During hand-over, neither the leaving nor joining ingesters will
30+
accept new samples. Distributors are aware of this, and "spill" the
31+
samples to the next ingester in the ring. This creates a set of extra
32+
"spilled" chunks which will idle out and flush after hand-over is
33+
complete. The sudden increase in flush queue can be alarming!
34+
35+
The following metrics can be used to observe this process:
36+
37+
- `cortex_member_ring_tokens_owned` - how many tokens each ingester thinks it owns
38+
- `cortex_ring_tokens_owned` - how many tokens each ingester is seen to own by other components
39+
- `cortex_ring_member_ownership_percent` same as `cortex_ring_tokens_owned` but expressed as a percentage
40+
- `cortex_ring_members` - how many ingesters can be seen in each state, by other components
41+
- `cortex_ingester_sent_chunks` - number of chunks sent by leaving ingester
42+
- `cortex_ingester_received_chunks` - number of chunks received by joining ingester
43+
44+
You can see the current state of the ring via http browser request to
45+
`/ring` on a distributor.

0 commit comments

Comments
 (0)