|
| 1 | +# Ingester Hand-over |
| 2 | + |
| 3 | +The [ingester](architecture.md#ingester) holds several hours of sample |
| 4 | +data in memory. When we want to shut down an ingester, either for |
| 5 | +software version update or to drain a node for maintenance, this data |
| 6 | +must not be discarded. |
| 7 | + |
| 8 | +Each ingester goes through different states in its lifecycle. When |
| 9 | +working normally, the state is `ACTIVE`. |
| 10 | + |
| 11 | +On start-up, an ingester first goes into state `PENDING`. After a |
| 12 | +short time, if nothing happens, it adds itself to the ring and goes |
| 13 | +into state ACTIVE. |
| 14 | + |
| 15 | +A running ingester is notified to shut down by Unix signal |
| 16 | +`SIGINT`. On receipt of this signal it goes into state `LEAVING` and |
| 17 | +looks for an ingester in state `PENDING`. If it finds one, that |
| 18 | +ingester goes into state `JOINING` and the leaver transfers all its |
| 19 | +in-memory data over to the joiner. On successful transfer the leaver |
| 20 | +removes itself from the ring and exits and the joiner changes to |
| 21 | +`ACTIVE`, taking over ownership of the leaver's |
| 22 | +[ring tokens](architecture.md#hashing). |
| 23 | + |
| 24 | +If a leaving ingester does not find a pending ingester, it will flush |
| 25 | +all of its chunks to the backing database, then remove itself from the |
| 26 | +ring and exit. This may take tens of minutes to complete. |
| 27 | + |
| 28 | +During hand-over, neither the leaving nor joining ingesters will |
| 29 | +accept new samples. Distributors are aware of this, and "spill" the |
| 30 | +samples to the next ingester in the ring. This creates a set of extra |
| 31 | +"spilled" chunks which will idle out and flush after hand-over is |
| 32 | +complete. The sudden increase in flush queue can be alarming! |
| 33 | + |
| 34 | +The following metrics can be used to observe this process: |
| 35 | + |
| 36 | + - `cortex_member_ring_tokens_owned` - how many tokens each ingester thinks it owns |
| 37 | + - `cortex_ring_tokens_owned` - how many tokens each ingester is seen to own by other components |
| 38 | + - `cortex_ring_member_ownership_percent` same as `cortex_ring_tokens_owned` but expressed as a percentage |
| 39 | + - `cortex_ring_members` - how many ingesters can be seen in each state, by other components |
| 40 | + - `cortex_ingester_sent_chunks` - number of chunks sent by leaving ingester |
| 41 | + - `cortex_ingester_received_chunks` - number of chunks received by joining ingester |
| 42 | + |
| 43 | +You can see the current state of the ring via http browser request to |
| 44 | +`/ring` on a distributor. |
0 commit comments