Skip to content

Commit c29b1b0

Browse files
committed
Review feedback.
Signed-off-by: Goutham Veeramachaneni <[email protected]>
1 parent b2dacf4 commit c29b1b0

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

docs/ha-pair-handling.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
1-
# Config for sending HA Pairs data to cortex
1+
# Config for sending HA Pairs data to Cortex
22

33
## Context
44

5-
With Prometheus, you can have more than a single prometheus monitoring and ingesting the same metrics for redundancy. Cortex already does replication for redundancy and it doesn't make sense to ingest the same data twice in cortex. So in cortex, we made sure we can dedupe the data we receive from HA Pairs of Prometheus. We do this via the following:
5+
You can have more than a single Prometheus monitoring and ingesting the same metrics for redundancy. Cortex already does replication for redundancy and it doesn't make sense to ingest the same data twice. So in Cortex, we made sure we can dedupe the data we receive from HA Pairs of Prometheus. We do this via the following:
66

7-
Assume that there are two teams, each running their own prometheus, monitoring different services. Let's call the Prometheis T1 and T2. Now, if the teams are running HA pairs, let's call the individual Prometheis, T1.a, T1.b and T2.a and T2.b.
7+
Assume that there are two teams, each running their own Prometheus, monitoring different services. Let's call the Prometheis T1 and T2. Now, if the teams are running HA pairs, let's call the individual Prometheis, T1.a, T1.b and T2.a and T2.b.
88

9-
In cortex we make sure we only ingest from one of T1.a and T1.b, and only from one of T2.a and T2.b. We do this by electing a leader replica for each cluster of Prometheus. For example, in the case of T1, let it be T1.a. As long as T1.a is the leader, we drop the samples sent by T1.b. And if cortex sees no new samples from T1.a for a short period (30s by default), it'll switch the leader to be T1.b.
9+
In Cortex we make sure we only ingest from one of T1.a and T1.b, and only from one of T2.a and T2.b. We do this by electing a leader replica for each cluster of Prometheus. For example, in the case of T1, let it be T1.a. As long as T1.a is the leader, we drop the samples sent by T1.b. And if Cortex sees no new samples from T1.a for a short period (30s by default), it'll switch the leader to be T1.b.
1010

11-
This means if T1.a goes down for 10 mins and comes back, we will no longer be accepting samples from T1.a, we will be accepting from T1.b and dropping the samples from T1.a. This way we can preserve the HA redundancy behaviour and make sure we're only accepting samples from a single replica and also we don't drop too much data in case of issues. Please note that with the default scrape period of 15s, you'd ideally be losing the metrics from only a single scrape in case we need to switch leaders. Your rate windows should be atleast 4x the scrape period to make sure you can tolerate this potentially rare occurrence.
11+
This means if T1.a goes down for a few minutes Cortex's HA sample handling will have switched and elected T1.b as the leader. This failover timeout is what enables us to only accept samples from a single replica at a time, but ensure we don't drop too much data in case of issues. Note that with the default scrape period of 15s, and the default timeouts in Cortex, in most cases you'll only lose a single scrape of data in the case of a leader election failover. For any rate queries the rate window should be at least 4x the scrape period to account for any of these failover scenarios, for example with the default scrape period of 15s then you should calculate rates over at least 1m periods.
1212

1313
Now we do the same leader election process T2.
1414

1515
## Config
1616

1717
### Client Side
1818

19-
So for cortex to achieve this, we need 2 identifiers for each process, one identifier for the cluster (T1 or T2, etc) and one identifier to identify the replica in the cluster (a or b). We do this by setting the external labels, ideally `cluster` and `replica`. For example:
19+
So for Cortex to achieve this, we need 2 identifiers for each process, one identifier for the cluster (T1 or T2, etc) and one identifier to identify the replica in the cluster (a or b). The easiest way to do with is by setting external labels, ideally `cluster` and `replica` (note the default is `__replica__`). For example:
2020

2121
```
2222
cluster: prom-team1
@@ -32,9 +32,9 @@ replica: replica2
3232

3333
Note: These are external labels and have nothing to do with remote_write config.
3434

35-
Now these two label-names are totally configurable on Cortex's end, and should be set to something sensible. For example, cluster label is already used by some workloads, and you should set the label to be something else but uniquely identifies the cluster. Good examples for this label-name would be `team`, `cluster`, `prometheus`, etc.
35+
These two label names are configurable per-tenant within Cortex, and should be set to something sensible. For example, cluster label is already used by some workloads, and you should set the label to be something else but uniquely identifies the cluster. Good examples for this label-name would be `team`, `cluster`, `prometheus`, etc.
3636

37-
And coming to the replica label, the name is totally configurable again and should be set so that the value for each prometheus to be unique in that cluster. Note: Cortex drops this label when ingesting data, but preserves the cluster label. This way, your timeseries won't change when replicas change.
37+
The replica label should be set so that the value for each prometheus is unique in that cluster. Note: Cortex drops this label when ingesting data, but preserves the cluster label. This way, your timeseries won't change when replicas change.
3838

3939
### Server Side
4040

0 commit comments

Comments
 (0)