Do not consider PENDING state as healthy #1866

pracucci · 2019-12-02T09:35:34Z

What this PR does:
While working on the PR #1818 I've realized that an ingester is considered healthy for read operations while PENDING but not while JOINING. From my perspective, an ingester shouldn't be considered healthy while PENDING, so I'm suggesting to:

Do not consider PENDING state as healthy
Explicitly enumerate healthy ingester states for the read path

In the read path, there are some API endpoints like LabelNames and LabelValuesForLabelName for which we query all healthy ingesters (see distributor.forAllIngesters()). A PENDING ingester shouldn't hold any data yet, so shouldn't be required to be hit.

A comment I've received in a previous discussion on this topic is that this change may introduce quorum issues during ingesters rollout. However, I can't see a real difference compared to when the ingester switches from PENDING to JOINING, considered that we already consider the JOINING state as unhealthy.

Which issue(s) this PR fixes:
No issue

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

… not consider PENDING state as healthy Signed-off-by: Marco Pracucci <[email protected]>

pstibrany · 2019-12-02T09:50:38Z

Personally, I would find it easier to read IsHealthy method, if "true" cases were more explicit. As it is now, each time I read it, I find it confusing. :-(

gouthamve

I I think this is better yeah.

csmarchbanks · 2019-12-06T16:50:44Z

I am concerned that this could break queries after a single ingester failure. If an ingester pod vanishes (node issue or something), it will still be unhealthy in the ring, and now the new pending pod will also be considered unhealthy causing all reads to fail. Would you be able to test that?

bboreham · 2019-12-09T18:47:13Z

I think Chris is correct, and I think the solution is not to count pending ingesters as an error.
That implies a change to Ring.GetAll().
Maybe we can lose reallyAll at the same time.

In light of new information :)

stale · 2020-03-17T20:29:15Z

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

pracucci · 2020-07-01T14:26:29Z

Closing for now

Explicitly enumerate healthy ingester states for the read path and do…

987a231

… not consider PENDING state as healthy Signed-off-by: Marco Pracucci <[email protected]>

gouthamve previously approved these changes Dec 6, 2019

View reviewed changes

pstibrany mentioned this pull request Dec 6, 2019

Explicit healty states #1890

Merged

3 tasks

pracucci mentioned this pull request Feb 1, 2020

Incrementally transfer chunks per token to improve handover #1764

Closed

stale bot added the stale label Mar 17, 2020

gouthamve added the keepalive Skipped by stale bot label Mar 17, 2020

stale bot removed the stale label Mar 17, 2020

pstibrany mentioned this pull request May 4, 2020

Update loki to cortex master grafana/loki#2030

Merged

pracucci closed this Jul 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not consider PENDING state as healthy #1866

Do not consider PENDING state as healthy #1866

Uh oh!

pracucci commented Dec 2, 2019

Uh oh!

pstibrany commented Dec 2, 2019

Uh oh!

gouthamve left a comment

Uh oh!

csmarchbanks commented Dec 6, 2019

Uh oh!

bboreham commented Dec 9, 2019

Uh oh!

stale bot commented Mar 17, 2020

Uh oh!

pracucci commented Jul 1, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Do not consider PENDING state as healthy #1866

Do not consider PENDING state as healthy #1866

Uh oh!

Conversation

pracucci commented Dec 2, 2019

Uh oh!

pstibrany commented Dec 2, 2019

Uh oh!

gouthamve left a comment

Choose a reason for hiding this comment

Uh oh!

csmarchbanks commented Dec 6, 2019

Uh oh!

bboreham commented Dec 9, 2019

Uh oh!

stale bot commented Mar 17, 2020

Uh oh!

pracucci commented Jul 1, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants