-
Notifications
You must be signed in to change notification settings - Fork 816
More visibility on ingesters #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes #59
If we're only going to error anyway, might as well do it quickly.
b57aef3
to
c891ab5
Compare
@@ -263,6 +263,10 @@ func (d *Distributor) Query(ctx context.Context, from, to model.Time, matchers . | |||
return err | |||
} | |||
|
|||
if len(ingesters) < d.cfg.MinReadSuccesses { | |||
return fmt.Errorf("Could only find %d ingesters for query. Need at least %d", len(ingesters), d.cfg.MinReadSuccesses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start error
strings lower-cased.
👍 besides nit |
ch <- prometheus.MustNewConstMetric( | ||
d.ingestersAlive, | ||
prometheus.GaugeValue, | ||
float64(len(d.cfg.Ring.GetAll(d.cfg.HeartbeatTimeout))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metric might strengthen the argument of moving the heartbeat timeout into the ring completely. We have the total ingesters metric in the ring, but the one about alive ones in the distributor, bleh. Just something to keep in mind though, fine for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as the Ring in inextricably tied to IngesterDesc, sure, why not?
👍 |
I guess this fixes #60. Gives us a measure of non-dead ingesters.
Also, I noticed that the reason we'd be getting
nil
errors on query failures is that we haven't got enough ingesters, so I added an early failure path with a better error message.I thought about adding a crappy status page at
/ringz
or something, but I would have had to pierce through a layer or two of abstraction and decided that it wasn't worth the effort. Happy to revisit if people think otherwise.Assumes #65
This change is