Skip to content

More visibility on ingesters #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 25, 2016
Merged

More visibility on ingesters #68

merged 8 commits into from
Oct 25, 2016

Conversation

jml
Copy link
Contributor

@jml jml commented Oct 24, 2016

I guess this fixes #60. Gives us a measure of non-dead ingesters.

Also, I noticed that the reason we'd be getting nil errors on query failures is that we haven't got enough ingesters, so I added an early failure path with a better error message.

I thought about adding a crappy status page at /ringz or something, but I would have had to pierce through a layer or two of abstraction and decided that it wasn't worth the effort. Happy to revisit if people think otherwise.

Assumes #65


This change is Reviewable

@jml jml force-pushed the 60-more-visibility branch from b57aef3 to c891ab5 Compare October 24, 2016 17:57
@@ -263,6 +263,10 @@ func (d *Distributor) Query(ctx context.Context, from, to model.Time, matchers .
return err
}

if len(ingesters) < d.cfg.MinReadSuccesses {
return fmt.Errorf("Could only find %d ingesters for query. Need at least %d", len(ingesters), d.cfg.MinReadSuccesses)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start error strings lower-cased.

@juliusv
Copy link
Contributor

juliusv commented Oct 24, 2016

👍 besides nit

ch <- prometheus.MustNewConstMetric(
d.ingestersAlive,
prometheus.GaugeValue,
float64(len(d.cfg.Ring.GetAll(d.cfg.HeartbeatTimeout))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This metric might strengthen the argument of moving the heartbeat timeout into the ring completely. We have the total ingesters metric in the ring, but the one about alive ones in the distributor, bleh. Just something to keep in mind though, fine for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the Ring in inextricably tied to IngesterDesc, sure, why not?

@juliusv
Copy link
Contributor

juliusv commented Oct 24, 2016

👍

@jml jml merged commit eafc3f6 into master Oct 25, 2016
@jml jml deleted the 60-more-visibility branch October 25, 2016 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Insufficient visibility into distributor ring state
2 participants