Skip to content

Horizontally Scalable Ruler #1258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 17, 2019

Conversation

jtlisi
Copy link
Contributor

@jtlisi jtlisi commented Mar 5, 2019

This PR provides a flag that allows the ruler to shard ownership of which rule groups to evaluate by re-using the ring code. When enabled each ruler will act as a Lifecycler and interact with consul to create a ring.

TO DO

  • Add tests for the ruler ring
  • Add better instrumentation for monitoring the distributed ruler

BREAKING CHANGES

Metrics deleted:

rules_queue_length

Metrics renamed:

Old New
cortex_ingester_consul_heartbeats_total cortex_member_consul_heartbeats_total
cortex_ingester_ring_tokens_owned cortex_member_ring_tokens_owned
cortex_ingester_ring_tokens_to_own cortex_member_ring_tokens_to_own
cortex_ring_ingester_ownership_percent cortex_ring_member_ownership_percent
cortex_ring_ingesters cortex_ring_members

The ingester label on some metrics is renamed member.
All of the above and cortex_shutdown_duration_seconds add a name label

@khaines
Copy link
Contributor

khaines commented Mar 9, 2019

Thank you for working on this change! I'll be very happy to test this when it is ready :)

@rsteneteg FYI, since we've discussed this as a scaling contention point in the past

@jtlisi jtlisi force-pushed the 20190219_distributed_ruler branch from db8fcc0 to 4c54de9 Compare March 11, 2019 19:34
@tomwilkie
Copy link
Contributor

Does this also fix #310?

@bboreham
Copy link
Contributor

Fixes #1213.

@jtlisi jtlisi marked this pull request as ready for review April 5, 2019 16:31
@tomwilkie
Copy link
Contributor

This looks really good! Lots of small nits and we need to chat about how we handle config on Cortex. When you get this green & all the points address, give me a ping and we can do a final pass.

@jtlisi jtlisi force-pushed the 20190219_distributed_ruler branch from c5774e8 to 5a8d444 Compare April 11, 2019 00:38
@jtlisi
Copy link
Contributor Author

jtlisi commented Apr 12, 2019

@tomwilkie I think this is ready for another look

@jtlisi jtlisi force-pushed the 20190219_distributed_ruler branch from 94f7902 to 06db06f Compare April 24, 2019 19:25
Copy link
Contributor

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the whole thing and found nothing wrong :) But this is also my first pass over this codebase. Given this has been running for us for a few weeks now, I'm giving it an LGTM.

@jtlisi jtlisi force-pushed the 20190219_distributed_ruler branch from 06db06f to bdefa7b Compare May 13, 2019 19:31
@jtlisi jtlisi force-pushed the 20190219_distributed_ruler branch 2 times, most recently from a35a3f4 to 1bbde65 Compare May 16, 2019 22:31
@jtlisi jtlisi force-pushed the 20190219_distributed_ruler branch from 1bbde65 to 12aec3d Compare May 16, 2019 22:58
@jtlisi
Copy link
Contributor Author

jtlisi commented May 16, 2019

I rebased this on top of master to work with a single binary. This should be ready for another look @tomwilkie

@tomwilkie
Copy link
Contributor

LGTM! Thanks for you work on this Jacob.

@tomwilkie tomwilkie merged commit a50412e into cortexproject:master May 17, 2019
@tomwilkie tomwilkie deleted the 20190219_distributed_ruler branch May 17, 2019 13:24
@tomwilkie tomwilkie mentioned this pull request May 18, 2019
@bboreham
Copy link
Contributor

I have edited the description to mention some breaking changes to metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants