Skip to content

Add a worker time-spent-idle counter to the ruler #727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 28, 2018

Conversation

leth
Copy link
Contributor

@leth leth commented Feb 28, 2018

The blocked_workers metric can't give us a good indication of how busy the ruler is, because work happens every 15s and we sample every ~15s.
The evalLatency metric indicates when evaluations are late, i.e. when we have reached/exceeded our capacity, but doesn't indicate how close we are to it in normal circumstances.

Using an idle time counter should give us an idea of how much slack we have in the queue processing; if it approaches 0, then we know our workers are at capacity.

Since counters are monotonic, this should allow us to measure utilisation without hitting the same sampling issue.

Should allow us to measure utilisation without sampling issues
@leth leth requested review from jml, bboreham and a team February 28, 2018 08:48
@leth leth merged commit 275f773 into master Feb 28, 2018
@leth leth deleted the worker-idle-time-counter branch February 28, 2018 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants