Skip to content

DynamoDB auto-scaling should use max(queue) not sum #1812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bboreham opened this issue Nov 12, 2019 · 3 comments
Closed

DynamoDB auto-scaling should use max(queue) not sum #1812

bboreham opened this issue Nov 12, 2019 · 3 comments

Comments

@bboreham
Copy link
Contributor

We had a situation where one ingester was somehow hitting more throttling than the others, and the queue overall didn't get big enough to trigger a scale-up but that one ingester OOMed.

Any lines like this should use max() not sum():

defaultQueueLenQuery = `sum(avg_over_time(cortex_ingester_flush_queue_length{job="cortex/ingester"}[2m]))`

Need to be a bit careful releasing that change, as it gives a very different meaning to --metrics.target-queue-length. Now I think about it, maybe we could have both - a target value which guides gentle scaling, and a max value for one ingester which triggers more urgent measures.

See also #921

@stale
Copy link

stale bot commented Feb 3, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 3, 2020
@bboreham
Copy link
Contributor Author

bboreham commented Feb 5, 2020

let's keep this alive at least a little longer.

@stale stale bot removed the stale label Feb 5, 2020
@stale
Copy link

stale bot commented Apr 5, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 5, 2020
@stale stale bot closed this as completed Apr 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants