-
Notifications
You must be signed in to change notification settings - Fork 816
Ingesters are routinely taking longer than 20mins to flush. #158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The code to manage the tables will need to:
So, when can we guarantee no more writes will go to the old table? If the max chunk age is fixed (it isn't right now), then we can - see #127. |
I made some cosmetic edits to the original post to make it a bit easier to read.
I'm pretty sure I understood the logic for this last year, but I've forgotten. Can you please explain a bit more?
I'd prefer a new, distinct job for the following reasons:
|
More interesting is the number of writes per chunk: http://frontend.dev.weave.works/admin/grafana/dashboard/file/cortex-chunks.json?panelId=8&fullscreen&from=1483430633952&to=1483459433952 Peaked at over 30 whilst flushing some ingester that had been running for a while, needs to be less than 10 for the pricing to work out.
See https://aws.amazon.com/premiumsupport/knowledge-center/throttled-ddb/
The dev table was 900GB, and each shard is 10GB, so we had 90 shards. At 2000 write capacity, thats only 20 writes/shard. The entropy we use (user id, metric name) to distribute amongst the ingesters is the same entropy we use in the hash (distribution) key in dynamodb - (user id, metrics name and time) - so when shutting down one ingester, we shouldn't expect uniform distribution of writes across the shards, hence we can't saturate our provisioned capacity when flushing. Migrating to an empty table (with a single shard) allowed us to get much closer to the provisioned throughput, which IMO confirms this theory.
I am also cool with this. |
Re: aligning buckets and tables, discussed on slack and decided this is not necessary (and is hard). Instead, we'll go with the rule that a bucket lives in the table containing the buckets start time - allowing bucket time ranges to 'span' table time ranges, but only be written once. Something along the lines of:
|
Can we change that? |
We can, if you can think of a better scheme. |
@jml More entropy = better distribution, but harder to find things again. Would need a cleverer indexing scheme for queries, and we haven't been able to come up with one yet. |
Yeah, I'm thinking about it now and basically anything that lets an ingester find it again by computation doesn't actually change the entropy |
Although I guess you could do something crazy and append a random number between [0, 4) on write and then read from all 4 every time. |
Yup, we could do that. Id prefer to investigate alternative indexing schemes first though. |
(And by that I mean #13) |
Yeah, except #13 would only help with the ingester load balancing, not DynamoDB (more entropy in DynamoDB would then require a multi-stage index, probably not helping the cause). |
Going to use a different ticket to track the weekly buckets work, as this ticket tracks a bunch of stuff: #189 |
#201 is probably a major cause. |
With #201 fixed, and the daily buckets and weekly tables, ingesters are taking 10 mins to flush in dev! |
Is in prod, should go live tomorrow. |
Discussed on slack this morning, there are 2 things going on here:
To help with (1), we plan to reduce the number of writes to dynamodb by:
To help with (2), we plan on moving to weekly dynamodb tables, to minimise the number of shards per table, and reduce the impact of poorly-balanced writes to the dynamo shards. We will need:
Other considerations:
Open questions:
The text was updated successfully, but these errors were encountered: