-
Notifications
You must be signed in to change notification settings - Fork 816
Updated Cortex architecture doc and documented the experimental blocks storage #1945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Cortex architecture doc and documented the experimental blocks storage #1945
Conversation
Nice work with the docs. Frontend section is looking good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there's plenty of changes in this document, I want to avoid giving an all-at-once review.
Is this is the kind of feedback you expected? Let me know and I'll keep going.
Overall, fantastic job @pracucci 👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit
around some wording otherwise LGTM from the TSDB side of things.
689c800
to
f86df9c
Compare
Thanks @owen-d and @gotjosh. I've addressed your comments. May you re-check it, please?
@gotjosh Yes, definitely. I'm just a bit dubious about the "Blocks" and "Chunks" uppercase letter. If we decide to go into this direction, it's better if we decide the rationale and I take care of all the changes cause there are many (more than then the places you edited). So far, I've kept them lowercase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry Marco, for leaving you with more comments 🙈. Safe to say my comments are optional - feel free to pick the ones that make sense to you.
Overall, great job on these docs - the improvement is superb! 🍾
|
||
#### The hash ring | ||
|
||
A consistent hash ring is stored in [Consul](https://www.consul.io/) as a single key-value pair, with the ring data structure also encoded as a [Protobuf](https://developers.google.com/protocol-buffers/) message. The consistent hash ring consists of a list of tokens and ingesters. Hashed values are looked up in the ring; the replication set is built for the closest unique ingesters by token. One of the benefits of this system is that adding and remove ingesters results in only 1/_N_ of the series being moved (where _N_ is the number of ingesters). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this section is one slightly more complex pieces of Cortex. You've done a fantastic job trying to expand on it, but I'm not sure about the way it reads.
If you feel strongly about it we can leave it as is, if you have doubts - I'm happy to take a stab and try to reword it and perhaps add a small diagram to it. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never feel very strong about something, and I'm always open to discussion 😉 That being said, I would appreciate if could express the same concept in a easier way.
About the diagram, sure. I also thought about it, but I didn't (yet) because I'm not sure how to draw a diagram in a way which will be easily editable by other contributors in the future (ie. which tool to use? where should we share the sources?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to use mermaid for that exact same reasons. The source code is extremely simple which makes it easy to check it into version control and they have a live editor which means anyone can copy/paste then start editing right away.
Let's merge this as is and I'll create a follow-up issue assigned to me to come up with something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have created #1949 to record this conversation. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good not blocking this PR with the improved ring description. I will comment the issue.
@pracucci I'll leave that in your hands then. My opinion is that if these components have their own semantics they should be consider proper nouns e.g. A Block within Cortex does not fall under the regular definition of the word. As I linked in the Prometheus doc, not capitalizing makes sense if we refer to them as adjectives e.g. block files or index files - in every other case, they should. |
f399e7e
to
ac912b6
Compare
Ok @gotjosh. May you check it again, please? I've capitalized them - when they're nouns (at least I believe they are) - in the commit 0e481f3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the architecture doc from Storage to right before Quorum consistency. Looks good from my side 👌
docs/architecture.md
Outdated
|
||
## Services | ||
|
||
Cortex has a service-based architecture, in which the overall system is split up into a variety of components that perform specific tasks and run separately (and potentially in parallel). | ||
Cortex has a service-based architecture, in which the overall system is split up into a variety of components that perform a specific task, run separately, and in parallel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe mention that you can run all of them together in a single process mode too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. What's about now?
docs/architecture.md
Outdated
|
||
> This hashing scheme was originally chosen to reduce the number of required ingesters on the query path. The trade-off, however, is that the write load on the ingesters is less even. | ||
The first hashing strategy (default) was originally chosen to reduce the number of required ingesters on the query path, but running Cortex in production showed that an evenly distributed write load on the ingesters is preferable at the cost of always querying most of the ingesters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that this is mostly us tbh. We might also be the only ones who are running it this way. Others are still using the defaults and it works for them. For us, it was sparked by one big customer that had an huge number of metrics with the same label-name.
I think the doc makes it sound like the shard-by-all should be the way to go for everyone, but for most people the defaults should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified it, removing "opinions" and just leaving the trade-off mention.
|
||
### Query frontend | ||
Contrary to the sole replication and given the persistent local disk data is not lost, in the event of multiple ingesters failure each ingester will recover the in-memory series samples from WAL upon subsequent restart. The replication is still recommended in order to ensure no temporary failures on the read path in the event of a single ingester failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Contrary to the sole replication
--> what does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means, "differently then when you setup a cluster only with the replication and without the WAL"
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
0e481f3
to
72e47bf
Compare
Signed-off-by: Marco Pracucci <[email protected]>
@gotjosh @gouthamve I've addressed your comments and replied to a couple. May you take another look, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from here @pracucci! 🥇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
What this PR does:
In this PR:
Notes to the reviewer:
Which issue(s) this PR fixes:
N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]