Skip to content

Updated Cortex architecture doc and documented the experimental blocks storage #1945

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pracucci
Copy link
Contributor

@pracucci pracucci commented Jan 2, 2020

What this PR does:
In this PR:

  • Added missing TSDB-related changes (already merged) to changelog
  • Updated Cortex architecture doc
  • Documented the experimental blocks storage

Notes to the reviewer:

  • I suggest to read the updated architecture doc entirely and not from the diff, to have a better overview of how has been covered
  • Mentioned the WAL, since I do expect to be merged soon.
  • Mentioned the blocks compactor, which I hope to see merged soon (PR Introduce TSDB blocks compactor #1942).
  • I've specified the blocks storage config giving preference to YAML and mentioning each related CLI flag. I've used the Prometheus config documentation syntax (also used by Loki). If there are no objection, I would like to submit a subsequent PR to document the entire Cortex config the same way.

Which issue(s) this PR fixes:
N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pracucci
Copy link
Contributor Author

pracucci commented Jan 2, 2020

Specific requests:

  • @codesome may you double check the WAL section, please?
  • @owen-d may you double check the query frontend section, please?
  • @jtlisi may you help me enriching the ruler and alert manager doc, please?
  • @thorfour may you review the TSDB related stuff?

@owen-d
Copy link
Contributor

owen-d commented Jan 2, 2020

Nice work with the docs. Frontend section is looking good.

Copy link
Contributor

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there's plenty of changes in this document, I want to avoid giving an all-at-once review.

Is this is the kind of feedback you expected? Let me know and I'll keep going.

Overall, fantastic job @pracucci 👌

Copy link
Contributor

@thorfour thorfour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit around some wording otherwise LGTM from the TSDB side of things.

@pracucci pracucci force-pushed the doc-experimental-tsdb-so-far branch 4 times, most recently from 689c800 to f86df9c Compare January 2, 2020 16:30
@pracucci
Copy link
Contributor Author

pracucci commented Jan 2, 2020

Thanks @owen-d and @gotjosh. I've addressed your comments. May you re-check it, please?

Is this is the kind of feedback you expected? Let me know and I'll keep going.

@gotjosh Yes, definitely. I'm just a bit dubious about the "Blocks" and "Chunks" uppercase letter. If we decide to go into this direction, it's better if we decide the rationale and I take care of all the changes cause there are many (more than then the places you edited). So far, I've kept them lowercase.

Copy link
Contributor

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Marco, for leaving you with more comments 🙈. Safe to say my comments are optional - feel free to pick the ones that make sense to you.

Overall, great job on these docs - the improvement is superb! 🍾


#### The hash ring

A consistent hash ring is stored in [Consul](https://www.consul.io/) as a single key-value pair, with the ring data structure also encoded as a [Protobuf](https://developers.google.com/protocol-buffers/) message. The consistent hash ring consists of a list of tokens and ingesters. Hashed values are looked up in the ring; the replication set is built for the closest unique ingesters by token. One of the benefits of this system is that adding and remove ingesters results in only 1/_N_ of the series being moved (where _N_ is the number of ingesters).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, this section is one slightly more complex pieces of Cortex. You've done a fantastic job trying to expand on it, but I'm not sure about the way it reads.

If you feel strongly about it we can leave it as is, if you have doubts - I'm happy to take a stab and try to reword it and perhaps add a small diagram to it. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never feel very strong about something, and I'm always open to discussion 😉 That being said, I would appreciate if could express the same concept in a easier way.

About the diagram, sure. I also thought about it, but I didn't (yet) because I'm not sure how to draw a diagram in a way which will be easily editable by other contributors in the future (ie. which tool to use? where should we share the sources?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to use mermaid for that exact same reasons. The source code is extremely simple which makes it easy to check it into version control and they have a live editor which means anyone can copy/paste then start editing right away.

Let's merge this as is and I'll create a follow-up issue assigned to me to come up with something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created #1949 to record this conversation. Let me know what you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good not blocking this PR with the improved ring description. I will comment the issue.

@gotjosh
Copy link
Contributor

gotjosh commented Jan 2, 2020

@gotjosh Yes, definitely. I'm just a bit dubious about the "Blocks" and "Chunks" uppercase letter. If we decide to go into this direction, it's better if we decide the rationale and I take care of all the changes cause there are many (more than then the places you edited). So far, I've kept them lowercase.

@pracucci I'll leave that in your hands then. My opinion is that if these components have their own semantics they should be consider proper nouns e.g. A Block within Cortex does not fall under the regular definition of the word.

As I linked in the Prometheus doc, not capitalizing makes sense if we refer to them as adjectives e.g. block files or index files - in every other case, they should.

@pracucci pracucci force-pushed the doc-experimental-tsdb-so-far branch from f399e7e to ac912b6 Compare January 2, 2020 18:43
@pracucci
Copy link
Contributor Author

pracucci commented Jan 3, 2020

As I liked in the Prometheus doc, not capitalizing makes sense if we refer to them as adjectives e.g. block files or index files - in every other case, they should.

Ok @gotjosh. May you check it again, please? I've capitalized them - when they're nouns (at least I believe they are) - in the commit 0e481f3

Copy link
Contributor

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the architecture doc from Storage to right before Quorum consistency. Looks good from my side 👌


## Services

Cortex has a service-based architecture, in which the overall system is split up into a variety of components that perform specific tasks and run separately (and potentially in parallel).
Cortex has a service-based architecture, in which the overall system is split up into a variety of components that perform a specific task, run separately, and in parallel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention that you can run all of them together in a single process mode too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. What's about now?


> This hashing scheme was originally chosen to reduce the number of required ingesters on the query path. The trade-off, however, is that the write load on the ingesters is less even.
The first hashing strategy (default) was originally chosen to reduce the number of required ingesters on the query path, but running Cortex in production showed that an evenly distributed write load on the ingesters is preferable at the cost of always querying most of the ingesters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that this is mostly us tbh. We might also be the only ones who are running it this way. Others are still using the defaults and it works for them. For us, it was sparked by one big customer that had an huge number of metrics with the same label-name.

I think the doc makes it sound like the shard-by-all should be the way to go for everyone, but for most people the defaults should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified it, removing "opinions" and just leaving the trade-off mention.


### Query frontend
Contrary to the sole replication and given the persistent local disk data is not lost, in the event of multiple ingesters failure each ingester will recover the in-memory series samples from WAL upon subsequent restart. The replication is still recommended in order to ensure no temporary failures on the read path in the event of a single ingester failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contrary to the sole replication --> what does this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means, "differently then when you setup a cluster only with the replication and without the WAL"

@pracucci pracucci force-pushed the doc-experimental-tsdb-so-far branch from 0e481f3 to 72e47bf Compare January 7, 2020 11:32
Signed-off-by: Marco Pracucci <[email protected]>
@pracucci
Copy link
Contributor Author

pracucci commented Jan 7, 2020

@gotjosh @gouthamve I've addressed your comments and replied to a couple. May you take another look, please?

Copy link
Contributor

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from here @pracucci! 🥇

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@bboreham bboreham merged commit 1a3b008 into cortexproject:master Jan 7, 2020
@pracucci pracucci deleted the doc-experimental-tsdb-so-far branch January 7, 2020 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants