Skip to content

Flush tokens to disk #1750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Dec 5, 2019
Merged

Flush tokens to disk #1750

merged 21 commits into from
Dec 5, 2019

Conversation

codesome
Copy link
Contributor

@codesome codesome commented Oct 23, 2019

This PR implements flushing of tokens to the disk and reading them back again when a pod is restarted/upgraded. This goes with this design of retaining tokens required for WAL in ingesters.

Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
@codesome codesome marked this pull request as ready for review October 23, 2019 15:54
Signed-off-by: Ganesh Vernekar <[email protected]>
@codesome
Copy link
Contributor Author

And it is ready for review

Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codesome I left few comments, I would be glad you could take a look. I will do a deeper review as next step 🙏

Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @codesome for re-iterating based on my feedback.

A couple of things:

  1. Can you rebase with master? I think there have been several changes related to the ring in the meanwhile.
  2. Please add unit tests for your logic (serialization / deserialization / changes to lifecycler)

I tried to have type Token uint32 but that would have led to a lot of changes and I would like to keep that for a separate PR in itself.

Actually it's what I see you did. I left my comments based on this, but I've some feeling it may be limiting in the future. The main problem is that you're binding the serialization to the list of tokens. What if we need to store the state as well? The current design will break. My suggestion was actually more simple, and to just create a new struct which is the "persisted state" of the ring (currently containing only the tokens), then to marshal/unmarshal it adding a version (no need to add the code to support multiple versions at this stage, but once you've a version written inside you can add multiple versions in the future). If you follow this path, it means:

  1. Rollback the tokens to []uint32
  2. Add something like type PersistedLifecyclerState struct with functions like:
    • Marshal()
    • Unmarshal()
    • LoadFromFile()
    • StoreToFile()

I will let you pick the path you believe it's the most appropriate, and I will do the next review accordingly.

@codesome
Copy link
Contributor Author

codesome commented Nov 6, 2019

Can you rebase with master? I think there have been several changes related to the ring in the meanwhile.

I had already rebased with master to include the ring changes. Rebased again in case something was left.

Please add unit tests for your logic (serialization / deserialization / changes to lifecycler)

I had added TestTokensOnDisk. Any case that is missing in that?

Actually it's what I see you did.

Yeah. I made some changes following @gouthamve's suggestion after I put that comment :)

Regarding persisting the lifecycler state, that would include more than just the tokens. I would prefer restoring other states (like the IngesterState) by looking at the ring and not the file and only persist the tokens. The idea was to only keep track of the tokens and the rest to be taken care of as usual. WDYT?

Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @codesome for addressing my previous feedback. The overall design looks better to me, yet I left few comments and here there about things that I believe should be improved.

Signed-off-by: Ganesh Vernekar <[email protected]>
@jtlisi jtlisi self-assigned this Nov 18, 2019
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @codesome for patiently addressing my feeback. LGTM!

@pstibrany
Copy link
Contributor

pstibrany commented Nov 19, 2019

Just out of curiosity... cannot ingester simply reuse its own tokens from the ring? That would assume that it keeps its identity, but otherwise it should already work. [Update: after reading design document, I'm even more convinced that this is not necessary, as with stateful sets, ingesters keep their identity.]

@codesome
Copy link
Contributor Author

codesome commented Nov 20, 2019

cannot ingester simply reuse its own tokens from the ring

There was some discussion about this. It would work, but, it would be dependant on how reliable is the ring, as it's possible that the ring can go away for a while or it might be restarted sometime. Having it in the file beside the WAL would also make it easy to relate the tokens with the WAL beside it, and makes those tokens on disk as reliable as WAL being present on the disk. Though the ring is reliable enough, I would personally push forward having the tokens on disk. (If ring, we need to have some state other than the current ones while upgrading the ingester else it is likely to mess up with other operations. Playing with local file feels less complex than playing with the ring. UPDATE: Apparently the change required for using the ring is pretty small!)

@pstibrany
Copy link
Contributor

There was some discussion about this. It would work, but, it would be dependant on how reliable is the ring, as it's possible that the ring can go away for a while or it might be restarted sometime. Having it in the file beside the WAL would also make it easy to relate the tokens with the WAL beside it, and makes those tokens on disk as reliable as WAL being present on the disk.

Fair enough. I just wanted to make sure this option was considered. Thanks.

Signed-off-by: Ganesh Vernekar <[email protected]>
@codesome
Copy link
Contributor Author

codesome commented Dec 2, 2019

@pstibrany I have addressed all your reviews

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updates. I have few more small comments.

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience with my feedback!

Signed-off-by: Ganesh Vernekar <[email protected]>
Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gouthamve gouthamve merged commit a1a8e04 into cortexproject:master Dec 5, 2019
@codesome codesome deleted the tokens-file branch December 6, 2019 11:21
@gouthamve gouthamve mentioned this pull request Jan 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants