Skip to content

Added KV Store client that mirrors one backend to another #1749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Jan 7, 2020
Merged

Added KV Store client that mirrors one backend to another #1749

merged 31 commits into from
Jan 7, 2020

Conversation

pstibrany
Copy link
Contributor

@pstibrany pstibrany commented Oct 23, 2019

Since we support multiple KV Store backends now (Consul, Etcd, memberlist-gossiping soon), we want to be able to migrate from one backend to a different one gracefully, without shutting down all components at once. (Issue #1525) This PR implements one way of doing that.

It adds yet another KV Store backend, called multiClient. This client is configured with several backends, one of them is designated as primary. Primary one is used for all client operations like Get, CAS, WatchKey, and WatchPrefix. At the same time, multiClient reads all changes done in primary backend and copies them to secondary backend. (That is done using WatchPrefix, with "" prefix [update: now configurable]) Updates via CAS are also written to secondary store, if mirroring is enabled.

It is possible to switch the primary backend in the runtime by using existing overrides mechanism. This mechanism has been extended and refactored a bit. [update: it's now also possible to enable/disable mirroring in runtime via overrides]

Cortex has new "runtimeConfigValues" struct that is reloaded using OverridesManager (now runtime_config.Manager). runtimeConfigValues currently contains limit overrides and multi-client primary store. runtime_config.Manager now lives in its own runtime_config package, and its job is to reload runtime YAML file (overrides.yaml), and provide new value to clients and listeners.

Components interested in using runtimeConfigValues setup themselves in main Cortex setup code (cortex.go, modules.go). At the moment, interested components are Overrides (using limits) and Ring (when using multi-client, for listening on primary store changes).

After all this introduction, how is this supposed to work? In scenario, where one wants to migrate from Consul to Etcd, here are the steps for migration:

  1. setup all components to use multi-client, with consul as primary and etcd as secondary store, and restart them. Components will keep using consul, but will also mirror the ring to etcd.
  2. switch primary store to etcd in runtime (eg. by pushing new config map)
  3. once done, all components are using etcd (and mirroring to consul)
  4. remove multi-client configuration, and let components use etcd only

All this can be done without any downtime. Cost of this is higher load on consul/etcd when components do mirroring, but that can be allieviated by using rate limits on consul client. [update: mirroring itself now has rate limit] [Watch+mirroring functionality has been replaced with simple update of values in secondary stores when performing CAS operation]

@pstibrany
Copy link
Contributor Author

TODO: think about how to do this without killing consul/etcd with too many requests. If every component (distributor/ingester/querier) does the mirroring all the time, that's probably little too much.

@pstibrany
Copy link
Contributor Author

Cleaned up the code a little bit, and added:

  • rate limit for watching for changes in the primary store
  • config option for watch prefix
  • flag for enabling/disabling mirroring in runtime (via overrides)

@tomwilkie tomwilkie requested review from gouthamve and jtlisi October 29, 2019 09:11
Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just have a few questions

Copy link
Contributor

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some nitpicky things 🙂

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good job @pstibrany! I left few minor non-blocking comments.

Tomorrow morning I will re-review the cancel fn logic in the multi client, cause I'm feeling tired and I may have miss something. Great job again!

jtlisi
jtlisi previously approved these changes Oct 30, 2019
Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pstibrany
Copy link
Contributor Author

Rebased on top of master, and squashed to single commit, to ease possible future rebases.

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look really good to me, great job!
Just a small change to do to fix an issue with limits changes.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @pstibrany! I left a couple of non blocking comments, so I'm approving it, given - to my 👀 - the changes look good.

Copy link
Contributor

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my big comment would be around simplifying the entire thing to be: Write to both, read from primary. Also, forget about inflight ops, just do it on the primary client when we initiated the call.


// Runs supplied fn with current primary client. If primary client changes, fn is restarted.
// When fn finishes (with or without error), this method returns given error value.
func (m *MultiClient) runWithPrimaryClient(origCtx context.Context, fn func(newCtx context.Context, primary kvclient) error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually can we make the requirements laxer? Instead of the whole inProgress and cancellation, can we just run the function against the primary client and forget about it changing midway?

I wonder what the worst case scenario is then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WatchKey and WatchPrefix operations are long-running, basically designed to never return (eg. see usage in pkg/ring/ring.go:155 or pkg/distributor/ha_tracker.go:144). So we need to either preserve that behaviour (what this PR is trying to do), or modify clients to restart if it returns early.

Get and CAS are short operations, and switching midway isn't necessary there.

}

// watchChanges performs mirroring -- it watches changes in the primary client, and puts them into secondary.
func (m *MultiClient) watchChanges(prefix string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, what about writing to both clients but the reads go off only primary? Shouldn't that achieve the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would achieve the same, at the cost of making CAS slower for clients. But that is probably just fine. (There is question of what to do when CAS to primary succeeds, but CAS to secondary fails, but the only reasonable thing we can do is to ignore it... we should not call f again)

@pstibrany
Copy link
Contributor Author

pstibrany commented Nov 6, 2019

So my big comment would be around simplifying the entire thing to be: Write to both, read from primary. Also, forget about inflight ops, just do it on the primary client when we initiated the call.

I've changed functionality to follow this logic: writes via CAS function are now mirrored (if enabled) to secondary store, Get always goes to primary. WatchKey and WatchPrefix are left as they were (explanation why).

I've removed mirroring goroutine and rate limiting, and added metrics. I've also introduced new config value, MirrorTimeout, used when forwarding write to secondary store. (I've observed very long delays when doing Consul -> Etcd mirroring and Etcd was down -- over minute, and ingester got to Unhealthy state.)

@gouthamve gouthamve requested a review from jtlisi November 10, 2019 14:36
@gouthamve gouthamve dismissed jtlisi’s stale review November 10, 2019 14:38

Tons of changes have been made after and this requires another look.

Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The KV code was much clearer than the last time I reviewed this PR. The logic around the KV store looks good and has good associated metrics/logs.

I want to note I would prefer if the OverridesManager and associated code lived in it's own package. We don't have a strong style guide around these types of things as a project. However, since a significant part of Cortex is going to be consumed by downstream projects I think it is important to maintain well named, clearly defined packages throughout the code base.

@pstibrany
Copy link
Contributor Author

LGTM

The KV code was much clearer than the last time I reviewed this PR. The logic around the KV store looks good and has good associated metrics/logs.

I want to note I would prefer if the OverridesManager and associated code lived in it's own package. We don't have a strong style guide around these types of things as a project. However, since a significant part of Cortex is going to be consumed by downstream projects I think it is important to maintain well named, clearly defined packages throughout the code base.

Thanks @jtlisi for your review.

"much cleaner" -- I think I mostly removed code from KV code since your last review. It's now not doing the "mirroring" as a background task, but updates values as part of CAS.

re OverridesManager: I've now moved it and renamed to runtime_config.Manager. Originally I didn't want to do big changes to it, as this PR is primarily about something else, but I also think this is a good change.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @pstibrany! I checked out the change set again - to catch up with last commits - and left few comments.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pstibrany for addressing my feedback. LGTM (once tests will be fixed)!

@tomwilkie tomwilkie requested a review from gouthamve November 26, 2019 14:51
@pstibrany
Copy link
Contributor Author

pstibrany commented Nov 28, 2019

I've added two small changes:

  1. when server starts, we reload overrides after setting YAML defaults for config. Otherwise they would only be applied after next reload. initialize YAML defaults for limits before starting runtimeconfig.Manager. This is to fix bug observed in production, when in first 10 seconds of ingester running, incorrect limits were applied.

  2. Communicate configuration updates via channels: runtimeconfig.Manager no longer spawns new goroutine to tell Listener about new configuration, but sends value to channel. If there is no receiver, or channel buffer is full, update is discarded.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pstibrany
Copy link
Contributor Author

Updated CHANGELOG.md and documented runtime config section and updated format of overrides/runtime config yaml file.

Copy link
Contributor

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all this @pstibrany! LGTM!

}
go mgr.loop()
} else {
level.Info(util.Logger).Log("msg", "config disabled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rutime config disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about "runtime config file not specified, reload disabled" to explain "why" as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or rather runtime config disabled: file not specified

This client is configured with multiple stores, one of them is designated as
primary. All client operations are forwarded to the primary store.

MultiClient also does "mirroring" of values from primary to secondary store.

MultiClient can listen on changes in runtime configuration (via overrides
mechanism), and switch primary store and enable/disable mirroring.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Mirroring goroutine was removed, and replaced by forwarding writes
done via CAS function to secondary client. Rate limits config was
removed, but there is now timeout for secondary write, to avoid
blocking CAS function for too long, if secondary write is slow
(eg. etcd being down can cause very long writes).

Only WatchKey and WatchPrefix functions now react on change of
primary client.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Without watch-and-mirror functionality, there is no need to check
if value is already present in the secondary store.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
…mes, removed forgotten log.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Instead of spawning new goroutine for each config update,
we now use channels to communicate config updates.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
Addressed other review feedback.

Signed-off-by: Peter Štibraný <[email protected]>
@gouthamve gouthamve merged commit b43bd74 into cortexproject:master Jan 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants