✨ multiNamespaceCache: support custom newCache funcs per namespace #1962

joelanford · 2022-07-28T17:24:07Z

This PR adds more flexibility to multi-namespace caches by introducing a new cache builder function (cache.ByNamespaceBuilder) that enables callers to specify individual cache builder functions on a per-namespace basis (with additional support for specifying a catch-all cache builder and a cluster-scoped cache builder)

The primary use case of this is to enable caching the same GVK with a different set of selectors/transformers/deepcopy config based on which namespace the objects are in.

Concretely, imagine a controller that needs to read configmaps in its own namespace and also manage configmaps during reconciliation of its primary object, where each set has effectively nothing to do with the other (i.e. a shared label selector is not feasible). In order to efficiently cache both sets, one might need to:

Cache all configmaps in <controller>-system namespace
In all other namespaces, cache only configmaps with label managed-by: <controller>

I originally considered implementing this out-of-tree, but because it so closely resembles and builds upon the existing in-tree multiNamespaceCache, I thought it made more sense to include here.

Signed-off-by: Joe Lanford [email protected]

joelanford · 2022-08-08T15:03:35Z

pkg/cache/multi_namespace_cache.go

+func MultiNamespacedCacheBuilder(namespaces []string) NewCacheFunc {
+	byNamespaceOpts := ByNamespaceOptions{NewNamespaceCaches: map[string]NewCacheFunc{}}
+	for _, ns := range namespaces {
+		byNamespaceOpts.NewNamespaceCaches[ns] = New
+	}
+	return BuilderByNamespace(byNamespaceOpts)
+}


Now, MultiNamespacedCacheBuilder is simply a specific configuration of the BuilderByNamespace where the catch-all namespace-scoped cache is left as nil (to avoid caching namespace-scoped objects outside the specified namespaces)

I agree with @alvaroaleman on making the functionality look similar across both options. But the question is if BuilderByNamespace and MultiNamespaceCacheBuilder support specifying cache funcs per namespace/cluster/all other namespaces wise, then what would be the underlying difference between them. Can we embed byNamespaceOpts directly into MultiNamespacedCacheBuilder? This way, the user can either provide the list of namespaces they want to cache or cache funcs for each of them. One option for this would be expose these options through WithNamespace or WithCacheOptions methods. This is definitely a major breaking change, but I guess its worth exploring instead of having two methods do the same with difference just being the parameters passed?

I just felt it didn't make sense to break MultiNamespacedCacheBuilder for no good reason. I'd be on board with deprecating it though.

Instead of having BuilderByNamespace and MultiNamespaceCacheBuilder as it is currently, can we introduce MultiNamespaceCacheBuilderWithOptions (or some better name) that allows both options - providing a list of namespaces, and/or cache funcs (with builder opts) and then deprecate (eventually remove) MultiNamespacedCacheBuilder in following releases?

OK updated to:

Call the new function MultiNamespacedCacheWithOptionsBuilder

Use functional options instead of a struct

Deprecate MultiNamespacedCacheBuilder

Include an example in the MultiNamespacedCacheBuilder deprecation message that explains how to achieve the existing functionality with the new builder function.

Does this get closer to what y'all are imagining?

alvaroaleman · 2022-08-08T19:54:00Z

pkg/cache/multi_namespace_cache.go

+			if err != nil {
+				return nil, err
+			}
+			nsToCache[corev1.NamespaceAll] = defaultNamespaceCache


I don't think I understand this. The clusterCache is for cluster-scoped objects, so what for is this? I thought NewDefaultNamespaceCache was intended to be the default used if a namespace has no explicit NewCacheFunc?

It looks like we fall through to this cache if no other cache matches, but that is new behavior or not? I don't think it is very expected that we have a multi_namespace_cache that will create a global cache if used to request an object that is not in one of the configured namespaces?

I thought NewDefaultNamespaceCache was intended to be the default used if a namespace has no explicit NewCacheFunc?

That's correct.

It looks like we fall through to this cache if no other cache matches, but that is new behavior or not?

That is new behavior, but only if you're using the ByNamespaceBuilder variation of the builder. The existing MultiNamespacedCacheBuilder leaves NewDefaultNamespaceCache set to nil, meaning there is no catch-all global cache, so the behavior there remains the same.

That is new behavior, but only if you're using the ByNamespaceBuilder variation of the builder. The existing MultiNamespacedCacheBuilder leaves NewDefaultNamespaceCache set to nil, meaning there is no catch-all global cache, so the behavior there remains the same.

IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.

Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector that if set will be used if a namespace does not have a selector.

IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.

In the BuilderByNamespace, it is opt-in. You have to specifically set NewDefaultNamespaceCache.

In MultiNamespacedCacheBuilder there has never been a fall-back option, and my thought was to not make a breaking change to that builder whatsoever, meaning there's no way to opt-in.

Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector that if set will be used if a namespace does not have a selector.

I'm not sure I'm following this. There's nothing in this layer that deals directly with selectors. This is just "I want different NewCacheFuncs per specific namespace/catch-all namespaces/cluster-scope."

pkg/cache/cache.go

joelanford · 2022-08-31T19:06:06Z

#1980 merged, so I rebased to latest master. This PR is ready for review again.

joelanford · 2022-09-01T17:14:34Z

/retest

joelanford · 2022-09-01T17:23:22Z

/retest

varshaprasad96 · 2022-09-02T20:00:00Z

pkg/cache/multi_namespace_cache.go

+
+// WithNamespaceCache configures MultiNamespacedCacheWithOptionsBuilder
+// with a namespace cache that uses the provided NewCacheFunc.
+func WithNamespaceCache(namespace string, f NewCacheFunc) MultiNamespacedOption {


MultiNamespacedCacheWithOptionsBuilder is 👍 . Just one question, since we are exposing
WithNamespaceCaches and WithNamespaceCache, what if a user does:

cache.WithNamespaceCache(testNamespaceOne, cache.BuilderWithOptions(....)), cache.WithNamespaceCaches([]string{testNamespaceOne, testNamespaceTwo, testNamespaceThree}, cache.New),

In this case, looks like we will simply override the custom cache func which the user provided for testNamespaceOne. Is this something we want to address and validate the inputs, or leave it to the user's responsibility to not give us duplicates?

Correct, that would result in testNamespaceOne: cache.New

I could document that if the same namespace is provided multiple times, the last applied NewCacheFunc for that namespace wins.

Last commit adds a line in the godoc that last setting for a particular namespace wins.

Signed-off-by: Joe Lanford <[email protected]>

varshaprasad96

/hold
/lgtm
Looks good from my end. Placing it on hold for @alvaroaleman's reviews.

k8s-ci-robot · 2022-09-02T23:25:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: joelanford, varshaprasad96

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [joelanford]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sbueringer

Just a few nits

pkg/cache/multi_namespace_cache.go

Signed-off-by: Joe Lanford <[email protected]>

sbueringer · 2022-09-06T12:32:49Z

Thx!
/lgtm

alvaroaleman · 2022-09-06T14:39:06Z

pkg/cache/multi_namespace_cache.go

+	}
+}
+
+func setDefaultNamespacedCacheOpts(opts Options, newObjectCaches map[string]NewCacheFunc) Options {


So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?

If yes, the problem with that is that it will fail if you don't have cluster-wide rbac - Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?

So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?

It depends on the NewCacheFunc defined by WithDefaultNamespacedCache. But yeah, I'd generally imagine that would be a cluster-wide cache.

Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?

Two thoughts:

If a caller uses client.List() without client.InNamespace they're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces.

There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.

Thoughts? Perhaps we could make this configurable?

If a caller uses client.List() without client.InNamespace they're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces.

I think the expectation there would be that you get whatever is in the cache, which is a subset of what is actually in the cluster

There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.

But what would be the reason for using this cache in the first place if you have cluster-wide rbac? Just getting a selector for one namespace only? Or is the actual, hidden use-case behind this "namespaced cache for object type X, global cache for everything else"?

I feel like the more we start supporting such rather exotic use-cases, the more ppl will just come up with even more of them. What if the next person wants to have different cache configurations based on object type or a combination of object type and namespace? Wouldn't that be the point where it would be easier to have some kind of CacheRouter that allows to freely configure something that inspects requests and then forwards them to the appropriate, configurable cache?

Sorry for being a bit negative here, it is just that this is IMHO both complicated enough to not be obvious to reason about and not solving the problem of "People might want an arbitrary cache depending on any possible combination of object type and selectors".

But what would be the reason for using this cache in the first place if you have cluster-wide rbac?

The specific motivation I have for this cache is memory optimization. I want to cache all configmaps in my controller's namespace; but in every other namespace, I only want to cache configmaps that match a certain label selector.

joelanford · 2022-10-28T14:42:20Z

@alvaroaleman I'm still interested in getting this merged. I think my overall feeling on this is that it's just a refactoring of the existing MultiNamespaceCacheBuilder that gives users more flexibility in that it allows the NewCache functions to be provided by the user rather than hard-coded (and it adds the optional catch-all cache for namespaced objects)

From my PR description:

I originally considered implementing this out-of-tree, but because it so closely resembles and builds upon the existing in-tree multiNamespaceCache, I thought it made more sense to include here.

I did look into implementing this out-of-tree, but IIRC it would have meant one of:

exporting the multiNamespaceCache struct and its fields to enable and out-of-tree caller to manipulate them to support different cache building functions
copying basically all of the multiNamespaceCache implementation out-of-tree AND exporting some of the internal cache code that it uses.

I completely understand your point about not wanting controller-runtime to be a dumping ground for exotic solutions to niche problems.

I don't think this is that. I think the use case I'm solving for is probably quite common, but I'd imagine that lots of controllers just end up using a cluster-wide cache and unknowingly consume more memory than necessary.
We might need to think about ways to make it easier to build caches like this out-of-tree. It seems to me that there's a good chunk of internal code that would be really useful for building other cache implementations.

joelanford · 2022-11-21T13:43:14Z

For what it's worth, I've been able to workaround not having this PR merged by making use of a separate Cluster, which is configured with the same apiserver credentials, but with a different namespace scoping.

This workaround means there are two separate caches:

The "main" cache, which is scoped to all namespaces but is configured with a label selector for objects.
A cache for the operator's namespace, which is scoped to just the operator's namespace but without any selectors.

As a result of the separate caches, I have to be careful to setup my watches. Specifically, I need to use source.NewKindWithCache for watches of objects in my operator's namespace.

Nonetheless, IMO this workaround is non-intuitive and isn't really an intended use of the Cluster object, so I still believe this PR should be accepted and merged.

alvaroaleman · 2022-11-25T18:20:14Z

Okay, I am trying to reason through all the possible use-cases here:

Limit the namespaces that can be used globally (implemented in multi-ns cache today)
Configure a PerNS+PerObject selector (not available today)
Configure a per-object selector (available today, if we implement 2 this would only be used for all namesapces that do not have a perNS+perObject selector)
Limit namespaces per object, this isn't really possible today (there is no or field selector from what I can see, might be possible to propose upstream? Not sure)
Configure a per-namespace selector, this would be used for all namespace+object combination where we don't have a namesapce+object or object level selector
Configure a global default for everything else (available today)

Is the above approximately correct or did I miss anything? If yes, I would probably suggest that we end up extenting the cache.Options with:

Adding a NamespacesByObject option that limits the namespaces by object, probably with a useSingleInformer setting if it is somehow possible to do that
Adding a Namespaces option that globally limits the namespaces and makes the constructor use the multi-namespace cache underneath for objects that do not have a NamespacesByObject configuration
If this is possible add a UseSingleInformer option that will be used in the multi ns cache
Adding an SelectorByNamespaceAndObject option. This has precedence over the existing SelectorByObject for the configured namespaces. This is basically what you want with this change, IIUIC.
Adding a SelectorByNamespace option that we will use unless there is a SelectorByNamespaceAndObject or SelectorByObject config.
Finally, we will fall through to DefaultSelector if nothing more specific exists
Eventually someone will likely ask for DefaultSelectorExclude lol

What do you think, did I miss anything? I am btw not asking you to implement this, I would just like to have a coherent design of where we want to get to in the end rather than a series of ad-hoc changes that don't really fit in together and are confusing to use.

k8s-ci-robot · 2023-02-01T05:36:09Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2023-02-16T15:42:58Z

@joelanford: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-controller-runtime-test	`40e2e1f`	link	true	`/test pull-controller-runtime-test`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

vincepri · 2023-02-16T22:11:36Z

@joelanford We should chat about a design document on how the above can become a result of reworking the options, the multi-namespace constructor got deprecated on main, and it'll be removed later on. We should work on an overall design that makes sense around the cache.Options

vincepri · 2023-02-16T22:12:05Z

Closing this for now, let's rally behind a feature design in a hackmd of sorts, or a different PR based on the latest changes, once ready

/close

k8s-ci-robot · 2023-02-16T22:12:10Z

@vincepri: Closed this PR.

In response to this:

Closing this for now, let's rally behind a feature design in a hackmd of sorts, or a different PR based on the latest changes, once ready

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 28, 2022

k8s-ci-robot requested review from gerred and varshaprasad96 July 28, 2022 17:24

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2022

joelanford removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2022

joelanford requested review from vincepri and alvaroaleman and removed request for gerred August 8, 2022 14:53

joelanford commented Aug 8, 2022

View reviewed changes

alvaroaleman reviewed Aug 8, 2022

View reviewed changes

joelanford mentioned this pull request Aug 8, 2022

Add ConfigMapSyncer controller and rukpak-ca configmap operator-framework/rukpak#483

Closed

joelanford force-pushed the cache-by-namespace-builder branch from f831e3d to 73b74e8 Compare August 18, 2022 12:50

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 18, 2022

joelanford force-pushed the cache-by-namespace-builder branch from 73b74e8 to af1dc80 Compare August 18, 2022 12:52

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 25, 2022

joelanford force-pushed the cache-by-namespace-builder branch from af1dc80 to dbc5e59 Compare August 29, 2022 17:39

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 29, 2022

joelanford force-pushed the cache-by-namespace-builder branch 3 times, most recently from c579070 to bec1314 Compare August 31, 2022 19:03

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 31, 2022

joelanford force-pushed the cache-by-namespace-builder branch 2 times, most recently from c5fee54 to 60dddc0 Compare September 2, 2022 14:14

joelanford force-pushed the cache-by-namespace-builder branch from 60dddc0 to 9ee454d Compare September 2, 2022 17:50

varshaprasad96 reviewed Sep 2, 2022

View reviewed changes

multiNamespaceCache: support custom newCache funcs per namespace

408df4e

Signed-off-by: Joe Lanford <[email protected]>

joelanford force-pushed the cache-by-namespace-builder branch from 9ee454d to 408df4e Compare September 2, 2022 20:41

varshaprasad96 approved these changes Sep 2, 2022

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 2, 2022

k8s-ci-robot assigned varshaprasad96 Sep 2, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 2, 2022

sbueringer reviewed Sep 6, 2022

View reviewed changes

pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved

pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved

pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved

pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 6, 2022

joelanford force-pushed the cache-by-namespace-builder branch from f941ae4 to f69cdb4 Compare September 6, 2022 12:23

address PR feedback

40e2e1f

Signed-off-by: Joe Lanford <[email protected]>

joelanford force-pushed the cache-by-namespace-builder branch from f69cdb4 to 40e2e1f Compare September 6, 2022 12:24

k8s-ci-robot assigned sbueringer Sep 6, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 6, 2022

alvaroaleman reviewed Sep 6, 2022

View reviewed changes

SgtCoDFish mentioned this pull request Jan 10, 2023

Cache correctness investigations cert-manager/trust-manager#91

Closed

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 1, 2023

k8s-ci-robot closed this Feb 16, 2023

alvaroaleman mentioned this pull request Apr 1, 2023

📖 Add a design for cache options configuration #2261

Merged

✨ multiNamespaceCache: support custom newCache funcs per namespace #1962

✨ multiNamespaceCache: support custom newCache funcs per namespace #1962

Conversation

joelanford commented Jul 28, 2022

Uh oh!

joelanford Aug 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

varshaprasad96 Sep 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joelanford Aug 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joelanford commented Aug 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joelanford commented Sep 1, 2022

Uh oh!

joelanford commented Sep 1, 2022

Uh oh!

varshaprasad96 Sep 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

varshaprasad96 left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Sep 2, 2022

Uh oh!

sbueringer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sbueringer commented Sep 6, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joelanford Sep 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joelanford commented Oct 28, 2022

Uh oh!

joelanford commented Nov 21, 2022

Uh oh!

alvaroaleman commented Nov 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Feb 1, 2023

Uh oh!

joelanford Aug 8, 2022 •

edited

Loading

varshaprasad96 Sep 1, 2022 •

edited

Loading

joelanford Aug 8, 2022 •

edited

Loading

joelanford commented Aug 31, 2022 •

edited

Loading

varshaprasad96 Sep 2, 2022 •

edited

Loading

joelanford Sep 9, 2022 •

edited

Loading

alvaroaleman commented Nov 25, 2022 •

edited

Loading