-
Notifications
You must be signed in to change notification settings - Fork 1.2k
✨ multiNamespaceCache: support custom newCache funcs per namespace #1962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ multiNamespaceCache: support custom newCache funcs per namespace #1962
Conversation
pkg/cache/multi_namespace_cache.go
Outdated
func MultiNamespacedCacheBuilder(namespaces []string) NewCacheFunc { | ||
byNamespaceOpts := ByNamespaceOptions{NewNamespaceCaches: map[string]NewCacheFunc{}} | ||
for _, ns := range namespaces { | ||
byNamespaceOpts.NewNamespaceCaches[ns] = New | ||
} | ||
return BuilderByNamespace(byNamespaceOpts) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, MultiNamespacedCacheBuilder is simply a specific configuration of the BuilderByNamespace
where the catch-all namespace-scoped cache is left as nil
(to avoid caching namespace-scoped objects outside the specified namespaces)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @alvaroaleman on making the functionality look similar across both options. But the question is if BuilderByNamespace
and MultiNamespaceCacheBuilder
support specifying cache funcs per namespace/cluster/all other namespaces wise, then what would be the underlying difference between them. Can we embed byNamespaceOpts
directly into MultiNamespacedCacheBuilder
? This way, the user can either provide the list of namespaces they want to cache or cache funcs for each of them. One option for this would be expose these options through WithNamespace
or WithCacheOptions
methods. This is definitely a major breaking change, but I guess its worth exploring instead of having two methods do the same with difference just being the parameters passed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just felt it didn't make sense to break MultiNamespacedCacheBuilder
for no good reason. I'd be on board with deprecating it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having BuilderByNamespace
and MultiNamespaceCacheBuilder
as it is currently, can we introduce MultiNamespaceCacheBuilderWithOptions
(or some better name) that allows both options - providing a list of namespaces, and/or cache funcs (with builder opts) and then deprecate (eventually remove) MultiNamespacedCacheBuilder
in following releases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK updated to:
- Call the new function
MultiNamespacedCacheWithOptionsBuilder
- Use functional options instead of a struct
- Deprecate
MultiNamespacedCacheBuilder
- Include an example in the
MultiNamespacedCacheBuilder
deprecation message that explains how to achieve the existing functionality with the new builder function.
Does this get closer to what y'all are imagining?
pkg/cache/multi_namespace_cache.go
Outdated
if err != nil { | ||
return nil, err | ||
} | ||
nsToCache[corev1.NamespaceAll] = defaultNamespaceCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this. The clusterCache
is for cluster-scoped objects, so what for is this? I thought NewDefaultNamespaceCache
was intended to be the default used if a namespace has no explicit NewCacheFunc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we fall through to this cache if no other cache matches, but that is new behavior or not? I don't think it is very expected that we have a multi_namespace_cache that will create a global cache if used to request an object that is not in one of the configured namespaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought NewDefaultNamespaceCache was intended to be the default used if a namespace has no explicit NewCacheFunc?
That's correct.
It looks like we fall through to this cache if no other cache matches, but that is new behavior or not?
That is new behavior, but only if you're using the ByNamespaceBuilder
variation of the builder. The existing MultiNamespacedCacheBuilder
leaves NewDefaultNamespaceCache
set to nil
, meaning there is no catch-all global cache, so the behavior there remains the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is new behavior, but only if you're using the ByNamespaceBuilder variation of the builder. The existing MultiNamespacedCacheBuilder leaves NewDefaultNamespaceCache set to nil, meaning there is no catch-all global cache, so the behavior there remains the same.
IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.
Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector
that if set will be used if a namespace does not have a selector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.
- In the
BuilderByNamespace
, it is opt-in. You have to specifically setNewDefaultNamespaceCache
. - In
MultiNamespacedCacheBuilder
there has never been a fall-back option, and my thought was to not make a breaking change to that builder whatsoever, meaning there's no way to opt-in.
Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector that if set will be used if a namespace does not have a selector.
I'm not sure I'm following this. There's nothing in this layer that deals directly with selectors. This is just "I want different NewCacheFunc
s per specific namespace/catch-all namespaces/cluster-scope."
f831e3d
to
73b74e8
Compare
73b74e8
to
af1dc80
Compare
af1dc80
to
dbc5e59
Compare
c579070
to
bec1314
Compare
#1980 merged, so I rebased to latest |
/retest |
1 similar comment
/retest |
c5fee54
to
60dddc0
Compare
60dddc0
to
9ee454d
Compare
pkg/cache/multi_namespace_cache.go
Outdated
|
||
// WithNamespaceCache configures MultiNamespacedCacheWithOptionsBuilder | ||
// with a namespace cache that uses the provided NewCacheFunc. | ||
func WithNamespaceCache(namespace string, f NewCacheFunc) MultiNamespacedOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiNamespacedCacheWithOptionsBuilder
is 👍 . Just one question, since we are exposing
WithNamespaceCaches
and WithNamespaceCache
, what if a user does:
cache.WithNamespaceCache(testNamespaceOne, cache.BuilderWithOptions(....)),
cache.WithNamespaceCaches([]string{testNamespaceOne, testNamespaceTwo, testNamespaceThree}, cache.New),
In this case, looks like we will simply override the custom cache func which the user provided for testNamespaceOne
. Is this something we want to address and validate the inputs, or leave it to the user's responsibility to not give us duplicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, that would result in testNamespaceOne: cache.New
I could document that if the same namespace is provided multiple times, the last applied NewCacheFunc for that namespace wins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last commit adds a line in the godoc that last setting for a particular namespace wins.
Signed-off-by: Joe Lanford <[email protected]>
9ee454d
to
408df4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
/lgtm
Looks good from my end. Placing it on hold for @alvaroaleman's reviews.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: joelanford, varshaprasad96 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits
f941ae4
to
f69cdb4
Compare
Signed-off-by: Joe Lanford <[email protected]>
f69cdb4
to
40e2e1f
Compare
Thx! |
} | ||
} | ||
|
||
func setDefaultNamespacedCacheOpts(opts Options, newObjectCaches map[string]NewCacheFunc) Options { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?
If yes, the problem with that is that it will fail if you don't have cluster-wide rbac - Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?
It depends on the NewCacheFunc defined by WithDefaultNamespacedCache
. But yeah, I'd generally imagine that would be a cluster-wide cache.
Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?
Two thoughts:
- If a caller uses
client.List()
withoutclient.InNamespace
they're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces. - There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.
Thoughts? Perhaps we could make this configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a caller uses client.List() without client.InNamespace they're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces.
I think the expectation there would be that you get whatever is in the cache, which is a subset of what is actually in the cluster
There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.
But what would be the reason for using this cache in the first place if you have cluster-wide rbac? Just getting a selector for one namespace only? Or is the actual, hidden use-case behind this "namespaced cache for object type X, global cache for everything else"?
I feel like the more we start supporting such rather exotic use-cases, the more ppl will just come up with even more of them. What if the next person wants to have different cache configurations based on object type or a combination of object type and namespace? Wouldn't that be the point where it would be easier to have some kind of CacheRouter
that allows to freely configure something that inspects requests and then forwards them to the appropriate, configurable cache?
Sorry for being a bit negative here, it is just that this is IMHO both complicated enough to not be obvious to reason about and not solving the problem of "People might want an arbitrary cache depending on any possible combination of object type and selectors".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what would be the reason for using this cache in the first place if you have cluster-wide rbac?
The specific motivation I have for this cache is memory optimization. I want to cache all configmaps in my controller's namespace; but in every other namespace, I only want to cache configmaps that match a certain label selector.
@alvaroaleman I'm still interested in getting this merged. I think my overall feeling on this is that it's just a refactoring of the existing MultiNamespaceCacheBuilder that gives users more flexibility in that it allows the NewCache functions to be provided by the user rather than hard-coded (and it adds the optional catch-all cache for namespaced objects) From my PR description:
I did look into implementing this out-of-tree, but IIRC it would have meant one of:
I completely understand your point about not wanting controller-runtime to be a dumping ground for exotic solutions to niche problems.
|
For what it's worth, I've been able to workaround not having this PR merged by making use of a separate This workaround means there are two separate caches:
As a result of the separate caches, I have to be careful to setup my watches. Specifically, I need to use Nonetheless, IMO this workaround is non-intuitive and isn't really an intended use of the |
Okay, I am trying to reason through all the possible use-cases here:
Is the above approximately correct or did I miss anything? If yes, I would probably suggest that we end up extenting the cache.Options with:
What do you think, did I miss anything? I am btw not asking you to implement this, I would just like to have a coherent design of where we want to get to in the end rather than a series of ad-hoc changes that don't really fit in together and are confusing to use. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@joelanford: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@joelanford We should chat about a design document on how the above can become a result of reworking the options, the multi-namespace constructor got deprecated on main, and it'll be removed later on. We should work on an overall design that makes sense around the cache.Options |
Closing this for now, let's rally behind a feature design in a hackmd of sorts, or a different PR based on the latest changes, once ready /close |
@vincepri: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This PR adds more flexibility to multi-namespace caches by introducing a new cache builder function (
cache.ByNamespaceBuilder
) that enables callers to specify individual cache builder functions on a per-namespace basis (with additional support for specifying a catch-all cache builder and a cluster-scoped cache builder)The primary use case of this is to enable caching the same GVK with a different set of selectors/transformers/deepcopy config based on which namespace the objects are in.
Concretely, imagine a controller that needs to read configmaps in its own namespace and also manage configmaps during reconciliation of its primary object, where each set has effectively nothing to do with the other (i.e. a shared label selector is not feasible). In order to efficiently cache both sets, one might need to:
<controller>-system
namespacemanaged-by: <controller>
I originally considered implementing this out-of-tree, but because it so closely resembles and builds upon the existing in-tree multiNamespaceCache, I thought it made more sense to include here.
Signed-off-by: Joe Lanford [email protected]