✨ multiNamespaceCache: support custom newCache funcs per namespace#1962
✨ multiNamespaceCache: support custom newCache funcs per namespace#1962joelanford wants to merge 2 commits intokubernetes-sigs:mainfrom
Conversation
| func MultiNamespacedCacheBuilder(namespaces []string) NewCacheFunc { | ||
| byNamespaceOpts := ByNamespaceOptions{NewNamespaceCaches: map[string]NewCacheFunc{}} | ||
| for _, ns := range namespaces { | ||
| byNamespaceOpts.NewNamespaceCaches[ns] = New | ||
| } | ||
| return BuilderByNamespace(byNamespaceOpts) | ||
| } |
There was a problem hiding this comment.
Now, MultiNamespacedCacheBuilder is simply a specific configuration of the BuilderByNamespace where the catch-all namespace-scoped cache is left as nil (to avoid caching namespace-scoped objects outside the specified namespaces)
There was a problem hiding this comment.
I agree with @alvaroaleman on making the functionality look similar across both options. But the question is if BuilderByNamespace and MultiNamespaceCacheBuilder support specifying cache funcs per namespace/cluster/all other namespaces wise, then what would be the underlying difference between them. Can we embed byNamespaceOpts directly into MultiNamespacedCacheBuilder? This way, the user can either provide the list of namespaces they want to cache or cache funcs for each of them. One option for this would be expose these options through WithNamespace or WithCacheOptions methods. This is definitely a major breaking change, but I guess its worth exploring instead of having two methods do the same with difference just being the parameters passed?
There was a problem hiding this comment.
I just felt it didn't make sense to break MultiNamespacedCacheBuilder for no good reason. I'd be on board with deprecating it though.
There was a problem hiding this comment.
Instead of having BuilderByNamespace and MultiNamespaceCacheBuilder as it is currently, can we introduce MultiNamespaceCacheBuilderWithOptions (or some better name) that allows both options - providing a list of namespaces, and/or cache funcs (with builder opts) and then deprecate (eventually remove) MultiNamespacedCacheBuilder in following releases?
There was a problem hiding this comment.
OK updated to:
- Call the new function
MultiNamespacedCacheWithOptionsBuilder - Use functional options instead of a struct
- Deprecate
MultiNamespacedCacheBuilder - Include an example in the
MultiNamespacedCacheBuilderdeprecation message that explains how to achieve the existing functionality with the new builder function.
Does this get closer to what y'all are imagining?
pkg/cache/multi_namespace_cache.go
Outdated
| if err != nil { | ||
| return nil, err | ||
| } | ||
| nsToCache[corev1.NamespaceAll] = defaultNamespaceCache |
There was a problem hiding this comment.
I don't think I understand this. The clusterCache is for cluster-scoped objects, so what for is this? I thought NewDefaultNamespaceCache was intended to be the default used if a namespace has no explicit NewCacheFunc?
There was a problem hiding this comment.
It looks like we fall through to this cache if no other cache matches, but that is new behavior or not? I don't think it is very expected that we have a multi_namespace_cache that will create a global cache if used to request an object that is not in one of the configured namespaces?
There was a problem hiding this comment.
I thought NewDefaultNamespaceCache was intended to be the default used if a namespace has no explicit NewCacheFunc?
That's correct.
It looks like we fall through to this cache if no other cache matches, but that is new behavior or not?
That is new behavior, but only if you're using the ByNamespaceBuilder variation of the builder. The existing MultiNamespacedCacheBuilder leaves NewDefaultNamespaceCache set to nil, meaning there is no catch-all global cache, so the behavior there remains the same.
There was a problem hiding this comment.
That is new behavior, but only if you're using the ByNamespaceBuilder variation of the builder. The existing MultiNamespacedCacheBuilder leaves NewDefaultNamespaceCache set to nil, meaning there is no catch-all global cache, so the behavior there remains the same.
IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.
Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector that if set will be used if a namespace does not have a selector.
There was a problem hiding this comment.
IMHO it is extremely unintuitive that this happens in one builder and not the other and IMHO it is wrong to do this by default. If anything, we should have an explicit knob to opt it in to this in both the builders.
- In the
BuilderByNamespace, it is opt-in. You have to specifically setNewDefaultNamespaceCache. - In
MultiNamespacedCacheBuilderthere has never been a fall-back option, and my thought was to not make a breaking change to that builder whatsoever, meaning there's no way to opt-in.
Also IMHO rather than forcing a selector for all namespaces by using the map, it would be nicer to have a struct as value that optionally takes a selector and an option for a DefaultSelector that if set will be used if a namespace does not have a selector.
I'm not sure I'm following this. There's nothing in this layer that deals directly with selectors. This is just "I want different NewCacheFuncs per specific namespace/catch-all namespaces/cluster-scope."
f831e3d to
73b74e8
Compare
73b74e8 to
af1dc80
Compare
af1dc80 to
dbc5e59
Compare
c579070 to
bec1314
Compare
|
#1980 merged, so I rebased to latest |
|
/retest |
1 similar comment
|
/retest |
c5fee54 to
60dddc0
Compare
60dddc0 to
9ee454d
Compare
pkg/cache/multi_namespace_cache.go
Outdated
|
|
||
| // WithNamespaceCache configures MultiNamespacedCacheWithOptionsBuilder | ||
| // with a namespace cache that uses the provided NewCacheFunc. | ||
| func WithNamespaceCache(namespace string, f NewCacheFunc) MultiNamespacedOption { |
There was a problem hiding this comment.
MultiNamespacedCacheWithOptionsBuilder is 👍 . Just one question, since we are exposing
WithNamespaceCaches and WithNamespaceCache, what if a user does:
cache.WithNamespaceCache(testNamespaceOne, cache.BuilderWithOptions(....)),
cache.WithNamespaceCaches([]string{testNamespaceOne, testNamespaceTwo, testNamespaceThree}, cache.New),In this case, looks like we will simply override the custom cache func which the user provided for testNamespaceOne. Is this something we want to address and validate the inputs, or leave it to the user's responsibility to not give us duplicates?
There was a problem hiding this comment.
Correct, that would result in testNamespaceOne: cache.New
I could document that if the same namespace is provided multiple times, the last applied NewCacheFunc for that namespace wins.
There was a problem hiding this comment.
Last commit adds a line in the godoc that last setting for a particular namespace wins.
Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
9ee454d to
408df4e
Compare
varshaprasad96
left a comment
There was a problem hiding this comment.
/hold
/lgtm
Looks good from my end. Placing it on hold for @alvaroaleman's reviews.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: joelanford, varshaprasad96 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
f941ae4 to
f69cdb4
Compare
Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
f69cdb4 to
40e2e1f
Compare
|
Thx! |
| } | ||
| } | ||
|
|
||
| func setDefaultNamespacedCacheOpts(opts Options, newObjectCaches map[string]NewCacheFunc) Options { |
There was a problem hiding this comment.
So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?
If yes, the problem with that is that it will fail if you don't have cluster-wide rbac - Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?
There was a problem hiding this comment.
So this will end up setting up a cluster-scoped cache with a field-selector that excludes all namespaces for which we have a cache?
It depends on the NewCacheFunc defined by WithDefaultNamespacedCache. But yeah, I'd generally imagine that would be a cluster-wide cache.
Wouldn't it be more useful to create a namespace-scoped cache with the configured selector whenever we get a request for a namespace that doesn't have a cache?
Two thoughts:
- If a caller uses
client.List()withoutclient.InNamespacethey're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces. - There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.
Thoughts? Perhaps we could make this configurable?
There was a problem hiding this comment.
If a caller uses client.List() without client.InNamespace they're essentially asking for all $things in the cluster (modulo the selectors used to configure the various subcaches). It isn't immediately obvious to me how we'd handle this situation with individual per-namespace caches for catch-all namespaces.
I think the expectation there would be that you get whatever is in the cache, which is a subset of what is actually in the cluster
There's a performance/permissions tradeoff here. If you assume cluster-wide list/watch RBAC, you only need one informer. If you don't/can't assume cluster-wide list/watch RBAC, you'd end up creating an informer and open a watch connection for each catch-all namespace you end up querying. In a cluster with lots of namespaces, that could end up putting quite a load on the apiserver.
But what would be the reason for using this cache in the first place if you have cluster-wide rbac? Just getting a selector for one namespace only? Or is the actual, hidden use-case behind this "namespaced cache for object type X, global cache for everything else"?
I feel like the more we start supporting such rather exotic use-cases, the more ppl will just come up with even more of them. What if the next person wants to have different cache configurations based on object type or a combination of object type and namespace? Wouldn't that be the point where it would be easier to have some kind of CacheRouter that allows to freely configure something that inspects requests and then forwards them to the appropriate, configurable cache?
Sorry for being a bit negative here, it is just that this is IMHO both complicated enough to not be obvious to reason about and not solving the problem of "People might want an arbitrary cache depending on any possible combination of object type and selectors".
There was a problem hiding this comment.
But what would be the reason for using this cache in the first place if you have cluster-wide rbac?
The specific motivation I have for this cache is memory optimization. I want to cache all configmaps in my controller's namespace; but in every other namespace, I only want to cache configmaps that match a certain label selector.
|
@alvaroaleman I'm still interested in getting this merged. I think my overall feeling on this is that it's just a refactoring of the existing MultiNamespaceCacheBuilder that gives users more flexibility in that it allows the NewCache functions to be provided by the user rather than hard-coded (and it adds the optional catch-all cache for namespaced objects) From my PR description:
I did look into implementing this out-of-tree, but IIRC it would have meant one of:
I completely understand your point about not wanting controller-runtime to be a dumping ground for exotic solutions to niche problems.
|
|
For what it's worth, I've been able to workaround not having this PR merged by making use of a separate This workaround means there are two separate caches:
As a result of the separate caches, I have to be careful to setup my watches. Specifically, I need to use Nonetheless, IMO this workaround is non-intuitive and isn't really an intended use of the |
|
Okay, I am trying to reason through all the possible use-cases here:
Is the above approximately correct or did I miss anything? If yes, I would probably suggest that we end up extenting the cache.Options with:
What do you think, did I miss anything? I am btw not asking you to implement this, I would just like to have a coherent design of where we want to get to in the end rather than a series of ad-hoc changes that don't really fit in together and are confusing to use. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@joelanford: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@joelanford We should chat about a design document on how the above can become a result of reworking the options, the multi-namespace constructor got deprecated on main, and it'll be removed later on. We should work on an overall design that makes sense around the cache.Options |
|
Closing this for now, let's rally behind a feature design in a hackmd of sorts, or a different PR based on the latest changes, once ready /close |
|
@vincepri: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This PR adds more flexibility to multi-namespace caches by introducing a new cache builder function (
cache.ByNamespaceBuilder) that enables callers to specify individual cache builder functions on a per-namespace basis (with additional support for specifying a catch-all cache builder and a cluster-scoped cache builder)The primary use case of this is to enable caching the same GVK with a different set of selectors/transformers/deepcopy config based on which namespace the objects are in.
Concretely, imagine a controller that needs to read configmaps in its own namespace and also manage configmaps during reconciliation of its primary object, where each set has effectively nothing to do with the other (i.e. a shared label selector is not feasible). In order to efficiently cache both sets, one might need to:
<controller>-systemnamespacemanaged-by: <controller>I originally considered implementing this out-of-tree, but because it so closely resembles and builds upon the existing in-tree multiNamespaceCache, I thought it made more sense to include here.
Signed-off-by: Joe Lanford joe.lanford@gmail.com