POC for Opaque config in ResourceClaim status instead of Individual Mode#165
POC for Opaque config in ResourceClaim status instead of Individual Mode#165pravk03 wants to merge 4 commits into
Conversation
Allows bypassing the automatic topology-packed CPU allocator in GROUP_BY_MACHINE mode by specifying a custom "cpuset" parameter in the claim's opaque configurations.
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pravk03 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
thanks for the PoC. It seems to me this is already in a pretty good shape. The main blockers I'm seeing atm are
- decision on having the opaque data mandatory or optional/recommended
- the schema of the opaque data. It's very implicit. I wonder how we can make it explicit, besides docs.
- some e2e tests
| }) | ||
| } | ||
| case GROUP_BY_MACHINE: | ||
| deviceName := fmt.Sprintf("%s%03d", cpuDeviceMachineGroupedPrefix, 0) |
There was a problem hiding this comment.
nit: should be a constant, we will always have a single device
| allocatableCPUs := allCPUs.Difference(cp.reservedCPUs) | ||
| availableCPUs := int64(allocatableCPUs.Size()) | ||
|
|
||
| if allocatableCPUs.Size() > 0 { |
There was a problem hiding this comment.
this should probably be a generic pre-flight check in driver.Start() or, in general, should be done earlier I believe
| } else { | ||
| availableCPUsForDevice := sharedCPUs.Difference(cpuAssignment) | ||
| logger.V(4).Info("no opaque cpuset config override found, falling back to topology-packed allocator", "device", alloc.Device, "availableCPUs", availableCPUsForDevice.String()) | ||
| cur, err = cpumanager.TakeByTopologyNUMAPacked(logger, topo, availableCPUsForDevice, int(claimCPUCount), cpumanager.CPUSortingStrategyPacked, true) |
There was a problem hiding this comment.
this is probably the main/only open point in the PoC/proposal: if we do not get a cpuset in the opaque data, should we hard fail or do a best-effort allocation?
If the driver focuses on esclusive CPUs allocation, I don't see much value in giving X random exclusive CPUs.
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The current allocation logic attempts to pack CPUs within a single NUMA node and spills over across NUMA domains only if it cannot satisfy the request on a single node. So, this already provides a soft guarantee. Some points for keeping opaque parameter optional:
However, the point you bring up for making All in all, I am ok either options here. Or, the third option is adding a flag to make it configurable, though the question then becomes what the default behavior should be. Maybe we can start with making
Good point. I will think more and try to come up with a proposal.
I think we need this only if we allow fallback. With |
POC for #164