You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
a RayCluster, part of an AppWrapper, has been instantiated in the cluster, although there's not enough resources to schedule them:
"message": "0/8 nodes are available: 1 Insufficient cpu, 1 node(s) had untolerated taint {only-test-pods: yes},
3 node(s) didn't match Pod's node affinity/selector,
3 node(s) had untolerated taint {node-role.kubernetes.io/master: }.
preemption: 0/8 nodes are available: 1 No preemption victims found for incoming pod,
7 Preemption is not helpful for scheduling.",
My understanding is that MCAD did not take into account the taint in one of the nodes (1 node(s) had untolerated taint {only-test-pods: yes}) when it decided that the AppWrapper would fit in the cluster. Unfortunately, the only node available for this workload did not have enough CPU to host the RayCluster1 Insufficient cpu.
As part of my Codeflare/MCAD test automation, I observed the following behavior:
RayCluster
, part of anAppWrapper
, has been instantiated in the cluster, although there's not enough resources to schedule them:My understanding is that MCAD did not take into account the taint in one of the nodes (
1 node(s) had untolerated taint {only-test-pods: yes}
) when it decided that theAppWrapper
would fit in the cluster. Unfortunately, the only node available for this workload did not have enough CPU to host theRayCluster
1 Insufficient cpu
.quay.io/project-codeflare/mcad-controller:release-v1.32.0
(found in the Pod details link above)The text was updated successfully, but these errors were encountered: