Skip to content

RHOAIENG-50554: Make kueue optional for RayJob#1013

Open
kryanbeane wants to merge 1 commit intoproject-codeflare:mainfrom
kryanbeane:RHOAIENG-50554
Open

RHOAIENG-50554: Make kueue optional for RayJob#1013
kryanbeane wants to merge 1 commit intoproject-codeflare:mainfrom
kryanbeane:RHOAIENG-50554

Conversation

@kryanbeane
Copy link
Contributor

Issue link

https://issues.redhat.com/browse/RHOAIENG-50554

What changes have been made

Make Kueue optional for RayJob, matching existing RayCluster behavior.

Previously, RayJob always injected a Kueue queue label (falling back to "default") and set suspend: true, causing jobs to hang when Kueue was not installed or no LocalQueue existed. Now:

  • If local_queue is omitted and no default LocalQueue is found, no Kueue label is added and the job runs without Kueue
  • If local_queue is provided or a default is auto-detected, the label is set
  • The SDK no longer sets suspend: true — Kueue's mutating admission webhook handles suspension automatically when installed

Verification steps

  1. With Kueue + explicit queue: Submit a RayJob with local_queue="your-queue" — verify the queue label is set and the job completes
  2. With Kueue + auto-detect: Submit a RayJob without local_queue in a namespace with a default LocalQueue — verify the default queue is auto-detected and the job completes
  3. Without Kueue: Submit a RayJob without local_queue on a cluster where Kueue is not installed or no LocalQueue exists — verify no Kueue label is added and the job runs immediately without hanging

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Feb 23, 2026

@kryanbeane: This pull request references RHOAIENG-50554 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Issue link

https://issues.redhat.com/browse/RHOAIENG-50554

What changes have been made

Make Kueue optional for RayJob, matching existing RayCluster behavior.

Previously, RayJob always injected a Kueue queue label (falling back to "default") and set suspend: true, causing jobs to hang when Kueue was not installed or no LocalQueue existed. Now:

  • If local_queue is omitted and no default LocalQueue is found, no Kueue label is added and the job runs without Kueue
  • If local_queue is provided or a default is auto-detected, the label is set
  • The SDK no longer sets suspend: true — Kueue's mutating admission webhook handles suspension automatically when installed

Verification steps

  1. With Kueue + explicit queue: Submit a RayJob with local_queue="your-queue" — verify the queue label is set and the job completes
  2. With Kueue + auto-detect: Submit a RayJob without local_queue in a namespace with a default LocalQueue — verify the default queue is auto-detected and the job completes
  3. Without Kueue: Submit a RayJob without local_queue on a cluster where Kueue is not installed or no LocalQueue exists — verify no Kueue label is added and the job runs immediately without hanging

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 23, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kryanbeane for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.96%. Comparing base (37b9009) to head (dda4085).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1013      +/-   ##
==========================================
- Coverage   95.97%   95.96%   -0.01%     
==========================================
  Files          23       23              
  Lines        2211     2208       -3     
==========================================
- Hits         2122     2119       -3     
  Misses         89       89              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@kryanbeane kryanbeane added test-guided-notebooks Run PR check to verify Guided notebooks test-ui-notebooks Run PR check to verify UI notebooks test-additional-notebooks labels Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference test-additional-notebooks test-guided-notebooks Run PR check to verify Guided notebooks test-ui-notebooks Run PR check to verify UI notebooks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants