Skip to content

Conversation

@googs1025
Copy link
Collaborator

Pull Request Description

[Please provide a clear and concise description of your changes here]

  • check rayclusters crd is installed before controller start (fix TODO)

Related Issues

Resolves: #[Insert issue number(s)]None

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@googs1025
Copy link
Collaborator Author

@googs1025
Copy link
Collaborator Author

another way: we can check in helm like this: InftyAI/llmaz#316 (comment)

// TODO: check crd exists or not. If not, we should fail here directly without moving forward.
// This is used to validate whether kuberay is installed now.
// Check if the CRD exists. If not, fail directly.
crdName := "rayclusters.ray.io"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hard code, not sure if it is good enough

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ok for short term but better to use scheme to construct it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 In addition to the ray dependency, do we also need to do the same for the EnvoyGateway object?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you find anywhere suitable for the EnvoyGateway object check? Without it, the load balancer won’t be populated. Adding checks could help catch the issue earlier, but the failure would still be visible even without them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you find anywhere suitable for the EnvoyGateway object check? Without it, the load balancer won’t be populated. Adding checks could help catch the issue earlier, but the failure would still be visible even without them.

Sounds reasonable, we can find this problem again when creating envoy-gateway-config. 🤔

@googs1025
Copy link
Collaborator Author

googs1025 commented Mar 30, 2025

@Jeffwan
Copy link
Collaborator

Jeffwan commented Apr 1, 2025

@googs1025 you mean the ci failure is due to ray crd dependency?

kubectl create -k config/dependency

Let me double check if that's related to this change #793

@googs1025
Copy link
Collaborator Author

@googs1025 you mean the ci failure is due to ray crd dependency?

kubectl create -k config/dependency

Let me double check if that's related to this change #793

oh, I mistakenly thought that there was no dependency installed.
thanks for reminding me this

@Jeffwan
Copy link
Collaborator

Jeffwan commented Apr 2, 2025

image

@googs1025 the problem is controller-manager lack of permission to get/list crds. Please help grant the permission in this PR.

@googs1025 googs1025 force-pushed the add_ray_crd_first branch from aab6d8d to a0cd509 Compare April 3, 2025 01:03
@googs1025
Copy link
Collaborator Author

image

@googs1025 the problem is controller-manager lack of permission to get/list crds. Please help grant the permission in this PR.

thanks for this. done

Copy link
Collaborator

@Jeffwan Jeffwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@Jeffwan Jeffwan merged commit c03d7ca into vllm-project:main Apr 3, 2025
11 checks passed
gangmuk pushed a commit to gangmuk/aibrix-gangmuk that referenced this pull request Jun 21, 2025
…led before controller start (vllm-project#922)

feature(rayclusterreplicaset): check rayclusters crd is installed before controller start

Signed-off-by: googs1025 <[email protected]>
Yaegaki1Erika pushed a commit to Yaegaki1Erika/aibrix that referenced this pull request Jul 23, 2025
…led before controller start (vllm-project#922)

feature(rayclusterreplicaset): check rayclusters crd is installed before controller start

Signed-off-by: googs1025 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants