Model replacement to Qwen3-32B#2189
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Hi @sats-23. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
40dfba9 to
82a6a88
Compare
|
/cc @ahg-g |
|
/ok-to-test |
tools/dynamic-lora-sidecar/README.md
Outdated
There was a problem hiding this comment.
not done, pls revert the changes in this file and only update the model name here
ee2aad6 to
026bc39
Compare
|
As the tests are deeply knit with the doc and manifests, I would prefer all the changes related to model port to go in this single PR. Converting PR to draft until all models are ported |
|
/test pull-gateway-api-inference-extension-test-unit-main |
|
it seems a unit test is failing |
Yep @ahg-g, I will look into it soon |
034203f to
fc42c33
Compare
|
@ahg-g All comments have been addressed, PTAL |
ahg-g
left a comment
There was a problem hiding this comment.
Thanks.
There is also the file https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/site-src/_includes/verify-status-latest.md that needs updating
tools/dynamic-lora-sidecar/README.md
Outdated
There was a problem hiding this comment.
not done, pls revert the changes in this file and only update the model name here
f50ce0a to
b6ec9db
Compare
Signed-off-by: Sathvik <Sathvik.S@ibm.com>
add back sglang engine type label
add back vllm engine type label
add back vllm engine type label
|
Thanks a lot! /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, sats-23 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Model replacement to Qwen3-32B Signed-off-by: Sathvik <Sathvik.S@ibm.com> * Update config/manifests/sglang/gpu-deployment.yaml add back sglang engine type label * Update config/manifests/vllm/cpu-deployment.yaml add back vllm engine type label * Update config/manifests/vllm/gpu-deployment.yaml add back vllm engine type label --------- Signed-off-by: Sathvik <Sathvik.S@ibm.com> Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>
* Model replacement to Qwen3-32B Signed-off-by: Sathvik <Sathvik.S@ibm.com> * Update config/manifests/sglang/gpu-deployment.yaml add back sglang engine type label * Update config/manifests/vllm/cpu-deployment.yaml add back vllm engine type label * Update config/manifests/vllm/gpu-deployment.yaml add back vllm engine type label --------- Signed-off-by: Sathvik <Sathvik.S@ibm.com> Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>
What type of PR is this?
/kind documentation
/kind cleanup
What this PR does / why we need it:
-Replaces llama3-8b model with qwen3-32b across all references in guides
-Propagate changes onto manifests referenced in the guides
-For fine-tuned example, replaced adapters with https://huggingface.co/nicoboss/Qwen3-32B-Uncensored
Which issue(s) this PR fixes:
Issue #2151 (more PRs to follow)
Does this PR introduce a user-facing change?: Yes