Model replacement to Qwen3-32B by sats-23 · Pull Request #2189 · kubernetes-sigs/gateway-api-inference-extension

sats-23 · 2026-01-21T09:08:28Z

What type of PR is this?
/kind documentation
/kind cleanup

What this PR does / why we need it:
-Replaces llama3-8b model with qwen3-32b across all references in guides
-Propagate changes onto manifests referenced in the guides
-For fine-tuned example, replaced adapters with https://huggingface.co/nicoboss/Qwen3-32B-Uncensored

Which issue(s) this PR fixes:

Issue #2151 (more PRs to follow)

Does this PR introduce a user-facing change?: Yes

Update manifests to use Qwen3-32B

netlify · 2026-01-21T09:08:35Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`394d49c`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69981e885e6b0c0008c32230
😎 Deploy Preview	https://deploy-preview-2189--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

k8s-ci-robot · 2026-01-21T09:08:38Z

Hi @sats-23. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

sats-23 · 2026-01-22T12:47:54Z

/cc @ahg-g
/cc @nirrozenbaum

ahg-g · 2026-01-22T12:50:21Z

/ok-to-test

ahg-g

Thanks a lot!

conformance/tests/epp_unavailable_fail_open.go

conformance/tests/gateway_destination_endpoint_served.go

conformance/tests/gateway_following_epp_routing.go

conformance/tests/gateway_following_epp_routing_dp.go

conformance/tests/gateway_weighted_two_pools.go

test/e2e/epp/e2e_test.go

ahg-g · 2026-01-22T13:06:10Z

tools/dynamic-lora-sidecar/README.md

needs updating as well

not done, pls revert the changes in this file and only update the model name here

sats-23 · 2026-01-22T13:49:26Z

As the tests are deeply knit with the doc and manifests, I would prefer all the changes related to model port to go in this single PR. Converting PR to draft until all models are ported

sats-23 · 2026-01-22T17:08:27Z

/test pull-gateway-api-inference-extension-test-unit-main

ahg-g · 2026-01-23T05:29:27Z

it seems a unit test is failing

sats-23 · 2026-01-23T08:43:38Z

it seems a unit test is failing

Yep @ahg-g, I will look into it soon

sats-23 · 2026-02-12T11:15:23Z

@ahg-g All comments have been addressed, PTAL

ahg-g

Thanks.

There is also the file https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/site-src/_includes/verify-status-latest.md that needs updating

site-src/_includes/epp.md

site-src/_includes/verify-status.md

site-src/guides/getting-started-latest.md

site-src/guides/index.md

site-src/guides/standalone.md

tools/dynamic-lora-sidecar/Dockerfile

ahg-g · 2026-02-13T14:30:41Z

tools/dynamic-lora-sidecar/README.md

not done, pls revert the changes in this file and only update the model name here

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

config/manifests/sglang/gpu-deployment.yaml

config/manifests/vllm/cpu-deployment.yaml

config/manifests/vllm/gpu-deployment.yaml

add back sglang engine type label

add back vllm engine type label

ahg-g · 2026-02-20T11:41:22Z

Thanks a lot!

/lgtm
/approve

k8s-ci-robot · 2026-02-20T11:41:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, sats-23

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* Model replacement to Qwen3-32B Signed-off-by: Sathvik <Sathvik.S@ibm.com> * Update config/manifests/sglang/gpu-deployment.yaml add back sglang engine type label * Update config/manifests/vllm/cpu-deployment.yaml add back vllm engine type label * Update config/manifests/vllm/gpu-deployment.yaml add back vllm engine type label --------- Signed-off-by: Sathvik <Sathvik.S@ibm.com> Co-authored-by: Abdullah Gharaibeh <40361897+ahg-g@users.noreply.github.com>

k8s-ci-robot added kind/documentation Categorizes issue or PR as related to documentation. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Jan 21, 2026

k8s-ci-robot requested review from nirrozenbaum and robscott January 21, 2026 09:08

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 21, 2026

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 21, 2026

sats-23 force-pushed the issue2151 branch 2 times, most recently from 40dfba9 to 82a6a88 Compare January 21, 2026 11:14

k8s-ci-robot requested a review from ahg-g January 22, 2026 12:47

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 22, 2026

ahg-g reviewed Jan 22, 2026

View reviewed changes

sats-23 force-pushed the issue2151 branch 2 times, most recently from ee2aad6 to 026bc39 Compare January 22, 2026 13:31

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 22, 2026

sats-23 force-pushed the issue2151 branch from 745d307 to ee35d52 Compare January 22, 2026 13:35

sats-23 marked this pull request as draft January 22, 2026 13:49

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 22, 2026

sats-23 force-pushed the issue2151 branch from ee35d52 to 73e1695 Compare January 22, 2026 16:59

sats-23 force-pushed the issue2151 branch from 73e1695 to f9d95e0 Compare January 26, 2026 15:32

sats-23 force-pushed the issue2151 branch 3 times, most recently from 034203f to fc42c33 Compare February 12, 2026 11:14

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 12, 2026

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 12, 2026

ahg-g reviewed Feb 13, 2026

View reviewed changes

sats-23 force-pushed the issue2151 branch 5 times, most recently from f50ce0a to b6ec9db Compare February 19, 2026 06:19

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 19, 2026

Model replacement to Qwen3-32B

8265475

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

sats-23 force-pushed the issue2151 branch from 7d3a083 to 8265475 Compare February 20, 2026 06:36

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 20, 2026

sats-23 requested a review from ahg-g February 20, 2026 06:36

ahg-g reviewed Feb 20, 2026

View reviewed changes

config/manifests/sglang/gpu-deployment.yaml Show resolved Hide resolved

config/manifests/vllm/cpu-deployment.yaml Show resolved Hide resolved

config/manifests/vllm/gpu-deployment.yaml Show resolved Hide resolved

ahg-g added 3 commits February 20, 2026 00:42

Update config/manifests/sglang/gpu-deployment.yaml

6c4b043

add back sglang engine type label

Update config/manifests/vllm/cpu-deployment.yaml

74c4f0a

add back vllm engine type label

Update config/manifests/vllm/gpu-deployment.yaml

394d49c

add back vllm engine type label

k8s-ci-robot assigned ahg-g Feb 20, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 20, 2026

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 20, 2026

k8s-ci-robot merged commit 3aa604b into kubernetes-sigs:main Feb 20, 2026
11 checks passed

Conversation

sats-23 commented Jan 21, 2026

Uh oh!

netlify bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

k8s-ci-robot commented Jan 21, 2026

Uh oh!

sats-23 commented Jan 22, 2026

Uh oh!

ahg-g commented Jan 22, 2026

Uh oh!

ahg-g left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahg-g Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

sats-23 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

ahg-g Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

sats-23 commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sats-23 commented Jan 22, 2026

Uh oh!

ahg-g commented Jan 23, 2026

Uh oh!

sats-23 commented Jan 23, 2026

Uh oh!

sats-23 commented Feb 12, 2026

Uh oh!

ahg-g left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahg-g Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahg-g commented Feb 20, 2026

Uh oh!

k8s-ci-robot commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Jan 21, 2026 •

edited

Loading

sats-23 commented Jan 22, 2026 •

edited

Loading