Skip to content

clean up: remove model server type#2311

Merged
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
capri-xiyue:capri-xiyue/remove-inference-model-server-type
Feb 10, 2026
Merged

clean up: remove model server type#2311
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
capri-xiyue:capri-xiyue/remove-inference-model-server-type

Conversation

@capri-xiyue
Copy link
Copy Markdown
Contributor

@capri-xiyue capri-xiyue commented Feb 10, 2026

What type of PR is this?

/kind cleanup

What this PR does / why we need it:
As #2161 merged, we no longer need model server type in helm chart as such info is now exposed via model server deployment via label inference.networking.k8s.io/engine-type
Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

No more `inferencePool.modelServerType` in `inferencepool` helm chart  and `standalone` helm chart and no more `inferenceExtension.endpointsServer.modelServerType` in `standalone` helm chart

@k8s-ci-robot k8s-ci-robot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Feb 10, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Feb 10, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 8d8829e
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/698b8756e0278d0008ff7a6a
😎 Deploy Preview https://deploy-preview-2311--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 10, 2026
@k8s-ci-robot k8s-ci-robot requested a review from shmuelk February 10, 2026 19:30
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 10, 2026
Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install for Triton TensorRT-LLM, e.g.,

```txt
$ helm install triton-llama3-8b-instruct \
Copy link
Copy Markdown
Contributor Author

@capri-xiyue capri-xiyue Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted it as because such config should just be done by setting model server deployment labels and it has been covered in other docs in https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/site-src/implementations/model-servers.md

@capri-xiyue
Copy link
Copy Markdown
Contributor Author

/assign @ahg-g

@capri-xiyue
Copy link
Copy Markdown
Contributor Author

/assign @kfswain

@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Feb 10, 2026

/lgtm
/approve

Thanks @capri-xiyue !

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 10, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: capri-xiyue, kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 10, 2026
@k8s-ci-robot k8s-ci-robot merged commit a80ab3c into kubernetes-sigs:main Feb 10, 2026
15 checks passed
@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Feb 10, 2026

Actually we have to revert this. while #2161 is merged, it is behind the datalayer feature flag, which is disabled by default. The multi-engine support is implemented as part of the metrics datalayer plugin, not the legacy metrics collection path which unfortunately is still the default and relies on the command line flags.

capri-xiyue added a commit to capri-xiyue/gateway-api-inference-extension that referenced this pull request Feb 10, 2026
k8s-ci-robot pushed a commit that referenced this pull request Feb 10, 2026
RyanRosario pushed a commit to RyanRosario/gateway-api-inference-extension that referenced this pull request Mar 9, 2026
RyanRosario pushed a commit to RyanRosario/gateway-api-inference-extension that referenced this pull request Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants