clean up: remove model server type#2311
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
| Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install for Triton TensorRT-LLM, e.g., | ||
|
|
||
| ```txt | ||
| $ helm install triton-llama3-8b-instruct \ |
There was a problem hiding this comment.
I deleted it as because such config should just be done by setting model server deployment labels and it has been covered in other docs in https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/site-src/implementations/model-servers.md
|
/assign @ahg-g |
|
/assign @kfswain |
|
/lgtm Thanks @capri-xiyue ! |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: capri-xiyue, kfswain The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Actually we have to revert this. while #2161 is merged, it is behind the datalayer feature flag, which is disabled by default. The multi-engine support is implemented as part of the metrics datalayer plugin, not the legacy metrics collection path which unfortunately is still the default and relies on the command line flags. |
This reverts commit a80ab3c.
…sigs#2312) This reverts commit a80ab3c.
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
As #2161 merged, we no longer need model server type in helm chart as such info is now exposed via model server deployment via label
inference.networking.k8s.io/engine-typeWhich issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: