helm support for sidecar injection in EPP by capri-xiyue · Pull Request #1821 · kubernetes-sigs/gateway-api-inference-extension

capri-xiyue · 2025-11-05T21:17:45Z

What type of PR is this?

What this PR does / why we need it:
see #1778

Which issue(s) this PR fixes:

Does this PR introduce a user-facing change?:

NONE

netlify · 2025-11-05T21:17:51Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`fd9cdea`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/691781c63eca46000871b2e7
😎 Deploy Preview	https://deploy-preview-1821--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

kfswain · 2025-11-05T21:41:25Z

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

capri-xiyue · 2025-11-05T21:59:45Z

@kfswain For this stage, EPP standalone mode still uses InferencePool. That's why I didn't put it in a distinct chart. Later, if we decide to have a standalone EPP without any k8s crd, I will refactor it to a distinct helm chart.

To clarify, this PR just removes gateway api dependency, inferencepool api is still used.

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

capri-xiyue · 2025-11-05T22:00:20Z

/assign @ahg-g

kfswain · 2025-11-05T22:04:29Z

@kfswain For this stage, EPP standalone mode still uses InferencePool. That's why I didn't put it in a distinct chart. Later, if we decide to have a standalone EPP without any k8s crd, I will refactor it to a distinct helm chart.

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

That may lead to confusing UX in the case where this is deployed in a cluster that has a GW controller, as it will attempt to reconcile on the InferencePool and integrate it into the GW system, is modifying EPP to just accept a selector not a viable path forward?

capri-xiyue · 2025-11-05T22:09:28Z

@kfswain For this stage, EPP standalone mode still uses InferencePool. That's why I didn't put it in a distinct chart. Later, if we decide to have a standalone EPP without any k8s crd, I will refactor it to a distinct helm chart.

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

That may lead to confusing UX in the case where this is deployed in a cluster that has a GW controller, as it will attempt to reconcile on the InferencePool and integrate it into the GW system, is modifying EPP to just accept a selector not a viable path forward?

I talked with @ahg-g before, modifying EPP to just accept a selector needs further discussion. Therefore he suggested me finalizing EPP with envoy proxy first with helm chart.

Curious now what will happen when a inference pool deployed in a cluster with two GW controller?(for example kgateway and istio), will it cause issues here? Initially I thought each GW controller is able to handle this case.

ahg-g · 2025-11-05T22:25:29Z

Yeah, my suggestion is to take a gradual approach, a gateway controller should not care about an inferencePool that is not referenced by an httpRoute.

config/charts/inferencepool/templates/epp-deployment.yaml

capri-xiyue · 2025-11-06T03:30:30Z

As an update, I'm now working on another PR to modify EPP to just accept a selector and will refactor the helm chart to have a distinct one as no inference pool is needed in that PR. EPP refactor probably takes a little while as fix bunch of ut takes time. Will let you know when I send a PR. @kfswain

config/charts/inferencepool/values.yaml

danehans · 2025-11-07T19:42:21Z

IMHO we need to get the #1778 design gdoc that @ahg-g created into an md and added to docs/proposals.

capri-xiyue · 2025-11-14T00:44:50Z

With that being said, I agree with the point of splitting the vendor specific to a separate repo provided that we offer the necessary "hooks" to enable that. So my suggestion is to make the necessary changes in this chart to enable customizing it by an external one to allow those other deployment patterns.

@capri-xiyue isn't all needed here to enable the above is to allow adding a sidecar proxy to the epp deployment?

Yes. I've updated it to enable additional sidecar proxy injection.

ahg-g

Great!

@danehans can you please make sure this works for agentgateway?

config/charts/inferencepool/templates/epp-deployment.yaml

config/charts/inferencepool/values.yaml

ahg-g · 2025-11-14T18:23:25Z

config/charts/inferencepool/values.yaml

+#      # Because the template just dumps this section, the keys become filenames.
+#      # The values MUST be strings (note the literal block scalar '|')
+#      data:
+#        envoy.yaml: |


What assumptions are we making in this configuration with regards to the EPP configuration, the configuration of the model servers and inferencePool? Put another way, is this configuration assuming that the epp/servers/infoPool is configuration in a specific way?

The example envoy config only assumes the communication between envoy proxy and epp communication. It has nothing to do with model server and inference pool. By the way, such envoy config can be customized by users. Users can use what ever config they want to use.

Ideally we want a config that just works, and the helm chart we offer should not even offer any customization for this config.

Yep. The current default config works as I just disable side car injection. By the way, originally the example envoy config was commented already. Now I just removed it as it will live in llm-d

The current default config works as I just disable side car injection

I was referring to the envoy config, that config in llm-d should just work and not be customizable.

Yep. The envoy config in llm-d will just work and won't be customizable. I tested it with no inference pool dependency. Will ping you when I open a PR in llm-d

config/charts/inferencepool/templates/epp-deployment.yaml

config/charts/inferencepool/values.yaml

ahg-g · 2025-11-14T20:06:07Z

/lgtm
/approve

Great!

k8s-ci-robot · 2025-11-14T20:06:17Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, capri-xiyue

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* draft * draft * workable version * added docs * fixed format * move epp-envoy to a standalone file * make standalone a provider * removed unused change * remove unused change * syntax fix * workable version * remove unused yaml * added more customized info and removed example * fixed typo * fixed default image pull policy

capri-xiyue added 4 commits November 3, 2025 16:35

draft

67d8bc9

draft

35b63ad

workable version

b5f49b4

added docs

d8b3966

k8s-ci-robot requested review from liu-cong and nirrozenbaum November 5, 2025 21:17

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 5, 2025

fixed format

6d06c16

k8s-ci-robot assigned ahg-g Nov 5, 2025

ahg-g reviewed Nov 5, 2025

View reviewed changes

config/charts/inferencepool/templates/epp-deployment.yaml Outdated Show resolved Hide resolved

move epp-envoy to a standalone file

92d36bd

capri-xiyue requested a review from ahg-g November 6, 2025 00:28

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 6, 2025

ahg-g reviewed Nov 7, 2025

View reviewed changes

config/charts/inferencepool/values.yaml Outdated Show resolved Hide resolved

ahg-g reviewed Nov 7, 2025

View reviewed changes

config/charts/inferencepool/values.yaml Outdated Show resolved Hide resolved

make standalone a provider

f11f891

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 7, 2025

capri-xiyue requested a review from ahg-g November 7, 2025 18:18

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 13, 2025

capri-xiyue added 3 commits November 13, 2025 11:16

remove unused change

b294e51

syntax fix

139eec8

workable version

1538ae3

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 14, 2025

remove unused yaml

2b59475

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 14, 2025

capri-xiyue changed the title ~~helm support for EPP standalone mode with Envoy Proxy~~ helm support for sidecar injection in EPP Nov 14, 2025

capri-xiyue requested review from ahg-g and kfswain November 14, 2025 17:58

ahg-g reviewed Nov 14, 2025

View reviewed changes

added more customized info and removed example

f840091

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 14, 2025

fixed typo

bd809e5

capri-xiyue requested a review from ahg-g November 14, 2025 18:51

ahg-g reviewed Nov 14, 2025

View reviewed changes

config/charts/inferencepool/templates/epp-deployment.yaml Outdated Show resolved Hide resolved

ahg-g reviewed Nov 14, 2025

View reviewed changes

config/charts/inferencepool/values.yaml Show resolved Hide resolved

fixed default image pull policy

fd9cdea

capri-xiyue requested a review from ahg-g November 14, 2025 19:24

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2025

k8s-ci-robot merged commit 81deb19 into kubernetes-sigs:main Nov 14, 2025
11 checks passed

delavet mentioned this pull request Dec 8, 2025

feature: (helm) support custom volumes and volumeMounts for epp #1945

Merged

Conversation

capri-xiyue commented Nov 5, 2025

Uh oh!

netlify bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

kfswain commented Nov 5, 2025

Uh oh!

capri-xiyue commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

capri-xiyue commented Nov 5, 2025

Uh oh!

kfswain commented Nov 5, 2025

Uh oh!

capri-xiyue commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahg-g commented Nov 5, 2025

Uh oh!

Uh oh!

capri-xiyue commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

danehans commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

capri-xiyue commented Nov 14, 2025

Uh oh!

ahg-g left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahg-g Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

capri-xiyue Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahg-g Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

capri-xiyue Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahg-g Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

capri-xiyue Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ahg-g commented Nov 14, 2025

Uh oh!

k8s-ci-robot commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

netlify bot commented Nov 5, 2025 •

edited

Loading

capri-xiyue commented Nov 5, 2025 •

edited

Loading

capri-xiyue commented Nov 5, 2025 •

edited

Loading

capri-xiyue commented Nov 6, 2025 •

edited

Loading

danehans commented Nov 7, 2025 •

edited

Loading

capri-xiyue Nov 14, 2025 •

edited

Loading

capri-xiyue Nov 14, 2025 •

edited

Loading