Skip to content

update bbr quickstart guide with latest functionality#2150

Merged
k8s-ci-robot merged 2 commits intokubernetes-sigs:mainfrom
nirrozenbaum:bbr-configmap-guide
Jan 15, 2026
Merged

update bbr quickstart guide with latest functionality#2150
k8s-ci-robot merged 2 commits intokubernetes-sigs:mainfrom
nirrozenbaum:bbr-configmap-guide

Conversation

@nirrozenbaum
Copy link
Copy Markdown
Contributor

What type of PR is this?
/kind documentation

What this PR does / why we need it:
update serving multiple inference pool quickstart guide when using the code from main.

Which issue(s) this PR fixes:

This PR completes the work on multi inference pool management through bbr.
Fixes #1812

Does this PR introduce a user-facing change?:

None

@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Jan 14, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Jan 14, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 6d6e6af
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6967efa3fef939000842265b
😎 Deploy Preview https://deploy-preview-2150--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 14, 2026
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 14, 2026
@nirrozenbaum
Copy link
Copy Markdown
Contributor Author

/hold self verifying before opening for review.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2026
@nirrozenbaum
Copy link
Copy Markdown
Contributor Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2026
Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
@nirrozenbaum
Copy link
Copy Markdown
Contributor Author

cc: @howardjohn please pay attention that continuing our conversation on #1908 kgateway was temporarily removed, since it doesn't support bbr's new functionality. it would be great if we can add kgateway back with support for the new functionality.

Copy link
Copy Markdown
Contributor

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't remove our implementation. We can support the native and ext-proc approach, so there is no need for removal -- maybe tweaks.

I don't see any new functionality that cannot be done natively, IIUC. I believe the idea is there is now a mapping of adapter to model, instead of a simple body --> header.

This can be trivially done by the proxy as well with a minor tweak:

apiVersion: gateway.kgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
  name: bbr
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: inference-gateway
  traffic:
    phase: PreRouting
    transformation:
      request:
        set:
        - name: X-Gateway-Model-Name
          value: |
            {
              "food-review-1": "meta-llama/Llama-3.1-8B-Instruct",
              "meta-llama/Llama-3.1-8B-Instruct": "meta-llama/Llama-3.1-8B-Instruct",
            }[json(request.body).model]

@nirrozenbaum
Copy link
Copy Markdown
Contributor Author

Please don't remove our implementation. We can support the native and ext-proc approach, so there is no need for removal -- maybe tweaks.

I don't see any new functionality that cannot be done natively, IIUC. I believe the idea is there is now a mapping of adapter to model, instead of a simple body --> header.

This can be trivially done by the proxy as well with a minor tweak:

apiVersion: gateway.kgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
  name: bbr
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: inference-gateway
  traffic:
    phase: PreRouting
    transformation:
      request:
        set:
        - name: X-Gateway-Model-Name
          value: |
            {
              "food-review-1": "meta-llama/Llama-3.1-8B-Instruct",
              "meta-llama/Llama-3.1-8B-Instruct": "meta-llama/Llama-3.1-8B-Instruct",
            }[json(request.body).model]

sounds good.
will make these additions and ping you for review on that section.
holding to prevent merging without kgateway.
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2026
Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 14, 2026
@nirrozenbaum
Copy link
Copy Markdown
Contributor Author

@howardjohn updated agentgatewaypolicy as you suggested.
please let me know if this looks ok.

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2026
Copy link
Copy Markdown
Contributor

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: howardjohn, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Jan 15, 2026

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 15, 2026
@k8s-ci-robot k8s-ci-robot merged commit 2a92276 into kubernetes-sigs:main Jan 15, 2026
11 checks passed
@nirrozenbaum nirrozenbaum deleted the bbr-configmap-guide branch January 19, 2026 08:29
RyanRosario pushed a commit to RyanRosario/gateway-api-inference-extension that referenced this pull request Jan 20, 2026
…s#2150)

* update bbr quickstart guide with latest functionality

Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>

* add back kgateway bbr based on agentgateway policy

Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>

---------

Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi Pool support through BBR

4 participants