Skip to content

Conversation

@RyanRosario
Copy link
Contributor

@RyanRosario RyanRosario commented Dec 30, 2025

What type of PR is this?
/kind documentation
/kind feature

What this PR does / why we need it:

It adds new observability metrics for flow control.

Which issue(s) this PR fixes:
Related to #1708

Does this PR introduce a user-facing change?:

NO

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. labels Dec 30, 2025
@netlify
Copy link

netlify bot commented Dec 30, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit fa18396
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69537608ce4e1e00083611c3
😎 Deploy Preview https://deploy-preview-2044--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 30, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @RyanRosario. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 30, 2025
@RyanRosario RyanRosario changed the title [WIP] Add additional observability metrics for flow control Add flowcontrol queue length in bytes metric Jan 9, 2026
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 9, 2026
@kfswain
Copy link
Collaborator

kfswain commented Jan 12, 2026

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 12, 2026
@ahg-g
Copy link
Contributor

ahg-g commented Jan 14, 2026

/assign @LukeAVanDrie

@k8s-ci-robot
Copy link
Contributor

@ahg-g: GitHub didn't allow me to assign the following users: LukeAVanDrie.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

Details

In response to this:

/assign @LukeAVanDrie

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@LukeAVanDrie LukeAVanDrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Ryan! I have left a few minor inline comments, but I have no blocking concerns.
This LGTM!

/assign @ahg-g

Comment on lines +225 to +232
metrics.AddFlowControlQueueBytes(
flowKey.ID, priority,
req.InferencePoolName(),
req.ModelName(), req.TargetModelName(), req.ByteSize())
defer metrics.SubFlowControlQueueBytes(
flowKey.ID, priority,
req.InferencePoolName(),
req.ModelName(), req.TargetModelName(), req.ByteSize())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Even though req.ByteSize() is technically immutable right now, it is a defensive best practice to capture the size in a local variable before the defer.

This guarantees that the Add and Sub operations are always mathematically symmetric. If a future refactor makes changes how ByteSize() is calculated (making it mutable), we risk the Gauge drifting permanently (e.g., subtracting more than we added or vice versa).

)

// Basic Inc/Dec
AddFlowControlQueueBytes("user-a", "100", pool, model, target, 32.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The helper function AddFlowControlQueueBytes accepts a uint64. While Go's untyped constants allow 32.0 to compile, it is cleaner to use integer literals (e.g., 32) to match the function signature.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LukeAVanDrie, RyanRosario
Once this PR has been reviewed and has the lgtm label, please assign danehans for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@LukeAVanDrie
Copy link
Contributor

Oh, @RyanRosario, since we are adding a new public metric that operators will use, this is a user-facing change. Please update the release note section in your PR description (on this and the other metrics PRs):

E.g.,

Added `inference_extension_flow_control_queue_bytes` metric to track the total size (in bytes) of requests currently buffered in the Flow Control layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants