Skip to content

Add logs (at log level 1) for better observability#2384

Merged
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
rahulgurnani:add-logs
Feb 24, 2026
Merged

Add logs (at log level 1) for better observability#2384
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
rahulgurnani:add-logs

Conversation

@rahulgurnani
Copy link
Copy Markdown
Contributor

@rahulgurnani rahulgurnani commented Feb 19, 2026

Add debug logs to log request/response is sent to the proxy at the default log level 1 (info logs) for better observability

What type of PR is this?

/kind cleanup

What this PR does / why we need it:
Ensure that EPP has logs at info level for request/response lifecycle. Sample logs for a request processed by EPP with based on current PR:

{"level":"info","ts":"2026-02-20T22:24:39Z","caller":"handlers/server.go:217","msg":"EPP received request","x-request-id":"bench-1f452021-72"}
{"level":"info","ts":"2026-02-20T22:24:39Z","caller":"handlers/server.go:406","msg":"EPP sent request body response(s) to proxy","x-request-id":"bench-1f452021-72","modelName":"meta-llama/Llama-3.1-8B-Instruct","targetModelName":"meta-llama/Llama-3.1-8B-Instruct"}
{"level":"info","ts":"2026-02-20T22:25:00Z","caller":"handlers/server.go:434","msg":"EPP sent response body back to proxy","x-request-id":"bench-1f452021-72"}

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

More per LLM request logging at default log level for better observability

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Feb 19, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Feb 19, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 5792c3a
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/699e04cdf7d6350008b0df13
😎 Deploy Preview https://deploy-preview-2384--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 19, 2026
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 19, 2026
@rahulgurnani rahulgurnani marked this pull request as ready for review February 19, 2026 22:07
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 19, 2026
@rahulgurnani
Copy link
Copy Markdown
Contributor Author

rahulgurnani commented Feb 19, 2026

Assigning it to @kfswain since they have more context about it

@rahulgurnani
Copy link
Copy Markdown
Contributor Author

/assign @kfswain

@nirrozenbaum
Copy link
Copy Markdown
Contributor

this seems to me like too much information for log level 1.
in level 1 we should essentially log only warning/error and critical messages.

in high load this will thrash the log, making it almost impossible to see warnings or errors

Comment thread pkg/epp/handlers/server.go Outdated
return status.Errorf(codes.Unknown, "failed to send response back to Envoy: %v", err)
}
}
logger.Info("EPP sent request body response(s) to envoy", "modelName", r.IncomingModelName, "targetModelName", r.TargetModelName)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support more than Envoy (NGINX works with our protocol). Can we specify proxy here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, thanks

Comment thread pkg/epp/handlers/server.go Outdated
}
logger = logger.WithValues(requtil.RequestIdHeaderKey, requestID)

logger.Info("EPP received request") // Request ID will be logged too as part of logger context values.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: pls set the log level

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info log is already set to log level 1, so do we really need to set the log level here? 🤔

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verbosity level** thanks!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls set the verbosity level to 1, even though we consider our default and floor to be 1, if for some reason a user wants to rig this for silent running they could set to 0 https://pkg.go.dev/github.com/go-logr/logr#Logger.V

Note that this is distinct from the log level in terms of severity.

Copy link
Copy Markdown
Contributor Author

@rahulgurnani rahulgurnani Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to set the level explicitly in the whole PR. Thanks!

Comment thread pkg/epp/handlers/server.go Outdated
if err := srv.Send(r.respHeaderResp); err != nil {
return status.Errorf(codes.Unknown, "failed to send response back to Envoy: %v", err)
}
logger.Info("EPP sent response headers to envoy", "response headers", r.respHeaderResp)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of dumping the response headers, can we output just the error code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, after some thought and reading through the discussion, I removed this log line. I am only logging when the complete response is sent back

@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Feb 19, 2026

Thanks @rahulgurnani! I think we can/should do a quick test of performance to make sure this isn't going to generate lock contention in the buffer. As the RC cut for the next release is in a week.

in high load this will thrash the log, making it almost impossible to see warnings or errors
The logging solutions I'm aware of have filters for the different log types, I'm mostly concerned about the perf impacts

@nirrozenbaum
Copy link
Copy Markdown
Contributor

nirrozenbaum commented Feb 20, 2026

Thanks @rahulgurnani! I think we can/should do a quick test of performance to make sure this isn't going to generate lock contention in the buffer. As the RC cut for the next release is in a week.

in high load this will thrash the log, making it almost impossible to see warnings or errors
The logging solutions I'm aware of have filters for the different log types, I'm mostly concerned about the perf impacts

I agree that performance is the main concern, but at the same time these logging messages aren’t critical, but informative.
any logging best practices guide would tell us not to log these messages in v=1.
when we run in production and there’s a real issue, we want to find it in the logs easily.

can we update log level of these messages to 2?
anyone who wants to see them can always use higher verbosity.

@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Feb 20, 2026

I disagree... we have gotten direct customer feedback that this is critical observability required for their system. And it makes sense, our v1 is very quiet and it's hard to know if anything is going on.

any logging best practices guide would tell us not to log these messages in v=1.

If you have a link to that, I would love to see it. That's typically around log level (Info, warn,error, etc) and not verbosity level.

If we wanted to compile this all to a single log entry per request, I could agree with that, and thats more in line with.
But otherwise the point of:

when we run in production and there’s a real issue, we want to find it in the logs easily.

Flys counter to the point of these logs. if a customer is running at v=1, they will want some knowledge of their request outcome (as well as the status code) to help triage prod issue. We provide none of that. Again, if you're looking for error or warning logs, since this is structured logging, you can easily filter on that.

@nirrozenbaum
Copy link
Copy Markdown
Contributor

a single log line per request would be a good place to meet in the middle 👍🏻

@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 20, 2026
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 20, 2026
@rahulgurnani
Copy link
Copy Markdown
Contributor Author

Thanks @rahulgurnani! I think we can/should do a quick test of performance to make sure this isn't going to generate lock contention in the buffer. As the RC cut for the next release is in a week.

in high load this will thrash the log, making it almost impossible to see warnings or errors
The logging solutions I'm aware of have filters for the different log types, I'm mostly concerned about the perf impacts

I did run vllm benchmark with some profiling and didn't see any performance issues in my benchmarking.

I also reduced the log lines to 3 per request and I think its inline with the feedback we received of 3 log lines per request.
Furthermore, the error logs in the file for a request at Debug level 2 which meant they would not be logged in default verbosity. So I changed them as well in the last commit. Let me know what you think. Thanks!

@nirrozenbaum
Copy link
Copy Markdown
Contributor

/approve

Thanks @rahulgurnani 🙏🏼

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 21, 2026
@rahulgurnani rahulgurnani requested a review from kfswain February 23, 2026 19:39
@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Feb 24, 2026

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 24, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, nirrozenbaum, rahulgurnani

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [kfswain,nirrozenbaum]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 6a3d548 into kubernetes-sigs:main Feb 24, 2026
11 checks passed
RyanRosario pushed a commit to RyanRosario/gateway-api-inference-extension that referenced this pull request Mar 9, 2026
)

* Add debug logs for better observability

* Log only when the entire body is sent back to proxy

* Make error logs be logged at level 1 in server.go

* Set explicit verbosity
elevran pushed a commit to llm-d/llm-d-inference-scheduler that referenced this pull request Apr 23, 2026
…ateway-api-inference-extension#2384)

* Add debug logs for better observability

* Log only when the entire body is sent back to proxy

* Make error logs be logged at level 1 in server.go

* Set explicit verbosity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants