Add logs (at log level 1) for better observability#2384
Add logs (at log level 1) for better observability#2384k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Assigning it to @kfswain since they have more context about it |
|
/assign @kfswain |
a571c98 to
c250f42
Compare
|
this seems to me like too much information for log level 1. in high load this will thrash the log, making it almost impossible to see warnings or errors |
| return status.Errorf(codes.Unknown, "failed to send response back to Envoy: %v", err) | ||
| } | ||
| } | ||
| logger.Info("EPP sent request body response(s) to envoy", "modelName", r.IncomingModelName, "targetModelName", r.TargetModelName) |
There was a problem hiding this comment.
We support more than Envoy (NGINX works with our protocol). Can we specify proxy here?
There was a problem hiding this comment.
updated, thanks
| } | ||
| logger = logger.WithValues(requtil.RequestIdHeaderKey, requestID) | ||
|
|
||
| logger.Info("EPP received request") // Request ID will be logged too as part of logger context values. |
There was a problem hiding this comment.
Nit: pls set the log level
There was a problem hiding this comment.
Info log is already set to log level 1, so do we really need to set the log level here? 🤔
There was a problem hiding this comment.
verbosity level** thanks!
There was a problem hiding this comment.
Pls set the verbosity level to 1, even though we consider our default and floor to be 1, if for some reason a user wants to rig this for silent running they could set to 0 https://pkg.go.dev/github.com/go-logr/logr#Logger.V
Note that this is distinct from the log level in terms of severity.
There was a problem hiding this comment.
Updated to set the level explicitly in the whole PR. Thanks!
| if err := srv.Send(r.respHeaderResp); err != nil { | ||
| return status.Errorf(codes.Unknown, "failed to send response back to Envoy: %v", err) | ||
| } | ||
| logger.Info("EPP sent response headers to envoy", "response headers", r.respHeaderResp) |
There was a problem hiding this comment.
Instead of dumping the response headers, can we output just the error code?
There was a problem hiding this comment.
actually, after some thought and reading through the discussion, I removed this log line. I am only logging when the complete response is sent back
|
Thanks @rahulgurnani! I think we can/should do a quick test of performance to make sure this isn't going to generate lock contention in the buffer. As the RC cut for the next release is in a week.
|
I agree that performance is the main concern, but at the same time these logging messages aren’t critical, but informative. can we update log level of these messages to 2? |
|
I disagree... we have gotten direct customer feedback that this is critical observability required for their system. And it makes sense, our v1 is very quiet and it's hard to know if anything is going on.
If you have a link to that, I would love to see it. That's typically around log level (Info, warn,error, etc) and not verbosity level. If we wanted to compile this all to a single log entry per request, I could agree with that, and thats more in line with.
Flys counter to the point of these logs. if a customer is running at v=1, they will want some knowledge of their request outcome (as well as the status code) to help triage prod issue. We provide none of that. Again, if you're looking for error or warning logs, since this is structured logging, you can easily filter on that. |
|
a single log line per request would be a good place to meet in the middle 👍🏻 |
c250f42 to
7fce268
Compare
16d9733 to
a4b8a6b
Compare
I did run vllm benchmark with some profiling and didn't see any performance issues in my benchmarking. I also reduced the log lines to 3 per request and I think its inline with the feedback we received of 3 log lines per request. |
|
/approve Thanks @rahulgurnani 🙏🏼 |
c5355a0 to
5792c3a
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kfswain, nirrozenbaum, rahulgurnani The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ateway-api-inference-extension#2384) * Add debug logs for better observability * Log only when the entire body is sent back to proxy * Make error logs be logged at level 1 in server.go * Set explicit verbosity
Add debug logs to log request/response is sent to the proxy at the default log level 1 (info logs) for better observability
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Ensure that EPP has logs at info level for request/response lifecycle. Sample logs for a request processed by EPP with based on current PR:
Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: