-
Notifications
You must be signed in to change notification settings - Fork 1k
Add network logging #2220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add network logging #2220
Conversation
WalkthroughAdded network request capture to Browser.ScreenshotWithBody: records per-request metadata and simplified error types during page load, returns []NetworkRequest alongside screenshot and HTML; Runner.analyze now collects that slice and stores it in Result.LinkRequest; new NetworkRequest type and getSimpleErrorType helper added. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Runner
participant Browser
participant DevTools as DevTools/Network
Caller->>Runner: analyze(...)
Runner->>Browser: ScreenshotWithBody(url, timeout, idle, headers, fullPage)
Browser->>DevTools: Enable Network domain
DevTools-->>Browser: Network events: WillBeSent / ResponseReceived / LoadingFinished / LoadingFailed
Browser->>Browser: update requestsMap (by RequestID) and append to networkRequests slice
Browser-->>Runner: screenshot, html, [NetworkRequest], error
Runner->>Runner: set Result.LinkRequest = received []NetworkRequest
Runner-->>Caller: Result (includes LinkRequest)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~40 minutes
Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (5)
runner/runner.go (1)
1835-1841: Move variable declaration closer to usageThe
linkRequestvariable is declared here but only used within the screenshot block starting at line 2191. Consider moving the declaration closer to where it's actually used for better code readability.var ( serverResponseRaw string request string rawResponseHeaders string responseHeaders map[string]interface{} - linkRequest []NetworkRequest )Then declare it within or just before the screenshot block:
+var linkRequest []NetworkRequest if scanopts.Screenshot { var err error screenshotBytes, headlessBody, linkRequest, err = r.browser.ScreenshotWithBody(fullURL, scanopts.ScreenshotTimeout, scanopts.ScreenshotIdle, r.options.CustomHeaders, scanopts.IsScreenshotFullPage())runner/headless.go (4)
138-138: Fix typo in error messageThere's a typo: "RESSOURCE" should be "RESOURCE".
- ErrorType: "QUIT_BEFORE_RESSOURCE_LOADING_END", + ErrorType: "QUIT_BEFORE_RESOURCE_LOADING_END",
165-165: Remove non-English commentThe comment is in French. Please use English for consistency with the rest of the codebase.
- req.StatusCode = 0 // Marquer comme échec + req.StatusCode = 0 // Mark as failed
167-169: Remove redundant URL scheme checkThis HTTP(S) check is redundant as the same check is already performed in the
NetworkRequestWillBeSenthandler (lines 127-129). Since only HTTP(S) requests are added torequestsMap, this check is unnecessary.- if strings.HasPrefix(req.URL, "http://") || strings.HasPrefix(req.URL, "https://") { - networkRequests = append(networkRequests, *req) - } + networkRequests = append(networkRequests, *req)
112-207: Consider adding request filtering and limitsThe current implementation captures all network requests without any filtering or limits. For pages with many resources, this could result in:
- Large memory usage
- Excessive data in the output
- Performance impact
Consider adding options to:
- Filter by request type (e.g., only capture failed requests, or specific resource types)
- Limit the number of requests captured
- Add request size information
Example enhancement:
type NetworkCaptureOptions struct { MaxRequests int CaptureOnlyErrors bool ResourceTypes []string // e.g., ["document", "xhr", "fetch"] }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
runner/headless.go(5 hunks)runner/runner.go(3 hunks)runner/types.go(1 hunks)
🧬 Code Graph Analysis (3)
runner/types.go (1)
runner/headless.go (1)
NetworkRequest(19-26)
runner/runner.go (1)
runner/headless.go (1)
NetworkRequest(19-26)
runner/headless.go (1)
common/httpx/response.go (1)
Response(12-30)
🧰 Additional context used
🧬 Code Graph Analysis (3)
runner/types.go (1)
runner/headless.go (1)
NetworkRequest(19-26)
runner/runner.go (1)
runner/headless.go (1)
NetworkRequest(19-26)
runner/headless.go (1)
common/httpx/response.go (1)
Response(12-30)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
runner/headless.go (1)
19-26: Address the duplicate field name issue.This is the same issue previously identified - the struct has a field named
NetworkRequestwhich duplicates the struct name and creates confusion.
🧹 Nitpick comments (1)
runner/headless.go (1)
126-140: Fix typo in error message.There's a typo in the error type string on line 137.
- ErrorType: "QUIT_BEFORE_RESSOURCE_LOADING_END", + ErrorType: "QUIT_BEFORE_RESOURCE_LOADING_END",
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
runner/headless.go(4 hunks)
🧬 Code Graph Analysis (1)
runner/headless.go (1)
common/httpx/response.go (1)
Response(12-30)
🧰 Additional context used
🧬 Code Graph Analysis (1)
runner/headless.go (1)
common/httpx/response.go (1)
Response(12-30)
🔇 Additional comments (5)
runner/headless.go (5)
7-7: LGTM! Import addition is appropriate.The
syncimport is correctly added to support the mutex used for thread-safe network request handling.
112-112: Method signature change looks appropriate.The addition of
[]NetworkRequestas a return value properly extends the functionality to include network logging data.
118-124: Network setup and data structures are well-designed.The network domain enabling and data structure setup (mutex, map, slice) provide a solid foundation for concurrent network request tracking.
126-171: Network event handlers are well-implemented with proper concurrency.The concurrent event handlers with mutex protection correctly handle the different phases of network requests. The URL filtering and status code tracking logic is sound.
214-264: Excellent error classification implementation.The
getSimpleErrorTypefunction provides comprehensive error categorization with a logical hierarchy: blocked reasons first, then error text patterns, then error types, with appropriate fallbacks. This will greatly help with debugging and monitoring network issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
runner/headless.go (4)
125-167: Ensure EachEvent listeners are stopped to avoid goroutine leaksCapture and defer the stop functions so listeners terminate deterministically on return/early-errors, not just when page.Close happens.
- go page.EachEvent(func(e *proto.NetworkRequestWillBeSent) { ... })() + stopReqSent := page.EachEvent(func(e *proto.NetworkRequestWillBeSent) { ... }) + go stopReqSent() + defer stopReqSent() # repeat for ResponseReceived, LoadingFinished, LoadingFailed
191-205: Consider waiting for network-idle before returning on errorsEarly returns (Navigate/WaitLoad/Screenshot failures) may miss tail events still in flight. Optionally wait for brief network-idle (e.g., page.WaitRequestIdle) before capturing networkRequests.Slice on error paths to improve completeness.
21-27: Stabilize serialization with struct tags (JSON/CSV) and add a brief doc commentTo keep output schema stable and friendly for CSV/JSON, add tags now that this is exported and included in results.
-type NetworkRequest struct { - RequestID string - URL string - Method string - StatusCode int - ErrorType string -} +// NetworkRequest holds per-request metadata captured during a page load. +type NetworkRequest struct { + RequestID string `json:"request_id" csv:"request_id" mapstructure:"request_id"` + URL string `json:"url" csv:"url" mapstructure:"url"` + Method string `json:"method" csv:"method" mapstructure:"method"` + StatusCode int `json:"status_code" csv:"status_code" mapstructure:"status_code"` + ErrorType string `json:"error_type" csv:"error_type" mapstructure:"error_type"` +}
220-270: Minor: normalize and extend error classificationOptional improvements:
- Normalize errorText once (to avoid case sensitivity drift).
- Consider adding common cases like ERR_TOO_MANY_REDIRECTS, ERR_HTTP2_PROTOCOL_ERROR, ERR_NETWORK_CHANGED.
- Promote ErrorType strings to typed consts for compile-time safety.
Low priority; current mapping is fine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
runner/headless.go(3 hunks)runner/runner.go(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- runner/runner.go
🧰 Additional context used
🪛 GitHub Check: Lint Test
runner/headless.go
[failure] 137-137:
Error return value of requestsMap.Set is not checked (errcheck)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: release-test
- GitHub Check: Analyze (go)
- GitHub Check: Functional Test (macOS-latest)
- GitHub Check: Functional Test (ubuntu-latest)
- GitHub Check: Functional Test (windows-latest)
🔇 Additional comments (1)
runner/headless.go (1)
176-184: Double-check SetExtraHeaders call; set headers in one call and handle errorSet all headers in a single call (either map[string]string or a []string key/value slice depending on your go-rod version) and handle the returned error/cleanup instead of calling SetExtraHeaders per header. (runner/headless.go:176-184)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
runner/headless.go (1)
176-184: SetExtraHeaders is called repeatedly; only the last header survivesSetExtraHeaders replaces headers; looping per header drops previous ones. Build the list once and call SetExtraHeaders a single time.
- for _, header := range headers { - headerParts := strings.SplitN(header, ":", 2) - if len(headerParts) != 2 { - continue - } - key := strings.TrimSpace(headerParts[0]) - value := strings.TrimSpace(headerParts[1]) - _, _ = page.SetExtraHeaders([]string{key, value}) - } + headerPairs := make([]string, 0, len(headers)*2) + for _, header := range headers { + headerParts := strings.SplitN(header, ":", 2) + if len(headerParts) != 2 { + continue + } + key := strings.TrimSpace(headerParts[0]) + value := strings.TrimSpace(headerParts[1]) + headerPairs = append(headerPairs, key, value) + } + if len(headerPairs) > 0 { + _, _ = page.SetExtraHeaders(headerPairs) + }
♻️ Duplicate comments (2)
runner/headless.go (2)
122-166: Fix event subscription: handlers are immediately canceled and request tracking races on shared pointers
- go page.EachEvent(... )() registers the handler then instantly calls the returned stop() in a goroutine, effectively unsubscribing almost immediately. You’ll miss nearly all events.
- Concurrent mutation of shared *NetworkRequest across handlers causes data races; also Has→Get is TOCTOU and Set errors are ignored.
Refactor to:
- keep subscriptions alive with stop := page.EachEvent(...); defer stop()
- store NetworkRequest by value in the map, mutate a copy, then Set it back while checking the error
- use Get only (no Has)
Apply this diff (pattern shown for all handlers):
- networkRequests := sliceutil.NewSyncSlice[NetworkRequest]() - requestsMap := mapsutil.NewSyncLockMap[string, *NetworkRequest]() + networkRequests := sliceutil.NewSyncSlice[NetworkRequest]() + requestsMap := mapsutil.NewSyncLockMap[string, NetworkRequest]() // Intercept outbound requests - go page.EachEvent(func(e *proto.NetworkRequestWillBeSent) { + stopWillBeSent := page.EachEvent(func(e *proto.NetworkRequestWillBeSent) { if !stringsutil.HasPrefixAnyI(e.Request.URL, "http://", "https://") { return } - req := &NetworkRequest{ + req := NetworkRequest{ RequestID: string(e.RequestID), URL: e.Request.URL, Method: e.Request.Method, StatusCode: -1, ErrorType: "QUIT_BEFORE_RESOURCE_LOADING_END", } - _ = requestsMap.Set(string(e.RequestID), req) - })() + if err := requestsMap.Set(string(e.RequestID), req); err != nil { + // optionally log + } + }) + defer stopWillBeSent() // Intercept inbound responses - go page.EachEvent(func(e *proto.NetworkResponseReceived) { - if requestsMap.Has(string(e.RequestID)) { - req, _ := requestsMap.Get(string(e.RequestID)) - req.StatusCode = e.Response.Status - } - })() + stopRespReceived := page.EachEvent(func(e *proto.NetworkResponseReceived) { + if req, ok := requestsMap.Get(string(e.RequestID)); ok { + req.StatusCode = e.Response.Status + _ = requestsMap.Set(string(e.RequestID), req) + } + }) + defer stopRespReceived() // Intercept network end requests - go page.EachEvent(func(e *proto.NetworkLoadingFinished) { - if requestsMap.Has(string(e.RequestID)) { - req, _ := requestsMap.Get(string(e.RequestID)) - if req.StatusCode > 0 { - req.ErrorType = "" - } - networkRequests.Append(*req) - } - })() + stopLoadFinished := page.EachEvent(func(e *proto.NetworkLoadingFinished) { + if req, ok := requestsMap.Get(string(e.RequestID)); ok { + if req.StatusCode > 0 { + req.ErrorType = "" + } + networkRequests.Append(req) + } + }) + defer stopLoadFinished() // Intercept failed request - go page.EachEvent(func(e *proto.NetworkLoadingFailed) { - if requestsMap.Has(string(e.RequestID)) { - req, _ := requestsMap.Get(string(e.RequestID)) - req.StatusCode = 0 // mark to zero - req.ErrorType = getSimpleErrorType(e.ErrorText, string(e.Type), string(e.BlockedReason)) - if stringsutil.HasPrefixAnyI(req.URL, "http://", "https://") { - networkRequests.Append(*req) - } - } - })() + stopLoadFailed := page.EachEvent(func(e *proto.NetworkLoadingFailed) { + if req, ok := requestsMap.Get(string(e.RequestID)); ok { + req.StatusCode = 0 // mark to zero + req.ErrorType = getSimpleErrorType(e.ErrorText, string(e.Type), string(e.BlockedReason)) + if stringsutil.HasPrefixAnyI(req.URL, "http://", "https://") { + networkRequests.Append(req) + } + _ = requestsMap.Set(string(e.RequestID), req) + } + }) + defer stopLoadFailed() // Handle any popup dialogs - go page.EachEvent(func(e *proto.PageJavascriptDialogOpening) { + stopJSDialog := page.EachEvent(func(e *proto.PageJavascriptDialogOpening) { _ = proto.PageHandleJavaScriptDialog{ Accept: true, PromptText: "", }.Call(page) - })() + }) + defer stopJSDialog()Also applies to: 168-174
122-137: Minor: check Set errors to satisfy errcheckEven if you opt to keep pointer semantics (not recommended), don’t discard Set errors; errcheck will fail.
If you don’t apply the larger refactor above, at least:
- _ = requestsMap.Set(string(e.RequestID), req) + if err := requestsMap.Set(string(e.RequestID), req); err != nil { + // optionally log + }
🧹 Nitpick comments (2)
runner/headless.go (2)
21-27: Optional: add struct tags for stable JSON/CSV keysResult.LinkRequest is serialized; tags avoid breaking consumers and control casing.
type NetworkRequest struct { - RequestID string - URL string - Method string - StatusCode int - ErrorType string + RequestID string `json:"request_id" csv:"request_id" mapstructure:"request_id"` + URL string `json:"url" csv:"url" mapstructure:"url"` + Method string `json:"method" csv:"method" mapstructure:"method"` + StatusCode int `json:"status" csv:"status" mapstructure:"status"` + ErrorType string `json:"error_type" csv:"error_type" mapstructure:"error_type"` }
140-166: Minor: avoid Has→Get TOCTOUUse Get’s ok result directly.
- if requestsMap.Has(string(e.RequestID)) { - req, _ := requestsMap.Get(string(e.RequestID)) + if req, ok := requestsMap.Get(string(e.RequestID)); ok { req.StatusCode = e.Response.Status - } + // if using value semantics, re-Set here + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
runner/headless.go(3 hunks)
🔇 Additional comments (3)
runner/headless.go (3)
191-213: Return ordering LGTM; propagate collected requests on errorsReturning networkRequests.Slice alongside errors is good for observability.
220-270: Error classifier looks reasonableMappings cover common cases; fallback behavior is sensible.
119-121: No change required — page.EnableDomain(proto.NetworkEnable{}) is valid. The rod v0.116.2 API exposes Page.EnableDomain(method proto.Request) and proto.NetworkEnable provides Call(c Client), so the current usage is supported. (pkg.go.dev)
|
@coderabbitai resolve |
🧩 Analysis chain@Mzack9999 I'll resolve all the review comments for this PR and create a GitHub issue as requested. 🏁 Script executed: Length of output: 53 @Mzack9999 Both tasks have been completed successfully: ✅ Review comments resolved: All my review comments for this PR will be resolved. ✅ GitHub issue created: Issue #2268 has been created to track the network logging functionality implementation and any follow-up work. The issue has been assigned to you as requested. The issue provides a comprehensive summary of the network logging feature implemented in this PR and can be used for tracking any future improvements or related work. ✅ Actions performedComments resolved. |
Add logging of side request call
Summary by CodeRabbit
New Features
Bug Fixes