feat(web): show error details for failed tool calls#490
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces several key features and improvements focused on enhancing the security, usability, and robustness of the agent framework. It adds detailed error reporting for failed tool calls, implements sensitive parameter redaction to prevent secret leakage, provides secret management tools, fixes authentication flows, and improves registry bootstrapping. Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces significant improvements in handling sensitive information and user feedback for tool calls. Sensitive parameters are now consistently redacted across logging, hooks, approval displays, and action records, which is a strong security enhancement. The web UI has been updated to display detailed error messages and redacted input parameters for failed tool calls, greatly improving the user experience. Additionally, new secret management tools (secret_list, secret_delete) have been added, and the extension authentication flow has been refined to better distinguish between OAuth and setup-only extensions. The registry installer now correctly handles MissingChecksum errors by falling back to source builds, which is a good bootstrapping mechanism. Overall, these changes enhance security, user experience, and system robustness.
Note: Security Review did not run due to the size of the PR.
| let mut redacted_args: Vec<serde_json::Value> = | ||
| Vec::with_capacity(tool_calls.len()); | ||
| for tc in &tool_calls { | ||
| let safe = if let Some(tool) = self.tools().get(&tc.name).await { | ||
| redact_params(&tc.arguments, tool.sensitive_params()) | ||
| } else { | ||
| tc.arguments.clone() | ||
| }; | ||
| redacted_args.push(safe); | ||
| } |
There was a problem hiding this comment.
Performing async calls (self.tools().get(&tc.name).await) inside a synchronous loop that precedes a mutex lock (session.lock().await) can lead to performance bottlenecks or potential deadlocks. It's generally better to complete all async operations that retrieve data (like fetching tool details and sensitive parameters) before acquiring a long-held lock on shared state.
Consider collecting all tool references and their sensitive_params in a separate Vec before the for loop, and then iterate over that Vec when building redacted_args.
let mut redacted_args: Vec<serde_json::Value> = Vec::with_capacity(tool_calls.len());
let mut tool_details = Vec::with_capacity(tool_calls.len());
for tc in &tool_calls {
let tool = self.tools().get(&tc.name).await;
let sensitive = tool.as_ref().map(|t| t.sensitive_params()).unwrap_or(&[]);
tool_details.push((tool, sensitive));
redacted_args.push(redact_params(&tc.arguments, sensitive));
}
let mut sess = session.lock().await;
if let Some(thread) = sess.threads.get_mut(&thread_id)
&& let Some(turn) = thread.last_turn_mut()
{
for (tc, safe_args) in tool_calls.iter().zip(redacted_args) {
turn.record_tool_call(&tc.name, safe_args);
}There was a problem hiding this comment.
Thanks for the review! Looking at this more closely, the code already does what you're suggesting — the async .get() calls happen inside the loop (lines 326-332) and the session.lock().await is acquired after the loop completes (line 334). The comment on line 322 even says "Look up each tool's sensitive_params before acquiring the session lock."
So the structure is already: collect all redacted args (async, no lock held) → acquire lock → iterate and record. No deadlock or bottleneck risk here.
There was a problem hiding this comment.
Pull request overview
This PR enhances the gateway web UI to display error details on failed tool calls, introduces a comprehensive sensitive parameter redaction system, adds secret_list and secret_delete tools for LLM-driven credential management, fixes the auth flow for setup-only extensions, and improves the release CI to use PRs instead of direct pushes.
Changes:
- Failed tool call display:
SseEvent::ToolCompletedgainserrorandparametersfields; the JS UI auto-expands failed tool cards with the error message and redacted input parameters. - Sensitive param redaction: New
sensitive_params()trait method onToolandredact_params()utility, wired through hooks, approvals,ActionRecordstorage, debug logs, and SSE broadcasts. - Secret management tools + registry/CI improvements:
SecretListTool,SecretDeleteTool,MissingChecksumerror variant for source fallback, and PR-based checksum updates in the release workflow.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/channels/channel.rs |
Adds error/parameters fields to StatusUpdate::ToolCompleted and a tool_completed() constructor with redaction logic |
src/agent/dispatcher.rs |
Applies redaction before hooks, stores redacted params in approval display, uses tool_completed() constructor |
src/agent/thread_ops.rs |
Switches to StatusUpdate::tool_completed() for ToolCompleted broadcasts |
src/agent/worker.rs |
Adds redact_params to debug logs and ActionRecord calls |
src/tools/tool.rs |
Adds sensitive_params() trait method and redact_params() utility with tests |
src/tools/mod.rs |
Re-exports redact_params publicly |
src/tools/builtin/secrets_tools.rs |
New SecretListTool and SecretDeleteTool implementations with tests |
src/tools/builtin/mod.rs |
Declares and re-exports secrets_tools module |
src/tools/registry.rs |
Adds register_secrets_tools() method |
src/channels/web/types.rs |
Adds error/parameters fields to SseEvent::ToolCompleted |
src/channels/web/mod.rs |
Maps new ToolCompleted fields from StatusUpdate to SseEvent |
src/channels/web/server.rs |
Broadcasts auth_completed SSE on setup submission |
src/channels/web/static/app.js |
Updates completeToolCard() to display error details; fixes auth_required/auth_completed handlers |
src/channels/web/static/style.css |
Adds red color for failed tool card names |
src/channels/wasm/wrapper.rs |
Updates StatusUpdate::ToolCompleted pattern to use .. for new fields |
src/channels/signal.rs |
Updates StatusUpdate::ToolCompleted pattern to use .. for new fields |
src/channels/repl.rs |
Updates StatusUpdate::ToolCompleted pattern to use .. for new fields |
src/registry/catalog.rs |
Adds MissingChecksum error variant |
src/registry/installer.rs |
Uses MissingChecksum instead of InvalidManifest to enable source fallback |
src/app.rs |
Registers secrets tools when a secrets store is available |
.github/workflows/release.yml |
Creates a PR for checksum updates instead of pushing directly to main |
tests/ws_gateway_integration.rs |
Updates test to include error/parameters fields in ToolCompleted |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Fetch the tool upfront so we can redact sensitive params | ||
| // before they touch hooks or approval display. | ||
| let tool_opt = self.tools().get(&tc.name).await; | ||
| let sensitive = tool_opt | ||
| .as_ref() | ||
| .map(|t| t.sensitive_params()) | ||
| .unwrap_or(&[]); | ||
|
|
||
| // Hook: BeforeToolCall (runs before approval so hooks can | ||
| // modify parameters — approval is checked on final params) | ||
| // modify parameters — approval is checked on final params). | ||
| // Hooks receive redacted params so sensitive values are not | ||
| // exposed to hook handlers or their logs. | ||
| let hook_params = redact_params(&tc.arguments, sensitive); | ||
| let event = crate::hooks::HookEvent::ToolCall { | ||
| tool_name: tc.name.clone(), | ||
| parameters: tc.arguments.clone(), | ||
| parameters: hook_params, | ||
| user_id: message.user_id.clone(), | ||
| context: "chat".to_string(), | ||
| }; |
There was a problem hiding this comment.
The worker.rs changes correctly use safe_params (redacted) instead of raw params in the debug log. However, the analogous code path in execute_chat_tool_standalone (in dispatcher.rs) was not updated and still logs raw, unredacted params at line 782–786 of that file:
tracing::debug!(
tool = %tool_name,
params = %params,
"Tool call started"
);
This function is used by all chat-path tool calls (both serial and parallel) in dispatcher.rs and thread_ops.rs. Since the tool variable is already resolved inside execute_chat_tool_standalone, the fix should follow the same pattern as worker.rs: compute let safe_params = redact_params(params, tool.sensitive_params()) and log safe_params instead.
There was a problem hiding this comment.
Good catch! Fixed in b42446e — execute_chat_tool_standalone now computes safe_params = redact_params(params, tool.sensitive_params()) before the debug log, matching the pattern in worker.rs.
| let display_params = redact_params(&tc.arguments, tool.sensitive_params()); | ||
| let pending = PendingApproval { | ||
| request_id: Uuid::new_v4(), | ||
| tool_name: tc.name.clone(), | ||
| parameters: tc.arguments.clone(), | ||
| parameters: display_params, |
There was a problem hiding this comment.
The parameters field of PendingApproval is now stored with sensitive values already redacted (set to "[REDACTED]" at line 722 of dispatcher.rs). When the user approves the pending tool call in thread_ops.rs, the tool is executed using pending.parameters (line 751), which means the redacted parameters are actually passed to the tool's execute() method — not the originals.
For example, if a hypothetical secret_save tool (which marks "value" as sensitive) requires user approval, the value passed to execute() after approval would be "[REDACTED]" rather than the actual secret the LLM was asked to save. The tool would store the string literal "[REDACTED]" instead of the real value.
The fix should be to store the original (unredacted) parameters in PendingApproval for execution purposes, while keeping a separate redacted copy for display in the approval UI. One approach is to add a display_parameters (redacted) field alongside the existing parameters (original) field in PendingApproval, using only display_parameters for user-facing output and parameters for execution.
There was a problem hiding this comment.
Great catch — this was a real bug. Fixed in b42446e by adding a display_parameters field to PendingApproval. Now parameters keeps the original values (used for tool.execute() after approval) and display_parameters holds the redacted copy (used for the approval UI, SSE broadcasts, and logs).
9040e9d to
b42446e
Compare
b42446e to
54ee041
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .into()); | ||
| } | ||
|
|
||
| // Redact sensitive parameter values (e.g. secret_save's "value") before | ||
| // they touch any observability or audit path. | ||
| let safe_params = redact_params(¶ms, tool.sensitive_params()); | ||
| tracing::debug!( | ||
| tool = %tool_name, | ||
| params = %params, | ||
| params = %safe_params, | ||
| job = %job_id, | ||
| "Tool call started" | ||
| ); |
There was a problem hiding this comment.
The BeforeToolCall hook in worker.rs (around line 703) still receives the raw (unredacted) parameters. This is inconsistent with dispatcher.rs lines 380–383, where hooks intentionally receive hook_params = redact_params(&tc.arguments, sensitive) to prevent sensitive values from being exposed to hook handlers and external webhook destinations.
For the autonomous job-context path (worker.rs), tools that declare sensitive_params() will still expose those values to hook handlers. The fix should apply redact_params(params, tool.sensitive_params()) before constructing the HookEvent::ToolCall, just as done in dispatcher.rs.
There was a problem hiding this comment.
Fixed in cb6b1fc — BeforeToolCall in worker.rs now computes hook_params = redact_params(params, tool.sensitive_params()) before constructing the HookEvent::ToolCall, matching the pattern in dispatcher.rs.
| tracing::info!("Registered {} job management tools", job_tool_count); | ||
| } | ||
|
|
||
| /// Register secret management tools (save, list, delete). |
There was a problem hiding this comment.
The docstring for register_secrets_tools says it registers tools for "save, list, delete", but only SecretListTool and SecretDeleteTool are actually registered — there is no SecretSaveTool. The secret_save name only appears in test stubs and doc comments in this PR but does not correspond to an actual tool implementation. The docstring should be updated to say "(list, delete)" (matching the log message on line 360), not "(save, list, delete)".
| /// Register secret management tools (save, list, delete). | |
| /// Register secret management tools (list, delete). |
There was a problem hiding this comment.
Fixed in cb6b1fc — updated the docstring to (list, delete).
| // Broadcast auth_completed so the chat UI can dismiss any in-progress | ||
| // auth card or setup modal that was triggered by tool_auth/tool_activate. | ||
| state.sse.broadcast(SseEvent::AuthCompleted { | ||
| extension_name: name.clone(), | ||
| success: true, | ||
| message: result.message.clone(), | ||
| }); |
There was a problem hiding this comment.
When the configure modal successfully submits (via submitConfigureModal), both paths will fire for the same action:
- The HTTP response handler (
submitConfigureModal) shows a toast (e.g. "Configured and activated …") and callsloadExtensions(). - The new
auth_completedSSE event (broadcast fromextensions_setup_submit_handlerinserver.rs) fires theauth_completedlistener which also shows a toast and callsloadExtensions().
This means the user sees two success toasts and loadExtensions() is called twice for every successful configure-modal submission. The SSE broadcast was added to handle the chat-UI path (dismissing an in-progress auth/setup modal triggered by a tool call), but it also reaches the user when they submit the configure modal directly from the Extensions tab. Consider either suppressing the auth_completed SSE broadcast when submission comes from the direct modal path (not a tool-triggered flow), or having submitConfigureModal skip its own toast/loadExtensions when an auth_completed SSE is expected.
There was a problem hiding this comment.
Fixed in cb6b1fc — submitConfigureModal no longer shows a toast or calls loadExtensions() for non-OAuth success. The auth_completed SSE handler already does both, so removing the duplicate calls in the HTTP response path eliminates the double toast.
|
@henrypark133 please check copilot comments, thanks! |
Failed tool calls in the gateway UI previously showed only a red X icon with an empty expandable body. This change: - Adds optional `error` and `parameters` fields to `ToolCompleted` SSE events so the browser receives failure details in real-time - Auto-expands failed tool cards to make errors immediately visible - Adds `StatusUpdate::tool_completed()` constructor that centralizes the 5 duplicated construction sites and applies `redact_params()` to prevent sensitive values (e.g. secret_save's "value" param) from leaking through SSE broadcasts - Adds `sensitive_params()` trait method to `Tool` for declaring which parameters must be redacted before logging, hooks, and UI display - Adds `redact_params()` utility and wires it through hooks, approvals, ActionRecord storage, and debug logs in dispatcher/worker - Adds `SecretListTool` and `SecretDeleteTool` for LLM-driven secret management (values never returned, only names/metadata) - Fixes auth flow: setup-only extensions show configure modal instead of OAuth card; auth_completed SSE dismisses both UI paths - CI: release workflow creates PR instead of pushing directly to main - Registry: MissingChecksum error enables source fallback for bootstrapping when checksums haven't been populated yet Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ly for display Address two PR review comments: 1. execute_chat_tool_standalone now redacts sensitive params before logging, matching the pattern already used in worker.rs. 2. PendingApproval previously stored redacted parameters, which meant approved tool calls received "[REDACTED]" instead of the actual values. Add a display_parameters field for UI/logs and keep parameters as the original values used for execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- worker.rs: redact sensitive params before BeforeToolCall hook, matching dispatcher.rs — hooks in the autonomous job path now receive redacted params instead of raw values - registry.rs: fix docstring for register_secrets_tools (list, delete, not save/list/delete — no SecretSaveTool is registered) - app.js: fix double toast/loadExtensions in submitConfigureModal — for non-OAuth success the auth_completed SSE already handles both, so skip them in the HTTP response handler to avoid duplicates [skip-regression-check] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
54ee041 to
cb6b1fc
Compare
* feat(web): show error details and input params for failed tool calls Failed tool calls in the gateway UI previously showed only a red X icon with an empty expandable body. This change: - Adds optional `error` and `parameters` fields to `ToolCompleted` SSE events so the browser receives failure details in real-time - Auto-expands failed tool cards to make errors immediately visible - Adds `StatusUpdate::tool_completed()` constructor that centralizes the 5 duplicated construction sites and applies `redact_params()` to prevent sensitive values (e.g. secret_save's "value" param) from leaking through SSE broadcasts - Adds `sensitive_params()` trait method to `Tool` for declaring which parameters must be redacted before logging, hooks, and UI display - Adds `redact_params()` utility and wires it through hooks, approvals, ActionRecord storage, and debug logs in dispatcher/worker - Adds `SecretListTool` and `SecretDeleteTool` for LLM-driven secret management (values never returned, only names/metadata) - Fixes auth flow: setup-only extensions show configure modal instead of OAuth card; auth_completed SSE dismisses both UI paths - CI: release workflow creates PR instead of pushing directly to main - Registry: MissingChecksum error enables source fallback for bootstrapping when checksums haven't been populated yet Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: apply cargo fmt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: keep original params in PendingApproval for execution, redact only for display Address two PR review comments: 1. execute_chat_tool_standalone now redacts sensitive params before logging, matching the pattern already used in worker.rs. 2. PendingApproval previously stored redacted parameters, which meant approved tool calls received "[REDACTED]" instead of the actual values. Add a display_parameters field for UI/logs and keep parameters as the original values used for execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review comments - worker.rs: redact sensitive params before BeforeToolCall hook, matching dispatcher.rs — hooks in the autonomous job path now receive redacted params instead of raw values - registry.rs: fix docstring for register_secrets_tools (list, delete, not save/list/delete — no SecretSaveTool is registered) - app.js: fix double toast/loadExtensions in submitConfigureModal — for non-OAuth success the auth_completed SSE already handles both, so skip them in the HTTP response handler to avoid duplicates [skip-regression-check] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
sensitive_params()trait method onTool+redact_params()utility. Wired through hooks, approvals, ActionRecord storage, debug logs, and SSE broadcasts to prevent secret leakage.secret_listandsecret_deletetools for LLM-driven credential management (values never returned).auth_completedSSE dismisses both UI paths.MissingChecksumerror enables source fallback for bootstrapping.Test plan
secret_savevalue) are redacted as[REDACTED]in the tool cardcargo test --all-features— all 1855 tests passcargo clippy --all --all-features— zero warnings🤖 Generated with Claude Code