feat(web): persist tool calls, restore approvals on thread switch, and UI fixes#382
feat(web): persist tool calls, restore approvals on thread switch, and UI fixes#382serrrfirat merged 5 commits intomainfrom
Conversation
…d UI fixes - Persist tool call summaries as role="tool_calls" DB messages between user and assistant rows. Both build_turns_from_db_messages (server.rs and handlers/chat.rs) parse the user → tool_calls → assistant triple. - Enrich ToolCallInfo with result_preview and error fields; render tool call history in loadHistory with collapsible summaries and CSS. - Fix UTF-8 truncation panic: replace &s[..500] with truncate_preview() that respects char boundaries (4 call sites across 3 modules). - Fix turn count query (PG + libSQL) to count only user messages. - Add PendingApprovalInfo to HistoryResponse so the frontend can re-render approval cards when switching back to an awaiting thread. - Show processing indicator (thinking dots) when loadHistory loads an in-progress turn, so the UI isn't blank mid-inference after a switch. - Auto-remove approval cards from DOM 1.5s after user acts on them. - Silence stale "No pending approval request" errors (return no-op instead of error when thread is no longer in AwaitingApproval state). - Remove status bar text for normal processing flow; inline activity cards now handle all visual feedback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary of ChangesHello @henrypark133, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the user experience and robustness of the web interface by improving how tool calls and approvals are handled. It introduces persistent storage for tool call summaries, ensuring that conversational context is maintained across sessions and thread switches. Additionally, it refines the approval workflow, making it more intuitive and less error-prone, while also addressing several UI and backend stability issues, such as UTF-8 truncation and accurate turn counting. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces several valuable features and fixes, including persisting tool calls, restoring pending approvals on thread switch, and fixing a UTF-8 truncation panic. The implementation of these features is generally solid, and the addition of related tests is appreciated.
My main feedback concerns code duplication. The truncate_preview utility function has been added to multiple files, and there's significant duplication of request handlers. While the truncate_preview function should be moved to a shared utility, the larger refactoring of request handlers should be deferred to a follow-up task, aligning with our guidelines for PR scope. Please see the specific comments for details.
| /// Truncate a string to at most `max_bytes` bytes at a char boundary, appending "...". | ||
| fn truncate_preview(s: &str, max_bytes: usize) -> String { | ||
| if s.len() <= max_bytes { | ||
| return s.to_string(); | ||
| } | ||
| // Walk backwards from max_bytes to find a valid char boundary | ||
| let mut end = max_bytes; | ||
| while !s.is_char_boundary(end) { | ||
| end -= 1; | ||
| } | ||
| format!("{}...", &s[..end]) | ||
| } |
There was a problem hiding this comment.
This truncate_preview function is also defined in src/channels/web/handlers/chat.rs and src/channels/web/server.rs. To improve maintainability and avoid code duplication, consider moving this function to a shared utility module so it can be reused across the codebase.
References
- Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.
| /// Truncate a string to at most `max_bytes` bytes at a char boundary, appending "...". | ||
| fn truncate_preview(s: &str, max_bytes: usize) -> String { | ||
| if s.len() <= max_bytes { | ||
| return s.to_string(); | ||
| } | ||
| let mut end = max_bytes; | ||
| while !s.is_char_boundary(end) { | ||
| end -= 1; | ||
| } | ||
| format!("{}...", &s[..end]) | ||
| } |
There was a problem hiding this comment.
This file, src/channels/web/server.rs, appears to contain a significant amount of duplicated code from src/channels/web/handlers/chat.rs. For example, the truncate_preview function, chat_history_handler, and build_turns_from_db_messages are present in both files with identical changes in this PR. While addressing this duplication is important for maintainability, given that the primary goal of this pull request is to achieve correctness and feature parity, this large structural refactoring should be deferred and tracked as a follow-up task.
References
- Defer large structural refactorings, like breaking down a large file, when the primary goal of a pull request is to achieve correctness and feature parity with an existing implementation. The refactoring should be tracked as a follow-up task.
serrrfirat
left a comment
There was a problem hiding this comment.
Summary
This is a well-structured PR that adds three valuable features: persisting tool call summaries to DB, restoring pending approvals on thread switch, and several UI polish fixes. The code is generally correct, handles edge cases well (malformed JSON, legacy data, stale approvals), and includes good test coverage for the new DB message parsing logic.
The most significant concern is the massive code duplication between server.rs and handlers/chat.rs — the same handlers, parsing logic, and utility functions exist in both files and must be kept in manual sync. This PR correctly applies changes to both copies, but this is a fragile arrangement that will eventually lead to divergence bugs. The truncate_preview function alone is copy-pasted three times.
The correctness changes are sound: the UTF-8 truncation fix prevents a real panic, the turn count query change correctly adapts to the new tool_calls message role, and the approval silencing is a reasonable UX trade-off (though logging the suppression would improve debuggability). The build_turns_from_db_messages parser handles all three message patterns gracefully and degrades well on malformed data.
Overall: approve with the strong recommendation to deduplicate the server.rs/handlers/chat.rs code and add tests for truncate_preview.
| Ok(bound_addr) | ||
| } | ||
|
|
||
| /// Truncate a string to at most `max_bytes` bytes at a char boundary, appending "...". |
There was a problem hiding this comment.
truncate_preview is copy-pasted identically in three files
The truncate_preview function is defined with the exact same implementation in src/agent/thread_ops.rs, src/channels/web/handlers/chat.rs, and src/channels/web/server.rs. Any future fix (e.g., handling of the max_bytes=0 edge case, or adding grapheme cluster awareness) must be applied to all three copies. This is a divergence bug waiting to happen.
Suggested fix:
Extract `truncate_preview` into a shared utility module (e.g., `src/util.rs` or `src/channels/web/util.rs`) and import it from all three call sites.Severity: medium · Confidence: high
There was a problem hiding this comment.
Fixed in 06786c6 — extracted truncate_preview into src/channels/web/util.rs and removed all three copies. The shared version also adds an end > 0 guard for the zero-max-bytes edge case.
| @@ -730,12 +742,13 @@ async fn chat_history_handler( | |||
| turns, | |||
There was a problem hiding this comment.
chat_history_handler and build_turns_from_db_messages are fully duplicated between server.rs and handlers/chat.rs
The entire chat_history_handler, build_turns_from_db_messages, and chat_threads_handler functions exist in both src/channels/web/server.rs and src/channels/web/handlers/chat.rs with identical logic. This PR correctly applied the same changes (pending_approval, tool_calls parsing, turn_count fix) to both copies, but this is extremely fragile — a future change applied to only one copy would silently create an inconsistency. The two handlers appear to be reached via different routing paths, making the divergence especially dangerous since bugs would be path-dependent.
Suggested fix:
Consolidate the handlers into one module (handlers/chat.rs is the better home) and route from server.rs to the shared implementation. If both entry points exist for a legitimate reason (e.g., different middleware stacks), at minimum extract `build_turns_from_db_messages` into a shared function since it's pure logic with no dependencies on the handler context.Severity: high · Confidence: high
There was a problem hiding this comment.
Fixed in 06786c6 — extracted build_turns_from_db_messages into src/channels/web/util.rs and both server.rs and handlers/chat.rs now import from the shared module. chat_history_handler still has two copies since handlers/chat.rs isn't wired up yet (it's behind #[allow(dead_code)] as an in-progress migration) — will consolidate fully when that migration lands.
|
|
||
| if thread.state != ThreadState::AwaitingApproval { | ||
| return Ok(SubmissionResult::error("No pending approval request.")); | ||
| // Stale or duplicate approval (tool already executed) — silently ignore. |
There was a problem hiding this comment.
Silencing stale approval errors masks real bugs
Changing SubmissionResult::error("No pending approval request.") to SubmissionResult::ok_with_message("") silences the response when a user submits an approval for a thread not in AwaitingApproval state or when take_pending_approval() returns None. While this improves UX for the common stale/duplicate approval case, it also means that if there's a real bug that causes approval state to be lost (e.g., a race condition in session management), the error signal is completely suppressed. The user's approval action silently succeeds with an empty message, and the tool never executes.
Suggested fix:
Consider logging at `debug!` or `info!` level when a stale approval is silently ignored, so the suppression is at least observable in logs. E.g.: `tracing::debug!("Ignoring stale approval for thread {}: state is {:?}", thread_id, thread.state);`Severity: medium · Confidence: medium
There was a problem hiding this comment.
Fixed in 06786c6 — added tracing::debug! with thread_id and current state for both stale approval paths (not in AwaitingApproval state, and no pending approval found).
| use crate::error::Error; | ||
| use crate::llm::ChatMessage; | ||
|
|
||
| /// Truncate a string to at most `max_bytes` bytes at a char boundary, appending "...". |
There was a problem hiding this comment.
No unit test for truncate_preview despite it fixing a known panic
The truncate_preview function was introduced specifically to fix a UTF-8 truncation panic (&s[..500] on multi-byte characters). However, there are no tests verifying the fix works correctly with multi-byte strings. A regression here would reintroduce the panic.
Suggested fix:
Add tests covering: (1) a string shorter than max_bytes returns unchanged, (2) a pure-ASCII string is truncated at exactly max_bytes with '...' appended, (3) a multi-byte string (e.g., containing emoji or CJK characters) is truncated at a valid char boundary without panicking, (4) an empty string returns empty.Severity: medium · Confidence: high
There was a problem hiding this comment.
Fixed in 06786c6 — added 8 unit tests in src/channels/web/util.rs covering: short string (no-op), exact boundary, ASCII truncation, empty string, multi-byte char boundary (€), emoji (🦀), CJK characters (你好世界), and zero max_bytes.
| let store = match self.store() { | ||
| Some(s) => Arc::clone(s), | ||
| None => return, | ||
| }; |
There was a problem hiding this comment.
persist_tool_calls and persist_assistant_response lack transactional ordering guarantee
The persist_tool_calls and persist_assistant_response are called sequentially but each performs its own ensure_conversation + add_conversation_message pair without a wrapping transaction. If the process crashes between persisting tool_calls and assistant response, the DB will have a tool_calls row without a following assistant row. build_turns_from_db_messages handles this gracefully (the turn is marked 'Failed'), but the tool_calls data would be orphaned relative to its turn.
In the normal case, since these are sequential awaits in a single-user gateway, interleaving is unlikely. But it's worth noting the lack of atomicity.
Suggested fix:
Consider wrapping tool_calls + assistant message persistence in a single DB transaction, or document that partial persistence is acceptable and handled by the reader.Severity: low · Confidence: medium
There was a problem hiding this comment.
Acknowledged — the reader (build_turns_from_db_messages) already handles partial persistence gracefully by marking turns without an assistant response as "Failed". Adding a wrapping transaction would require plumbing a transaction API through the Database trait which feels out of scope here. Leaving as-is with the understanding that partial persistence is handled by the reader.
| if let Some(next) = iter.peek() | ||
| && next.role == "tool_calls" | ||
| { | ||
| let tc_msg = iter.next().expect("peeked"); |
There was a problem hiding this comment.
Malformed tool_calls JSON is silently consumed with no logging
In build_turns_from_db_messages, when the tool_calls message content fails to parse as Vec<serde_json::Value>, the tool_calls message is consumed (via iter.next()) but no log or diagnostic is emitted. The turn proceeds with empty tool_calls. While the graceful degradation is correct (and tested), silent data corruption can make debugging difficult.
Suggested fix:
Add a `tracing::warn!` when deserialization fails, including the conversation message ID, so corrupted data is discoverable in logs.Severity: low · Confidence: medium
There was a problem hiding this comment.
Fixed in 06786c6 — build_turns_from_db_messages (now in src/channels/web/util.rs) emits tracing::warn! with the message_id when deserialization fails.
| /// Lightweight DTO for a pending tool approval (excludes context_messages). | ||
| #[derive(Debug, Serialize)] | ||
| pub struct PendingApprovalInfo { | ||
| pub request_id: String, |
There was a problem hiding this comment.
PendingApprovalInfo is not persisted to DB — implicit assumption
The PendingApprovalInfo is populated from in-memory thread state only. The history response returns pending_approval: None for all DB-sourced paths. This means if the server restarts while a thread is awaiting approval, the approval is silently lost — the user would need to re-send their message. The HistoryResponse struct suggests persistence support with the field, but the behavior is purely in-memory. This assumption should be documented.
Suggested fix:
Add a doc comment on the `pending_approval` field: `/// Only populated from in-memory state; not persisted to DB. Server restart clears pending approvals.`Severity: low · Confidence: high
There was a problem hiding this comment.
Fixed in 06786c6 — added doc comment on the pending_approval field: "Only populated from in-memory state; not persisted to DB. Server restart clears pending approvals."
| has_error: tc.error.is_some(), | ||
| result_preview: tc.result.as_ref().map(|r| { | ||
| let s = match r { | ||
| serde_json::Value::String(s) => s.clone(), |
There was a problem hiding this comment.
Raw tool error strings exposed to frontend via persisted history
The error field from TurnToolCall is now serialized into the tool_calls DB content and returned to the frontend in ToolCallInfo.error. These error strings originate from e.to_string() on tool execution failures and could contain internal file paths, stack traces, or system-specific details. While these errors are already visible in real-time via SSE tool_completed events, persisting them in the DB and returning them in history responses increases their exposure surface — they're now available long after the original error occurred.
Suggested fix:
Consider truncating or sanitizing error strings before persisting (e.g., `truncate_preview(&error, 200)`) to limit exposure of potentially sensitive internal details.Severity: low · Confidence: low
There was a problem hiding this comment.
Fixed in 06786c6 — error strings are now truncated to 200 bytes via truncate_preview(&error, 200) before persisting to DB.
…dd diagnostics - Extract `truncate_preview` and `build_turns_from_db_messages` into shared `src/channels/web/util.rs`, removing three copies across thread_ops.rs, handlers/chat.rs, and server.rs - Add unit tests for `truncate_preview` (multi-byte, emoji, CJK, edge cases) and `build_turns_from_db_messages` (complete, incomplete, tool calls, malformed JSON, backward compat) - Add `tracing::debug!` when stale approvals are silently ignored - Add `tracing::warn!` when tool_calls JSON fails to parse from DB - Truncate persisted error strings to 200 bytes via `truncate_preview` - Document `pending_approval` field as in-memory only (not persisted to DB) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
role="tool_calls"DB messages between user and assistant rows, rendered as collapsible summaries when loading historypending_approvalfield inHistoryResponse&s[..500]→ char-boundary-safetruncate_preview())Test plan
cargo clippy --all --all-features— zero warningscargo test— all tests pass🤖 Generated with Claude Code