-
Notifications
You must be signed in to change notification settings - Fork 12.9k
feat(cli): Interactive Progress Visualization & Task Stepping #21484
Description
What would you like to be added?
Gemini CLI's current UX treats multi-step agentic execution as a black box. Current chain of thought visuals come from the CLI Help Agent, inline thinking, and thought blocks. Tool calls appear in a flat, sequential list with no indication of hierarchy, nesting, or decision flow. Users also have no way to inspect what the model decided at each step, pause before a specific tool fires, or zoom in/out of output detail without permanently changing their approval mode.
This proposal (which I will implement) adds three interlocking features to the Ink TUI:
- Real-time hierarchical task tree — visualizes the parent/child
callIdgraph of all tool calls, agent decisions, and subagent invocations as a live-updating tree. - Step-through mode — a new
STEPexecution mode where the user presses Enter to advance through each tool call, with the ability to inspect inputs before approving. - Collapsible tool output sections + configurable verbosity levels — per-call expand/collapse and a new
ui.verbosityLevelsetting controlling how much output is shown by category.
Current Behavior
- Tool calls render in
ToolGroupMessage → ToolMessage → ToolResultDisplayas a flat list. TheparentCallIdfield exists onToolCallRequestInfobut is never visualized. DEFAULTapproval mode only asks before using dangerous tools (edit/exec/mcp). There is no mode that pauses before every tool, regardless of danger level.- No collapse/expand on tool outputs — long results are either shown in full or truncated at
truncateToolOutputThreshold(40k chars). - The only verbosity control is a global
ui.errorVerbosity: 'low' | 'full'toggle. There are no per-category or per-tool verbosity levels. - The
WriteTodosTool/TodoTraysystem tracks task status but renders it in a sidebar, not integrated with the execution trace.
Expected Behavior
1. Task Tree Visualization
When an agent turn is active, the MainContent area shows a live-updating tree instead of the current flat list:
▶ Turn 1 — "refactor the auth module"
├─ ✓ read_file src/auth.ts [0.3s]
├─ ✓ glob src/**/*.test.ts [0.1s]
├─ ● edit_file src/auth.ts [running…]
│ └─ ◷ shell tsc --noEmit [pending]
└─ ◌ subagent write-tests [queued]
├─ ◌ write_file src/auth.test.ts
└─ ◌ shell npm test
- Status icons:
●running /✓success /✗error /◷pending /◌queued /⊘cancelled - Duration shown for completed calls
- Subagent trees are nested under their parent
callId - Clicking
→/←on a node expands/collapses its output inline - The tree is scrollable in alternate-buffer mode via the existing
ScrollableList
2. Step-Through Mode
A new ApprovalMode.STEP mode (or alternatively a separate stepThroughEnabled boolean, see alternatives). When active:
- Before every tool call is dispatched (regardless of
Kind), the scheduler pauses at a newStepPendingstatus - A new
ToolStepDialogrenders at the bottom of the screen (similar toToolConfirmationQueue) - The dialog shows: tool name, inputs rendered as syntax-highlighted JSON, and a summary of what the tool will do
- Keyboard actions:
Enter/n— execute this tool, advance to next steps— skip this tool (return empty result, don't execute)c— continue (exit step-through, resume normal execution)q/Esc— cancel the entire agent turn
- Added to the
CYCLE_APPROVAL_MODEkeybinding cycle:DEFAULT → AUTO_EDIT → STEP → PLAN - A step counter
Step X of ~Yis shown (Y is estimated from the pending queue)
3. Collapsible Output + Verbosity Levels
Per-call collapsible sections:
ToolResultDisplaywraps output in a collapsible component- Outputs longer than a configurable threshold (default 20 lines) auto-collapse
→/←arrow keys toggle collapse state on a focused node in the tree- "Expand all" / "Collapse all" global shortcut
New ui.verbosityLevel setting with per-category overrides:
| Level | Tool inputs | Tool outputs | Model thoughts | Timing | Status transitions |
|---|---|---|---|---|---|
quiet |
never | never | never | never | final only |
standard |
never | truncated | never | on hover | success/error |
verbose |
always | full | on demand | always | all |
debug |
always | full + raw | always | always | all + state |
Setting path: ui.verbosityLevel (global) + ui.verbosityOverrides (per-tool-kind map).
4. Error State Visualization for Nested Failures
Currently, a failed subagent shows a single error message. With this feature:
ErroredToolCallexpands to show the full nested call chain that led to the failure- Retried calls show the retry count and reason inline on the tree node
- A "copy error chain" shortcut copies the full nested stack to clipboard
High-Level Architecture Overview
flowchart TD
Scheduler --> MessageBus
MessageBus -->|"TOOL_CALLS_UPDATE"| useToolScheduler
useToolScheduler --> useTaskTree
useTaskTree -->|"TaskTreeNode[]"| TaskTreeDisplay
TaskTreeDisplay --> TaskNode
TaskNode --> ToolResultDisplay
Scheduler -->|"STEP_THROUGH_REQUEST"| MessageBus2[MessageBus]
MessageBus2 -->|"STEP_THROUGH_REQUEST"| useStepThrough
useStepThrough --> ToolStepDialog
ToolStepDialog -->|"STEP_THROUGH_RESPONSE"| MessageBus2
MessageBus2 --> Scheduler
High Level Implementation Phases
Phase 1 — Core scaffolding (no UI yet) Add ApprovalMode.STEP to the policy system, add STEP_THROUGH_REQUEST / STEP_THROUGH_RESPONSE to the MessageBus, and insert the pause point in the Scheduler. At the end of this phase, step-through mode works — it just has no UI yet (the scheduler will block forever waiting for a response).
Phase 2 — Task tree Build useTaskTree (tree builder from flat TrackedToolCall[] using parentCallId) and then the TaskTreeDisplay + TaskNode components. Replace the flat pending items list in MainContent with the tree. At the end of this phase, users can see the execution hierarchy in real time. Add collapse state to TaskTreeNode and wire the expand/collapse keybindings.
Phase 3 — Step-through UI Build useStepThrough hook + ToolStepDialog component and wire them into AppContainer / MainContent. This mirrors the existing ToolConfirmationQueue pattern almost exactly. At the end of this phase, step-through mode is fully usable end-to-end on the Task Tree visualization.
Phase 4 — Verbosity wrap ToolResultDisplay in a verbosity-aware collapsible, and add ui.verbosityLevel / ui.verbosityOverrides to settings. This is largely additive UI work on top of Phase 3.
Phase 5 — Error visualization + polish Improve the error rendering for nested failures, update the system prompt snippet so the model knows about step-through, add tests for the new hooks and components.
Alternative Approaches (Open To Suggestions):
Task tree as a slash command (/tree) rather than always-on: Lower rendering complexity, but loses the "live update" value. A hybrid — off by default, auto-enabled when STEP mode or VERBOSE verbosity is active — is a reasonable middle ground.
Step-through as a separate boolean flag rather than ApprovalMode.STEP: Avoids coupling step behavior to the approval mode cycle and is easier to toggle independently. Downside: the existing approval mode system (policy engine, UI indicator, keybinding cycle, system prompt awareness) already handles exactly this kind of mode; extending it is less code.
Collapsible output via Ink's <Box> height clamping vs. true component toggling: Height-clamping avoids re-renders but doesn't work well with Static (committed history). True component toggling (conditional render of the full output) is the correct approach for committed history items.
Verbosity as a runtime /verbosity slash command only (no settings persistence): Simpler but transient — resets on every session. Better as a setting with a /verbosity command as an alias.
Why is this needed?
- Transparency: Agentic tasks touching dozens of files across multiple subagents are completely opaque today in the Gemiini CLI. Users cannot tell what the model decided or why.
- Trust: Step-through lets cautious users gain confidence before switching to YOLO/AUTO_EDIT on an unfamiliar codebase.
- Debugging: When a 20-tool chain fails, the current UI shows a flat error. The tree immediately shows which branch failed and the full input/output context at that node to the user.
- Verbosity: Power users want to see everything; new users want silence. One
ui.errorVerbositytoggle is not enough.
Additional context
parentCallIdis already populated onToolCallRequestInfoand flows through toIndividualToolCallDisplay— the hierarchy data exists, it is simply never rendered as a tree.SubagentProgressDisplay(packages/cli/src/ui/components/messages/) is an early precursor toTaskNodeand can be refactored into the new component.- The existing
ToolConfirmationQueuemodal pattern (pause → render dialog → dispatch outcome viaMessageBus) is the exact pattern to replicate forToolStepDialog. - Ink's
<Static>component does not support re-renders of committed items; collapse/expand state must live in the pending/live region until items are committed to history. - The
McpProgressIndicatorinToolMessageshowsprogressMessageand aprogress/progressTotalbar — this can be extended to show step position in step-through mode.