Skip to content
Merged
112 changes: 112 additions & 0 deletions docs/design/fork-subagent/fork-subagent-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Fork Subagent Design

> Implicit fork subagent that inherits the parent's full conversation context and shares prompt cache for cost-efficient parallel task execution.

## Overview

When the Agent tool is called without `subagent_type`, it triggers an implicit **fork** — a background subagent that inherits the parent's conversation history, system prompt, and tool definitions. The fork uses `CacheSafeParams` to ensure its API requests share the same prefix as the parent's, enabling DashScope prompt cache hits.

## Architecture

```
Parent conversation: [SystemPrompt | Tools | Msg1 | Msg2 | ... | MsgN (model)]
↑ identical prefix for all forks ↑

Fork A: [...MsgN | placeholder results | "Research A"] ← shared cache
Fork B: [...MsgN | placeholder results | "Modify B"] ← shared cache
Fork C: [...MsgN | placeholder results | "Test C"] ← shared cache
```

## Key Components

### 1. FORK_AGENT (`forkSubagent.ts`)

Synthetic agent config, not registered in `builtInAgents`. Has a fallback `systemPrompt` but in practice uses the parent's rendered system prompt via `generationConfigOverride`.

### 2. CacheSafeParams Integration (`agent.ts` + `forkedQuery.ts`)

```
agent.ts (fork path)
├── getCacheSafeParams() ← parent's generationConfig snapshot
│ ├── generationConfig ← systemInstruction + tools + temp/topP
│ └── history ← (not used — we build extraHistory instead)
├── forkGenerationConfig ← passed as generationConfigOverride
└── forkToolsOverride ← FunctionDeclaration[] extracted from tools
AgentHeadless.execute(context, signal, {
extraHistory, ← parent conversation history
generationConfigOverride, ← parent's exact systemInstruction + tools
toolsOverride, ← parent's exact tool declarations
})
AgentCore.createChat(context, {
extraHistory,
generationConfigOverride, ← bypasses buildChatSystemPrompt()
}) AND skips getInitialChatHistory()
│ (extraHistory already has env context)
new GeminiChat(config, generationConfig, startHistory)
↑ byte-identical to parent's config
```

### 3. History Construction (`agent.ts` + `forkSubagent.ts`)

The fork's `extraHistory` must end with a model message to maintain Gemini API's user/model alternation when `agent-headless` sends the `task_prompt`.

Three cases:

| Parent history ends with | extraHistory construction | task_prompt |
| ----------------------------- | ---------------------------------------------------------------------- | ------------------------------ |
| `model` (no function calls) | `[...rawHistory]` (unchanged) | `buildChildMessage(directive)` |
| `model` (with function calls) | `[...rawHistory, model(clone), user(responses+directive), model(ack)]` | `'Begin.'` |
| `user` (unusual) | `rawHistory.slice(0, -1)` (drop trailing user) | `buildChildMessage(directive)` |

### 4. Recursive Fork Prevention (`forkSubagent.ts`)

`isInForkChild()` scans conversation history for the `<fork-boilerplate>` tag. If found, the fork attempt is rejected with an error message.

### 5. Background Execution (`agent.ts`)

Fork uses `void executeSubagent()` (fire-and-forget) and returns `FORK_PLACEHOLDER_RESULT` immediately to the parent. Errors in the background task are caught, logged, and reflected in the display state.

## Data Flow

```
1. Model calls Agent tool (no subagent_type)
2. agent.ts: import forkSubagent.js
3. agent.ts: getCacheSafeParams() → forkGenerationConfig + forkToolsOverride
4. agent.ts: build extraHistory from parent's getHistory(true)
5. agent.ts: build forkTaskPrompt (directive or 'Begin.')
6. agent.ts: createAgentHeadless(FORK_AGENT, ...)
7. agent.ts: void executeSubagent() — background
8. agent.ts: return FORK_PLACEHOLDER_RESULT to parent immediately
9. Background:
a. AgentHeadless.execute(context, signal, {extraHistory, generationConfigOverride, toolsOverride})
b. AgentCore.createChat() — uses parent's generationConfig (cache-shared)
c. runReasoningLoop() — uses parent's tool declarations
d. Fork executes tools, produces result
e. updateDisplay() with final status
```

## Graceful Degradation

If `getCacheSafeParams()` returns null (first turn, no history yet), the fork falls back to:

- `FORK_AGENT.systemPrompt` for system instruction
- `prepareTools()` for tool declarations

This ensures the fork always works, even without cache sharing.

## Files

| File | Role |
| ---------------------------------------------------- | ------------------------------------------------------------------------------------- |
| `packages/core/src/agents/runtime/forkSubagent.ts` | FORK_AGENT config, buildForkedMessages(), isInForkChild(), buildChildMessage() |
| `packages/core/src/tools/agent.ts` | Fork path: CacheSafeParams retrieval, extraHistory construction, background execution |
| `packages/core/src/agents/runtime/agent-headless.ts` | execute() options: generationConfigOverride, toolsOverride |
| `packages/core/src/agents/runtime/agent-core.ts` | CreateChatOptions.generationConfigOverride |
| `packages/core/src/followup/forkedQuery.ts` | CacheSafeParams infrastructure (existing, no changes) |
38 changes: 37 additions & 1 deletion docs/users/features/sub-agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,54 @@ Subagents are independent AI assistants that:
- **Work autonomously** - Once given a task, they work independently until completion or failure
- **Provide detailed feedback** - You can see their progress, tool usage, and execution statistics in real-time

## Fork Subagent (Implicit Fork)

In addition to named subagents, Qwen Code supports **implicit forking** — when the AI omits the `subagent_type` parameter, it triggers a fork that inherits the parent's full conversation context.

### How Fork Differs from Named Subagents

| | Named Subagent | Fork Subagent |
| ------------- | --------------------------------- | ----------------------------------------------------- |
| Context | Starts fresh, no parent history | Inherits parent's full conversation history |
| System prompt | Uses its own configured prompt | Uses parent's exact system prompt (for cache sharing) |
| Execution | Blocks the parent until done | Runs in background, parent continues immediately |
| Use case | Specialized tasks (testing, docs) | Parallel tasks that need the current context |

### When Fork is Used

The AI automatically uses fork when it needs to:

- Run multiple research tasks in parallel (e.g., "investigate module A, B, and C")
- Perform background work while continuing the main conversation
- Delegate tasks that require understanding of the current conversation context

### Prompt Cache Sharing

All forks share the parent's exact API request prefix (system prompt, tools, conversation history), enabling DashScope prompt cache hits. When 3 forks run in parallel, the shared prefix is cached once and reused — saving 80%+ token costs compared to independent subagents.

### Recursive Fork Prevention

Fork children cannot create further forks. This is enforced at runtime — if a fork attempts to spawn another fork, it receives an error instructing it to execute tasks directly.

### Current Limitations

- **No result feedback**: Fork results are reflected in the UI progress display but are not automatically fed back into the main conversation. The parent AI sees a placeholder message and cannot act on the fork's output.
- **No worktree isolation**: Forks share the parent's working directory. Concurrent file modifications from multiple forks may conflict.

## Key Benefits

- **Task Specialization**: Create agents optimized for specific workflows (testing, documentation, refactoring, etc.)
- **Context Isolation**: Keep specialized work separate from your main conversation
- **Context Inheritance**: Fork subagents inherit the full conversation for context-heavy parallel tasks
- **Prompt Cache Sharing**: Fork subagents share the parent's cache prefix, reducing token costs
- **Reusability**: Save and reuse agent configurations across projects and sessions
- **Controlled Access**: Limit which tools each agent can use for security and focus
- **Progress Visibility**: Monitor agent execution with real-time progress updates

## How Subagents Work

1. **Configuration**: You create Subagents configurations that define their behavior, tools, and system prompts
2. **Delegation**: The main AI can automatically delegate tasks to appropriate Subagents
2. **Delegation**: The main AI can automatically delegate tasks to appropriate Subagents — or implicitly fork when no specific subagent type is needed
3. **Execution**: Subagents work independently, using their configured tools to complete tasks
4. **Results**: They return results and execution summaries back to the main conversation

Expand Down
73 changes: 55 additions & 18 deletions packages/core/src/agents/runtime/agent-core.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,14 +58,25 @@ import type {
import { type AgentEventEmitter, AgentEventType } from './agent-events.js';
import { AgentStatistics, type AgentStatsSummary } from './agent-statistics.js';
import { matchesMcpPattern } from '../../permissions/rule-parser.js';
import { AgentTool } from '../../tools/agent.js';
import { ToolNames } from '../../tools/tool-names.js';
import { DEFAULT_QWEN_MODEL } from '../../config/models.js';
import { type ContextState, templateString } from './agent-headless.js';

/**
* Result of a single reasoning loop invocation.
*/
/**
* Tools that must never be available to subagents (including forked agents).
* - AgentTool prevents recursive subagent spawning.
* - Cron tools are session-scoped and should only run from the main session.
*/
export const EXCLUDED_TOOLS_FOR_SUBAGENTS: ReadonlySet<string> = new Set([
ToolNames.AGENT,
ToolNames.CRON_CREATE,
ToolNames.CRON_LIST,
ToolNames.CRON_DELETE,
]);

export interface ReasoningLoopResult {
/** The final model text response (empty if terminated by abort/limits). */
text: string;
Expand Down Expand Up @@ -102,6 +113,26 @@ export interface CreateChatOptions {
* conversational context (e.g., from the main session that spawned it).
*/
extraHistory?: Content[];
/**
* When provided, replaces the auto-built generationConfig
* (systemInstruction, temperature, etc.) with this exact config.
* Used by fork subagents to share the parent conversation's cache
* prefix for DashScope prompt caching.
*/
generationConfigOverride?: GenerateContentConfig & {
systemInstruction?: string | Content;
};
/**
* When true, skip injecting the env bootstrap messages from
* `getInitialChatHistory()`. Set by fork subagents because their
* `extraHistory` is the full parent history that already contains
* those env messages — re-injecting would duplicate them.
*
* Other callers (e.g. arena interactive agents) pass an
* env-stripped history and DO need fresh env init for their own
* working directory, so they must leave this unset.
*/
skipEnvHistory?: boolean;
}

/**
Expand Down Expand Up @@ -223,30 +254,43 @@ export class AgentCore {
);
}

const envHistory = await getInitialChatHistory(this.runtimeContext);
// Skip env bootstrap when the caller (fork) explicitly says its
// extraHistory already contains those messages. Other callers that
// provide an env-stripped history (e.g. arena) still get fresh env init.
const envHistory = options?.skipEnvHistory
? []
: await getInitialChatHistory(this.runtimeContext);

const startHistory = [
...envHistory,
...(options?.extraHistory ?? []),
...(this.promptConfig.initialMessages ?? []),
];

const systemInstruction = this.promptConfig.systemPrompt
? this.buildChatSystemPrompt(context, options)
: undefined;
// If an override is provided (fork path), use it directly for cache
// sharing. Otherwise, build the config from this agent's promptConfig.
// Note: buildChatSystemPrompt is called OUTSIDE the try/catch so template
// errors propagate to the caller (not swallowed by reportError).
let generationConfig: GenerateContentConfig & {
systemInstruction?: string | Content;
};

try {
const generationConfig: GenerateContentConfig & {
systemInstruction?: string | Content;
} = {
if (options?.generationConfigOverride) {
generationConfig = options.generationConfigOverride;
} else {
const systemInstruction = this.promptConfig.systemPrompt
? this.buildChatSystemPrompt(context, options)
: undefined;
generationConfig = {
temperature: this.modelConfig.temp,
topP: this.modelConfig.top_p,
};

if (systemInstruction) {
generationConfig.systemInstruction = systemInstruction;
}
}

try {
return new GeminiChat(
this.runtimeContext,
generationConfig,
Expand Down Expand Up @@ -275,14 +319,7 @@ export class AgentCore {
const toolRegistry = this.runtimeContext.getToolRegistry();
const toolsList: FunctionDeclaration[] = [];

// Tools excluded from subagents: AgentTool (prevent recursion) and
// cron tools (session-scoped, should only be used by the main session).
const excludedFromSubagents = new Set<string>([
AgentTool.Name,
ToolNames.CRON_CREATE,
ToolNames.CRON_LIST,
ToolNames.CRON_DELETE,
]);
const excludedFromSubagents = EXCLUDED_TOOLS_FOR_SUBAGENTS;

if (this.toolConfig) {
const asStrings = this.toolConfig.tools.filter(
Expand Down
17 changes: 15 additions & 2 deletions packages/core/src/agents/runtime/agent-headless.ts
Original file line number Diff line number Diff line change
Expand Up @@ -192,8 +192,21 @@ export class AgentHeadless {
async execute(
context: ContextState,
externalSignal?: AbortSignal,
options?: {
extraHistory?: Array<import('@google/genai').Content>;
/** Override generationConfig for cache sharing (fork subagent). */
generationConfigOverride?: import('@google/genai').GenerateContentConfig;
/** Override tool declarations for cache sharing (fork subagent). */
toolsOverride?: Array<import('@google/genai').FunctionDeclaration>;
/** Skip env bootstrap injection (fork already inherits parent env). */
skipEnvHistory?: boolean;
},
): Promise<void> {
const chat = await this.core.createChat(context);
const chat = await this.core.createChat(context, {
extraHistory: options?.extraHistory,
generationConfigOverride: options?.generationConfigOverride,
skipEnvHistory: options?.skipEnvHistory,
});

if (!chat) {
this.terminateMode = AgentTerminateMode.ERROR;
Expand All @@ -212,7 +225,7 @@ export class AgentHeadless {
abortController.abort();
}

const toolsList = this.core.prepareTools();
const toolsList = options?.toolsOverride ?? this.core.prepareTools();

const initialTaskText = String(
(context.get('task_prompt') as string) ?? 'Get Started!',
Expand Down
Loading
Loading