fix(core): fix fork subagent bugs and add CacheSafeParams integration

wenshao · wenshao · commit 51728dc4363e · 2026-04-07T12:07:53.000+08:00
Bug fixes:
- Fix AgentParams.subagent_type type: string -&gt; string? (match schema)
- Fix undefined agentType passed to hook system (fallback to subagentConfig.name)
- Fix hook continuation missing extraHistory parameter
- Fix functionResponse missing id field (match coreToolScheduler pattern)
- Fix consecutive user messages in Gemini API (ensure history ends with model)
- Fix duplicate task_prompt when directive already in extraHistory
- Fix FORK_AGENT.systemPrompt empty string causing createChat to throw
- Fix redundant dynamic import of forkSubagent.js (merge into single import)
- Fix non-fork agent returning empty string on execution failure
- Fix misleading fork child rule referencing non-existent system prompt config
- Fix functionResponse.response key from {result:} to {output:} for consistency

CacheSafeParams integration:
- Retrieve parent's generationConfig via getCacheSafeParams() for cache sharing
- Add generationConfigOverride to CreateChatOptions and AgentHeadless.execute()
- Add toolsOverride to AgentHeadless.execute() for parent tool declarations
- Fork API requests now share byte-identical prefix with parent (DashScope cache hits)
- Graceful degradation when CacheSafeParams unavailable (first turn)

Docs:
- Add Fork Subagent section to sub-agents.md user manual
- Add fork-subagent-design.md design document
diff --git a/docs/design/fork-subagent/fork-subagent-design.md b/docs/design/fork-subagent/fork-subagent-design.md
@@ -0,0 +1,112 @@
+# Fork Subagent Design
+
+> Implicit fork subagent that inherits the parent's full conversation context and shares prompt cache for cost-efficient parallel task execution.
+
+## Overview
+
+When the Agent tool is called without `subagent_type`, it triggers an implicit **fork** — a background subagent that inherits the parent's conversation history, system prompt, and tool definitions. The fork uses `CacheSafeParams` to ensure its API requests share the same prefix as the parent's, enabling DashScope prompt cache hits.
+
+## Architecture
+
+```
+Parent conversation: [SystemPrompt | Tools | Msg1 | Msg2 | ... | MsgN (model)]
+                              ↑ identical prefix for all forks ↑
+
+Fork A: [...MsgN | placeholder results | "Research A"]  ← shared cache
+Fork B: [...MsgN | placeholder results | "Modify B"]    ← shared cache
+Fork C: [...MsgN | placeholder results | "Test C"]      ← shared cache
+```
+
+## Key Components
+
+### 1. FORK_AGENT (`forkSubagent.ts`)
+
+Synthetic agent config, not registered in `builtInAgents`. Has a fallback `systemPrompt` but in practice uses the parent's rendered system prompt via `generationConfigOverride`.
+
+### 2. CacheSafeParams Integration (`agent.ts` + `forkedQuery.ts`)
+
+```
+agent.ts (fork path)
+  │
+  ├── getCacheSafeParams()          ← parent's generationConfig snapshot
+  │     ├── generationConfig        ← systemInstruction + tools + temp/topP
+  │     └── history                 ← (not used — we build extraHistory instead)
+  │
+  ├── forkGenerationConfig          ← passed as generationConfigOverride
+  └── forkToolsOverride             ← FunctionDeclaration[] extracted from tools
+        │
+        ▼
+  AgentHeadless.execute(context, signal, {
+    extraHistory,                   ← parent conversation history
+    generationConfigOverride,       ← parent's exact systemInstruction + tools
+    toolsOverride,                  ← parent's exact tool declarations
+  })
+        │
+        ▼
+  AgentCore.createChat(context, {
+    extraHistory,
+    generationConfigOverride,       ← bypasses buildChatSystemPrompt()
+  })
+        │
+        ▼
+  new GeminiChat(config, generationConfig, startHistory)
+                          ↑ byte-identical to parent's config
+```
+
+### 3. History Construction (`agent.ts` + `forkSubagent.ts`)
+
+The fork's `extraHistory` must end with a model message to maintain Gemini API's user/model alternation when `agent-headless` sends the `task_prompt`.
+
+Three cases:
+
+| Parent history ends with      | extraHistory construction                                              | task_prompt                    |
+| ----------------------------- | ---------------------------------------------------------------------- | ------------------------------ |
+| `model` (no function calls)   | `[...rawHistory]` (unchanged)                                          | `buildChildMessage(directive)` |
+| `model` (with function calls) | `[...rawHistory, model(clone), user(responses+directive), model(ack)]` | `'Begin.'`                     |
+| `user` (unusual)              | `rawHistory.slice(0, -1)` (drop trailing user)                         | `buildChildMessage(directive)` |
+
+### 4. Recursive Fork Prevention (`forkSubagent.ts`)
+
+`isInForkChild()` scans conversation history for the `<fork-boilerplate>` tag. If found, the fork attempt is rejected with an error message.
+
+### 5. Background Execution (`agent.ts`)
+
+Fork uses `void executeSubagent()` (fire-and-forget) and returns `FORK_PLACEHOLDER_RESULT` immediately to the parent. Errors in the background task are caught, logged, and reflected in the display state.
+
+## Data Flow
+
+```
+1. Model calls Agent tool (no subagent_type)
+2. agent.ts: import forkSubagent.js
+3. agent.ts: getCacheSafeParams() → forkGenerationConfig + forkToolsOverride
+4. agent.ts: build extraHistory from parent's getHistory(true)
+5. agent.ts: build forkTaskPrompt (directive or 'Begin.')
+6. agent.ts: createAgentHeadless(FORK_AGENT, ...)
+7. agent.ts: void executeSubagent() — background
+8. agent.ts: return FORK_PLACEHOLDER_RESULT to parent immediately
+9. Background:
+   a. AgentHeadless.execute(context, signal, {extraHistory, generationConfigOverride, toolsOverride})
+   b. AgentCore.createChat() — uses parent's generationConfig (cache-shared)
+   c. runReasoningLoop() — uses parent's tool declarations
+   d. Fork executes tools, produces result
+   e. updateDisplay() with final status
+```
+
+## Graceful Degradation
+
+If `getCacheSafeParams()` returns null (first turn, no history yet), the fork falls back to:
+
+- `FORK_AGENT.systemPrompt` for system instruction
+- `prepareTools()` for tool declarations
+
+This ensures the fork always works, even without cache sharing.
+
+## Files
+
+| File                                                 | Role                                                                                  |
+| ---------------------------------------------------- | ------------------------------------------------------------------------------------- |
+| `packages/core/src/agents/runtime/forkSubagent.ts`   | FORK_AGENT config, buildForkedMessages(), isInForkChild(), buildChildMessage()        |
+| `packages/core/src/tools/agent.ts`                   | Fork path: CacheSafeParams retrieval, extraHistory construction, background execution |
+| `packages/core/src/agents/runtime/agent-headless.ts` | execute() options: generationConfigOverride, toolsOverride                            |
+| `packages/core/src/agents/runtime/agent-core.ts`     | CreateChatOptions.generationConfigOverride                                            |
+| `packages/core/src/followup/forkedQuery.ts`          | CacheSafeParams infrastructure (existing, no changes)                                 |
diff --git a/docs/users/features/sub-agents.md b/docs/users/features/sub-agents.md
@@ -12,18 +12,49 @@ Subagents are independent AI assistants that:
 - **Work autonomously** - Once given a task, they work independently until completion or failure
 - **Provide detailed feedback** - You can see their progress, tool usage, and execution statistics in real-time
 
+## Fork Subagent (Implicit Fork)
+
+In addition to named subagents, Qwen Code supports **implicit forking** — when the AI omits the `subagent_type` parameter, it triggers a fork that inherits the parent's full conversation context.
+
+### How Fork Differs from Named Subagents
+
+|               | Named Subagent                    | Fork Subagent                                         |
+| ------------- | --------------------------------- | ----------------------------------------------------- |
+| Context       | Starts fresh, no parent history   | Inherits parent's full conversation history           |
+| System prompt | Uses its own configured prompt    | Uses parent's exact system prompt (for cache sharing) |
+| Execution     | Blocks the parent until done      | Runs in background, parent continues immediately      |
+| Use case      | Specialized tasks (testing, docs) | Parallel tasks that need the current context          |
+
+### When Fork is Used
+
+The AI automatically uses fork when it needs to:
+
+- Run multiple research tasks in parallel (e.g., "investigate module A, B, and C")
+- Perform background work while continuing the main conversation
+- Delegate tasks that require understanding of the current conversation context
+
+### Prompt Cache Sharing
+
+All forks share the parent's exact API request prefix (system prompt, tools, conversation history), enabling DashScope prompt cache hits. When 3 forks run in parallel, the shared prefix is cached once and reused — saving 80%+ token costs compared to independent subagents.
+
+### Recursive Fork Prevention
+
+Fork children cannot create further forks. This is enforced at runtime — if a fork attempts to spawn another fork, it receives an error instructing it to execute tasks directly.
+
 ## Key Benefits
 
 - **Task Specialization**: Create agents optimized for specific workflows (testing, documentation, refactoring, etc.)
 - **Context Isolation**: Keep specialized work separate from your main conversation
+- **Context Inheritance**: Fork subagents inherit the full conversation for context-heavy parallel tasks
+- **Prompt Cache Sharing**: Fork subagents share the parent's cache prefix, reducing token costs
 - **Reusability**: Save and reuse agent configurations across projects and sessions
 - **Controlled Access**: Limit which tools each agent can use for security and focus
 - **Progress Visibility**: Monitor agent execution with real-time progress updates
 
 ## How Subagents Work
 
 1. **Configuration**: You create Subagents configurations that define their behavior, tools, and system prompts
-2. **Delegation**: The main AI can automatically delegate tasks to appropriate Subagents
+2. **Delegation**: The main AI can automatically delegate tasks to appropriate Subagents — or implicitly fork when no specific subagent type is needed
 3. **Execution**: Subagents work independently, using their configured tools to complete tasks
 4. **Results**: They return results and execution summaries back to the main conversation
 
diff --git a/packages/core/src/agents/runtime/agent-core.ts b/packages/core/src/agents/runtime/agent-core.ts
@@ -101,6 +101,15 @@ export interface CreateChatOptions {
    * conversational context (e.g., from the main session that spawned it).
    */
   extraHistory?: Content[];
+  /**
+   * When provided, replaces the auto-built generationConfig
+   * (systemInstruction, temperature, etc.) with this exact config.
+   * Used by fork subagents to share the parent conversation's cache
+   * prefix for DashScope prompt caching.
+   */
+  generationConfigOverride?: GenerateContentConfig & {
+    systemInstruction?: string | Content;
+  };
 }
 
 /**
@@ -230,22 +239,30 @@ export class AgentCore {
       ...(this.promptConfig.initialMessages ?? []),
     ];
 
-    const systemInstruction = this.promptConfig.systemPrompt
-      ? this.buildChatSystemPrompt(context, options)
-      : undefined;
+    // If an override is provided (fork path), use it directly for cache
+    // sharing. Otherwise, build the config from this agent's promptConfig.
+    // Note: buildChatSystemPrompt is called OUTSIDE the try/catch so template
+    // errors propagate to the caller (not swallowed by reportError).
+    let generationConfig: GenerateContentConfig & {
+      systemInstruction?: string | Content;
+    };
 
-    try {
-      const generationConfig: GenerateContentConfig & {
-        systemInstruction?: string | Content;
-      } = {
+    if (options?.generationConfigOverride) {
+      generationConfig = options.generationConfigOverride;
+    } else {
+      const systemInstruction = this.promptConfig.systemPrompt
+        ? this.buildChatSystemPrompt(context, options)
+        : undefined;
+      generationConfig = {
         temperature: this.modelConfig.temp,
         topP: this.modelConfig.top_p,
       };
-
       if (systemInstruction) {
         generationConfig.systemInstruction = systemInstruction;
       }
+    }
 
+    try {
       return new GeminiChat(
         this.runtimeContext,
         generationConfig,
diff --git a/packages/core/src/agents/runtime/agent-headless.ts b/packages/core/src/agents/runtime/agent-headless.ts
@@ -192,10 +192,17 @@ export class AgentHeadless {
   async execute(
     context: ContextState,
     externalSignal?: AbortSignal,
-    options?: { extraHistory?: Array<import('@google/genai').Content> },
+    options?: {
+      extraHistory?: Array<import('@google/genai').Content>;
+      /** Override generationConfig for cache sharing (fork subagent). */
+      generationConfigOverride?: import('@google/genai').GenerateContentConfig;
+      /** Override tool declarations for cache sharing (fork subagent). */
+      toolsOverride?: Array<import('@google/genai').FunctionDeclaration>;
+    },
   ): Promise<void> {
     const chat = await this.core.createChat(context, {
       extraHistory: options?.extraHistory,
+      generationConfigOverride: options?.generationConfigOverride,
     });
 
     if (!chat) {
@@ -215,7 +222,7 @@ export class AgentHeadless {
       abortController.abort();
     }
 
-    const toolsList = this.core.prepareTools();
+    const toolsList = options?.toolsOverride ?? this.core.prepareTools();
 
     const initialTaskText = String(
       (context.get('task_prompt') as string) ?? 'Get Started!',
diff --git a/packages/core/src/agents/runtime/forkSubagent.ts b/packages/core/src/agents/runtime/forkSubagent.ts
@@ -10,7 +10,8 @@ export const FORK_AGENT = {
   description:
     'Implicit fork — inherits full conversation context. Not selectable via subagent_type; triggered by omitting subagent_type.',
   tools: ['*'],
-  systemPrompt: '',
+  systemPrompt:
+    'You are a forked worker process. Follow the directive in the conversation history. Execute tasks directly using available tools. Do not spawn sub-agents.',
   level: 'session' as const,
 };
 
@@ -26,33 +27,49 @@ export function isInForkChild(messages: Content[]): boolean {
 export const FORK_PLACEHOLDER_RESULT =
   'Fork started — processing in background';
 
+/**
+ * Build extra history messages for a forked subagent.
+ *
+ * When the last model message has function calls, we must include matching
+ * function responses in a user message (Gemini API requirement). The
+ * directive is embedded in this same user message to avoid consecutive
+ * user messages.
+ *
+ * When there are no function calls, we return [] — the parent history
+ * already ends with a model text message and the directive will be sent
+ * as the task_prompt by agent-headless (model → user alternation is OK).
+ *
+ * @param directive - The fork directive text (user's prompt)
+ * @param assistantMessage - The last model message from the parent history
+ * @returns Extra messages to append to history (may be empty)
+ */
 export function buildForkedMessages(
   directive: string,
   assistantMessage: Content,
 ): Content[] {
-  // Clone the assistant message to avoid mutating the original
-  const fullAssistantMessage: Content = {
-    role: assistantMessage.role,
-    parts: [...(assistantMessage.parts || [])],
-  };
-
   const toolUseParts =
     assistantMessage.parts?.filter((part) => part.functionCall) || [];
 
   if (toolUseParts.length === 0) {
-    return [
-      {
-        role: 'user',
-        parts: [{ text: buildChildMessage(directive) }],
-      },
-    ];
+    // No function calls — no extra messages needed.
+    // The parent history already ends with this model message.
+    return [];
   }
 
-  // Build tool_result blocks for every tool_use, all with identical placeholder text
+  // Clone the assistant message to avoid mutating the original
+  const fullAssistantMessage: Content = {
+    role: assistantMessage.role,
+    parts: [...(assistantMessage.parts || [])],
+  };
+
+  // Build tool_result blocks for every tool_use, all with identical placeholder text.
+  // Include the directive text in the same user message to maintain
+  // proper user/model alternation.
   const toolResultParts = toolUseParts.map((part) => ({
     functionResponse: {
+      id: part.functionCall!.id,
       name: part.functionCall!.name,
-      response: { result: FORK_PLACEHOLDER_RESULT },
+      response: { output: FORK_PLACEHOLDER_RESULT },
     },
   }));
 
@@ -76,7 +93,7 @@ STOP. READ THIS FIRST.
 You are a forked worker process. You are NOT the main agent.
 
 RULES (non-negotiable):
-1. Your system prompt says "default to forking." IGNORE IT — that's for the parent. You ARE the fork. Do NOT spawn sub-agents; execute directly.
+1. You ARE the fork. Do NOT spawn sub-agents; execute directly.
 2. Do NOT converse, ask questions, or suggest next steps
 3. Do NOT editorialize or add meta-commentary
 4. USE your tools directly: Bash, Read, Write, etc.
diff --git a/packages/core/src/tools/agent.ts b/packages/core/src/tools/agent.ts