fix(followup): prevent tool call UI leak and Enter accept buffer race#2872
fix(followup): prevent tool call UI leak and Enter accept buffer race#2872yiliang114 merged 14 commits intoQwenLM:mainfrom
Conversation
… tool call UI The follow-up suggestion generation was leaking into the conversation UI through three channels: 1. The forked query included tools in its generation config, allowing the model to produce function calls during suggestion generation. Fixed by setting `tools: []` in runForkedQuery's per-request config (kept in createForkedChat for speculation which needs tools). 2. logApiResponse and logApiError recorded suggestion API events to the chatRecordingService, causing them to appear in session JSONL files and the WebUI. Fixed by adding isInternalPromptId() guard that skips chatRecordingService for 'prompt_suggestion' and 'forked_query' IDs. uiTelemetryService.addEvent() is preserved so /stats still tracks suggestion token usage. 3. LoggingContentGenerator logged suggestion requests/responses to the OpenAI logger and telemetry pipeline. Fixed by skipping logApiRequest, buildOpenAIRequestForLogging, and logOpenAIInteraction for internal prompt IDs. _logApiResponse is preserved (for /stats) but its chatRecordingService path is filtered by fix #2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📋 Review SummaryThis PR addresses a clean separation between internal background operations (suggestion generation, forked queries) and user-visible session logging. The changes prevent internal API calls from appearing in tool call UI, session JSONL recordings, and WebUI replays while preserving token usage tracking for 🔍 General Feedback
🎯 Specific Feedback🟡 High
🟢 Medium
🔵 Low
✅ Highlights
|
wenshao
left a comment
There was a problem hiding this comment.
整体方案比纯 UI 层修复更彻底,赞同从源头解决。补充几点建议:
1. 提取 isInternalPromptId 消除重复(对应 Bot 高优建议)
isInternalPromptId() 在 loggers.ts 和 loggingContentGenerator.ts 中各有一份实现,逻辑完全一致。建议提取到共享工具模块,例如:
// packages/core/src/utils/internalPromptIds.ts
export const INTERNAL_PROMPT_IDS = Object.freeze({
SUGGESTION: "prompt_suggestion",
FORKED_QUERY: "forked_query",
}) as { readonly [key: string]: string };
export function isInternalPromptId(promptId: string): boolean {
return promptId in INTERNAL_PROMPT_IDS;
}使用 Object.freeze + 常量映射而非硬编码字符串比较,后续新增内部 promptId 时只需改一处,避免两份实现产生分歧。
2. tools: [] 建议提取为常量
forkedQuery.ts 中的 { tools: [] } 建议提取为命名常量:
const NO_TOOLS_CONFIG: GenerateContentConfig = Object.freeze({ tools: [] });
// ...
const requestConfig: GenerateContentConfig = { ...NO_TOOLS_CONFIG };3. openaiRequest = undefined 的类型安全
内部 promptId 时 openaiRequest 被赋值为 undefined,但 logOpenAIInteraction 的参数类型可能不接受 undefined。建议将 openaiRequest 声明为 SomeType | undefined,或在调用处增加类型守卫,确保 TypeScript 无 warning。
4. 补充 isInternalPromptId 的单元测试
建议验证 "prompt_suggestion" → true、"forked_query" → true、"user_query" → false、空字符串 → false。
— qwen3.6-plus
…ers.ts Address review feedback: extract isInternalPromptId() to a single exported function in telemetry/loggers.ts and import it in LoggingContentGenerator, eliminating the duplicate private method. Also update loggingContentGenerator.test.ts mock to use importOriginal so the real isInternalPromptId is available during tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
已处理 review 反馈,提交 🟡 High — 重复 🟢 Medium — 🟢 Medium — 🔵 Low — JSDoc 和 export |
Address maintainer review feedback: 1. Move isInternalPromptId() to packages/core/src/utils/internalPromptIds.ts using a ReadonlySet for the ID registry. Adding new internal prompt IDs only requires changing one file. loggers.ts re-exports for compatibility, loggingContentGenerator.ts imports directly from utils. 2. Extract `tools: []` magic value to a frozen NO_TOOLS constant in forkedQuery.ts. 3. Add unit tests for isInternalPromptId: prompt_suggestion → true, forked_query → true, user_query → false, empty string → false. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
已处理维护者反馈,提交 1. 提取 2. 3. 4. 单元测试 |
There was a problem hiding this comment.
Pull request overview
This PR prevents internal follow-up/suggestion generation requests from surfacing as tool-call UI events by treating them as “internal” prompt IDs and suppressing their recording/logging in the session replay artifacts.
Changes:
- Added
isInternalPromptId()and used it to skipchatRecordingServicerecording forapi_response/api_errorevents tied to internal prompt IDs. - Updated forked query execution to strip tools per request (
tools: []) so follow-up generation can’t produce function calls. - Updated
LoggingContentGeneratorto skip request/OpenAI interaction logging for internal prompt IDs while preserving API response/error metrics logging.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| packages/core/src/telemetry/loggers.ts | Adds internal prompt ID detection and prevents internal API events from being recorded into session JSONL/WebUI replay. |
| packages/core/src/followup/forkedQuery.ts | Forces forked queries to run with no tools to avoid tool-call UI artifacts. |
| packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts | Skips request/OpenAI logging for internal prompt IDs while keeping response/error metric logging. |
| packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts | Adjusts logger mocking to preserve new exports while still mocking request/response/error logging. |
Comments suppressed due to low confidence (3)
packages/core/src/followup/forkedQuery.ts:211
runForkedQuerynow explicitly overridestoolsto an empty list to prevent function calls. There’s test coverage for CacheSafeParams in this module, but nothing asserting thatrunForkedQueryactually sends requests with tools stripped (or that it cannot emit tool-call parts). Add a unit test that stubsGeminiChat.sendMessageStream/content generator and verifies the merged config containstools: []for the'forked_query'prompt_id.
const model = options?.model ?? params.model;
const chat = createForkedChat(config, params);
// Build per-request config overrides.
// NO_TOOLS prevents the model from producing function calls — forked
// queries are pure text completion and must not appear in tool-call UI.
const requestConfig: GenerateContentConfig = { ...NO_TOOLS };
if (options?.abortSignal) {
requestConfig.abortSignal = options.abortSignal;
}
if (options?.jsonSchema) {
requestConfig.responseMimeType = 'application/json';
requestConfig.responseJsonSchema = options.jsonSchema;
}
const stream = await chat.sendMessageStream(
model,
{
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:217
- For internal prompt IDs,
openaiRequestis set toundefined(so OpenAI logging is effectively disabled), but the streaming path still runs the full response consolidation work inloggingStreamWrapperbefore callinglogOpenAIInteraction(which then returns early). To avoid unnecessary CPU/memory overhead for suggestion/forked queries, propagateisInternalinto the wrapper and skip consolidating +logOpenAIInteractionentirely when internal.
const isInternal = isInternalPromptId(userPromptId);
if (!isInternal) {
this.logApiRequest(
this.toContents(req.contents),
req.model,
userPromptId,
);
}
const openaiRequest = isInternal
? undefined
: await this.buildOpenAIRequestForLogging(req);
let stream: AsyncGenerator<GenerateContentResponse>;
try {
stream = await this.wrapped.generateContentStream(req, userPromptId);
} catch (error) {
const durationMs = Date.now() - startTime;
this._logApiError('', durationMs, error, req.model, userPromptId);
if (!isInternal) {
await this.logOpenAIInteraction(openaiRequest, undefined, error);
}
throw error;
}
return this.loggingStreamWrapper(
stream,
startTime,
userPromptId,
req.model,
openaiRequest,
);
packages/core/src/telemetry/loggers.ts:133
isInternalPromptIdand the new guards inlogApiError/logApiResponsechange recording behavior (skipchatRecordingService.recordUiTelemetryEventfor internal prompt IDs). There are existing logger unit tests, but none exercise this new branch. Add tests that passprompt_id='prompt_suggestion'/'forked_query'with a mockedgetChatRecordingService()and assertrecordUiTelemetryEventis not called, while it still is for normal prompt IDs.
// Re-export for consumers that import from this module.
export { isInternalPromptId } from '../utils/internalPromptIds.js';
export function logStartSession(
config: Config,
event: StartSessionEvent,
): void {
QwenLogger.getInstance(config)?.logStartSessionEvent(event);
if (!isTelemetrySdkInitialized()) return;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts
Show resolved
Hide resolved
1. Update forkedQuery.ts module docs to reflect that runForkedQuery overrides tools: [] at the per-request level while createForkedChat retains the full generationConfig for speculation callers. 2. Propagate isInternal into loggingStreamWrapper to skip response collection and consolidation for internal prompts, avoiding unnecessary CPU/memory overhead. 3. Add logApiResponse chatRecordingService filter tests: verify prompt_suggestion/forked_query skip recording while normal IDs still record. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts
Show resolved
Hide resolved
Address Copilot review round 3: 1. Deep-freeze NO_TOOLS.tools array to prevent shared mutable state across forked query calls. 2. Add LoggingContentGenerator tests verifying that internal prompt IDs (prompt_suggestion, forked_query) skip logApiRequest and OpenAI interaction logging while preserving logApiResponse. 3. Add logApiError chatRecordingService filter tests matching the existing logApiResponse coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Clarify that createForkedChat retains the full generationConfig (including tools) for speculation callers, while runForkedQuery strips tools at the per-request level via NO_TOOLS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
packages/core/src/followup/forkedQuery.ts:224
requestConfigis now always non-empty because it always includestools: []viaNO_TOOLS, soObject.keys(requestConfig).length > 0 ? requestConfig : undefinedis redundant and will always selectrequestConfig. Consider simplifying to passconfig: requestConfigdirectly (or keep the empty-config branch only ifNO_TOOLScan be conditionally applied).
// Build per-request config overrides.
// NO_TOOLS prevents the model from producing function calls — forked
// queries are pure text completion and must not appear in tool-call UI.
const requestConfig: GenerateContentConfig = { ...NO_TOOLS };
if (options?.abortSignal) {
requestConfig.abortSignal = options.abortSignal;
}
if (options?.jsonSchema) {
requestConfig.responseMimeType = 'application/json';
requestConfig.responseJsonSchema = options.jsonSchema;
}
const stream = await chat.sendMessageStream(
model,
{
message: [{ text: userMessage }],
config: Object.keys(requestConfig).length > 0 ? requestConfig : undefined,
},
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:217
- The internal-prompt behavior is updated for
generateContentStream(skippinglogApiRequest, OpenAI request conversion, and OpenAI interaction logging), but the new tests only covergenerateContent. Add a focused unit test forgenerateContentStreamwithuserPromptIdin['prompt_suggestion','forked_query']to assert:logApiRequest/OpenAILogger.logInteractionare not called, whilelogApiResponsestill is (for/stats).
async generateContentStream(
req: GenerateContentParameters,
userPromptId: string,
): Promise<AsyncGenerator<GenerateContentResponse>> {
const startTime = Date.now();
const isInternal = isInternalPromptId(userPromptId);
if (!isInternal) {
this.logApiRequest(
this.toContents(req.contents),
req.model,
userPromptId,
);
}
const openaiRequest = isInternal
? undefined
: await this.buildOpenAIRequestForLogging(req);
let stream: AsyncGenerator<GenerateContentResponse>;
try {
stream = await this.wrapped.generateContentStream(req, userPromptId);
} catch (error) {
const durationMs = Date.now() - startTime;
this._logApiError('', durationMs, error, req.model, userPromptId);
if (!isInternal) {
await this.logOpenAIInteraction(openaiRequest, undefined, error);
}
throw error;
}
return this.loggingStreamWrapper(
stream,
startTime,
userPromptId,
req.model,
openaiRequest,
);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts
Show resolved
Hide resolved
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts
Outdated
Show resolved
Hide resolved
1. Fix NO_TOOLS type: Object.freeze produces readonly array incompatible with ToolUnion[]. Use Readonly<Pick<>> instead; spread in requestConfig already creates a fresh mutable copy per call. 2. Fix test missing required 'model' field in ContentGeneratorConfig. 3. Track firstResponseId/firstModelVersion in loggingStreamWrapper so _logApiResponse/_logApiError have accurate values even when full response collection is skipped for internal prompts. 4. Strengthen OpenAI logger test assertion: assert OpenAILogger was constructed (not guarded by if), then assert logInteraction was not called. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts
Show resolved
Hide resolved
1. Simplify runForkedQuery: requestConfig always has tools:[] from NO_TOOLS spread, so the Object.keys().length > 0 ternary is dead code. Pass requestConfig directly. 2. Add generateContentStream test for internal prompt IDs to match the existing generateContent coverage, ensuring the streaming wrapper also skips logApiRequest and OpenAI interaction logging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
When accepting a followup suggestion via Enter, accept() queued buffer.insert(suggestion) in a microtask that executed after handleSubmitAndClear had already cleared the buffer, leaving the suggestion text stuck in the input. Add skipOnAccept option to accept() so the Enter path bypasses the onAccept callback. Also add runForkedQuery unit tests verifying tools: [] is passed in per-request config. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
wenshao
left a comment
There was a problem hiding this comment.
Review Summary
Core package changes are well-implemented with good test coverage (183 tests pass). However, two critical issues need attention:
Critical
-
WebUI
createFollowupControllerdoes NOT implementskipOnAccept— The interface declaresaccept(method?, options?)with{ skipOnAccept?: boolean }, but the actual implementation silently ignores the second argument. The microtask still firesonAccept, re-inserting the suggestion text after the buffer was cleared. This is the exact same race condition the PR fixes in the CLI, but still present in the WebUI. -
'speculation'missing fromINTERNAL_PROMPT_IDS— Only'prompt_suggestion'and'forked_query'are included. Speculation uses the sameLoggingContentGenerator→loggers.ts→chatRecordingServicepipeline, so speculation tool calls will still leak into chat recordings — the same problem this PR fixes for followup suggestions.
Suggestion
- WebUI duplicates
createFollowupController— Full ~110-line re-implementation from core. This duplication is the root cause of issue (1). Consider importing from core directly or using a thin adapter.
— qwen3.6-plus via Qwen Code /review
|
回复 qwen3.6-plus review:
|
|
[Critical] On Suggested fix: restore the — gpt-5.4 via Qwen Code /review |
…, improve suggestion prompt - Add 'speculation' to INTERNAL_PROMPT_IDS so speculation API traffic and tool calls are hidden from chat recordings and tool call UI - Add isInternalPromptId check to logToolCall() for consistency with logApiError/logApiResponse - Improve SUGGESTION_PROMPT: prioritize assistant's last few lines and extract actionable text from explicit tips (e.g. "Tip: type X") - Fix garbled unicode in prompt text - Update design docs and user docs to reflect changes - Add test coverage for all new behavior
|
回复 gpt-5.4 review: 误报。 本 PR 的 diff 中包含该变更是因为 merge main 时引入的,不需要恢复。 |
…erator tests - Object.freeze NO_TOOLS and its tools array to prevent runtime mutation - Add 'speculation' to loggingContentGenerator internal prompt ID tests for consistency with loggers.test.ts and internalPromptIds.ts
Use `as const` with type assertion to satisfy TypeScript while keeping runtime immutability via Object.freeze.
…rs.ts All consumers import directly from utils/internalPromptIds.js. The re-export was dead code with no importers.
yiliang114
left a comment
There was a problem hiding this comment.
Core / CLI fixes look good to me. I’m treating the exported WebUI followup consumer alignment as separate follow-up work, tracked in #3030, rather than a blocker for this PR.
Summary
Follow-up suggestion generation API calls were leaking into the tool call UI. Additionally, accepting a suggestion via Enter left the suggestion text stuck in the input buffer due to a microtask race condition. This PR fixes both issues, plus improves suggestion quality and hardens internal prompt filtering.
Root Cause 1 — Tool call UI leak
logApiResponse/logApiErrorwrote to chatRecordingServicelogToolCallwrote to chatRecordingService unconditionallyLoggingContentGeneratorrecorded requests/responsesRoot Cause 2 — Buffer race on Enter accept
When accepting a followup suggestion via Enter,
accept()queuedbuffer.insert(suggestion)in a microtask, which executed afterhandleSubmitAndClearhad already cleared the buffer — leaving the suggestion text stuck in the input. Fixed by adding askipOnAcceptoption toaccept()for the Enter path, which bypasses theonAcceptcallback since the text is passed directly to submit.Changes
New
utils/internalPromptIds.ts— Centralises internal prompt IDs (prompt_suggestion,forked_query,speculation) in aReadonlySetwith an exportedisInternalPromptId()guard. Adding a new internal ID requires changing only this file.followup/forkedQuery.ts—runForkedQuerysetstools: []via a deep-frozenNO_TOOLSconstant (Object.freezewithas const) in the per-request config, preventing the model from producing function calls.createForkedChatretains the full generationConfig (including tools) so speculation callers can still execute tool calls.telemetry/loggers.ts—logApiResponse,logApiError, andlogToolCallskipchatRecordingService.recordUiTelemetryEvent()for internal prompt IDs.uiTelemetryService.addEvent()is preserved so/statsstill tracks suggestion token usage. Removed unusedisInternalPromptIdre-export (all consumers import directly fromutils/internalPromptIds.js).loggingContentGenerator.ts— For internal prompt IDs:logApiRequest,buildOpenAIRequestForLogging, andlogOpenAIInteractionloggingStreamWrapperskips response collection and consolidation to reduce CPU/memory overheadfirstResponseId/firstModelVersionduring iteration so_logApiResponse/_logApiErrorretain accurate IDs even without full response collectionfollowup/followupState.ts—accept()gains an optional{ skipOnAccept: true }parameter. When set, the microtask skips theonAcceptcallback (buffer insert) while still firing telemetry and clearing state.followup/suggestionGenerator.ts— ImprovedSUGGESTION_PROMPT:FIRST:directive now prioritizes the last few lines of the assistant's response (where tips and next-step hints appear)PRIORITY:section extracts actionable text from explicit tips (e.g.,Tip: type post comments→post comments)InputPrompt.tsx(CLI) /InputForm.tsx(WebUI) — Enter accept path passes{ skipOnAccept: true }to prevent the microtask from re-inserting the suggestion after the buffer was already cleared by submit.Documentation — Updated
prompt-suggestion-design.md(LLM Prompt section synced, new Internal Prompt ID Filtering section) andfollowup-suggestions.md(tip extraction mention in user docs).Not affected
/statstoken tracking ✅skipOnAccept, behavior unchanged)Test plan
skipOnAcceptunit test: skipsonAcceptcallback, still fires telemetrybuffer.insertis NOT calledlogToolCallinternal ID filtering: 3 prompt IDs verifiedloggingContentGeneratorinternal ID filtering: speculation includedinternalPromptIdstest: all 3 IDs + negative cases/statsincludes suggestion token counts