fix(api-server): typed Responses input items + duplicated history on chained turns#21963
Open
WKHarmon wants to merge 2 commits into
Open
fix(api-server): typed Responses input items + duplicated history on chained turns#21963WKHarmon wants to merge 2 commits into
WKHarmon wants to merge 2 commits into
Conversation
When conversation_history is loaded from previous_response_id or body.conversation_history, the input[] array's leading items are a client-side replay of the same turns — appending them duplicates every prior turn in stored history. Open WebUI's Responses mode triggers this: it sends both previous_response_id (which loads stored prior history) AND re-inlines the entire prior transcript as typed message items in input[]. Without this guard, every chained turn doubles conversation_history; long chains grow exponentially. This is distinct from NousResearch#18995 / NousResearch#21185 (which deduplicates result["messages"] on the storage path). That fix runs at storage time and inspects the agent's returned transcript; this fix runs at request time and rejects redundant inlined history before it ever reaches conversation_history. Test: test_previous_response_id_does_not_duplicate_inlined_history, test_explicit_conversation_history_is_not_duplicated_by_input.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes two related bugs in
/v1/responsesrequest parsing that cause Open WebUI's stateful Responses-mode multi-turn chats to corrupt their storedconversation_history.Bug 1: Typed Responses input items get coerced into
{role: "user", content: ""}messagesThe Responses API spec allows
input[]to contain typed items:{type: "function_call", ...},{type: "function_call_output", ...},{type: "reasoning", ...},{type: "message", role: ..., content: ...}. Open WebUI forwards prior assistant turns as a sequence of these typed items when chaining.The current parser at
gateway/platforms/api_server.py:2025treats every dict the same way:function_call/function_call_outputitems have norole, so they default touserwith emptycontent. They become spurious user-shaped history entries, bloating context and making the agent re-address old user questions.Bug 2:
previous_response_id+ inlined history → duplicated transcriptWhen
previous_response_idis set, the server loads stored prior history from the response store (api_server.py:2081). It then unconditionally appendsinput_messages[:-1]to that loaded history. Open WebUI's Responses-mode connector sends bothprevious_response_idand re-inlines the entire prior transcript ininput[]. Result: every chained turn duplicates every prior turn in stored history; long chains grow exponentially.This is distinct from #18995 / #21185 ("avoid duplicated Responses history"). That fix runs at storage time and inspects
result["messages"]returned by the agent. It does not catch the input-side duplication that happens before the agent runs.Related Issue
No existing issue — first reproduction is described below. Happy to file a bug-report issue separately if maintainers prefer.
Type of Change
Changes Made
gateway/platforms/api_server.py:input[]dicts, skip items whosetypeis set to anything other than"message". Untyped role/content dicts (chat-style callers) still pass through unchanged.conversation_historywas loaded from a prior source (body.conversation_historyorprevious_response_id). When it was, skip thefor msg in input_messages[:-1]: conversation_history.append(msg)loop — those inlined items are a redundant client-side replay.tests/gateway/test_api_server.py: three regression teststest_responses_input_skips_function_call_items— Bug 1 coveragetest_previous_response_id_does_not_duplicate_inlined_history— Bug 2 coverage (previous_response_idpath)test_explicit_conversation_history_is_not_duplicated_by_input— Bug 2 coverage (body.conversation_historypath)How to Test
Reproduction (before fix)
Connect Open WebUI in Responses mode (
api_configs[N].api_type = "responses") to a Hermes api_server endpoint. Send two turns:Observe the second turn's stored
conversation_historyin~/.hermes/response_store.db:hist_len = 11for what should be a 5-message chain.After fix
Same two-turn sequence; second turn stores:
hist_len = 3. Empty user messages gone. No duplication.Test commands
141/141 pass on my checkout (138 existing + 3 new). The three new tests fail on
mainand pass after this PR.Checklist
Code
pytest tests/gateway/test_api_server.py -qand all tests passDocumentation & Housekeeping