fix(api-server): typed Responses input items + duplicated history on chained turns by WKHarmon · Pull Request #21963 · NousResearch/hermes-agent

WKHarmon · 2026-05-08T16:22:37Z

What does this PR do?

Fixes two related bugs in /v1/responses request parsing that cause Open WebUI's stateful Responses-mode multi-turn chats to corrupt their stored conversation_history.

Bug 1: Typed Responses input items get coerced into `{role: "user", content: ""}` messages

The Responses API spec allows input[] to contain typed items: {type: "function_call", ...}, {type: "function_call_output", ...}, {type: "reasoning", ...}, {type: "message", role: ..., content: ...}. Open WebUI forwards prior assistant turns as a sequence of these typed items when chaining.

The current parser at gateway/platforms/api_server.py:2025 treats every dict the same way:

elif isinstance(item, dict):
    role = item.get("role", "user")
    ...
    input_messages.append({"role": role, "content": content})

function_call / function_call_output items have no role, so they default to user with empty content. They become spurious user-shaped history entries, bloating context and making the agent re-address old user questions.

Bug 2: `previous_response_id` + inlined history → duplicated transcript

When previous_response_id is set, the server loads stored prior history from the response store (api_server.py:2081). It then unconditionally appends input_messages[:-1] to that loaded history. Open WebUI's Responses-mode connector sends both previous_response_id and re-inlines the entire prior transcript in input[]. Result: every chained turn duplicates every prior turn in stored history; long chains grow exponentially.

This is distinct from #18995 / #21185 ("avoid duplicated Responses history"). That fix runs at storage time and inspects result["messages"] returned by the agent. It does not catch the input-side duplication that happens before the agent runs.

Related Issue

No existing issue — first reproduction is described below. Happy to file a bug-report issue separately if maintainers prefer.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

gateway/platforms/api_server.py:
- Bug 1 fix (lines 2018–2030): when iterating input[] dicts, skip items whose type is set to anything other than "message". Untyped role/content dicts (chat-style callers) still pass through unchanged.
- Bug 2 fix (lines 2087–2102): track whether conversation_history was loaded from a prior source (body.conversation_history or previous_response_id). When it was, skip the for msg in input_messages[:-1]: conversation_history.append(msg) loop — those inlined items are a redundant client-side replay.
tests/gateway/test_api_server.py: three regression tests
- test_responses_input_skips_function_call_items — Bug 1 coverage
- test_previous_response_id_does_not_duplicate_inlined_history — Bug 2 coverage (previous_response_id path)
- test_explicit_conversation_history_is_not_duplicated_by_input — Bug 2 coverage (body.conversation_history path)

How to Test

Reproduction (before fix)

Connect Open WebUI in Responses mode (api_configs[N].api_type = "responses") to a Hermes api_server endpoint. Send two turns:

"What is in my home folder?" (triggers a tool call)
"Which file is the largest?" (context-dependent follow-up)

Observe the second turn's stored conversation_history in ~/.hermes/response_store.db:

[0] user: "What is in my home folder?"
[1] user: ""                                       ← function_call item, coerced
[2] user: ""                                       ← function_call_output item, coerced
[3] assistant: "Your home folder is /home/kyle..."
[4] user: "Which file is the largest?"
[5] user: "What is in my home folder?"             ← DUPLICATE of [0]
[6] assistant: "Your home folder is..."            ← DUPLICATE of [3]
[7] user: "Which file is the largest?"             ← DUPLICATE of [4]
[8-10] current turn's tool flow

hist_len = 11 for what should be a 5-message chain.

After fix

Same two-turn sequence; second turn stores:

[0] user: "What is in my home folder..."
[1] assistant: "Your home folder is /home/kyle..."
[2] user: "Which file is the largest?"

hist_len = 3. Empty user messages gone. No duplication.

Test commands

pytest tests/gateway/test_api_server.py -q

141/141 pass on my checkout (138 existing + 3 new). The three new tests fail on main and pass after this PR.

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits
I searched for existing PRs — closest is fix(gateway): avoid duplicated Responses history (salvage #18995) #21185 (storage-side dedupe, distinct path); fix(gateway): preserve tool_call_id in api_server conversation history #17422 (preserve tool_call_id, adjacent area)
My PR contains only changes related to these two bugs
I've run pytest tests/gateway/test_api_server.py -q and all tests pass
I've added tests for the changes
Tested on Ubuntu 24.04 against a live Open WebUI instance

Documentation & Housekeeping

No documentation changes needed — internal request-parsing fix
No config keys added/changed
No architecture changes
No cross-platform impact — pure Python request handler logic

When conversation_history is loaded from previous_response_id or body.conversation_history, the input[] array's leading items are a client-side replay of the same turns — appending them duplicates every prior turn in stored history. Open WebUI's Responses mode triggers this: it sends both previous_response_id (which loads stored prior history) AND re-inlines the entire prior transcript as typed message items in input[]. Without this guard, every chained turn doubles conversation_history; long chains grow exponentially. This is distinct from NousResearch#18995 / NousResearch#21185 (which deduplicates result["messages"] on the storage path). That fix runs at storage time and inspects the agent's returned transcript; this fix runs at request time and rejects redundant inlined history before it ever reaches conversation_history. Test: test_previous_response_id_does_not_duplicate_inlined_history, test_explicit_conversation_history_is_not_duplicated_by_input.

WKHarmon added 2 commits May 8, 2026 09:21

fix(api-server): skip non-message Responses input items

fa75a00

alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists labels May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api-server): typed Responses input items + duplicated history on chained turns#21963

fix(api-server): typed Responses input items + duplicated history on chained turns#21963
WKHarmon wants to merge 2 commits into
NousResearch:mainfrom
WKHarmon:fix/responses-input-parsing

WKHarmon commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

WKHarmon commented May 8, 2026

What does this PR do?

Bug 1: Typed Responses input items get coerced into {role: "user", content: ""} messages

Bug 2: previous_response_id + inlined history → duplicated transcript

Related Issue

Type of Change

Changes Made

How to Test

Reproduction (before fix)

After fix

Test commands

Checklist

Code

Documentation & Housekeeping

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bug 1: Typed Responses input items get coerced into `{role: "user", content: ""}` messages

Bug 2: `previous_response_id` + inlined history → duplicated transcript