Skip to content

fix(gateway): avoid duplicated Responses history#18995

Closed
thelumiereguy wants to merge 1 commit into
NousResearch:mainfrom
thelumiereguy:fix/responses-history-duplication
Closed

fix(gateway): avoid duplicated Responses history#18995
thelumiereguy wants to merge 1 commit into
NousResearch:mainfrom
thelumiereguy:fix/responses-history-duplication

Conversation

@thelumiereguy
Copy link
Copy Markdown
Contributor

@thelumiereguy thelumiereguy commented May 2, 2026

Closes the gap: Responses continuation history duplicates prior turns

Problem

/v1/responses supports stateful continuation through previous_response_id, but the API server treated AIAgent.run_conversation()'s result["messages"] as a turn-local suffix. In practice that value is the full internal transcript: previous history, the current user message, and new assistant/tool messages.

The persistence path rebuilt conversation_history + current user and then appended result["messages"], so each chained response stored the previous transcript twice. Long chains grew roughly exponentially and could trigger unnecessary context compression before otherwise small follow-up requests.

The non-streaming response body had a related replay issue: output extraction walked the full transcript and could re-emit old tool call artifacts from earlier previous_response_id turns.

Fix

  • Add a shared turn-boundary detector for transcript-shaped result["messages"].
  • Store the agent transcript directly when it already contains the prior history and current user message.
  • Preserve compatibility with suffix-shaped mocked/legacy results by appending them to conversation_history + current user.
  • Scope non-streaming Responses output extraction to current-turn assistant/tool messages so previous tool artifacts are not replayed.
  • Use the same storage helper in streaming and non-streaming completion paths.

Behaviour table

Scenario Before After
Chained non-streaming response Stored old + user + old + user + new Stores old + user + new
Chained streaming response Stored duplicated prior history Stores one linear transcript
Non-streaming response output with previous tools Replayed old function calls/results Emits current-turn output only
Suffix-shaped agent result Appended as suffix Still appended as suffix

Scope

  • /v1/responses persistence in gateway/platforms/api_server.py
  • Non-streaming Responses output item extraction
  • Regression coverage in tests/gateway/test_api_server.py

No run_agent.py contract change: run_conversation() still returns the canonical full transcript used by CLI, TUI, ACP, gateway, and batch paths.

Test plan

  • scripts/run_tests.sh tests/gateway/test_api_server.py -q
  • Result: 127 passed, 81 warnings
  • Platform tested: macOS (Darwin), Python 3.11.15

Type of Change

  • New feature
  • Bug fix
  • Documentation
  • Refactor

References

  • gateway/platforms/api_server.py:_build_response_conversation_history
  • gateway/platforms/api_server.py:_response_messages_turn_start_index
  • gateway/platforms/api_server.py:_extract_output_items
  • Regression tests: test_previous_response_id_stores_full_agent_transcript_once, test_previous_response_id_outputs_only_current_turn_items, test_streamed_previous_response_id_stores_full_agent_transcript_once

@thelumiereguy thelumiereguy changed the title fix(api): avoid duplicated Responses history fix(api_server): avoid duplicated Responses history May 2, 2026
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels May 2, 2026
@thelumiereguy thelumiereguy force-pushed the fix/responses-history-duplication branch from 6b4ef85 to fcf659a Compare May 2, 2026 22:24
@thelumiereguy thelumiereguy changed the title fix(api_server): avoid duplicated Responses history fix(gateway): avoid duplicated Responses history May 2, 2026
@teknium1
Copy link
Copy Markdown
Contributor

teknium1 commented May 7, 2026

Merged via #21185 with your commit cherry-picked onto current main — your authorship is preserved in git log via rebase-merge. Thanks @thelumiereguy, clean fix and great repro case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants