fix(agent): propagate ContextVars to concurrent tool worker threads (salvage #16660)#18123
Merged
Merged
Conversation
`_execute_tool_calls_concurrent` submits tools via `executor.submit(_run_tool, ...)`
without `copy_context().run`, so worker threads do not inherit the parent's
ContextVar values — including `_approval_session_key` set by the gateway before
`agent.run`. Worker tools fall through `tools/approval.py:get_current_session_key`'s
resolution order to the `os.environ` fallback ("default" session key), silently
collapsing per-session dispatch for any tool that runs on a worker thread.
Fix: wrap the submitted callable in `contextvars.copy_context().run`, mirroring
`asyncio.to_thread`'s implementation. The existing threading.local callback
propagation (0046d17 / GHSA-qg5c-hvr5-hjgr) is preserved unchanged — it
handles a different propagation surface that ContextVars cannot carry.
… executor Regression suite for the PR #16660 fix. Five layers of guards: 1. test_executor_submit_without_copy_context_does_not_propagate — documents the Python contract the fix relies on. If this ever flips, the fix becomes redundant (and the comment explains why). 2. test_executor_submit_with_copy_context_run_propagates — positive contract test for the copy_context().run(...) pattern itself. 3. test_run_tool_worker_sees_parent_approval_session_key — exercises the real tools.approval._approval_session_key ContextVar through the executor pattern end-to-end. 4. test_run_agent_concurrent_executor_wraps_submit_with_copy_context — AST-level guard: parses run_agent.py and asserts the executor.submit call site for _run_tool is invoked with ctx.run as the first arg, not _run_tool directly. Reverting the fix fails this test with a concrete diagnostic message. This is the PRIMARY regression guard; behavioral tests above exercise the pattern but not the actual call site. 5. test_two_concurrent_tool_batches_keep_session_keys_isolated — two concurrent callers each set a different session key; each worker must observe its own caller. Guards against future refactors that share a single context snapshot across callers. Validated by plant-and-revert: reverting run_agent.py to origin/main makes guard #4 fail with the expected diagnostic. Restoring the fix = 5/5 pass.
donald131
pushed a commit
to donald131/hermes-agent
that referenced
this pull request
May 2, 2026
…ousResearch#18123) Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics. Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd). Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site. Closes NousResearch#16660. Co-authored-by: firefly <promptsiren@gmail.com> Co-authored-by: banditburai <banditburai@users.noreply.github.com>
nickdlkk
pushed a commit
to nickdlkk/hermes-agent
that referenced
this pull request
May 11, 2026
…ousResearch#18123) Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics. Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd). Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site. Closes NousResearch#16660. Co-authored-by: firefly <promptsiren@gmail.com> Co-authored-by: banditburai <banditburai@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvages the core fix from #16660 by @banditburai (commit authored by @firefly). The original PR had scope contamination — a
tests/eval_018/directory containing eval-oracle test files for a different project ("talaria") that fail 5/5 on hermes main (check for a non-existentcli.CliErrorclass, a rename fromContextCompressor→ContextCompactorthat never happened here,talaria.tools._anchor_stateimports, a wronggenerate_titletimeout default, etc.). Those were dropped from this salvage.The real fix (4 LOC in run_agent.py)
_execute_tool_calls_concurrentsubmits tools viaexecutor.submit(_run_tool, ...)withoutcopy_context().run, so worker threads run with a fresh context —tools.approval._approval_session_key(set by gateway adapters beforeagent.run) is invisible. Workers fall throughget_current_session_key()'s resolution order to theos.environfallback (which every agent step overwrites), silently collapsing per-session dispatch to whichever session stepped most recently.Fix: snapshot the caller's context and submit
ctx.run(_run_tool, …). Mirrorsasyncio.to_threadsemantics. The existing threading.local callback propagation atrun_agent.py:~8796(from commit 0046d17 / GHSA-qg5c-hvr5-hjgr) is preserved — that one handles a different propagation surface (approval/sudo callbacks) that ContextVars cannot carry across thread boundaries.Real-world repro
Via @syahidfrd on #16660 comment: two concurrent Slack sessions (channels A and B), session A's agent fired a dangerous-command approval for a recursive delete → approval card was delivered to channel B — the user there saw an approval prompt for a command they had no context for, while session A's thread blocked waiting for a response that would never come. Any user in B could click "Allow Once" without understanding what they were authorizing.
Regression suite
tests/run_agent/test_tool_executor_contextvar_propagation.py— 5 guards, following thecontextvar-run-in-executor-bridgeskill's two-test pattern plus a source-level guard for the real call site:Contract documentation —
executor.submit(fn)withoutcopy_contextdoes NOT propagate ContextVars. If this ever flips, the fix becomes redundant.Contract validation —
copy_context().run(fn)does propagate. Positive baseline.End-to-end — set the real
_approval_session_keyin a caller, verify the worker thread observes it viaget_current_session_key().Source-level guard — AST-parses
run_agent.pyand asserts theexecutor.submitcall site for_run_toolis invoked withctx.runas its first arg. This is the primary regression guard. Behavioral tests 1-3 + 5 exercise the pattern but not the real call site — they keep passing even if someone reverts the wrapper inrun_agent.py. Test 4 fails with a concrete diagnostic:Concurrent-caller isolation — two callers each set a different session key; each worker must see its own caller's key.
Regression guard validation
Planted the pre-fix shape: reverted
run_agent.pytoorigin/main→ guard #4 fails with the diagnostic above ✓. Restored the fix → 5/5 pass ✓.Test plan
scripts/run_tests.sh tests/run_agent/test_tool_executor_contextvar_propagation.py— 5/5 passtests/run_agent/ tests/tools/test_approval.py tests/acp/— 1481 pass, 17 skipped, 1 pre-existing failure onorigin/main(test_interactive_env_var_routes_to_callback, unrelated to this PR)executor.submit(read_ctxvar)seesDEFAULTwhileexecutor.submit(ctx.run, read_ctxvar)sees the parent-set valueCloses #16660.
Co-authored-by: firefly promptsiren@gmail.com
Co-authored-by: banditburai banditburai@users.noreply.github.com