Skip to content

Force forked agents to inherit parent model settings#16055

Closed
friel-openai wants to merge 11 commits intomainfrom
dev/friel/fork-context-inherits-parent-model
Closed

Force forked agents to inherit parent model settings#16055
friel-openai wants to merge 11 commits intomainfrom
dev/friel/fork-context-inherits-parent-model

Conversation

@friel-openai
Copy link
Copy Markdown
Contributor

@friel-openai friel-openai commented Mar 27, 2026

Summary

  • make explicit forked spawns (fork_context = true / fork_turns) ignore child model and reasoning_effort overrides
  • keep forked agents on the spawning thread's effective model, provider, reasoning, service tier, verbosity, context-window, auto-compact, and profile settings
  • preserve forked-child prompt-cache behavior by reusing the parent prompt cache key
  • preserve MCP tool-surface parity for forked children by inheriting the parent MCP connection manager instead of starting a fresh manager
  • add v1/v2 regression tests for model inheritance, cache-key inheritance, and MCP inheritance

Test

  • cargo +1.93.1 test -p codex-core spawn_agent_fork_context_ignores_child_model_overrides --lib
  • cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns_ignores_child_model_overrides --lib
  • cargo +1.93.1 test -p codex-core spawn_agent_can_fork_parent_thread_history_with_sanitized_items --lib

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b0bc4b161

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread codex-rs/core/src/tools/handlers/multi_agents/spawn.rs
Comment thread codex-rs/core/src/tools/handlers/multi_agents_tests.rs
@friel-openai friel-openai force-pushed the dev/friel/fork-context-inherits-parent-model branch 4 times, most recently from 8bbc644 to 3379d41 Compare April 1, 2026 00:07
@friel-openai friel-openai force-pushed the dev/friel/fork-context-inherits-parent-model branch from 3379d41 to fc08138 Compare April 1, 2026 23:46
@friel-openai friel-openai force-pushed the dev/friel/fork-context-inherits-parent-model branch 3 times, most recently from 7d0d474 to 01c2cff Compare April 7, 2026 17:33
@friel-openai friel-openai force-pushed the dev/friel/fork-context-inherits-parent-model branch from 6b62601 to e351cad Compare April 7, 2026 22:31
Copy link
Copy Markdown
Collaborator

@jif-oai jif-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way too large. Forked agent should just inherit the full config of the parent agent. We shouldn't need to touch any other components

Btw profiles will disappear so feel free to ignore them all

@friel-openai
Copy link
Copy Markdown
Contributor Author

@jif-oai there are a few issues fixed in this PR:

  • configuration that affects forked agent behavior, such as model, reasoning effort, context window
  • MCP servers refreshing and generating distinct tool calls, not relying on the parent's MCP
  • Prompt cache keys

I think your comment is primarily about the first, is that right?

Copy link
Copy Markdown
Collaborator

@jif-oai jif-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this would be good to share more cache here but I don't think this PR fits at the right level.
I would suggest 3 different small PRs:

  1. Fully re-use the parent config in case of a fork + return a warning or error to the agent if it tries to set anything else
  2. Shared cache key, do a proper wiring that let us "override" the cache key for a given thread. That we fix at the creation of the thread and gets set directly into the API client
  3. Make the MCP tools listing stable. Either by definition or using a snapshot

mcp_connection_manager: Arc::new(RwLock::new(McpConnectionManager::new_uninitialized(
&config.permissions.approval_policy,
))),
mcp_connection_manager: session_configuration
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't do this. MCP manager get replaced/modified if we have sandbox updates, elicitation, ...
In general, we can't share the manager without re-designing it
If I get it correctly, the only reason we do this is because we want a stability over built_tools. First, this shouldn't drift most of the time. Second, we should use a snapshot around it to fix the issue then

let inherited_exec_policy = self
.inherited_exec_policy_for_source(&state, Some(&session_source), &config)
.await;
let inherited_prompt_cache_key = self
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This resume path should not opportunistically inherit parent cache/MCP state from a live parent. SubAgentSource::ThreadSpawn does not persist whether this child was originally forked, so the same resumed agent will get parent-shared cache/MCP only when the parent is currently loaded, and isolated state otherwise... we can't have such non-determinism

inherited_shell_snapshot: None,
user_shell_override: None,
inherited_exec_policy: Some(Arc::clone(&parent_session.services.exec_policy)),
inherited_prompt_cache_key: Some(parent_session.prompt_cache_key()),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not fork

@@ -770,6 +773,8 @@ impl ThreadManagerState {
metrics_service_name: Option<String>,
inherited_shell_snapshot: Option<Arc<ShellSnapshot>>,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a lot of optional inherited things. We should have a small builder pattern for the inherited things

@@ -225,7 +225,11 @@ fn build_agent_shared_config(turn: &TurnContext) -> Result<Config, FunctionCallE
let mut config = (*base_config).clone();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this should be cleared IMO
We should just re-use the full same config... otherwise this will become brittle over time

config.model = Some(turn.model_info.slug.clone());
config.model_provider = turn.provider.clone();
config.model_reasoning_effort = turn.reasoning_effort;
// Forked children must preserve the spawning turn's effective model settings, including a
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code path is not only for forked agents

pub(crate) inherited_shell_snapshot: Option<Arc<ShellSnapshot>>,
pub(crate) inherited_exec_policy: Option<Arc<ExecPolicyManager>>,
pub(crate) inherited_prompt_cache_key: Option<ThreadId>,
pub(crate) inherited_mcp_connection_manager: Option<Arc<RwLock<McpConnectionManager>>>,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up Option of an Arc of RwLock looks like a code smell in Rust... there are better alternatives for hot swap shared references

}

#[tokio::test]
async fn spawn_agent_fork_context_ignores_child_model_overrides() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is kind of pointless IMO
A better test would be an integration test that makes sure the full context is the same at the API level


let mut config = (*turn.config).clone();
let mut role_provider =
built_in_model_providers(/* openai_base_url */ /*openai_base_url*/ None)["openai"].clone();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dup in the comment

args.reasoning_effort,
)
.await?;
if !args.fork_context {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably error to the model if it specify fork with a model and reasoning effort. Or at least emit a warning

friel-openai added a commit that referenced this pull request Apr 13, 2026
## Summary

When a `spawn_agent` call does a full-history fork, keep the parent's
effective agent type and model configuration instead of applying child
role/model overrides.

This is the minimal config-inheritance slice of #16055. Prompt-cache key
inheritance and MCP tool-surface stability are split into follow-up PRs.

## Design

- Reject `agent_type`, `model`, and `reasoning_effort` for v1
`fork_context` spawns.
- Reject `agent_type`, `model`, and `reasoning_effort` for v2
`fork_turns = "all"` spawns.
- Keep v2 partial-history forks (`fork_turns = "N"`) configurable;
requested model/reasoning overrides and role config still apply there.
- Keep non-forked spawn behavior unchanged.

## Tests

- `cargo +1.93.1 test -p codex-core spawn_agent_fork_context --lib`
- `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns
--lib`
- `cargo +1.93.1 test -p codex-core
multi_agent_v2_spawn_partial_fork_turns_allows_agent_type_override
--lib`
manoelcalixto added a commit to Electivus/codex that referenced this pull request Apr 14, 2026
* Mirror user text into realtime (openai#17520)

- Let typed user messages submit while realtime is active and mirror
accepted text into the realtime text stream.
- Add integration coverage and snapshot for outbound realtime text.

* feat(tui): add reverse history search to composer (openai#17550)

## Problem

The TUI had shell-style Up/Down history recall, but `Ctrl+R` did not
provide the reverse incremental search workflow users expect from
shells. Users needed a way to search older prompts without immediately
replacing the current draft, and the interaction needed to handle async
persistent history, repeated navigation keys, duplicate prompt text,
footer hints, and preview highlighting without making the main composer
file even harder to review.


https://github.com/user-attachments/assets/5165affd-4c9a-46e9-adbd-89088f5f7b6b

<img width="1227" height="722" alt="image"
src="https://github.com/user-attachments/assets/8bc83289-eeca-47c7-b0c3-8975101901af"
/>

## Mental model

`Ctrl+R` opens a temporary search session owned by the composer. The
footer line becomes the search input, the composer body previews the
current match only after the query has text, and `Enter` accepts that
preview as an editable draft while `Esc` restores the draft that existed
before search started. The history layer provides a combined offset
space over persistent and local history, but search navigation exposes
unique prompt text rather than every physical history row.

## Non-goals

This change does not rewrite stored history, change normal Up/Down
browsing semantics, add fuzzy matching, or add persistent metadata for
attachments in cross-session history. Search deduplication is
deliberately scoped to the active Ctrl+R search session and uses exact
prompt text, so case, whitespace, punctuation, and attachment-only
differences are not normalized.

## Tradeoffs

The implementation keeps search state in the existing composer and
history state machines instead of adding a new cross-module controller.
That keeps ownership local and testable, but it means the composer still
coordinates visible search status, draft restoration, footer rendering,
cursor placement, and match highlighting while `ChatComposerHistory`
owns traversal, async fetch continuation, boundary clamping, and
unique-result caching. Unique-result caching stores cloned
`HistoryEntry` values so known matches can be revisited without cache
lookups; this is simple and robust for interactive search sizes, but it
is not a global history index.

## Architecture

`ChatComposer` detects `Ctrl+R`, snapshots the current draft, switches
the footer to `FooterMode::HistorySearch`, and routes search-mode keys
before normal editing. Query edits call `ChatComposerHistory::search`
with `restart = true`, which starts from the newest combined-history
offset. Repeated `Ctrl+R` or Up searches older; Down searches newer
through already discovered unique matches or continues the scan.
Persistent history entries still arrive asynchronously through
`on_entry_response`, where a pending search either accepts the response,
skips a duplicate, or requests the next offset.

The composer-facing pieces now live in
`codex-rs/tui/src/bottom_pane/chat_composer/history_search.rs`, leaving
`chat_composer.rs` responsible for routing and rendering integration
instead of owning every search helper inline.
`codex-rs/tui/src/bottom_pane/chat_composer_history.rs` remains the
owner of stored history, combined offsets, async fetch state, boundary
semantics, and duplicate suppression. Match highlighting is computed
from the current composer text while search is active and disappears
when the match is accepted.

## Observability

There are no new logs or telemetry. The practical debug path is state
inspection: `ChatComposer.history_search` tells whether the footer query
is idle, searching, matched, or unmatched; `ChatComposerHistory.search`
tracks selected raw offsets, pending persistent fetches, exhausted
directions, and unique match cache state. If a user reports skipped or
repeated results, first inspect the exact stored prompt text, the
selected offset, whether an async persistent response is still pending,
and whether a query edit restarted the search session.

## Tests

The change is covered by focused `codex-tui` unit tests for opening
search without previewing the latest entry, accepting and canceling
search, no-match restoration, boundary clamping, footer hints,
case-insensitive highlighting, local duplicate skipping, and persistent
duplicate skipping through async responses. Snapshot coverage captures
the footer-mode visual changes. Local verification used `just fmt`,
`cargo test -p codex-tui history_search`, `cargo test -p codex-tui`, and
`just fix -p codex-tui`.

* Remove context status-line meter (openai#17420)

Addresses openai#17313

Problem: The visual context meter in the status line was confusing and
continued to draw negative feedback, and context reporting should remain
an explicit opt-in rather than part of the default footer.

Solution: Remove the visual meter, restore opt-in context remaining/used
percentage items that explicitly say "Context", keep existing
context-usage configs working as a hidden alias, and update the setup
text and snapshots.

* Expose instruction sources (AGENTS.md) via app server (openai#17506)

Addresses openai#17498

Problem: The TUI derived /status instruction source paths from the local
client environment, which could show stale <none> output or incorrect
paths when connected to a remote app server.

Solution: Add an app-server v2 instructionSources snapshot to thread
start/resume/fork responses, default it to an empty list when older
servers omit it, and render TUI /status from that server-provided
session data.

Additional context: The app-server field is intentionally named
instructionSources rather than AGENTS.md-specific terminology because
the loaded instruction sources can include global instructions, project
AGENTS.md files, AGENTS.override.md, user-defined instruction files, and
future dynamic sources.

* fix(mcp) pause timer for elicitations (openai#17566)

## Summary
Stop counting elicitation time towards mcp tool call time. There are
some tradeoffs here, but in general I don't think time spent waiting for
elicitations should count towards tool call time, or at least not
directly towards timeouts.

Elicitations are not exactly like exec_command escalation requests, but
I would argue it's ~roughly equivalent.

## Testing
- [x] Added unit tests
- [x] Tested locally

* Add MCP tool wall time to model output (openai#17406)

Include MCP wall time in the output so the model is aware of how long
it's calls are taking.

* Run exec-server fs operations through sandbox helper (openai#17294)

## Summary
- run exec-server filesystem RPCs requiring sandboxing through a
`codex-fs` arg0 helper over stdin/stdout
- keep direct local filesystem execution for `DangerFullAccess` and
external sandbox policies
- remove the standalone exec-server binary path in favor of top-level
arg0 dispatch/runtime paths
- add sandbox escape regression coverage for local and remote filesystem
paths

## Validation
- `just fmt`
- `git diff --check`
- remote devbox: `cd codex-rs && bazel test --bes_backend=
--bes_results_url= //codex-rs/exec-server:all` (6/6 passed)

---------

Co-authored-by: Codex <noreply@openai.com>

* Stabilize exec-server process tests (openai#17605)

Problem: After openai#17294 switched exec-server tests to launch the top-level
`codex exec-server` command, parallel remote exec-process cases can
flake while waiting for the child server's listen URL or transport
shutdown.

Solution: Serialize remote exec-server-backed process tests and harden
the harness so spawned servers are killed on drop and shutdown waits for
the child process to exit.

* feat: ignore keyring on 0.0.0 (openai#17221)

To prevent the spammy: 
<img width="424" height="172" alt="Screenshot 2026-04-09 at 13 36 16"
src="https://github.com/user-attachments/assets/b5ece9e3-c561-422f-87ec-041e7bd6813d"
/>

* Build remote exec env from exec-server policy (openai#17216)

## Summary
- add an exec-server `envPolicy` field; when present, the server starts
from its own process env and applies the shell environment policy there
- keep `env` as the exact environment for local/embedded starts, but
make it an overlay for remote unified-exec starts
- move the shell-environment-policy builder into `codex-config` so Core
and exec-server share the inherit/filter/set/include behavior
- overlay only runtime/sandbox/network deltas from Core onto the
exec-server-derived env

## Why
Remote unified exec was materializing the shell env inside Core and
forwarding the whole map to exec-server, so remote processes could
inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
base env on the executor while preserving Core-owned runtime additions
like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
sandbox marker env.

## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-core --lib unified_exec::process_manager::tests`
- `cargo test -p codex-core --lib exec_env::tests`
- `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
matched 0 tests)
- `cargo test -p codex-config --lib shell_environment` (compile-only;
filter matched 0 tests)
- `just bazel-lock-update`

## Known local validation issue
- `just bazel-lock-check` is not runnable in this checkout: it invokes
`./scripts/check-module-bazel-lock.sh`, which is missing.

---------

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>

* nit: change consolidation model (openai#17633)

* fix: stability exec server (openai#17640)

* fix: dedup compact (openai#17643)

* Make forked agent spawns keep parent model config (openai#17247)

## Summary

When a `spawn_agent` call does a full-history fork, keep the parent's
effective agent type and model configuration instead of applying child
role/model overrides.

This is the minimal config-inheritance slice of openai#16055. Prompt-cache key
inheritance and MCP tool-surface stability are split into follow-up PRs.

## Design

- Reject `agent_type`, `model`, and `reasoning_effort` for v1
`fork_context` spawns.
- Reject `agent_type`, `model`, and `reasoning_effort` for v2
`fork_turns = "all"` spawns.
- Keep v2 partial-history forks (`fork_turns = "N"`) configurable;
requested model/reasoning overrides and role config still apply there.
- Keep non-forked spawn behavior unchanged.

## Tests

- `cargo +1.93.1 test -p codex-core spawn_agent_fork_context --lib`
- `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns
--lib`
- `cargo +1.93.1 test -p codex-core
multi_agent_v2_spawn_partial_fork_turns_allows_agent_type_override
--lib`

* Fix custom tool output cleanup on stream failure (openai#17470)

Addresses openai#16255

Problem: Incomplete Responses streams could leave completed custom tool
outputs out of cleanup and retry prompts, making persisted history
inconsistent and retries stale.

Solution: Route stream and output-item errors through shared cleanup,
and rebuild retry prompts from fresh session history after the first
attempt.

* Emit plan-mode prompt notifications for questionnaires (openai#17417)

Addresses openai#17252

Problem: Plan-mode clarification questionnaires used the generic
user-input notification type, so configs listening for plan-mode-prompt
did not fire when request_user_input waited for an answer.

Solution: Map request_user_input prompts to the plan-mode-prompt
notification and remove the obsolete user-input TUI notification
variant.

* Wrap status reset timestamps in narrow layouts (openai#17481)

Addresses openai#17453

Problem: /status rate-limit reset timestamps can be truncated in narrow
layouts, leaving users with partial times or dates.

Solution: Let narrow rate-limit rows drop the fixed progress bar to
preserve the percent summary, and wrap reset timestamps onto
continuation lines instead of truncating them.

* Suppress duplicate compaction and terminal wait events (openai#17601)

Addresses openai#17514

Problem: PR openai#16966 made the TUI render the deprecated context-compaction
notification, while v2 could also receive legacy unified-exec
interaction items alongside terminal-interaction notifications, causing
duplicate "Context compacted" and "Waited for background terminal"
messages.

Solution: Suppress deprecated context-compaction notifications and
legacy unified-exec interaction command items from the app-server v2
projection, and render canonical context-compaction items through the
existing TUI info-event path.

* Fix TUI compaction item replay (openai#17657)

Problem: PR openai#17601 updated context-compaction replay to call a new
ChatWidget handler, but the handler was never implemented, breaking
codex-tui compilation on main.

Solution: Render context-compaction replay through the existing
info-message path, preserving the intended `Context compacted` UI marker
without adding a one-off handler.

* Do not fail thread start when trust persistence fails (openai#17595)

Addresses openai#17593

Problem: A regression introduced in
openai#16492 made thread/start fail when
Codex could not persist trusted project state, which crashes startup for
users with read-only config.toml.

Solution: Treat trusted project persistence as best effort and keep the
current thread's config trusted in memory when writing config.toml
fails.

* Use AbsolutePathBuf in skill loading and codex_home (openai#17407)

Helps with FS migration later

* feat: disable memory endpoint (openai#17626)

* Include legacy deny paths in elevated Windows sandbox setup (openai#17365)

## Summary

This updates the Windows elevated sandbox setup/refresh path to include
the legacy `compute_allow_paths(...).deny` protected children in the
same deny-write payload pipe added for split filesystem carveouts.

Concretely, elevated setup and elevated refresh now both build
deny-write payload paths from:

- explicit split-policy deny-write paths, preserving missing paths so
setup can materialize them before applying ACLs
- legacy `compute_allow_paths(...).deny`, which includes existing
`.git`, `.codex`, and `.agents` children under writable roots

This lets the elevated backend protect `.git` consistently with the
unelevated/restricted-token path, and removes the old janky hard-coded
`.codex` / `.agents` elevated setup helpers in favor of the shared
payload path.

## Root Cause

The landed split-carveout PR threaded a `deny_write_paths` pipe through
elevated setup/refresh, but the legacy workspace-write deny set from
`compute_allow_paths(...).deny` was not included in that payload. As a
result, elevated workspace-write did not apply the intended deny-write
ACLs for existing protected children like `<cwd>/.git`.

## Notes

The legacy protected children still only enter the deny set if they
already exist, because `compute_allow_paths` filters `.git`, `.codex`,
and `.agents` with `exists()`. Missing explicit split-policy deny paths
are preserved separately because setup intentionally materializes those
before applying ACLs.

## Validation

- `cargo fmt --check -p codex-windows-sandbox`
- `cargo test -p codex-windows-sandbox`
- `cargo build -p codex-cli -p codex-windows-sandbox --bins`
- Elevated `codex exec` smoke with `windows.sandbox='elevated'`: fresh
git repo, attempted append to `.git/config`, observed `Access is
denied`, marker not written, Deny ACE present on `.git`
- Unelevated `codex exec` smoke with `windows.sandbox='unelevated'`:
fresh git repo, attempted append to `.git/config`, observed `Access is
denied`, marker not written, Deny ACE present on `.git`

* feat: Avoid reloading curated marketplaces for tool-suggest discovera… (openai#17638)

- stop `list_tool_suggest_discoverable_plugins()` from reloading the
curated marketplace for each discoverable plugin
- reuse a direct plugin-detail loader against the already-resolved
marketplace entry


The trigger was to stop those logs spamming:
```
d=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.402Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
2026-04-13T12:27:30.402Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.405Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
2026-04-13T12:27:30.406Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.408Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
```

* app-server: Only unload threads which were unused for some time (openai#17398)

Currently app-server may unload actively running threads once the last
connection disconnects, which is not expected.
Instead track when was the last active turn & when there were any
subscribers the last time, also add 30 minute idleness/no subscribers
timer to reduce the churn.

* only specify remote ports when the rule needs them (openai#17669)

Windows gives an error when you combine `protocol = ANY` with
`SetRemotePorts`
This fixes that

* Fix tui compilation (openai#17691)

The recent release broke, codex suggested this as the fix

Source failure:
https://github.com/openai/codex/actions/runs/24362949066/job/71147202092

Probably from
openai@ac82443

For why it got in:
```
The relevant setup:

.github/workflows/rust-ci.yml (line 1) runs on PRs, but for codex-rs it only does:

cargo fmt --check
cargo shear
argument-comment lint via Bazel
no cargo check, no cargo clippy over the workspace, no cargo test over codex-tui
.github/workflows/rust-ci-full.yml (line 1) runs on pushes to main and branches matching **full-ci**. That one does compile TUI because:

codex-rs/Cargo.toml includes "tui" as a workspace member
lint_build runs cargo clippy --target ... --tests --profile ...
the matrix includes both dev and release profiles
tests runs cargo nextest run ..., but only dev-profile tests
Release CI also compiles it indirectly. .github/workflows/rust-release.yml (line 235) builds --bin codex, and cli/Cargo.toml (line 46) depends on codex-tui.
```

Codex tested locally with `cargo check -p codex-tui --release` and was
able to repro, and verified that this fixed it

* Update phase 2 memory model to gpt-5.4 (openai#17384)

### Motivation
- Switch the default model used for memory Phase 2 (consolidation) to
the newer `gpt-5.4` model.

### Description
- Change the Phase 2 model constant from `"gpt-5.3-codex"` to
`"gpt-5.4"` in `codex-rs/core/src/memories/mod.rs`.

### Testing
- Ran `just fmt`, which completed successfully.
- Attempted `cargo test -p codex-core`, but the build failed in this
environment because the `codex-linux-sandbox` crate requires the system
`libcap` pkg-config entry and the required system packages could not be
installed, so the test run was blocked.

------
[Codex
Task](https://chatgpt.com/codex/cloud/tasks/task_i_69d977693b48832a967e78d73c66dc8e)

* Remove unnecessary tests (openai#17395)

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

* Cap realtime mirrored user turns (openai#17685)

Cap mirrored user text sent to realtime with the existing 300-token turn
budget while preserving the full model turn.

Adds integration coverage for capped realtime mirror payloads.

---------

Co-authored-by: Codex <noreply@openai.com>

* change realtime tool description (openai#17699)

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

* Add `supports_parallel_tool_calls` flag to included mcps (openai#17667)

## Why

For more advanced MCP usage, we want the model to be able to emit
parallel MCP tool calls and have Codex execute eligible ones
concurrently, instead of forcing all MCP calls through the serial block.

The main design choice was where to thread the config. I made this
server-level because parallel safety depends on the MCP server
implementation. Codex reads the flag from `mcp_servers`, threads the
opted-in server names into `ToolRouter`, and checks the parsed
`ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
on model-visible tool names, which can be incomplete in
deferred/search-tool paths or ambiguous for similarly named
servers/tools.

## What was added

Added `supports_parallel_tool_calls` for MCP servers.

Before:

```toml
[mcp_servers.docs]
command = "docs-server"
```

After:

```toml
[mcp_servers.docs]
command = "docs-server"
supports_parallel_tool_calls = true
```

MCP calls remain serial by default. Only tools from opted-in servers are
eligible to run in parallel. Docs also now warn to enable this only when
the server’s tools are safe to run concurrently, especially around
shared state or read/write races.

## Testing

Tested with a local stdio MCP server exposing real delay tools. The
model/Responses side was mocked only to deterministically emit two MCP
calls in the same turn.

Each test called `query_with_delay` and `query_with_delay_2` with `{
"seconds": 25 }`.

| Build/config | Observed | Wall time |
| --- | --- | --- |
| main with flag enabled | serial | `58.79s` |
| PR with flag enabled | parallel | `31.73s` |
| PR without flag | serial | `56.70s` |

PR with flag enabled showed both tools start before either completed;
main and PR-without-flag completed the first delay before starting the
second.

Also added an integration test.

Additional checks:

- `cargo test -p codex-tools` passed
- `cargo test -p codex-core
mcp_parallel_support_uses_exact_payload_server` passed
- `git diff --check` passed

---------

Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
Co-authored-by: Felipe Coury <felipe.coury@openai.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
Co-authored-by: starr-openai <starr@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: jif-oai <jif@openai.com>
Co-authored-by: friel-openai <friel@openai.com>
Co-authored-by: iceweasel-oai <iceweasel@openai.com>
Co-authored-by: Ruslan Nigmatullin <ruslan@openai.com>
Co-authored-by: David Z Hao <david.hao@openai.com>
Co-authored-by: Kevin Liu <kevin@kliu.io>
Co-authored-by: josiah-openai <josiah@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants