feat(llm): Add OpenAI Codex (ChatGPT subscription) as LLM provider by ilblackdragon · Pull Request #1461 · nearai/ironclaw

ilblackdragon · 2026-03-20T07:38:34Z

Summary

Continuation of #744 by @Sanjeev-S.

Adds openai_codex as a new LLM backend so ChatGPT Plus/Pro subscribers can use IronClaw without a separate API key. Includes OAuth device code login, native Responses API client with SSE parsing and tool call round-trips, and a token-refreshing LlmProvider decorator.

Changes from original

Merged with latest staging (resolved 4 conflicts: app.rs, CLAUDE.md, mod.rs, wizard.rs)
Security: Added SSRF validation on OPENAI_CODEX_AUTH_URL and OPENAI_CODEX_API_URL via validate_base_url() — every other base URL was validated except these two ([HIGH] SSRF risk via configurable embedding base URL without validation #1103)
Added missing cache_write_multiplier() and cache_read_discount() trait delegation in TokenRefreshingProvider
Capped device-code polling backoff at 60s to prevent unbounded interval growth on repeated 429s
Default expires_in to 3600s when server returns 0, preventing immediately-expired sessions
Extracted duplicated make_test_jwt() and test_codex_config() into shared codex_test_helpers module
Fixed pre-existing SseEvent::JobResult missing fallback_deliverable field in job_monitor.rs tests

Original PR

#744 — feat(llm): Add OpenAI Codex (ChatGPT subscription) as LLM provider

Review comments addressed

All review comments from @zmanian and @ilblackdragon were already resolved by the original author in commit bd588e0. The remaining blocker was merge staleness (9 days behind staging) and the SSRF validation gap identified during takeover review.

Test plan

cargo fmt — clean
cargo clippy --all --benches --tests --examples --all-features — zero warnings
All 30 codex-related unit tests pass (including 2 new SSRF regression tests)
Manual E2E: device code login → send message → received response

Co-Authored-By: Sanjeev-S Sanjeev-S@users.noreply.github.com

Generated with Claude Code

Add OpenAiCodex as a new LLM backend variant with config for auth endpoint, API base URL, client ID, and session persistence path. The session manager implements OpenAI's device code auth flow (headless-friendly, no browser required on the server) with automatic token refresh, following the same persistence pattern as the existing NEAR AI session manager. Closes #742 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Native Responses API client for chatgpt.com/backend-api/codex/responses, the endpoint that works with ChatGPT subscription tokens. Handles SSE streaming, text completions, and tool call round-trips. Token-refreshing decorator wraps the provider to pre-emptively refresh OAuth tokens before API calls and retry once on auth failures. Reports zero cost since billing is through subscription. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…zard Connect the new provider to the LLM factory, add openai_codex to the CLI --backend flag, and add it as an option in the onboarding wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Review fixes for the OpenAI Codex provider PR: - Remove dead `generate_pkce()` code (device flow gets PKCE from server) - Fix `refresh_tokens()` to use `.form()` instead of `.json()` per OAuth spec - Inline codex dispatch into `build_provider_chain()` (single async function, no separate `assemble_provider_chain()` helper — matches main's pattern) - Remove Clone from `OpenAiCodexSession`, restrict fields to `pub(crate)` - Propagate HTTP client builder error instead of silent fallback - Redact device code response body from debug log - Change `set_model()` in TokenRefreshingProvider to delegate to inner - Replace hardcoded `/tmp/` test path with `tempfile::tempdir()` - Accept `request_timeout_secs` from config instead of hardcoded 300s - Parse `Retry-After` header on 429 responses (matches nearai_chat.rs pattern) - Reuse `normalize_schema_strict()` for Codex tool definitions - Add warning log for dropped image attachments - Add doc comments on `list_models()` and `include` field - Add `OPENAI_CODEX_API_URL` to `.env.example` - Fix codex error message in `create_llm_provider()` for clarity - Revert unrelated `.worktrees` addition to `.gitignore` - Update `src/llm/CLAUDE.md` with Codex provider docs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

# Conflicts: # src/app.rs # src/llm/CLAUDE.md # src/llm/mod.rs # src/setup/wizard.rs

#744) Security: - Add SSRF validation (validate_base_url) on OPENAI_CODEX_AUTH_URL and OPENAI_CODEX_API_URL, matching the pattern used by all other base URL configs (regression test for #1103 included) Correctness: - Add missing cache_write_multiplier() and cache_read_discount() trait delegation in TokenRefreshingProvider - Cap device-code polling backoff at 60s to prevent unbounded interval growth on repeated 429 responses - Default expires_in to 3600s when server returns 0, preventing immediately-expired sessions - Fix pre-existing SseEvent::JobResult missing fallback_deliverable field in job_monitor.rs tests Cleanup: - Extract duplicated make_test_jwt() and test_codex_config() into shared codex_test_helpers module Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gemini-code-assist · 2026-03-20T07:39:01Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for OpenAI Codex, enabling users with ChatGPT subscriptions to seamlessly integrate and utilize this LLM within the application. The changes encompass a new provider implementation, a robust OAuth authentication flow with token management, and critical security enhancements. This significantly expands the range of available LLM backends and improves the user experience for a segment of OpenAI subscribers.

Highlights

New LLM Provider: OpenAI Codex (ChatGPT subscription): Integrated OpenAI Codex as a new LLM backend, allowing ChatGPT Plus/Pro subscribers to use IronClaw without needing a separate API key. This leverages OAuth device code login for authentication.
Native Responses API Client: Implemented a native client for the OpenAI Responses API, including Server-Sent Events (SSE) parsing and support for tool call round-trips, mirroring OpenClaw's approach.
Token-Refreshing LlmProvider Decorator: Introduced a TokenRefreshingProvider decorator that handles pre-emptive token refreshes and retries on authentication failures, ensuring continuous access to the OpenAI Codex service.
Enhanced Security: SSRF Validation: Added Server-Side Request Forgery (SSRF) validation for OPENAI_CODEX_AUTH_URL and OPENAI_CODEX_API_URL to prevent potential security vulnerabilities, addressing a previously identified gap.
Improved OAuth Flow Robustness: Capped device-code polling backoff at 60 seconds to prevent unbounded interval growth on repeated 429 (Too Many Requests) errors and defaulted expires_in to 3600 seconds when the server returns zero, preventing immediately expired sessions.
CLI Login Command: Added a new ironclaw login --openai-codex command to facilitate re-authentication for OpenAI Codex users.
Setup Wizard Integration: Integrated OpenAI Codex into the setup wizard, providing a guided experience for configuring this new LLM provider.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

Adds OpenAI Codex (ChatGPT subscription) as a first-class LLM backend in IronClaw, including OAuth device-code authentication, a Responses API client, and provider wiring so users can run without an OpenAI API key.

Changes:

Introduces openai_codex backend with OAuth session persistence + token refresh, and a Responses API provider implementation.
Wires Codex into setup wizard + CLI (ironclaw login --openai-codex) and into the LLM provider factory chain.
Extends config/env/docs and fixes/updates related tests & CLI snapshots.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/setup/wizard.rs`	Adds OpenAI Codex to provider selection and implements device-code login setup flow.
`src/main.rs`	Adds `login` command handling that triggers Codex device-code authentication.
`src/llm/token_refreshing.rs`	New decorator provider that pre-refreshes tokens and retries on auth failures.
`src/llm/rig_adapter.rs`	Exposes `normalize_schema_strict` for reuse by Codex tool schema conversion.
`src/llm/openai_codex_session.rs`	New OAuth session manager with persistence, refresh, and device-code flow.
`src/llm/openai_codex_provider.rs`	New Responses API client + SSE event parsing + tool-call extraction.
`src/llm/models.rs`	Ensures `LlmConfig` builder includes `openai_codex` field.
`src/llm/mod.rs`	Exports Codex modules and adds a dedicated factory path in `build_provider_chain()`.
`src/llm/config.rs`	Introduces `OpenAiCodexConfig` and adds it into `LlmConfig`.
`src/llm/codex_test_helpers.rs`	Shared helpers for Codex unit tests (JWT + config builders).
`src/llm/CLAUDE.md`	Documents Codex provider behavior, auth flow, and env vars.
`src/config/mod.rs`	Re-exports `OpenAiCodexConfig`; minor comment tweak.
`src/config/llm.rs`	Resolves `openai_codex` backend, including SSRF validation + tests.
`src/cli/snapshots/ironclaw__cli__tests__long_help_output.snap`	Updates CLI snapshot to include new `login` command.
`src/cli/snapshots/ironclaw__cli__tests__long_help_output_without_import.snap`	Updates CLI snapshot to include new `login` command.
`src/cli/snapshots/ironclaw__cli__tests__help_output.snap`	Updates CLI snapshot to include new `login` command.
`src/cli/snapshots/ironclaw__cli__tests__help_output_without_import.snap`	Updates CLI snapshot to include new `login` command.
`src/cli/mod.rs`	Adds `Login { openai_codex: bool }` Clap subcommand definition.
`src/app.rs`	Allows `openai_codex` backend to bypass registry-provider requirement checks.
`src/agent/job_monitor.rs`	Fixes test struct construction by adding `fallback_deliverable`.
`.env.example`	Documents `openai_codex` backend and relevant `OPENAI_CODEX_*` env vars.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-20T07:42:43Z

+                let config = Config::from_env()
+                    .await
+                    .map_err(|e| anyhow::anyhow!("{}", e))?;
+                let codex_config = config.llm.openai_codex.unwrap_or_default();
+                let mgr = ironclaw::llm::OpenAiCodexSessionManager::new(codex_config)
+                    .map_err(|e| anyhow::anyhow!("{}", e))?;


ironclaw login --openai-codex loads Config::from_env() but then falls back to OpenAiCodexConfig::default() when config.llm.openai_codex is None. Since LlmConfig::resolve() only populates openai_codex when LLM_BACKEND/settings are set to Codex, this path ignores OPENAI_CODEX_* env overrides (auth/api URL, session path, refresh margin, model). Consider resolving OpenAiCodexConfig directly from env in this command (or temporarily treating the backend as openai_codex) so login uses the same validated config as runtime.

Fixed in 3bb817b. The login command now resolves OPENAI_CODEX_* env overrides directly when config.llm.openai_codex is None, so env vars like OPENAI_CODEX_AUTH_URL are honoured regardless of LLM_BACKEND setting.

Copilot · 2026-03-20T07:42:43Z

                    return Ok(());
                }
+                if current == "openai_codex" {
+                    return self.setup_openai_codex().await;


When the current provider is openai_codex and the user answers “Keep current provider?”, the wizard calls setup_openai_codex() which triggers a device-code login flow again. That’s a surprising side effect for a “keep” path and may force unnecessary re-auth. Consider mirroring the Bedrock behavior (just keep and return), or doing a non-interactive ensure_authenticated() check and only prompting for login if the session is missing/expired.

Suggested change

return self.setup_openai_codex().await;

// Keeping the existing OpenAI Codex config — avoid re-running the

// full setup flow (device-code login) on a no-op "keep" choice.

print_info("Keeping existing OpenAI Codex configuration.");

return Ok(());

Fixed in 3bb817b. The "Keep current provider?" path now mirrors Bedrock: prints "Keeping existing OpenAI Codex configuration." and returns without re-triggering device code login.

Copilot · 2026-03-20T07:42:44Z

+        // Read the full body and parse SSE events
+        let body_bytes = response
+            .bytes()
+            .await
+            .map_err(|e| LlmError::RequestFailed {
+                provider: "openai_codex".to_string(),
+                reason: format!("Failed to read response body: {e}"),
+            })?;
+
+        let body_text = String::from_utf8_lossy(&body_bytes);
+        parse_sse_response(&body_text)


send_request() advertises SSE streaming (Accept: text/event-stream, stream: true) but then calls response.bytes().await and parses after the full body is buffered. This defeats streaming (slower time-to-first-token) and risks unbounded memory use for long outputs. Consider parsing the response incrementally from bytes_stream() (similar to CodexChatGptProvider::parse_sse_response_stream) and building the output as events arrive.

Acknowledged — this is a known limitation noted during the takeover review. Switching to incremental SSE parsing requires a significant refactor (using bytes_stream() + line-buffered parsing). Filed as a follow-up, not addressing in this PR.

Copilot · 2026-03-20T07:42:44Z

+    } else {
+        create_llm_provider(config, session.clone()).await?
+    };
+    tracing::info!("LLM provider initialized: {}", llm.model_name());


build_provider_chain() now logs provider initialization at info level. This message includes the model name and will fire on every startup; if the rest of the codebase uses debug for this kind of internal wiring message, consider keeping it at debug (or adding structured fields and a more specific target) to avoid noisy logs in production.

Suggested change

tracing::info!("LLM provider initialized: {}", llm.model_name());

tracing::debug!("LLM provider initialized: {}", llm.model_name());

Fixed in 3bb817b. Reverted to tracing::debug! to match the rest of the codebase's internal wiring log level.

gemini-code-assist

Code Review

This pull request introduces OpenAI Codex as a new LLM provider, enabling ChatGPT Plus/Pro subscribers to use IronClaw without a separate API key. The changes include implementing OAuth device code login, a native Responses API client with SSE parsing and tool call round-trips, and a token-refreshing decorator. Key updates also involve integrating SSRF validation for the new endpoints, adding missing trait delegations, capping device-code polling backoff, and fixing a pre-existing SseEvent::JobResult field. The overall implementation is robust, with good test coverage for the new features and security considerations. The addition of openai_codex_test_helpers is a good practice for shared test utilities.

gemini-code-assist · 2026-03-20T07:43:07Z

+        /// Authenticate with OpenAI Codex (ChatGPT subscription)
+        #[arg(long)]
+        openai_codex: bool,


The Login command currently only supports openai_codex. Consider adding a more generic --provider <name> argument or a subcommand structure for future extensibility, rather than adding a new flag for each provider. This would make the CLI more scalable as more providers are integrated.

Acknowledged as a follow-up. The boolean flag pattern works for now and the current implementation is from the original PR. Will revisit if/when more OAuth providers are added.

gemini-code-assist · 2026-03-20T07:43:07Z

+                    if let Ok(dt) = chrono::DateTime::parse_from_rfc2822(v.trim()) {
+                        let now = chrono::Utc::now();
+                        let delta = dt.signed_duration_since(now);
+                        return Some(std::time::Duration::from_secs(
+                            delta.num_seconds().max(0) as u64
+                        ));
+                    }
+                    None


The chrono::DateTime::parse_from_rfc2822 function is used here, but RFC 2822 is typically for email dates. Given that expires_at is an ISO-8601 timestamp in UserCodeResponse, it might be more appropriate to use chrono::DateTime::parse_from_rfc3339 for consistency and accuracy when parsing Retry-After headers, as RFC 3339 is a profile of ISO 8601. If the header can indeed be RFC 2822, then this is fine, but it's worth double-checking the expected format from the OpenAI API.

Not a bug. The Retry-After header uses HTTP-date format (RFC 7231 §7.1.3), which is IMF-fixdate — a profile of RFC 2822. Using parse_from_rfc2822 is correct here. The expires_at in UserCodeResponse (ISO 8601) is a different field parsed elsewhere with parse_from_rfc3339.

gemini-code-assist · 2026-03-20T07:43:07Z

+                        let args_str = if tc.arguments.is_string() {
+                            tc.arguments.as_str().unwrap_or("{}").to_string()
+                        } else {
+                            tc.arguments.to_string()
+                        };


The tc.arguments field is converted to a string using to_string() if it's not already a string. This might lead to issues if the arguments are complex JSON objects that need to be passed as-is, rather than as a string representation. The serde_json::Value type should ideally be passed directly or serialized in a way that preserves its structure, especially for tool calls where the arguments are expected to be a JSON object.

This is correct behavior. The Responses API function_call item expects arguments as a JSON string (not a nested object). When tc.arguments is already a serde_json::Value::String, we extract it; when it's an object, we serialize to string. This matches the API spec.

gemini-code-assist · 2026-03-20T07:43:07Z

+                            let arguments: serde_json::Value =
+                                serde_json::from_str(&state.arguments).unwrap_or_else(|_| {
+                                    serde_json::Value::String(state.arguments.clone())
+                                });


The fallback logic for serde_json::from_str(&state.arguments).unwrap_or_else(|_| serde_json::Value::String(state.arguments.clone())) could potentially mask parsing errors. If state.arguments is expected to be a JSON string, a parsing failure should probably be treated as an error rather than silently converting it to a string literal. This could hide malformed tool call arguments.

References

This comment aligns with the principle of avoiding silent failures when handling critical data. Using unwrap_or_else to fallback to a string literal instead of failing on a parsing error for state.arguments can mask malformed input, similar to how #[serde(default)] can hide missing critical data.

This is intentionally defensive. The arguments are accumulated from SSE stream chunks. If the final concatenated string isn't valid JSON (e.g., truncated stream), wrapping it as a string value lets the tool handler see the raw text rather than aborting the entire response. The agent loop will handle the malformed arguments gracefully.

gemini-code-assist · 2026-03-20T07:43:07Z

+                + chrono::Duration::seconds(if token_resp.expires_in > 0 {
+                    token_resp.expires_in
+                } else {
+                    3600
+                } as i64),


The expires_in field from the TokenResponse is used to calculate expires_at. If expires_in is 0, it defaults to 3600 seconds. While this prevents immediately expired sessions, an expires_in of 0 might indicate an issue with the token or the server's response. It might be safer to log a warning or treat expires_in == 0 as an error condition, or at least use a more conservative default, to avoid silently extending potentially invalid token lifetimes.

References

Defaulting expires_in to 3600 seconds when it's 0 for a TokenResponse field can lead to silent failures by extending the lifetime of a potentially invalid token. This aligns with the rule to avoid default values for critical fields where missing or unexpected data should result in an explicit failure.

Good point. Added a tracing::warn! when expires_in=0 before defaulting to 3600s, in 3bb817b. This makes the fallback visible in logs for debugging without failing hard (since in practice this usually means the server omitted the field rather than indicating an error).

gemini-code-assist · 2026-03-20T07:43:07Z

+                + chrono::Duration::seconds(if token_resp.expires_in > 0 {
+                    token_resp.expires_in
+                } else {
+                    3600
+                } as i64),


Similar to the previous comment, the expires_in field from the TokenResponse is defaulted to 3600 seconds if it's 0. This behavior should be consistent and potentially re-evaluated. If the server explicitly sends expires_in: 0, it might imply immediate expiration or an invalid token, which should be handled more explicitly.

References

This comment reiterates the concern from the previous comment regarding the expires_in field. Explicitly handling expires_in: 0 as a potential error or immediate expiration, rather than defaulting, prevents silent failures and aligns with the rule's guidance on critical fields in deserialized data.

Same as above — added warning log in 3bb817b for both occurrences (device code login and token refresh paths).

gemini-code-assist · 2026-03-20T07:43:07Z

+                    if let Err(e) = self.update_inner_token().await {
+                        tracing::warn!(
+                            "Pre-emptive token update failed: {e}, will retry on auth failure"
+                        );
+                    }


The tracing::warn! message here indicates that a pre-emptive token update failed and that the system will "retry on auth failure". This implies that the ensure_fresh_token function might not be fully effective in preventing AuthFailed errors from reaching the main complete or complete_with_tools methods. It might be clearer to explicitly state that the token update is part of a best-effort pre-check and that the retry mechanism in the complete methods will handle the actual AuthFailed error.

The message accurately describes the behavior — ensure_fresh_token is explicitly documented as a best-effort pre-check (see doc comment on line 45-48). The retry-on-auth-failure path in complete()/complete_with_tools() handles the actual recovery. No change needed.

gemini-code-assist · 2026-03-20T07:43:07Z

+                    tracing::warn!(
+                        "Pre-emptive token refresh failed: {e}, will retry on auth failure"
+                    );


The warning message for a failed pre-emptive token refresh is informative. However, if refresh_tokens() fails, it might be useful to include the specific error e in the log message to provide more context for debugging, rather than just stating that it failed and will retry on auth failure.

References

This comment suggests including the specific error e in the log message for a failed token refresh. This aligns with the rule's emphasis on capturing debugging information in error logs to provide more context for failure analysis.

The error IS included in the format string: "Pre-emptive token refresh failed: {e}, will retry on auth failure" — {e} expands to the full error. No change needed.

gemini-code-assist · 2026-03-20T07:43:07Z

+                let config = Config::from_env()
+                    .await
+                    .map_err(|e| anyhow::anyhow!("{}", e))?;
+                let codex_config = config.llm.openai_codex.unwrap_or_default();


Using unwrap_or_default() on config.llm.openai_codex here means that if openai_codex is None (i.e., the backend is not configured as openai_codex), it will still proceed with a default OpenAiCodexConfig. This might lead to unexpected behavior if the user intended to use a different backend but accidentally triggered the login --openai-codex command. It might be more robust to return an error or a more explicit message if the openai_codex config is not present when attempting to log in.

References

This comment advises against using unwrap_or_default() in application logic when a None value for config.llm.openai_codex could lead to unexpected behavior or silently incorrect logic. This mirrors the rule's guidance to avoid unwrap_or() in tests for similar reasons, preferring explicit failure or error handling over silent fallbacks.

Fixed in 3bb817b. The login command now resolves OPENAI_CODEX_* env overrides via std::env::var() when config.llm.openai_codex is None, rather than silently using defaults.

- Login command now resolves OPENAI_CODEX_* env overrides even when LLM_BACKEND isn't set to openai_codex (Copilot review) - Setup wizard "Keep current provider?" for codex no longer re-triggers device code login — mirrors Bedrock's keep-and-return pattern (Copilot) - Revert provider init log from info back to debug (Copilot) - Add warning log when token expires_in=0, before defaulting to 3600s (Gemini review) Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude · 2026-03-20T08:12:49Z

Code review

No issues found.

claude · 2026-03-20T08:13:01Z

Code review

Found 5 issues:

[CRITICAL:100] Blocking I/O during provider initialization
- OpenAiCodexSessionManager::new() performs synchronous std::fs::read_to_string() on session file
- Blocks tokio runtime during app startup, violates CLAUDE.md "All I/O is async with tokio" requirement
- Fix: Make new() async or defer session load to lazy-loading pattern
[HIGH:90] Full SSE response body buffered into memory
- send_request() calls response.bytes().await buffering entire SSE response before parsing
- Unbounded memory consumption for large completions with multiple tool calls
- Fix: Implement streaming SSE parsing using response.bytes_stream() with line-buffered reader
[MEDIUM:85] Polling interval escalation during rate limiting can stall
- Device auth polling increases interval by 5s per attempt, capping at 60s (simple linear backoff, not exponential)
- No jitter applied, causes thundering herd risk when rate limit lifts
- Fix: Use exponential backoff with jitter (min(interval * 1.5, max))
[MEDIUM:75] No overall timeout on device code polling loop
- Individual polls have 30s timeout, but entire auth flow can stall if server hangs
- Polling deadline prevents infinite loops but allows long waits at 60s interval caps
- Verify HTTP client timeout applies to all polling requests
[LOW:50] No backoff on pre-emptive token refresh failures
- If refresh fails (e.g., refresh_token expired), every API call retries immediately
- Doesn't follow standard retry patterns; can hammer auth endpoints
- Fix: Track consecutive refresh failures, apply circuit-breaker or backoff

claude · 2026-03-20T08:13:24Z

Code review (Architecture & Design Issues)

Found 8 additional issues:

[CRITICAL:100] Trait delegation in TokenRefreshingProvider may be incomplete
- Cache methods (cache_write_multiplier(), cache_read_discount()) are delegated, but need verification that ALL optional LlmProvider methods are delegated
- Methods to check: list_models(), model_metadata(), active_model_name(), effective_model_name(), set_model()
- Fix: Audit all trait methods and ensure decorator delegates everything to inner provider
[HIGH:100] Unwrap in production code at src/llm/codex_chatgpt.rs:726
- serde_json::from_str(&tc.arguments).unwrap_or_else(...) violates CLAUDE.md "No unwrap/expect in production code"
- While the unwrap has a fallback, the pattern itself violates the explicit rule
- Fix: Use Result types or match expressions instead
[HIGH:100] Factory function dispatch logic violation in mod.rs
- create_llm_provider() rejects openai_codex backend with error instead of handling internally
- Error message tells caller "Use build_provider_chain() instead" — violates module encapsulation
- CLAUDE.md mandates: "Module-owned initialization logic must live in owning module as public factory function"
- Fix: Handle openai_codex dispatch transparently in factory, don't defer to caller
[HIGH:75] Hardcoded Codex CLI integration
- codex_auth.rs tightly coupled to `~/.codex/auth.json" path and Codex CLI file format
- Violates "Prefer generic/extensible architectures over hardcoding specific integrations"
- Fix: Consider generic OAuth credential storage abstraction
[MEDIUM:100] Configuration structure inconsistency

OpenAiCodexConfig lacks factory methods that NearAiConfig provides
Three backends use three different patterns (NearAiConfig with session manager, BedrockConfig with AWS SDK, CodexConfig with custom responses API)
Creates maintainability debt for future backends
Fix: Standardize factory patterns across all LLM backends

[MEDIUM:75] Potential race condition in concurrent 401 retries

codex_chatgpt.rs:361-365 checks 401 before acquiring refresh_lock
Window between "401 detected" and "lock acquired" allows multiple concurrent refresh attempts
Mitigation exists (double-check after lock), but this is defensive programming against avoidable race
Fix: Acquire lock before checking 401, or use atomic swap for token check

[LOW:50] Error type specificity lost in LlmError mapping

Multiple distinct errors (SSE corruption, token refresh failure, HTTP errors) all map to RequestFailed or AuthFailed
Lower priority since shared across all providers, but reduces debuggability

claude · 2026-03-20T08:13:42Z

Code review (Security & Safety Issues)

Found 4 security-related issues:

[MEDIUM:75] No symlink validation on session file path

Session file written to ~/.ironclaw/openai_codex_session.json without symlink checks
Attacker with local filesystem access could create symlink pointing to other users' files
Mitigated by 0600 file permissions, but risk persists at directory/parent level on shared systems
Fix: Use std::fs::canonicalize() or check is_symlink() before write

[MEDIUM:50] Session file created without TOCTOU protection

Config allows OPENAI_CODEX_SESSION_PATH env var override without validation
If directory is traversable, attacker could pre-create directory at path before app runs
Fix: Validate that session_path parent is owned by user before write

[LOW:50] JWT parsing doesn't validate signature (by design, acceptable)

extract_account_id() decodes JWT without signature verification
Acceptable because: token comes from OpenAI's HTTPS endpoint, only payload extracted for internal use, actual API calls use full Bearer token
If this pattern spreads elsewhere, could become a risk

[LOW:25] Unbounded HashMap during SSE parsing (mitigated)

parse_sse_response() uses HashMap for in-flight function calls without documented limit
Mitigated in practice: streaming processed in real-time, HashMap cleared after response, OpenAI API likely enforces limits
Recommendation: Add configurable cap or warning if size exceeds threshold (e.g., 100 calls)

Positive Security Findings:
✓ SSRF validation on both auth and API URLs
✓ Session file permissions set to 0600 (Unix)
✓ Secrets handled with SecretString (not plain strings)
✓ No token/secret details in logs
✓ Proper error handling with context
✓ Device code polling backoff capped at 60s
✓ Token expiry defaulted to 3600s (not 0)
✓ Concurrent token refresh protected by renewal_lock Mutex

claude · 2026-03-20T08:14:03Z

Code review (Bug Scan Results): NO OBVIOUS BUGS DETECTED. Comprehensive diff-only scan found no crashes, data corruption, or critical logic errors. Final summary: 17 total issues found across 4 categories (2 CRITICAL, 4 HIGH, 4 MEDIUM, 3 LOW). Recommend addressing CRITICAL items before merge.

ilblackdragon · 2026-03-20T15:14:14Z

This was reviewed in #744 - rebased and re-reviewed

…1461) * feat(llm): add OpenAI Codex backend config and OAuth session manager Add OpenAiCodex as a new LLM backend variant with config for auth endpoint, API base URL, client ID, and session persistence path. The session manager implements OpenAI's device code auth flow (headless-friendly, no browser required on the server) with automatic token refresh, following the same persistence pattern as the existing NEAR AI session manager. Closes #742 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): add Responses API client and token-refreshing decorator Native Responses API client for chatgpt.com/backend-api/codex/responses, the endpoint that works with ChatGPT subscription tokens. Handles SSE streaming, text completions, and tool call round-trips. Token-refreshing decorator wraps the provider to pre-emptively refresh OAuth tokens before API calls and retry once on auth failures. Reports zero cost since billing is through subscription. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): wire OpenAI Codex into provider factory, CLI, and setup wizard Connect the new provider to the LLM factory, add openai_codex to the CLI --backend flag, and add it as an option in the onboarding wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(llm): address PR #744 review feedback (20 items) Review fixes for the OpenAI Codex provider PR: - Remove dead `generate_pkce()` code (device flow gets PKCE from server) - Fix `refresh_tokens()` to use `.form()` instead of `.json()` per OAuth spec - Inline codex dispatch into `build_provider_chain()` (single async function, no separate `assemble_provider_chain()` helper — matches main's pattern) - Remove Clone from `OpenAiCodexSession`, restrict fields to `pub(crate)` - Propagate HTTP client builder error instead of silent fallback - Redact device code response body from debug log - Change `set_model()` in TokenRefreshingProvider to delegate to inner - Replace hardcoded `/tmp/` test path with `tempfile::tempdir()` - Accept `request_timeout_secs` from config instead of hardcoded 300s - Parse `Retry-After` header on 429 responses (matches nearai_chat.rs pattern) - Reuse `normalize_schema_strict()` for Codex tool definitions - Add warning log for dropped image attachments - Add doc comments on `list_models()` and `include` field - Add `OPENAI_CODEX_API_URL` to `.env.example` - Fix codex error message in `create_llm_provider()` for clarity - Revert unrelated `.worktrees` addition to `.gitignore` - Update `src/llm/CLAUDE.md` with Codex provider docs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback and harden OpenAI Codex provider (takeover #744) Security: - Add SSRF validation (validate_base_url) on OPENAI_CODEX_AUTH_URL and OPENAI_CODEX_API_URL, matching the pattern used by all other base URL configs (regression test for #1103 included) Correctness: - Add missing cache_write_multiplier() and cache_read_discount() trait delegation in TokenRefreshingProvider - Cap device-code polling backoff at 60s to prevent unbounded interval growth on repeated 429 responses - Default expires_in to 3600s when server returns 0, preventing immediately-expired sessions - Fix pre-existing SseEvent::JobResult missing fallback_deliverable field in job_monitor.rs tests Cleanup: - Extract duplicated make_test_jwt() and test_codex_config() into shared codex_test_helpers module Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review feedback on OpenAI Codex provider (#1461) - Login command now resolves OPENAI_CODEX_* env overrides even when LLM_BACKEND isn't set to openai_codex (Copilot review) - Setup wizard "Keep current provider?" for codex no longer re-triggers device code login — mirrors Bedrock's keep-and-return pattern (Copilot) - Revert provider init log from info back to debug (Copilot) - Add warning log when token expires_in=0, before defaulting to 3600s (Gemini review) Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sanjeev Suresh <Sanjeev-S@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…earai#1461) * feat(llm): add OpenAI Codex backend config and OAuth session manager Add OpenAiCodex as a new LLM backend variant with config for auth endpoint, API base URL, client ID, and session persistence path. The session manager implements OpenAI's device code auth flow (headless-friendly, no browser required on the server) with automatic token refresh, following the same persistence pattern as the existing NEAR AI session manager. Closes nearai#742 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): add Responses API client and token-refreshing decorator Native Responses API client for chatgpt.com/backend-api/codex/responses, the endpoint that works with ChatGPT subscription tokens. Handles SSE streaming, text completions, and tool call round-trips. Token-refreshing decorator wraps the provider to pre-emptively refresh OAuth tokens before API calls and retry once on auth failures. Reports zero cost since billing is through subscription. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): wire OpenAI Codex into provider factory, CLI, and setup wizard Connect the new provider to the LLM factory, add openai_codex to the CLI --backend flag, and add it as an option in the onboarding wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(llm): address PR nearai#744 review feedback (20 items) Review fixes for the OpenAI Codex provider PR: - Remove dead `generate_pkce()` code (device flow gets PKCE from server) - Fix `refresh_tokens()` to use `.form()` instead of `.json()` per OAuth spec - Inline codex dispatch into `build_provider_chain()` (single async function, no separate `assemble_provider_chain()` helper — matches main's pattern) - Remove Clone from `OpenAiCodexSession`, restrict fields to `pub(crate)` - Propagate HTTP client builder error instead of silent fallback - Redact device code response body from debug log - Change `set_model()` in TokenRefreshingProvider to delegate to inner - Replace hardcoded `/tmp/` test path with `tempfile::tempdir()` - Accept `request_timeout_secs` from config instead of hardcoded 300s - Parse `Retry-After` header on 429 responses (matches nearai_chat.rs pattern) - Reuse `normalize_schema_strict()` for Codex tool definitions - Add warning log for dropped image attachments - Add doc comments on `list_models()` and `include` field - Add `OPENAI_CODEX_API_URL` to `.env.example` - Fix codex error message in `create_llm_provider()` for clarity - Revert unrelated `.worktrees` addition to `.gitignore` - Update `src/llm/CLAUDE.md` with Codex provider docs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback and harden OpenAI Codex provider (takeover nearai#744) Security: - Add SSRF validation (validate_base_url) on OPENAI_CODEX_AUTH_URL and OPENAI_CODEX_API_URL, matching the pattern used by all other base URL configs (regression test for nearai#1103 included) Correctness: - Add missing cache_write_multiplier() and cache_read_discount() trait delegation in TokenRefreshingProvider - Cap device-code polling backoff at 60s to prevent unbounded interval growth on repeated 429 responses - Default expires_in to 3600s when server returns 0, preventing immediately-expired sessions - Fix pre-existing SseEvent::JobResult missing fallback_deliverable field in job_monitor.rs tests Cleanup: - Extract duplicated make_test_jwt() and test_codex_config() into shared codex_test_helpers module Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review feedback on OpenAI Codex provider (nearai#1461) - Login command now resolves OPENAI_CODEX_* env overrides even when LLM_BACKEND isn't set to openai_codex (Copilot review) - Setup wizard "Keep current provider?" for codex no longer re-triggers device code login — mirrors Bedrock's keep-and-return pattern (Copilot) - Revert provider init log from info back to debug (Copilot) - Add warning log when token expires_in=0, before defaulting to 3600s (Gemini review) Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sanjeev Suresh <Sanjeev-S@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Sanjeev-S and others added 6 commits March 11, 2026 13:51

feat(llm): wire OpenAI Codex into provider factory, CLI, and setup wi…

e4b2e6f

…zard Connect the new provider to the LLM factory, add openai_codex to the CLI --backend flag, and add it as an option in the onboarding wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'pr-744-incoming' into takeover/744-openai-codex-provider

fad2abc

# Conflicts: # src/app.rs # src/llm/CLAUDE.md # src/llm/mod.rs # src/setup/wizard.rs

Copilot AI review requested due to automatic review settings March 20, 2026 07:38

github-actions Bot added scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel scope: llm LLM integration scope: setup Onboarding / setup scope: docs Documentation labels Mar 20, 2026

ilblackdragon mentioned this pull request Mar 20, 2026

feat(llm): Add OpenAI Codex (ChatGPT subscription) as LLM provider #744

Closed

4 tasks

github-actions Bot added size: XL 500+ changed lines risk: high Safety, secrets, auth, or critical infrastructure contributor: core 20+ merged PRs labels Mar 20, 2026

Copilot started reviewing on behalf of ilblackdragon March 20, 2026 07:39 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

gemini-code-assist Bot reviewed Mar 20, 2026

View reviewed changes

ilblackdragon merged commit 3da9810 into staging Mar 20, 2026
14 checks passed

ilblackdragon deleted the takeover/744-openai-codex-provider branch March 20, 2026 15:14

ironclaw-ci Bot mentioned this pull request Mar 20, 2026

chore: promote staging to staging-promote/cba1bc37-23334371795 (2026-03-20 16:12 UTC) #1466

Merged

This was referenced Mar 20, 2026

chore: promote staging to staging-promote/3da9810e-23351687636 (2026-03-20 17:14 UTC) #1467

Merged

chore: promote staging to staging-promote/ee6f5cd6-23354122351 (2026-03-20 19:41 UTC) #1483

Merged

ilblackdragon mentioned this pull request Mar 23, 2026

PR & Issue Triage Report (2026-03-22): 100+ open PRs, ~100 open issues #1575

Open

This was referenced Mar 25, 2026

chore: release v0.22.0 #1601

Merged

chore(ironclaw): release v0.23.0 #1658

Merged

-                    return self.setup_openai_codex().await;
+                    // Keeping the existing OpenAI Codex config — avoid re-running the
+                    // full setup flow (device-code login) on a no-op "keep" choice.
+                    print_info("Keeping existing OpenAI Codex configuration.");
+                    return Ok(());

	tracing::info!("LLM provider initialized: {}", llm.model_name());
	tracing::debug!("LLM provider initialized: {}", llm.model_name());

Conversation

ilblackdragon commented Mar 20, 2026

Summary

Changes from original

Original PR

Review comments addressed

Test plan

Uh oh!

gemini-code-assist Bot commented Mar 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!