Fix CI approval flows and stale fixtures by henrypark133 · Pull Request #1478 · nearai/ironclaw

henrypark133 · 2026-03-20T18:25:22Z

Summary

update stale config URL fixtures so the Rust test suites stop cascading via ENV_MUTEX poisoning
allow explicit approval submissions to target an already-loaded owner thread by UUID across channels
update the owner-scope E2E to approve full-job routine creation through the web API and remove the stale staging CI triage plan doc

Testing

tests/e2e/.venv/bin/pytest tests/e2e/scenarios/test_owner_scope.py -q

gemini-code-assist · 2026-03-20T18:25:40Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the robustness of CI processes and improving the flexibility of approval workflows. It addresses issues with stale test configurations that could lead to cascading failures and introduces a more versatile mechanism for handling cross-channel approval submissions, particularly for routine creation. The changes aim to streamline development and testing by ensuring more reliable and adaptable system interactions.

Highlights

CI Approval Flows: Implemented the ability for explicit approval submissions to target an already-loaded owner thread by its UUID, allowing cross-channel approvals (e.g., web UI approving work from HTTP channels).
Stale Fixture Updates: Updated stale configuration URL fixtures in Rust test suites to prevent ENV_MUTEX poisoning and ensure test stability.
E2E Test Refinement: Updated the owner-scope end-to-end tests to validate the approval of full-job routine creation through the web API, and removed an outdated staging CI triage plan document.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request effectively addresses the stated goals of fixing CI approval flows and updating stale fixtures. The removal of the outdated triage plan document and the test fixture updates contribute to a cleaner and more relevant codebase. The core functional change in src/agent/agent_loop.rs to allow approval submissions to target specific threads by UUID across channels is a valuable enhancement for the web approval UI. The corresponding updates to the E2E tests in tests/e2e/scenarios/test_owner_scope.py correctly reflect and validate this new approval mechanism. Overall, the changes are well-implemented and improve the system's functionality and test coverage.

gemini-code-assist · 2026-03-20T18:28:03Z

+            } else {
+                drop(sess);
+                self.session_manager
+                    .resolve_thread(
+                        &message.user_id,
+                        &message.channel,
+                        message.conversation_scope(),
+                    )
+                    .await
+            }


The self.session_manager.resolve_thread(...) call is duplicated here and in the final else block. While functionally correct, consider extracting this logic into a helper function or a dedicated method within SessionManager to reduce repetition and improve readability. For instance, SessionManager could expose a method like resolve_or_activate_thread_for_approval that encapsulates this specific logic.

Suggested change

} else {

drop(sess);

self.session_manager

.resolve_thread(

&message.user_id,

&message.channel,

message.conversation_scope(),

)

.await

}

} else {

// If the target_thread_id is not in the current session, fall back to generic resolution.

// This might create a new thread or resolve an existing one based on channel/scope.

self.session_manager

.resolve_thread(

&message.user_id,

&message.channel,

message.conversation_scope(),

)

.await

}

References

When an issue is found in duplicated code, prefer refactoring into a shared function over applying localized fixes.

Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.

Thanks. I left this as-is for now to keep the CI fix narrowly scoped, since the duplicated branch is just the fallback into the existing SessionManager path. If we touch this area again, I agree a small helper such as resolve_or_activate_thread_for_approval would make the control flow easier to read.

gemini-code-assist · 2026-03-20T18:28:03Z

 - the owner-scoped HTTP webhook channel
 - routine tools / routines tab
- job creation via routine execution / jobs tab
 """


The removal of this line from the docstring correctly reflects the updated scope of the E2E tests, which no longer directly cover job creation via routine execution in this scenario.

Yep, that was intentional. The E2E harness runs with sandbox-backed full-job execution disabled, so I narrowed the docstring to match what this scenario actually verifies: owner-scoped routine creation plus web approval/visibility.

Copilot

Pull request overview

This PR updates CI/E2E fixtures and approval handling so owner-scoped workflows (especially approvals for work originating in non-gateway channels) behave consistently and stop failing due to stale URL fixtures.

Changes:

Update E2E owner-scope scenario to approve pending requests via the web API for routines created from the HTTP owner-scoped webhook.
Adjust agent-loop thread resolution to allow approval submissions to target an already-loaded owned thread UUID across channels.
Refresh config URL test fixtures (LLM + embeddings) to avoid DNS-related flakiness/ENV_MUTEX poisoning, and remove a stale staging CI triage plan doc.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/e2e/scenarios/test_owner_scope.py	Switches the “full-job routine from HTTP” scenario to use `/api/chat/history` + `/api/chat/approval` for explicit approvals.
src/agent/agent_loop.rs	Adds an approval-specific thread-resolution fast path using UUID thread targeting.
src/config/llm.rs	Updates base URL test fixtures to localhost URLs.
src/config/embeddings.rs	Updates embedding base URL fixture to an IP literal to avoid DNS resolution.
docs/plans/2026-03-18-staging-ci-triage.md	Removes outdated triage plan document.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-20T18:31:13Z

+        let (session, thread_id) = if let Some(target_thread_id) = approval_thread_uuid {
+            let session = self
+                .session_manager
+                .get_or_create_session(&message.user_id)
+                .await;
+            let mut sess = session.lock().await;
+            if sess.threads.contains_key(&target_thread_id) {
+                sess.active_thread = Some(target_thread_id);
+                sess.last_active_at = chrono::Utc::now();
+                drop(sess);
+                (session, target_thread_id)
+            } else {
+                drop(sess);
+                self.session_manager
+                    .resolve_thread(
+                        &message.user_id,
+                        &message.channel,
+                        message.conversation_scope(),
+                    )
+                    .await
+            }


In the approval-thread fast path you switch active_thread and return the UUID, but you don't register a thread_map entry for the current (user_id, channel, external_thread_id) key. That means subsequent non-approval gateway messages that include the same UUID can still be routed by resolve_thread() into a new thread because this UUID is already mapped under a different channel key. Consider registering the mapping (e.g., via SessionManager::register_thread) when the UUID is accepted so future requests on this channel+thread_id consistently resolve to the same in-memory thread.

Fixed in aa426a00. When the approval fast path adopts an already-loaded UUID, it now also registers that thread for the current (user_id, channel, external_thread_id) key via SessionManager::register_thread, so later gateway messages on the same UUID resolve back to the same in-memory thread instead of creating a fresh channel-scoped thread. I also added a focused unit test for the second-channel registration case and reran tests/e2e/scenarios/test_owner_scope.py (3 passed).

henrypark133 · 2026-03-20T19:03:31Z

Followed up on the approval-thread mapping review note in aa426a0. When the approval fast path adopts an already-loaded UUID, it now also registers that thread for the current channel key via SessionManager::register_thread, so later gateway messages using the same UUID resolve back to the same in-memory thread instead of creating a fresh channel-scoped thread. I also added a focused unit test for the second-channel registration case and reran tests/e2e/scenarios/test_owner_scope.py (3 passed).

* Fix CI approval flows and stale fixtures * Backfill approval thread mapping across channels

Fix CI approval flows and stale fixtures

308999f

Copilot AI review requested due to automatic review settings March 20, 2026 18:25

github-actions Bot added scope: agent Agent core (agent loop, router, scheduler) scope: docs Documentation size: M 50-199 changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Mar 20, 2026

Copilot started reviewing on behalf of henrypark133 March 20, 2026 18:25 View session

gemini-code-assist Bot reviewed Mar 20, 2026

View reviewed changes

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Backfill approval thread mapping across channels

aa426a0

github-actions Bot added size: L 200-499 changed lines and removed size: M 50-199 changed lines labels Mar 20, 2026

nickpismenkov approved these changes Mar 20, 2026

View reviewed changes

henrypark133 merged commit d3b69e7 into staging Mar 20, 2026
14 checks passed

henrypark133 deleted the codex/ci-triage-fixes branch March 20, 2026 19:21

ironclaw-ci Bot mentioned this pull request Mar 20, 2026

chore: promote staging to staging-promote/ee6f5cd6-23354122351 (2026-03-20 19:41 UTC) #1483

Merged

tianhaoz95 pushed a commit to tianhaoz95/clawgo that referenced this pull request Mar 22, 2026

Fix CI approval flows and stale fixtures (nearai#1478)

3cb340f

* Fix CI approval flows and stale fixtures * Backfill approval thread mapping across channels

github-actions Bot mentioned this pull request Mar 24, 2026

chore: promote staging to staging-promote/89203225-23327092672 (2026-03-20 04:32 UTC) #1451

Merged

This was referenced Mar 25, 2026

chore: release v0.22.0 #1601

Merged

chore(ironclaw): release v0.23.0 #1658

Merged

bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026

Fix CI approval flows and stale fixtures (nearai#1478)

da2291b

* Fix CI approval flows and stale fixtures * Backfill approval thread mapping across channels

drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026

Fix CI approval flows and stale fixtures (nearai#1478)

d1135ee

* Fix CI approval flows and stale fixtures * Backfill approval thread mapping across channels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CI approval flows and stale fixtures#1478

Fix CI approval flows and stale fixtures#1478
henrypark133 merged 2 commits intostagingfrom
codex/ci-triage-fixes

henrypark133 commented Mar 20, 2026

Uh oh!

gemini-code-assist Bot commented Mar 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Uh oh!

henrypark133 Mar 20, 2026

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Uh oh!

henrypark133 Mar 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

henrypark133 Mar 20, 2026

Uh oh!

henrypark133 commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

henrypark133 commented Mar 20, 2026

Summary

Testing

Uh oh!

gemini-code-assist Bot commented Mar 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants