Skip to content

Fix CI approval flows and stale fixtures#1478

Merged
henrypark133 merged 2 commits intostagingfrom
codex/ci-triage-fixes
Mar 20, 2026
Merged

Fix CI approval flows and stale fixtures#1478
henrypark133 merged 2 commits intostagingfrom
codex/ci-triage-fixes

Conversation

@henrypark133
Copy link
Copy Markdown
Collaborator

Summary

  • update stale config URL fixtures so the Rust test suites stop cascading via ENV_MUTEX poisoning
  • allow explicit approval submissions to target an already-loaded owner thread by UUID across channels
  • update the owner-scope E2E to approve full-job routine creation through the web API and remove the stale staging CI triage plan doc

Testing

  • tests/e2e/.venv/bin/pytest tests/e2e/scenarios/test_owner_scope.py -q

Copilot AI review requested due to automatic review settings March 20, 2026 18:25
@github-actions github-actions Bot added scope: agent Agent core (agent loop, router, scheduler) scope: docs Documentation size: M 50-199 changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Mar 20, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the robustness of CI processes and improving the flexibility of approval workflows. It addresses issues with stale test configurations that could lead to cascading failures and introduces a more versatile mechanism for handling cross-channel approval submissions, particularly for routine creation. The changes aim to streamline development and testing by ensuring more reliable and adaptable system interactions.

Highlights

  • CI Approval Flows: Implemented the ability for explicit approval submissions to target an already-loaded owner thread by its UUID, allowing cross-channel approvals (e.g., web UI approving work from HTTP channels).
  • Stale Fixture Updates: Updated stale configuration URL fixtures in Rust test suites to prevent ENV_MUTEX poisoning and ensure test stability.
  • E2E Test Refinement: Updated the owner-scope end-to-end tests to validate the approval of full-job routine creation through the web API, and removed an outdated staging CI triage plan document.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request effectively addresses the stated goals of fixing CI approval flows and updating stale fixtures. The removal of the outdated triage plan document and the test fixture updates contribute to a cleaner and more relevant codebase. The core functional change in src/agent/agent_loop.rs to allow approval submissions to target specific threads by UUID across channels is a valuable enhancement for the web approval UI. The corresponding updates to the E2E tests in tests/e2e/scenarios/test_owner_scope.py correctly reflect and validate this new approval mechanism. Overall, the changes are well-implemented and improve the system's functionality and test coverage.

Comment thread src/agent/agent_loop.rs
Comment on lines +1040 to +1049
} else {
drop(sess);
self.session_manager
.resolve_thread(
&message.user_id,
&message.channel,
message.conversation_scope(),
)
.await
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The self.session_manager.resolve_thread(...) call is duplicated here and in the final else block. While functionally correct, consider extracting this logic into a helper function or a dedicated method within SessionManager to reduce repetition and improve readability. For instance, SessionManager could expose a method like resolve_or_activate_thread_for_approval that encapsulates this specific logic.

Suggested change
} else {
drop(sess);
self.session_manager
.resolve_thread(
&message.user_id,
&message.channel,
message.conversation_scope(),
)
.await
}
} else {
// If the target_thread_id is not in the current session, fall back to generic resolution.
// This might create a new thread or resolve an existing one based on channel/scope.
self.session_manager
.resolve_thread(
&message.user_id,
&message.channel,
message.conversation_scope(),
)
.await
}
References
  1. When an issue is found in duplicated code, prefer refactoring into a shared function over applying localized fixes.
  2. Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I left this as-is for now to keep the CI fix narrowly scoped, since the duplicated branch is just the fallback into the existing SessionManager path. If we touch this area again, I agree a small helper such as resolve_or_activate_thread_for_approval would make the control flow easier to read.

- the owner-scoped HTTP webhook channel
- routine tools / routines tab
- job creation via routine execution / jobs tab
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The removal of this line from the docstring correctly reflects the updated scope of the E2E tests, which no longer directly cover job creation via routine execution in this scenario.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that was intentional. The E2E harness runs with sandbox-backed full-job execution disabled, so I narrowed the docstring to match what this scenario actually verifies: owner-scoped routine creation plus web approval/visibility.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates CI/E2E fixtures and approval handling so owner-scoped workflows (especially approvals for work originating in non-gateway channels) behave consistently and stop failing due to stale URL fixtures.

Changes:

  • Update E2E owner-scope scenario to approve pending requests via the web API for routines created from the HTTP owner-scoped webhook.
  • Adjust agent-loop thread resolution to allow approval submissions to target an already-loaded owned thread UUID across channels.
  • Refresh config URL test fixtures (LLM + embeddings) to avoid DNS-related flakiness/ENV_MUTEX poisoning, and remove a stale staging CI triage plan doc.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/e2e/scenarios/test_owner_scope.py Switches the “full-job routine from HTTP” scenario to use /api/chat/history + /api/chat/approval for explicit approvals.
src/agent/agent_loop.rs Adds an approval-specific thread-resolution fast path using UUID thread targeting.
src/config/llm.rs Updates base URL test fixtures to localhost URLs.
src/config/embeddings.rs Updates embedding base URL fixture to an IP literal to avoid DNS resolution.
docs/plans/2026-03-18-staging-ci-triage.md Removes outdated triage plan document.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/agent/agent_loop.rs
Comment on lines +1029 to +1049
let (session, thread_id) = if let Some(target_thread_id) = approval_thread_uuid {
let session = self
.session_manager
.get_or_create_session(&message.user_id)
.await;
let mut sess = session.lock().await;
if sess.threads.contains_key(&target_thread_id) {
sess.active_thread = Some(target_thread_id);
sess.last_active_at = chrono::Utc::now();
drop(sess);
(session, target_thread_id)
} else {
drop(sess);
self.session_manager
.resolve_thread(
&message.user_id,
&message.channel,
message.conversation_scope(),
)
.await
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the approval-thread fast path you switch active_thread and return the UUID, but you don't register a thread_map entry for the current (user_id, channel, external_thread_id) key. That means subsequent non-approval gateway messages that include the same UUID can still be routed by resolve_thread() into a new thread because this UUID is already mapped under a different channel key. Consider registering the mapping (e.g., via SessionManager::register_thread) when the UUID is accepted so future requests on this channel+thread_id consistently resolve to the same in-memory thread.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in aa426a00. When the approval fast path adopts an already-loaded UUID, it now also registers that thread for the current (user_id, channel, external_thread_id) key via SessionManager::register_thread, so later gateway messages on the same UUID resolve back to the same in-memory thread instead of creating a fresh channel-scoped thread. I also added a focused unit test for the second-channel registration case and reran tests/e2e/scenarios/test_owner_scope.py (3 passed).

@github-actions github-actions Bot added size: L 200-499 changed lines and removed size: M 50-199 changed lines labels Mar 20, 2026
@henrypark133
Copy link
Copy Markdown
Collaborator Author

Followed up on the approval-thread mapping review note in aa426a0. When the approval fast path adopts an already-loaded UUID, it now also registers that thread for the current channel key via SessionManager::register_thread, so later gateway messages using the same UUID resolve back to the same in-memory thread instead of creating a fresh channel-scoped thread. I also added a focused unit test for the second-channel registration case and reran tests/e2e/scenarios/test_owner_scope.py (3 passed).

@henrypark133 henrypark133 merged commit d3b69e7 into staging Mar 20, 2026
14 checks passed
@henrypark133 henrypark133 deleted the codex/ci-triage-fixes branch March 20, 2026 19:21
tianhaoz95 pushed a commit to tianhaoz95/clawgo that referenced this pull request Mar 22, 2026
* Fix CI approval flows and stale fixtures

* Backfill approval thread mapping across channels
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
* Fix CI approval flows and stale fixtures

* Backfill approval thread mapping across channels
drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026
* Fix CI approval flows and stale fixtures

* Backfill approval thread mapping across channels
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: docs Documentation size: L 200-499 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants