Skip to content

feat(jobs): per-job MCP server filtering and max_iterations cap#1243

Merged
serrrfirat merged 6 commits intonearai:stagingfrom
nick-stebbings:pr/per-job-mcp-filtering
Apr 1, 2026
Merged

feat(jobs): per-job MCP server filtering and max_iterations cap#1243
serrrfirat merged 6 commits intonearai:stagingfrom
nick-stebbings:pr/per-job-mcp-filtering

Conversation

@nick-stebbings
Copy link
Copy Markdown
Contributor

Summary

  • Add mcp_servers and max_iterations optional parameters to create_job tool
  • mcp_servers filters which MCP servers are mounted into worker containers, gated behind MCP_PER_JOB_ENABLED env var (default false)
  • max_iterations caps the worker agent loop iteration count (default 50, max 500)
  • Per-job MCP configs written to temp files, mounted read-only, cleaned up on job completion

Motivation

Deployments running multiple MCP servers need per-job scoping — a research job should only access research tools, not production APIs. Simple data-gathering jobs shouldn't burn 50 iterations when 10 suffice.

Test plan

  • cargo test --lib passes (3155 tests)
  • cargo clippy --all --all-features zero warnings
  • cargo fmt --check clean
  • Manual: create job with mcp_servers: ["serpstat"] → only serpstat in container config
  • Manual: create job with max_iterations: 5 → worker exits at 5
  • Manual: create job without params → full config mounted (backward compatible)
  • Verify MCP_PER_JOB_ENABLED=false (default) ignores mcp_servers param

🤖 Generated with Claude Code

@github-actions github-actions bot added scope: channel/web Web gateway channel scope: tool/builtin Built-in tools scope: orchestrator Container orchestrator size: L 200-499 changed lines risk: medium Business logic, config, or moderate-risk modules contributor: experienced 6-19 merged PRs labels Mar 16, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances job configuration flexibility and security by introducing per-job control over MCP server access and worker agent iteration limits. These changes allow for more granular resource management and improved isolation between different job types, ensuring that jobs only access necessary tools and operate within defined computational boundaries, thereby boosting efficiency and reducing potential security risks.

Highlights

  • Per-Job MCP Server Filtering: Introduced an optional mcp_servers parameter to the create_job tool, allowing users to specify which MCP servers are mounted into worker containers. This feature is gated by the MCP_PER_JOB_ENABLED environment variable.
  • Worker Iteration Cap: Added an optional max_iterations parameter to the create_job tool, enabling users to cap the worker agent loop iteration count for a job. The default is 50, with a maximum of 500 iterations.
  • Dynamic MCP Configuration: Per-job MCP configurations are now dynamically generated, written to temporary files, mounted read-only into worker containers, and automatically cleaned up upon job completion.
  • API and Schema Updates: The create_job API and its corresponding JSON schema have been updated to include the new mcp_servers and max_iterations parameters.
Changelog
  • src/channels/web/handlers/jobs.rs
    • Updated jobs_restart_handler to pass None for the newly introduced mcp_servers and max_iterations parameters when restarting jobs, ensuring backward compatibility.
  • src/orchestrator/job_manager.rs
    • Added mcp_per_job_enabled field to ContainerJobConfig to control the per-job MCP filtering feature.
    • Modified create_job and create_job_inner function signatures to accept optional mcp_servers and max_iterations parameters.
    • Implemented logic to inject IRONCLAW_MAX_ITERATIONS environment variable into job containers based on the max_iterations parameter.
    • Added functionality to generate and mount filtered MCP server configurations into job containers when mcp_per_job_enabled is active.
    • Introduced a cleanup mechanism in cleanup_job to remove temporary per-job MCP configuration files.
    • Created a new helper function generate_worker_mcp_config responsible for creating filtered MCP config files based on provided server names.
    • Added comprehensive unit tests for the generate_worker_mcp_config function.
  • src/orchestrator/mod.rs
    • Configured the mcp_per_job_enabled setting in ContainerJobConfig by reading the MCP_PER_JOB_ENABLED environment variable during orchestrator setup.
  • src/tools/builtin/job.rs
    • Modified the execute_sandbox function signature to accommodate the new mcp_servers and max_iterations parameters.
    • Updated the call to jm.create_job to pass through the new optional parameters.
    • Extended the JSON schema for the create_job tool to formally define mcp_servers (array of strings) and max_iterations (integer) as optional inputs.
    • Implemented parsing logic to extract mcp_servers and max_iterations from tool parameters and forward them to the job execution.
Activity
  • The pull request author has provided a detailed summary and motivation for the changes.
  • Comprehensive test plans are outlined, including unit tests and manual verification steps for the new features.
  • The changes were generated with Claude Code, indicating AI assistance in development.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces per-job MCP server filtering and a max_iterations cap, which are valuable additions for scoping job permissions and managing resources. The implementation is solid, with new parameters correctly propagated through the job creation flow and handled during container setup. The temporary file management for per-job configurations is also well-implemented.

My review includes a few suggestions to enhance maintainability and robustness, primarily by addressing hardcoded paths and refining error handling. Additionally, I've noted that several functions now have a large number of arguments; while acceptable for this change, it would be beneficial to consider using a builder pattern or an options struct in the future to improve code clarity.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces per-job configuration for MCP servers and iteration limits, which is a great enhancement for security and resource management. The implementation is generally solid, with good use of feature flags and temporary file handling. I've included a few suggestions to improve error handling and code maintainability.

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: REQUEST CHANGES

Good architecture -- per-job MCP filtering via temp config bind-mounted read-only into containers, behind a feature gate. But two critical bugs in the max_iterations implementation.

Critical

1. max_iterations is completely dead code
The cap is injected as IRONCLAW_MAX_ITERATIONS env var, but the worker binary reads --max-iterations from clap CLI args (default 50). The env var is never consumed anywhere in the codebase. Jobs always use 50 iterations regardless of what the user specifies.

Fix: add env = "IRONCLAW_MAX_ITERATIONS" to the clap arg definition in src/cli/mod.rs:

#[arg(long, env = "IRONCLAW_MAX_ITERATIONS", default_value = "50")]

2. max_iterations: 0 is allowed
No minimum bound in the parsing at job.rs. max_iterations: 0 creates a job that immediately terminates with zero loop iterations. Fix: add .max(1) after .min(500).

Important

3. Hardcoded /tmp/ironclaw-mcp-configs/
Per .claude/rules/review-discipline.md: "Never hardcode /tmp/... paths." Use std::env::temp_dir() or store the temp dir path on ContainerJobManager.

4. No test for max_iterations plumbing
The generate_worker_mcp_config tests are good, but there's no test verifying max_iterations reaches the container. Would have caught bug #1.

Suggestions

  • Consider a CreateJobRequest struct instead of #[allow(clippy::too_many_arguments)]
  • MCP server name matching should be case-insensitive per project style rules

What's good

  • Feature gate (MCP_PER_JOB_ENABLED, default false) is the right safety choice
  • generate_worker_mcp_config semantics are clean: None (full), Some([]) (none), Some([...]) (filtered)
  • Disabled servers correctly excluded from filtered configs
  • Good test coverage for the MCP filtering path
  • Schema_version preserved in filtered configs

CI all green.

@github-actions github-actions bot added the scope: channel/cli TUI / CLI channel label Mar 16, 2026
@nick-stebbings
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @zmanian. All four issues addressed in 2c564df:

Critical fixes:

  1. max_iterations dead code — Added env = "IRONCLAW_MAX_ITERATIONS" to clap arg in cli/mod.rs. Worker CLI now reads the env var injected by the orchestrator.
  2. max_iterations: 0 — Changed .min(500) to .clamp(1, 500).

Important fixes:
3. Hardcoded /tmp/ — Replaced with std::env::temp_dir().join("ironclaw-mcp-configs") in all three locations (cleanup, generation, doc comment).
4. max_iterations plumbing test — Added test_max_iterations_env_var_injected that uses include_str! to verify the env var name in cli/mod.rs matches what create_job_inner injects. Would have caught bug #1.

Suggestions addressed:

  • MCP server name matching is now case-insensitive (eq_ignore_ascii_case) + test added.
  • CreateJobRequest struct — agree it's the right direction, happy to do that as a follow-up to keep this PR focused.

@nick-stebbings nick-stebbings requested a review from zmanian March 17, 2026 21:07
@henrypark133 henrypark133 requested a review from Copilot March 18, 2026 21:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds per-job scoping controls for sandboxed job execution by extending the create_job tool with optional MCP server filtering and a worker loop iteration cap, plus orchestrator support for enforcing those settings at container creation time.

Changes:

  • Extend create_job tool schema and parameter parsing to accept mcp_servers and max_iterations.
  • Add orchestrator plumbing to optionally mount a filtered per-job MCP config and inject IRONCLAW_MAX_ITERATIONS.
  • Add tests for MCP config filtering behavior and an env-var wiring assertion for max_iterations.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/tools/builtin/job.rs Adds mcp_servers / max_iterations tool parameters and forwards them into sandbox job creation.
src/orchestrator/mod.rs Adds MCP_PER_JOB_ENABLED env-var gate into ContainerJobConfig.
src/orchestrator/job_manager.rs Implements MCP config filtering/mounting, IRONCLAW_MAX_ITERATIONS injection, cleanup, and unit tests.
src/cli/mod.rs Allows worker max_iterations to be set via IRONCLAW_MAX_ITERATIONS.
src/channels/web/handlers/jobs.rs Updates restart path to pass new create_job arguments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nick-stebbings nick-stebbings force-pushed the pr/per-job-mcp-filtering branch from 2c564df to 04e6e6a Compare March 19, 2026 19:46
@github-actions github-actions bot added scope: docs Documentation scope: dependencies Dependency updates labels Mar 19, 2026
@nick-stebbings
Copy link
Copy Markdown
Contributor Author

Addressed Copilot review feedback in 79aecc8:

  1. IRONCLAW_MAX_ITERATIONS mode guard — now only injected for JobMode::Worker (ClaudeCode uses max_turns instead)
  2. Extracted WORKER_MCP_CONFIG_PATH constant — removed hardcoded /opt/ironclaw/config/worker/mcp-servers.json string
  3. TOCTOU race in cleanup_job — replaced exists() + remove_file() with direct remove_file() matching on NotFound
  4. schema_version default — changed from 0 to 1 to match McpServersFile default
  5. Serialization error propagation — replaced unwrap_or_else(|| "{}") with proper map_err — config corruption now surfaces as an error
  6. Type validation for mcp_servers/max_iterations — wrong JSON types now log a warning instead of being silently ignored

Re: temp file leak on Docker create failure — cleanup_job already handles this via the UUID-based temp path. The temp dir is also cleaned up by the OS reaper. Happy to add explicit early-cleanup if preferred.

Re: predictable temp path — the ironclaw-mcp-configs/ dir is only writable by the deploy user (umask 0022). Happy to switch to tempfile crate for symlink protection if this is a hard requirement.

@zmanian ready for re-review when you get a chance.

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature logic is sound and well-designed, but CI is blocking on missing regression tests.

Strengths:

  • Per-job MCP filtering gated behind MCP_PER_JOB_ENABLED (opt-in, backward compatible)
  • Three clean modes: None (full config), Some([]) (no MCP), Some(["name"]) (filtered)
  • Case-insensitive server name matching, respects enabled flag
  • max_iterations correctly clamped to [1, 500] and injected via env var
  • 8 unit tests for core filtering logic

Required changes:

  1. Regression test enforcement failing -- add integration tests covering:

    • Create job with mcp_servers: ["serpstat"] -> verify mounted config contains only serpstat
    • Create job with max_iterations: 5 -> verify env var injected
    • Create job with MCP_PER_JOB_ENABLED=false and mcp_servers param -> verify param ignored
    • Verify temp file cleanup after job completion
  2. Security hardening (minor): Set restrictive permissions (0o700) on /tmp/ironclaw-mcp-configs/ directory

Will approve once the regression tests are added and CI passes.

@nick-stebbings nick-stebbings force-pushed the pr/per-job-mcp-filtering branch from 79aecc8 to 237c7dc Compare March 21, 2026 21:55
@nick-stebbings
Copy link
Copy Markdown
Contributor Author

Addressed all review feedback in d49d615:

Regression tests added (5 new tests, 18 total):

  • test_filtered_config_contains_only_requested_server — verifies filtered config contains only serpstat, explicitly asserts notion and archon do not leak through, checks schema_version preservation
  • test_feature_flag_disabled_skips_mcp_filtering — verifies mcp_per_job_enabled defaults to false and the gate in create_job_inner is present
  • test_temp_file_cleanup_removes_per_job_config — creates a filtered config, verifies the temp path matches what cleanup_job expects, removes it, asserts it's gone
  • test_cleanup_job_is_idempotent — calls cleanup_job twice for a non-existent job, verifies no panic
  • test_temp_dir_has_restrictive_permissions (unix) — verifies /tmp/ironclaw-mcp-configs/ is set to 0o700

Security hardening:

  • Set 0o700 permissions on the ironclaw-mcp-configs temp directory after creation to prevent other host users from reading filtered MCP configs

All 18 tests pass locally.

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: per-job MCP server filtering and max_iterations cap

What was done well

  • Clean separation: MCP filtering is gated behind MCP_PER_JOB_ENABLED (default false), good defense-in-depth
  • Temp directory permissions hardened to 0700 on Unix
  • TOCTOU-safe cleanup in cleanup_job (remove_file directly, match on NotFound)
  • Case-insensitive server name matching
  • Disabled servers excluded from filtered config
  • Tests cover the core generate_worker_mcp_config function thoroughly (empty filter, no match, case-insensitive, permissions, cleanup idempotency)
  • Read-only bind mount (:ro) prevents container from modifying its MCP config

Critical (must fix)

1. max_iterations is NOT enforced server-side for the container worker path

The max_iterations value is injected as an env var IRONCLAW_MAX_ITERATIONS and read by the worker CLI via clap. But src/worker/container.rs:176 passes self.config.max_iterations to AgenticLoopConfig -- this comes from the clap-parsed CLI arg, which the container process controls.

The problem: there are TWO worker paths:

  • Container worker (src/worker/container.rs): reads max_iterations from its own CLI args (clap env = "IRONCLAW_MAX_ITERATIONS"). This is the path used by create_job. The value is honored, but only because the container environment is set by the host. A compromised container process could ignore the env var.
  • Job worker (src/worker/job.rs:304-311): reads max_iterations from ctx.metadata, clamps to MAX_WORKER_ITERATIONS = 500. This is the in-process scheduler path. The PR doesn't appear to inject max_iterations into job metadata for this path.

For the container path specifically: the clamp happens in job.rs tool parsing (n.clamp(1, 500)), but create_job_inner passes the value straight through to the env var without re-validating. If the tool parsing is bypassed (e.g., direct API call to the web handler -- though the restart handler passes None today), there's no server-side enforcement in create_job_inner.

Recommendation: Add a clamp in create_job_inner before injecting the env var:

if let Some(iters) = max_iterations && mode == JobMode::Worker {
    let capped = iters.clamp(1, 500);
    env_vec.push(format!("IRONCLAW_MAX_ITERATIONS={}", capped));
}

2. #[allow(clippy::too_many_arguments)] -- structural concern

Two new #[allow(clippy::too_many_arguments)] suppressions were added. Per the project's zero-clippy-warnings policy, this is technically compliant but signals that create_job and execute_sandbox are accumulating parameters. These functions now take 7-8 positional arguments, which is fragile.

Recommendation: Introduce a JobCreationParams struct to bundle mcp_servers, max_iterations, and credential_grants. This would reduce parameter count and make future extensions safer:

pub struct JobCreationParams {
    pub credential_grants: Vec<CredentialGrant>,
    pub mcp_servers: Option<Vec<String>>,
    pub max_iterations: Option<u32>,
}

Important (should fix)

3. No validation of MCP server names

Server names from the mcp_servers array are compared against the master config via eq_ignore_ascii_case, which is correct. However, there is no validation that the names are reasonable strings (e.g., no path separators, no null bytes). While the names are only used for string comparison (not file paths), defensive validation would prevent future misuse if the names were ever used in paths or commands.

Recommendation: Add a simple check that rejects names containing /, \, \0, or names longer than 128 chars.

4. generate_worker_mcp_config uses synchronous I/O

The function uses std::fs::read_to_string, std::fs::write, std::fs::create_dir_all, etc. It's called from an async context (create_job_inner). This will block the tokio runtime thread during file I/O.

For small JSON config files this is unlikely to cause problems in practice, but it violates the project's "all I/O is async with tokio" convention stated in CLAUDE.md.

Recommendation: Either use tokio::fs equivalents (making the function async) or wrap the call in tokio::task::spawn_blocking. Low urgency since these files are small.

5. Missing test: max_iterations not injected for ClaudeCode mode

The code correctly gates IRONCLAW_MAX_ITERATIONS injection on mode == JobMode::Worker. There should be a test verifying that max_iterations is ignored when mode == JobMode::ClaudeCode, since ClaudeCode has its own max_turns parameter.


Suggestions (nice to have)

6. The test_max_iterations_env_var_injected test uses include_str! to grep source code. This is clever but brittle -- if the clap attribute format changes, the test breaks without a real behavioral regression. Consider replacing with a clap try_get_matches_from test that actually parses args.

7. The test_feature_flag_disabled_skips_mcp_filtering test also uses include_str! source scanning. Same concern as above.

8. Consider adding the max_iterations cap constant (500) as a named constant in job_manager.rs rather than having it as a magic number in the tool parsing code, since it needs to match MAX_WORKER_ITERATIONS in worker/job.rs.


Summary

The core design is sound: MCP filtering is properly gated, filtered configs are written to temp files with restrictive permissions, cleanup is TOCTOU-safe, and the mount is read-only. The main concern is that max_iterations lacks server-side clamping in create_job_inner -- the defense currently relies entirely on the tool parameter parsing layer, which is bypassable. Adding a clamp at the container creation layer (Critical #1) would close this gap. The parameter proliferation (Critical #2) should be addressed to prevent further erosion of the function signatures.

Requesting changes for Critical #1 (server-side clamp). Critical #2 is a strong recommendation but can be deferred to a follow-up if needed.

@github-actions github-actions bot added size: XL 500+ changed lines and removed size: L 200-499 changed lines labels Mar 28, 2026
nick-stebbings and others added 5 commits March 28, 2026 08:32
Add mcp_servers and max_iterations optional params to create_job.
mcp_servers filters which MCP servers are mounted into worker
containers (gated behind MCP_PER_JOB_ENABLED, default false).
max_iterations caps the worker agent loop (default 50, max 500).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix max_iterations dead code: add env = "IRONCLAW_MAX_ITERATIONS"
  to clap arg so worker CLI reads the env var injected by orchestrator
- Fix max_iterations: 0 allowed: use .clamp(1, 500) instead of .min(500)
- Replace hardcoded /tmp/ironclaw-mcp-configs with std::env::temp_dir()
- Make MCP server name matching case-insensitive
- Add test for case-insensitive matching
- Add test verifying max_iterations env var name matches clap definition

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Guard IRONCLAW_MAX_ITERATIONS injection to Worker mode only (ClaudeCode uses max_turns)
- Extract WORKER_MCP_CONFIG_PATH as constant (no more hardcoded path)
- Fix TOCTOU race in cleanup_job: use remove_file directly, match on NotFound
- Fix schema_version default: 0 → 1 to match McpServersFile default
- Propagate serialization errors instead of silently writing empty config
- Add type validation warnings for mcp_servers and max_iterations params
…tering

Add 5 regression tests covering CI-required scenarios:
- Filtered config contains only the requested server (no leaks)
- Feature flag disabled skips MCP filtering entirely
- Temp file cleanup removes per-job config
- cleanup_job is idempotent (no panic on missing file/handle)
- Temp directory has restrictive 0o700 permissions (unix)

Security: set 0o700 permissions on /tmp/ironclaw-mcp-configs/ to prevent
other users on the host from reading filtered MCP server configs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… validation

Critical:
1. Server-side max_iterations clamp in create_job_inner — defense no longer
   relies solely on tool parameter parsing. Uses MAX_WORKER_ITERATIONS constant
   (matching worker/job.rs) so the cap is enforced even for direct API calls.

2. Introduce JobCreationParams struct to bundle credential_grants, mcp_servers,
   and max_iterations. Removes #[allow(clippy::too_many_arguments)] from both
   create_job and execute_sandbox (7→5 and 9→7 positional args).

Important:
3. Validate MCP server names: reject path separators (/\), null bytes, and
   names longer than 128 chars to prevent future misuse.

5. Add test verifying max_iterations is NOT injected for ClaudeCode mode.
   Add test verifying server-side clamp uses MAX_WORKER_ITERATIONS constant.
   Add test verifying name validation rejects path separators and null bytes.
@nick-stebbings nick-stebbings force-pushed the pr/per-job-mcp-filtering branch from 69cc96e to 87eb293 Compare March 28, 2026 07:32
@nick-stebbings
Copy link
Copy Markdown
Contributor Author

nick-stebbings commented Mar 28, 2026

Addressed all review items in 87eb293:

Critical #1 — Server-side max_iterations clamp:
Added iters.clamp(1, MAX_WORKER_ITERATIONS) in create_job_inner before injecting the env var. The cap is now enforced server-side regardless of whether the tool parsing layer is bypassed (e.g., direct API call via web restart handler). Uses a named MAX_WORKER_ITERATIONS constant (500) matching worker/job.rs.

Critical #2JobCreationParams struct:
Introduced JobCreationParams to bundle credential_grants, mcp_servers, and max_iterations. Removes both #[allow(clippy::too_many_arguments)] suppressions from create_job (7→5 args) and execute_sandbox (9→7 args). All callers updated.

Important #3 — MCP server name validation:
Added validation rejecting names containing /, \, \0, or exceeding 128 chars. Returns ContainerCreationFailed error with the invalid name.

Important #5 — Test: max_iterations ignored for ClaudeCode:
Added test_max_iterations_not_injected_for_claude_code verifying the mode == JobMode::Worker gate. Also added test_server_side_max_iterations_clamp and test_mcp_server_name_validation_rejects_path_separators.

@nick-stebbings nick-stebbings requested a review from zmanian March 28, 2026 07:47
zmanian
zmanian previously approved these changes Mar 28, 2026
Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review: APPROVE

All critical and important issues from previous reviews have been addressed across the four fix commits.

Verified fixes

Critical (all resolved):

  • Server-side clamp added in create_job_inner: iters.clamp(1, MAX_WORKER_ITERATIONS) enforces the cap even when tool parameter parsing is bypassed (e.g., direct API calls via the web restart handler). This was our primary blocking concern.
  • env = "IRONCLAW_MAX_ITERATIONS" added to the clap arg in src/cli/mod.rs, closing the dead-code bug where the env var was injected but never consumed.
  • JobCreationParams struct introduced, eliminating positional argument proliferation on create_job and execute_sandbox.

Important (all resolved):

  • MCP server name validation rejects path separators, null bytes, and names >128 chars.
  • Temp directory uses std::env::temp_dir() instead of hardcoded /tmp/.
  • Temp directory permissions hardened to 0700 on Unix.
  • Test added for ClaudeCode mode not injecting IRONCLAW_MAX_ITERATIONS.

Regression tests added (12 new tests):

  • test_mcp_config_none_filter_returns_master_path
  • test_mcp_config_empty_filter_returns_none
  • test_mcp_config_missing_master_returns_none
  • test_mcp_config_filters_to_named_servers
  • test_mcp_config_no_match_returns_none
  • test_mcp_config_case_insensitive_match
  • test_max_iterations_env_var_injected
  • test_max_iterations_not_injected_for_claude_code
  • test_server_side_max_iterations_clamp
  • test_mcp_server_name_validation_rejects_path_separators
  • test_filtered_config_contains_only_requested_server
  • test_feature_flag_disabled_skips_mcp_filtering
  • test_temp_file_cleanup_removes_per_job_config
  • test_cleanup_job_is_idempotent
  • test_temp_dir_has_restrictive_permissions (unix-only)

Remaining minor items (non-blocking)

  1. Synchronous I/O in async context: generate_worker_mcp_config still uses std::fs rather than tokio::fs. Low risk for small config files but technically violates the project's async I/O convention. Fine as a follow-up.

  2. Source-scanning tests: test_max_iterations_env_var_injected and test_feature_flag_disabled_skips_mcp_filtering use include_str! to grep source code. These are brittle if formatting changes. Consider replacing with behavioral tests in a follow-up, but they serve their purpose for now.

  3. Dual MAX_WORKER_ITERATIONS constants: The 500 cap exists in both src/orchestrator/job_manager.rs and src/worker/job.rs. A shared constant in ironclaw_common would prevent drift. Non-blocking since the test_server_side_max_iterations_clamp test catches mismatches.

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

…TIONS

1. Convert generate_worker_mcp_config from sync std::fs to async tokio::fs.
   The function is called from async create_job_inner — sync I/O was blocking
   the tokio runtime thread. All test callers converted to #[tokio::test].

2. Move MAX_WORKER_ITERATIONS (500) to ironclaw_common as single source of
   truth. Both src/orchestrator/job_manager.rs and src/worker/job.rs now
   import from the shared crate, preventing drift.
@nick-stebbings
Copy link
Copy Markdown
Contributor Author

Addressed remaining items in 79da64f:

  1. Async I/O: generate_worker_mcp_config now uses tokio::fs (read_to_string, create_dir_all, set_permissions, write, try_exists) instead of blocking std::fs. All test callers converted to #[tokio::test].

  2. Shared MAX_WORKER_ITERATIONS: Moved to ironclaw_common crate as single source of truth. Both job_manager.rs and worker/job.rs import from the shared crate — no more dual constants that could drift.

@nick-stebbings
Copy link
Copy Markdown
Contributor Author

@zmanian Your last review (APPROVE) was dismissed by the rebase force-push. All critical and important issues from your four review rounds are addressed — could you re-approve when you get a chance? CI is all green.

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-Review (4th round) -- APPROVE

All items from the previous 3 review cycles are resolved:

  • max_iterations: env var wired to clap, server-side clamp with shared MAX_WORKER_ITERATIONS constant
  • Temp file handling: std::env::temp_dir(), directory permissions 0o700, server name validation (rejects /, \, \0, >128 chars)
  • Async I/O: generate_worker_mcp_config converted to async with tokio::fs
  • JobCreationParams struct introduced
  • 15 regression tests added

Suggestions (non-blocking)

  • cleanup_job still uses sync std::fs::remove_file in async context -- should use tokio::fs::remove_file for consistency
  • Tool parsing at job.rs:740 uses magic number 500 instead of MAX_WORKER_ITERATIONS constant -- could drift
  • Consider setting 0o600 on individual config files, not just the directory

@serrrfirat serrrfirat merged commit 81bb705 into nearai:staging Apr 1, 2026
14 checks passed
serrrfirat pushed a commit that referenced this pull request Apr 5, 2026
* feat(jobs): per-job MCP server filtering and max_iterations cap

Add mcp_servers and max_iterations optional params to create_job.
mcp_servers filters which MCP servers are mounted into worker
containers (gated behind MCP_PER_JOB_ENABLED, default false).
max_iterations caps the worker agent loop (default 50, max 500).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review feedback on per-job MCP filtering

- Fix max_iterations dead code: add env = "IRONCLAW_MAX_ITERATIONS"
  to clap arg so worker CLI reads the env var injected by orchestrator
- Fix max_iterations: 0 allowed: use .clamp(1, 500) instead of .min(500)
- Replace hardcoded /tmp/ironclaw-mcp-configs with std::env::temp_dir()
- Make MCP server name matching case-insensitive
- Add test for case-insensitive matching
- Add test verifying max_iterations env var name matches clap definition

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address Copilot review feedback on per-job MCP filtering

- Guard IRONCLAW_MAX_ITERATIONS injection to Worker mode only (ClaudeCode uses max_turns)
- Extract WORKER_MCP_CONFIG_PATH as constant (no more hardcoded path)
- Fix TOCTOU race in cleanup_job: use remove_file directly, match on NotFound
- Fix schema_version default: 0 → 1 to match McpServersFile default
- Propagate serialization errors instead of silently writing empty config
- Add type validation warnings for mcp_servers and max_iterations params

* test: add regression tests and security hardening for per-job MCP filtering

Add 5 regression tests covering CI-required scenarios:
- Filtered config contains only the requested server (no leaks)
- Feature flag disabled skips MCP filtering entirely
- Temp file cleanup removes per-job config
- cleanup_job is idempotent (no panic on missing file/handle)
- Temp directory has restrictive 0o700 permissions (unix)

Security: set 0o700 permissions on /tmp/ironclaw-mcp-configs/ to prevent
other users on the host from reading filtered MCP server configs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address code review — server-side clamp, JobCreationParams, name validation

Critical:
1. Server-side max_iterations clamp in create_job_inner — defense no longer
   relies solely on tool parameter parsing. Uses MAX_WORKER_ITERATIONS constant
   (matching worker/job.rs) so the cap is enforced even for direct API calls.

2. Introduce JobCreationParams struct to bundle credential_grants, mcp_servers,
   and max_iterations. Removes #[allow(clippy::too_many_arguments)] from both
   create_job and execute_sandbox (7→5 and 9→7 positional args).

Important:
3. Validate MCP server names: reject path separators (/\), null bytes, and
   names longer than 128 chars to prevent future misuse.

5. Add test verifying max_iterations is NOT injected for ClaudeCode mode.
   Add test verifying server-side clamp uses MAX_WORKER_ITERATIONS constant.
   Add test verifying name validation rejects path separators and null bytes.

* fix: async I/O in generate_worker_mcp_config, shared MAX_WORKER_ITERATIONS

1. Convert generate_worker_mcp_config from sync std::fs to async tokio::fs.
   The function is called from async create_job_inner — sync I/O was blocking
   the tokio runtime thread. All test callers converted to #[tokio::test].

2. Move MAX_WORKER_ITERATIONS (500) to ironclaw_common as single source of
   truth. Both src/orchestrator/job_manager.rs and src/worker/job.rs now
   import from the shared crate, preventing drift.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: experienced 6-19 merged PRs risk: medium Business logic, config, or moderate-risk modules scope: channel/cli TUI / CLI channel scope: channel/web Web gateway channel scope: dependencies Dependency updates scope: docs Documentation scope: orchestrator Container orchestrator scope: tool/builtin Built-in tools scope: worker Container worker size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants