Skip to content

feat: add IRONCLAW_BASE_DIR env var with LazyLock caching#397

Merged
serrrfirat merged 2 commits intonearai:mainfrom
ibhagwan:feature/ironclaw-base-dir-env-override
Feb 27, 2026
Merged

feat: add IRONCLAW_BASE_DIR env var with LazyLock caching#397
serrrfirat merged 2 commits intonearai:mainfrom
ibhagwan:feature/ironclaw-base-dir-env-override

Conversation

@ibhagwan
Copy link
Copy Markdown
Contributor

This allows users to place their ironclaw data directory anywhere by setting the IRONCLAW_BASE_DIR environment variable instead of the hardcoded ~/.ironclaw path.

This also enables a multi-agent setup where each agent has their own base_dir with different config/database (if sqlite), etc.

Usage:
IRONCLAW_BASE_DIR=/custom/path ironclaw

Features:

  • Value computed once at startup and cached via LazyLock for thread safety
  • Empty string or null bytes in env var treated as unset (falls back to default)
  • Warns user if home directory cannot be determined (falls back to ./.ironclaw)
  • Warns user if IRONCLAW_BASE_DIR contains null bytes

This is useful for development, testing, or running ironclaw in environments where modifying HOME is not desirable.

@github-actions github-actions Bot added scope: channel/cli TUI / CLI channel scope: channel/web Web gateway channel scope: channel/wasm WASM channel runtime scope: tool/builtin Built-in tools size: L 200-499 changed lines scope: tool/mcp MCP client scope: llm LLM integration scope: workspace Persistent memory / workspace scope: orchestrator Container orchestrator scope: config Configuration scope: setup Onboarding / setup scope: pairing Pairing mode risk: high Safety, secrets, auth, or critical infrastructure contributor: regular 2-5 merged PRs labels Feb 27, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the flexibility of IronClaw by externalizing the configuration of its base data directory. By introducing an environment variable and centralizing path resolution, it simplifies multi-agent deployments and provides greater control over data storage locations, making the application more adaptable to diverse operational environments.

Highlights

  • Configurable Base Directory: Introduced the IRONCLAW_BASE_DIR environment variable, allowing users to specify a custom data directory instead of the default ~/.ironclaw.
  • LazyLock Caching: Implemented LazyLock for the IRONCLAW_BASE_DIR to ensure the base directory path is computed only once at startup and cached for thread-safe access throughout the application's lifetime.
  • Robust Environment Variable Handling: Added logic to handle various IRONCLAW_BASE_DIR environment variable scenarios, including treating empty strings or paths containing null bytes as unset, and falling back to a default path with warnings if the home directory cannot be determined.
  • Centralized Path Resolution: Refactored numerous internal path resolutions across the codebase to consistently use the new ironclaw_base_dir() function, ensuring all data-related paths respect the configured base directory.
  • Comprehensive Testing: Added new unit tests to verify the correct behavior of ironclaw_base_dir() under different conditions, such as default usage, environment variable overrides, and handling of empty or special character paths.
Changelog
  • src/bootstrap.rs
    • Added IRONCLAW_BASE_DIR_ENV constant for the environment variable name.
    • Implemented LazyLock for IRONCLAW_BASE_DIR to cache the computed base directory.
    • Created compute_ironclaw_base_dir function to determine the base directory from the environment, handling empty strings and null bytes.
    • Developed default_base_dir function to provide a fallback path and log warnings if the home directory is unavailable.
    • Exported ironclaw_base_dir as a public function to retrieve the resolved base directory.
    • Updated ironclaw_env_path to use the new ironclaw_base_dir function.
    • Modified migrate_disk_to_db to utilize ironclaw_base_dir for resolving the IronClaw directory.
    • Introduced ENV_MUTEX for thread-safe environment variable manipulation in tests.
    • Added new tests for test_ironclaw_base_dir_default, test_ironclaw_base_dir_env_override, test_ironclaw_env_path_uses_base_dir, test_ironclaw_base_dir_empty_env, and test_ironclaw_base_dir_special_chars.
  • src/channels/repl.rs
    • Imported ironclaw_base_dir.
    • Updated history_path to use ironclaw_base_dir for path resolution.
  • src/channels/signal.rs
    • Imported ironclaw_base_dir.
    • Modified validate_attachment_paths to use ironclaw_base_dir for determining the sandbox base directory.
  • src/channels/wasm/loader.rs
    • Imported ironclaw_base_dir.
    • Updated default_channels_dir to use ironclaw_base_dir for path resolution.
  • src/channels/web/handlers/static_files.rs
    • Imported ironclaw_base_dir.
    • Modified serve_project_file to use ironclaw_base_dir for resolving project paths.
  • src/channels/web/server.rs
    • Imported ironclaw_base_dir.
    • Modified serve_project_file to use ironclaw_base_dir for resolving project paths.
  • src/cli/doctor.rs
    • Imported ironclaw_base_dir.
    • Updated check_workspace_dir to use ironclaw_base_dir for determining the workspace directory.
  • src/cli/status.rs
    • Imported ironclaw_base_dir.
    • Updated default_tools_dir and default_channels_dir to use ironclaw_base_dir for path resolution.
  • src/cli/tool.rs
    • Imported ironclaw_base_dir.
    • Updated default_tools_dir to use ironclaw_base_dir for path resolution.
  • src/config/channels.rs
    • Imported ironclaw_base_dir.
    • Updated default_channels_dir to use ironclaw_base_dir for path resolution.
  • src/config/database.rs
    • Imported ironclaw_base_dir.
    • Updated default_libsql_path to use ironclaw_base_dir for path resolution.
  • src/config/hygiene.rs
    • Imported ironclaw_base_dir.
    • Modified HygieneConfig::build to use ironclaw_base_dir for the state directory.
  • src/config/llm.rs
    • Imported ironclaw_base_dir.
    • Updated default_session_path to use ironclaw_base_dir for path resolution.
  • src/config/skills.rs
    • Imported ironclaw_base_dir.
    • Updated default_skills_dir and default_installed_skills_dir to use ironclaw_base_dir for path resolution.
  • src/config/wasm.rs
    • Imported ironclaw_base_dir.
    • Updated default_tools_dir to use ironclaw_base_dir for path resolution.
  • src/llm/session.rs
    • Imported ironclaw_base_dir.
    • Updated default_session_path to use ironclaw_base_dir for path resolution.
  • src/orchestrator/job_manager.rs
    • Imported ironclaw_base_dir.
    • Modified validate_bind_mount_path to use ironclaw_base_dir for resolving project base paths.
    • Updated test_validate_bind_mount_valid_path to use ironclaw_base_dir.
  • src/pairing/store.rs
    • Imported ironclaw_base_dir.
    • Updated default_pairing_dir to use ironclaw_base_dir for path resolution.
  • src/registry/installer.rs
    • Imported ironclaw_base_dir.
    • Modified RegistryInstaller::with_defaults to use ironclaw_base_dir for tools and channels directories.
  • src/service.rs
    • Imported ironclaw_base_dir.
    • Updated ironclaw_logs_dir to use ironclaw_base_dir for path resolution.
  • src/settings.rs
    • Imported ironclaw_base_dir.
    • Updated default_path and default_toml_path to use ironclaw_base_dir for settings file paths.
  • src/setup/wizard.rs
    • Imported ironclaw_base_dir.
    • Modified SetupWizard methods to use ironclaw_base_dir for resolving channels and tools directories during setup.
    • Updated install_selected_registry_channels to use ironclaw_base_dir.
  • src/tools/builtin/job.rs
    • Imported ironclaw_base_dir.
    • Updated projects_base to use ironclaw_base_dir for path resolution.
  • src/tools/builtin/message.rs
    • Imported ironclaw_base_dir.
    • Modified MessageTool::new to use ironclaw_base_dir for its base directory.
  • src/tools/mcp/config.rs
    • Imported ironclaw_base_dir.
    • Updated default_config_path to use ironclaw_base_dir for path resolution.
  • src/workspace/hygiene.rs
    • Imported ironclaw_base_dir.
    • Modified HygieneConfig::default to use ironclaw_base_dir for the state directory.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature by allowing the IronClaw base directory to be configured via the IRONCLAW_BASE_DIR environment variable. This greatly improves flexibility for development, testing, and multi-agent setups. The implementation is robust, using std::sync::LazyLock for efficient, thread-safe initialization of the base path. The refactoring across the codebase to use the new centralized ironclaw_base_dir() function is extensive and well-executed. The added tests are comprehensive and correctly use a mutex to handle the non-thread-safe nature of environment variable manipulation. I have one suggestion regarding the unnecessary use of unsafe blocks in the tests, which I've detailed in a specific comment.

Comment thread src/bootstrap.rs
@ibhagwan
Copy link
Copy Markdown
Contributor Author

@serrrfirat, used your skill to review the code, let's see if this time the review finds less stuff.

Copy link
Copy Markdown
Collaborator

@serrrfirat serrrfirat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This is a well-structured refactoring PR that correctly centralizes ~26 instances of duplicated base-directory computation into a single ironclaw_base_dir() function with LazyLock caching. The core logic is sound and the code quality is good. The main concerns are: (1) test reliability — the ENV_MUTEX doesn't prevent the LazyLock from being poisoned by concurrent test threads in other modules, and a signal channel test still uses dirs::home_dir() instead of the new API; (2) the silent fallback to a CWD-relative path (./.ironclaw) when the home directory is unavailable changes the error behavior in security-sensitive code paths like Docker bind-mount validation and daemon log directory resolution; (3) the null-byte check on the env var is dead code since std::env::var() cannot return strings with embedded null bytes. None of these are critical bugs, but the test reliability issue (f-1) and the sandbox validation change (f-2) deserve attention before merging.

Comment thread src/bootstrap.rs
.collect();
assert_eq!(parsed.len(), 2);
let onboard = parsed.iter().find(|(k, _)| k == "ONBOARD_COMPLETED");
assert!(onboard.is_some(), "ONBOARD_COMPLETED must be present");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test env-var manipulation races with LazyLock initialization in other test threads

The tests use unsafe { std::env::set_var("IRONCLAW_BASE_DIR", ...) } and protect against inter-test races with ENV_MUTEX. However, ENV_MUTEX only synchronizes tests within this module. When cargo test runs, tests from other modules execute in parallel threads. If another module's test calls ironclaw_base_dir() for the first time while a bootstrap test has temporarily set IRONCLAW_BASE_DIR to a test value, the LazyLock will be permanently initialized with that test value. All subsequent calls to ironclaw_base_dir() from any test in the process will return the wrong path, causing non-deterministic test failures. This is a classic test-order-dependent flaky test pattern.

Suggested fix:

Either: (1) Run bootstrap env-var tests in a separate test binary via `[[test]]` in Cargo.toml with `harness = false` so they have their own process. (2) Or avoid relying on the LazyLock at all in production code that's also used in tests — e.g., make `ironclaw_base_dir()` take an optional override parameter, or use a test-specific initialization mechanism. The current approach of testing only `compute_ironclaw_base_dir()` avoids corrupting the LazyLock itself, but `test_validate_bind_mount_valid_path` in `job_manager.rs` (line 617) calls `ironclaw_base_dir()` which initializes the LazyLock, and this test runs in the same process.

Severity: medium · Confidence: high

Comment thread src/orchestrator/job_manager.rs
Comment thread src/bootstrap.rs Outdated
Comment thread src/bootstrap.rs
Comment thread src/bootstrap.rs
Comment thread src/service.rs Outdated
Comment thread src/channels/signal.rs
Comment thread src/bootstrap.rs
This allows users to place their ironclaw data directory anywhere by setting
the IRONCLAW_BASE_DIR environment variable instead of the hardcoded ~/.ironclaw
path.

Usage:
  IRONCLAW_BASE_DIR=/custom/path ironclaw

Features:
- Value computed once at startup and cached via LazyLock for thread safety
- Empty string or null bytes in env var treated as unset (falls back to default)
- Warns user if home directory cannot be determined (falls back to ./.ironclaw)
- Warns user if IRONCLAW_BASE_DIR contains null bytes

This is useful for development, testing, or running ironclaw in environments
where modifying HOME is not desirable.
ibhagwan added a commit to ibhagwan/ironclaw that referenced this pull request Feb 27, 2026
- Make compute_ironclaw_base_dir() public for use in tests
- Add absolute path check in validate_bind_mount_path for security
- Add warning for relative IRONCLAW_BASE_DIR paths
- Remove unreachable null-byte check (std::env::var cannot contain nulls)
- Rename misleading test name to test_compute_base_dir_env_path_join
- Change ironclaw_logs_dir() return type to PathBuf (cannot fail anymore)
- Update signal test to use ironclaw_base_dir() instead of dirs::home_dir()
- Fix fallback to use current_dir() instead of "." for predictability
- Add SAFETY comments to unsafe env var operations in tests
@ibhagwan ibhagwan force-pushed the feature/ironclaw-base-dir-env-override branch from 1110b52 to a5e3e2f Compare February 27, 2026 13:13
- Make compute_ironclaw_base_dir() public for use in tests
- Add absolute path check in validate_bind_mount_path for security
- Add warning for relative IRONCLAW_BASE_DIR paths
- Remove unreachable null-byte check (std::env::var cannot contain nulls)
- Rename misleading test name to test_compute_base_dir_env_path_join
- Change ironclaw_logs_dir() return type to PathBuf (cannot fail anymore)
- Update signal test to use ironclaw_base_dir() instead of dirs::home_dir()
- Fix fallback to use current_dir() instead of "." for predictability
- Add SAFETY comments to unsafe env var operations in tests
@ibhagwan ibhagwan force-pushed the feature/ironclaw-base-dir-env-override branch from a5e3e2f to 1ab9a4f Compare February 27, 2026 13:15
Copy link
Copy Markdown
Collaborator

@serrrfirat serrrfirat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review feedback from the previous round has been thoroughly addressed in commit 1ab9a4f:

  • LazyLock test safety: compute_ironclaw_base_dir() made public for tests; job_manager.rs test avoids LazyLock initialization — fixes the cross-module race.
  • Sandbox security: Absolute path check added in validate_bind_mount_path — hard fail instead of silent fallback.
  • Dead code removed: Unreachable null-byte check replaced with relative path warning.
  • API cleanup: ironclaw_logs_dir() simplified to return PathBuf (infallible), signal test updated to use ironclaw_base_dir(), test renamed for accuracy.
  • Fallback improved: current_dir() + /tmp instead of PathBuf::from(".").

LGTM — all 8 findings resolved. Approving.

@ibhagwan
Copy link
Copy Markdown
Contributor Author

All review feedback from the previous round has been thoroughly addressed in commit 1ab9a4f:

  • LazyLock test safety: compute_ironclaw_base_dir() made public for tests; job_manager.rs test avoids LazyLock initialization — fixes the cross-module race.
  • Sandbox security: Absolute path check added in validate_bind_mount_path — hard fail instead of silent fallback.
  • Dead code removed: Unreachable null-byte check replaced with relative path warning.
  • API cleanup: ironclaw_logs_dir() simplified to return PathBuf (infallible), signal test updated to use ironclaw_base_dir(), test renamed for accuracy.
  • Fallback improved: current_dir() + /tmp instead of PathBuf::from(".").

LGTM — all 8 findings resolved. Approving.

Ty @serrrfirat, I am just in the process of addressing the unsafe in the tests by refactoring the code to use OnceLock instead of LazyLock, it's not a big change so up to you if you wish to wait for this commit or merge as is.

I'm referring to this review comment (which I left unresolved until I commit):

The tests use unsafe { std::env::set_var("IRONCLAW_BASE_DIR", ...) } and protect against inter-test races with ENV_MUTEX. However, ENV_MUTEX only synchronizes tests within this module. When cargo test runs, tests from other modules execute in parallel threads. If another module's test calls ironclaw_base_dir() for the first time while a bootstrap test has temporarily set IRONCLAW_BASE_DIR to a test value, the LazyLock will be permanently initialized with that test value. All subsequent calls to ironclaw_base_dir() from any test in the process will return the wrong path, causing non-deterministic test failures. This is a classic test-order-dependent flaky test pattern.

@serrrfirat serrrfirat merged commit c592a8f into nearai:main Feb 27, 2026
14 checks passed
@ibhagwan
Copy link
Copy Markdown
Contributor Author

Ty @serrrfirat for merging, lmk if you still want to to push the LazyLock -> OnceLock refactor as a new PR?

bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: regular 2-5 merged PRs risk: high Safety, secrets, auth, or critical infrastructure scope: channel/cli TUI / CLI channel scope: channel/wasm WASM channel runtime scope: channel/web Web gateway channel scope: config Configuration scope: llm LLM integration scope: orchestrator Container orchestrator scope: pairing Pairing mode scope: setup Onboarding / setup scope: tool/builtin Built-in tools scope: tool/mcp MCP client scope: workspace Persistent memory / workspace size: L 200-499 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants