feat: add IRONCLAW_BASE_DIR env var with LazyLock caching#397
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the flexibility of IronClaw by externalizing the configuration of its base data directory. By introducing an environment variable and centralizing path resolution, it simplifies multi-agent deployments and provides greater control over data storage locations, making the application more adaptable to diverse operational environments. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable feature by allowing the IronClaw base directory to be configured via the IRONCLAW_BASE_DIR environment variable. This greatly improves flexibility for development, testing, and multi-agent setups. The implementation is robust, using std::sync::LazyLock for efficient, thread-safe initialization of the base path. The refactoring across the codebase to use the new centralized ironclaw_base_dir() function is extensive and well-executed. The added tests are comprehensive and correctly use a mutex to handle the non-thread-safe nature of environment variable manipulation. I have one suggestion regarding the unnecessary use of unsafe blocks in the tests, which I've detailed in a specific comment.
|
@serrrfirat, used your skill to review the code, let's see if this time the review finds less stuff. |
serrrfirat
left a comment
There was a problem hiding this comment.
Summary
This is a well-structured refactoring PR that correctly centralizes ~26 instances of duplicated base-directory computation into a single ironclaw_base_dir() function with LazyLock caching. The core logic is sound and the code quality is good. The main concerns are: (1) test reliability — the ENV_MUTEX doesn't prevent the LazyLock from being poisoned by concurrent test threads in other modules, and a signal channel test still uses dirs::home_dir() instead of the new API; (2) the silent fallback to a CWD-relative path (./.ironclaw) when the home directory is unavailable changes the error behavior in security-sensitive code paths like Docker bind-mount validation and daemon log directory resolution; (3) the null-byte check on the env var is dead code since std::env::var() cannot return strings with embedded null bytes. None of these are critical bugs, but the test reliability issue (f-1) and the sandbox validation change (f-2) deserve attention before merging.
| .collect(); | ||
| assert_eq!(parsed.len(), 2); | ||
| let onboard = parsed.iter().find(|(k, _)| k == "ONBOARD_COMPLETED"); | ||
| assert!(onboard.is_some(), "ONBOARD_COMPLETED must be present"); |
There was a problem hiding this comment.
Test env-var manipulation races with LazyLock initialization in other test threads
The tests use unsafe { std::env::set_var("IRONCLAW_BASE_DIR", ...) } and protect against inter-test races with ENV_MUTEX. However, ENV_MUTEX only synchronizes tests within this module. When cargo test runs, tests from other modules execute in parallel threads. If another module's test calls ironclaw_base_dir() for the first time while a bootstrap test has temporarily set IRONCLAW_BASE_DIR to a test value, the LazyLock will be permanently initialized with that test value. All subsequent calls to ironclaw_base_dir() from any test in the process will return the wrong path, causing non-deterministic test failures. This is a classic test-order-dependent flaky test pattern.
Suggested fix:
Either: (1) Run bootstrap env-var tests in a separate test binary via `[[test]]` in Cargo.toml with `harness = false` so they have their own process. (2) Or avoid relying on the LazyLock at all in production code that's also used in tests — e.g., make `ironclaw_base_dir()` take an optional override parameter, or use a test-specific initialization mechanism. The current approach of testing only `compute_ironclaw_base_dir()` avoids corrupting the LazyLock itself, but `test_validate_bind_mount_valid_path` in `job_manager.rs` (line 617) calls `ironclaw_base_dir()` which initializes the LazyLock, and this test runs in the same process.Severity: medium · Confidence: high
This allows users to place their ironclaw data directory anywhere by setting the IRONCLAW_BASE_DIR environment variable instead of the hardcoded ~/.ironclaw path. Usage: IRONCLAW_BASE_DIR=/custom/path ironclaw Features: - Value computed once at startup and cached via LazyLock for thread safety - Empty string or null bytes in env var treated as unset (falls back to default) - Warns user if home directory cannot be determined (falls back to ./.ironclaw) - Warns user if IRONCLAW_BASE_DIR contains null bytes This is useful for development, testing, or running ironclaw in environments where modifying HOME is not desirable.
- Make compute_ironclaw_base_dir() public for use in tests - Add absolute path check in validate_bind_mount_path for security - Add warning for relative IRONCLAW_BASE_DIR paths - Remove unreachable null-byte check (std::env::var cannot contain nulls) - Rename misleading test name to test_compute_base_dir_env_path_join - Change ironclaw_logs_dir() return type to PathBuf (cannot fail anymore) - Update signal test to use ironclaw_base_dir() instead of dirs::home_dir() - Fix fallback to use current_dir() instead of "." for predictability - Add SAFETY comments to unsafe env var operations in tests
1110b52 to
a5e3e2f
Compare
- Make compute_ironclaw_base_dir() public for use in tests - Add absolute path check in validate_bind_mount_path for security - Add warning for relative IRONCLAW_BASE_DIR paths - Remove unreachable null-byte check (std::env::var cannot contain nulls) - Rename misleading test name to test_compute_base_dir_env_path_join - Change ironclaw_logs_dir() return type to PathBuf (cannot fail anymore) - Update signal test to use ironclaw_base_dir() instead of dirs::home_dir() - Fix fallback to use current_dir() instead of "." for predictability - Add SAFETY comments to unsafe env var operations in tests
a5e3e2f to
1ab9a4f
Compare
serrrfirat
left a comment
There was a problem hiding this comment.
All review feedback from the previous round has been thoroughly addressed in commit 1ab9a4f:
- LazyLock test safety:
compute_ironclaw_base_dir()made public for tests;job_manager.rstest avoids LazyLock initialization — fixes the cross-module race. - Sandbox security: Absolute path check added in
validate_bind_mount_path— hard fail instead of silent fallback. - Dead code removed: Unreachable null-byte check replaced with relative path warning.
- API cleanup:
ironclaw_logs_dir()simplified to returnPathBuf(infallible), signal test updated to useironclaw_base_dir(), test renamed for accuracy. - Fallback improved:
current_dir()+/tmpinstead ofPathBuf::from(".").
LGTM — all 8 findings resolved. Approving.
Ty @serrrfirat, I am just in the process of addressing the I'm referring to this review comment (which I left unresolved until I commit):
|
|
Ty @serrrfirat for merging, lmk if you still want to to push the LazyLock -> OnceLock refactor as a new PR? |
This allows users to place their ironclaw data directory anywhere by setting the IRONCLAW_BASE_DIR environment variable instead of the hardcoded ~/.ironclaw path.
This also enables a multi-agent setup where each agent has their own base_dir with different config/database (if sqlite), etc.
Usage:
IRONCLAW_BASE_DIR=/custom/path ironclaw
Features:
This is useful for development, testing, or running ironclaw in environments where modifying HOME is not desirable.