Improve test infrastructure: StubChannel, gateway helpers, security tests, search edge cases by zmanian · Pull Request #623 · nearai/ironclaw

zmanian · 2026-03-06T18:52:57Z

Summary

Implements testing improvements inspired by OpenClaw's test architecture (see #466).

Batch 1: StubChannel + Test Tier Separation

StubChannel test double for the Channel trait (mpsc-based message injection, response capture)
Wired into TestHarnessBuilder for easy test setup
6 ChannelManager unit tests using StubChannel (first-ever tests for this module)
Integration tests gated behind integration feature flag, replacing silent try_connect() skip pattern
Test tier documentation added to CLAUDE.md

Batch 2: Gateway Helpers, Security Tests, Search Edge Cases

TestGatewayBuilder eliminates 19-field GatewayState struct duplication across 2 integration test files
11 security regression tests for skill installer: ZIP extraction safety (path traversal, nested paths, oversized entries) and SSRF prevention (private IPs, loopback, link-local, IPv4-mapped IPv6, restricted hosts)
8 RRF search edge case tests: empty inputs, single-method results, duplicate chunk merging, limit=0, min_score filtering, config mode builders
Architecture boundary check script (scripts/check-boundaries.sh): detects direct DB driver usage outside db layer, unwrap/expect in production, env var reads outside config

Batch 3: SSRF Fix, WASM Loader Tests, Skills Limits Tests

IPv6 SSRF bypass fix: validate_fetch_url() used host_str() which returns bracketed IPv6 that IpAddr::parse() cannot handle. Switched to url::Host enum matching for correct IP extraction
6 WASM loader security tests: tool name path separator rejection, empty name rejection, nonexistent file handling, invalid WASM bytes rejection, dotfile discovery, subdirectory non-recursion
Activation criteria limits test: verifies enforce_limits() trims excess patterns (>5), keywords (>20), and filters short keywords/tags (<3 chars)

Test plan

cargo test --lib -- all 1882 tests pass (up from 1848)
cargo clippy --all --tests -- -D warnings -- zero warnings
cargo test --test ws_gateway_integration -- 10 tests pass
cargo test --test openai_compat_integration -- 16 tests pass
bash scripts/check-boundaries.sh -- runs, reports known violations as warnings

Generated with Claude Code

Adds StubChannel to src/testing.rs alongside StubLlm. Supports message injection via mpsc sender, response/status capture, and configurable health check toggling. Includes handle methods for use after ownership transfer to ChannelManager. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add with_stub_channel() builder method that creates a StubChannel pre-registered in a ChannelManager. Tests can inject messages via the sender and verify routing through the manager. The channel field on TestHarness is Optional, defaulting to None for backward compat. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace silent try_connect() skip pattern with explicit feature gating. cargo test now runs only self-contained tests. cargo test --features integration runs tests requiring PostgreSQL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Cover add/start_all stream merging, respond routing, unknown channel errors, health_check_all with mixed health, empty-channels error path, and injection channel merging -- all via StubChannel test double. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Grep-based checks for three architecture boundaries: - Direct database driver usage (tokio_postgres/libsql) outside src/db/ - .unwrap()/.expect() in production code (warning only) - Direct std::env::var reads outside config layer (warning only) The DB driver check is a hard violation; the other two are warnings for gradual cleanup. Run with: bash scripts/check-boundaries.sh Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…onfig modes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… protections Add 11 regression tests covering the security controls in skill_tools: ZIP extraction safety: - Valid SKILL.md extraction works correctly - Non-SKILL.md entries are ignored (returns error) - Path traversal entries (../../SKILL.md) do not match - Nested path entries (subdir/SKILL.md) do not match - Oversized entries (>1MB uncompressed) are rejected SSRF prevention: - Loopback addresses (127.0.0.1) are blocked - Private ranges (10.x, 172.16.x, 192.168.x) are blocked - Link-local addresses (169.254.x) are blocked - Public IPs (8.8.8.8, 1.1.1.1) are allowed - IPv4-mapped IPv6 unwrapping logic works correctly - Metadata endpoints and .internal/.local hostnames are blocked - Normal hostnames (github.com, clawhub.dev) are allowed Also documents a known gap: url::Url::host_str() returns bracketed IPv6 addresses that std::net::IpAddr cannot parse, so IPv4-mapped IPv6 URLs currently bypass IP-based checks in validate_fetch_url. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…st duplication Both ws_gateway_integration.rs and openai_compat_integration.rs manually constructed GatewayState with 19+ fields. Extracted to a shared builder in src/channels/web/test_helpers.rs that provides sensible defaults and lets tests override only what they need. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

validate_fetch_url used host_str() which returns bracketed IPv6 (e.g. "[::ffff:7f00:1]") that IpAddr::parse() cannot handle, silently skipping IP-based SSRF checks for all IPv6 URLs. Switch to url::Host enum matching to extract proper IpAddr values without string parsing. IPv4-mapped IPv6 addresses like ::ffff:127.0.0.1 are now correctly unwrapped and blocked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds test_activation_criteria_enforce_limits to verify that enforce_limits() correctly trims excess patterns (>5), keywords (>20), and tags (>10), and filters out short keywords/tags (<3 chars). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 6 tests covering: tool name path separator rejection, empty name rejection, nonexistent file handling, invalid WASM bytes rejection, dotfile discovery behavior, and subdirectory non-recursion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove plan files from repo (ilblackdragon review) - Replace CLAUDE.md test tier rules with pointer to check-boundaries.sh - Add Check 4 to check-boundaries.sh: enforces integration tests are gated behind the 'integration' feature flag Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Check 5 catches try_connect() and similar silent-skip patterns in integration tests. Tests should use feature gates to fail loudly when prerequisites are missing, not silently return. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge origin/main into the PR branch, resolving merge conflicts in src/testing.rs by keeping both the branch's StubChannel/harness tests and main's database CRUD coverage tests. Fix the Tests (all-features) CI failure caused by workspace_integration tests trying to connect to PostgreSQL when none is available. The --all-features flag enables the integration feature, which gates these tests, but CI has no PostgreSQL service. Changed the CI test matrix to use explicit feature flags instead of --all-features, excluding the integration feature (which requires a running database). Also applied cargo fmt to files touched by the merge. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist · 2026-03-06T18:53:26Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the project's testing capabilities and security posture. It introduces new test doubles and builders to simplify writing robust tests, particularly for channel management and web gateway interactions. A new script enforces architectural boundaries, promoting code quality and maintainability. Crucially, it addresses several security vulnerabilities and edge cases through dedicated regression tests for skill installation, URL fetching (SSRF), and WASM tool loading, ensuring the system is more resilient against malicious inputs and configurations. The changes also improve the reliability of search functionality by covering various edge cases in reciprocal rank fusion.

Highlights

Testing Infrastructure: Introduced a StubChannel test double for the Channel trait, enabling easier message injection and response capture in tests. This is integrated into TestHarnessBuilder for streamlined test setup.
ChannelManager Unit Tests: Added 6 new unit tests for ChannelManager, leveraging the StubChannel to provide comprehensive coverage for this module for the first time.
Integration Test Gating: Refactored integration tests to be gated behind an integration feature flag, replacing previous silent try_connect() skip patterns and improving test tier separation. Documentation for test tiers was added to CLAUDE.md.
Gateway Test Helpers: Created TestGatewayBuilder to simplify the setup of GatewayState in integration tests, reducing boilerplate and duplication across test files.
Security Regression Tests (Skill Installer): Implemented 11 new security regression tests for the skill installer, covering ZIP extraction safety (path traversal, nested paths, oversized entries) and SSRF prevention (private IPs, loopback, link-local, IPv4-mapped IPv6, restricted hosts).
RRF Search Edge Case Tests: Added 8 new tests for Reciprocal Rank Fusion (RRF) search, addressing edge cases such as empty inputs, single-method results, duplicate chunk merging, limit=0, min_score filtering, and config mode builders.
Architecture Boundary Check Script: Introduced a new scripts/check-boundaries.sh script to enforce architectural rules, detecting direct database driver usage outside the db layer, unwrap()/expect() in production code, and environment variable reads outside config/bootstrap layers.
IPv6 SSRF Bypass Fix: Fixed an IPv6 SSRF bypass vulnerability in validate_fetch_url() by switching to url::Host enum matching for correct IP extraction, addressing issues with bracketed IPv6 addresses.
WASM Loader Security Tests: Added 6 security tests for the WASM loader, verifying rejection of tool names with path separators, empty names, handling of nonexistent files, invalid WASM bytes, dotfile discovery, and subdirectory non-recursion.
Activation Criteria Limits Test: Included a test to verify that enforce_limits() correctly trims excess activation patterns, keywords, and filters short keywords/tags.

Changelog

CLAUDE.md
- Updated documentation to include a new 'Test Tiers' section, detailing unit, integration, and live testing commands and dependencies.
- Added instructions to run the new check-boundaries.sh script.
Cargo.lock
- Updated the syn dependency from version 2.0.114 to 2.0.117.
scripts/check-boundaries.sh
- Added a new shell script to perform architecture boundary checks, including direct database driver usage, unwrap()/expect() in production, direct environment variable reads, test tier gating, and silent test-skip patterns.
src/channels/manager.rs
- Added a new test module with 6 unit tests for ChannelManager, covering adding channels, starting all channels, responding to channels, handling unknown channels, health checks, and message injection.
src/channels/web/mod.rs
- Exported the new test_helpers module to provide shared utilities for gateway integration tests.
src/channels/web/test_helpers.rs
- Added a new module containing TestGatewayBuilder, a builder for constructing GatewayState with sensible test defaults and methods to start a test server.
src/skills/mod.rs
- Added a new test case test_activation_criteria_enforce_limits to verify that activation criteria (keywords, patterns, tags) are correctly limited and filtered based on predefined caps and minimum lengths.
src/testing.rs
- Introduced StubChannel, a configurable test double for the Channel trait, allowing message injection and response/status capture.
- Updated TestHarness to include an optional channel field for StubChannel and ChannelManager.
- Modified TestHarnessBuilder with a with_stub_channel() method to easily integrate StubChannel into test harnesses.
- Added new tests for StubChannel functionality (injecting messages, capturing responses, health checks) and for TestHarness integration with StubChannel.
src/tools/builtin/skill_tools.rs
- Modified validate_fetch_url to use url::Host for more robust IP address parsing, specifically to correctly handle bracketed IPv6 addresses and prevent SSRF bypasses.
- Updated the host string conversion for internal hostname checks to use host.to_string().to_lowercase().
- Added extensive security regression tests for ZIP extraction, covering valid skill extraction, ignoring non-skill entries, rejecting path traversal, nested paths, and oversized entries.
- Added comprehensive SSRF prevention regression tests, blocking loopback, private ranges, link-local, IPv4-mapped IPv6, and restricted internal hostnames, while allowing public IPs.
src/tools/wasm/loader.rs
- Added a suite of security regression tests for the WASM tool loader, including checks for tool names with path separators, empty names, nonexistent WASM files, invalid WASM bytes, dotfile handling, and non-recursive subdirectory discovery.
src/workspace/search.rs
- Added new edge case tests for the reciprocal rank fusion (RRF) search algorithm, covering scenarios like empty input lists, FTS-only or vector-only results, merging duplicate chunks, handling a limit of zero, and filtering by minimum score.
tests/heartbeat_integration.rs
- Updated the cfg attribute to all(feature = "postgres", feature = "integration") to explicitly gate the integration test behind both PostgreSQL and the general integration feature flag.
tests/openai_compat_integration.rs
- Refactored start_test_server_with_provider and the test_no_llm_provider_returns_503 test to utilize the new TestGatewayBuilder for creating GatewayState instances, reducing boilerplate.
tests/workspace_integration.rs
- Updated the cfg attribute to all(feature = "postgres", feature = "integration") for all integration tests.
- Removed the try_connect() helper function and its calls within tests, relying on the new feature flag gating for conditional test execution.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/test.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request significantly improves the test infrastructure with StubChannel and TestGatewayBuilder, and introduces a new test tiering system. It also enhances security by addressing an IPv6 SSRF bypass in validate_fetch_url and adding comprehensive regression tests for ZIP extraction and WASM loading. However, a high-severity vulnerability was identified: the SSRF protection remains incomplete as it does not account for domain names resolving to private IP addresses, posing a DNS-based SSRF bypass risk. Additionally, a suggestion was made to improve the robustness of the check-boundaries.sh script.

gemini-code-assist · 2026-03-06T18:57:48Z

+    let raw_ip = match &host {
+        url::Host::Ipv4(v4) => Some(std::net::IpAddr::V4(*v4)),
+        url::Host::Ipv6(v6) => Some(std::net::IpAddr::V6(*v6)),
+        url::Host::Domain(_) => None,
+    };
+    if let Some(raw_ip) = raw_ip {


The validate_fetch_url function is vulnerable to SSRF via DNS resolution. While it correctly checks for private/loopback IP addresses when the host is provided as an IP, it skips these checks when a domain name is used (line 407). An attacker can bypass the SSRF protection by using a domain name that resolves to a private IP address (e.g., 127.0.0.1.nip.io).

To remediate this, you should resolve the domain name to its IP addresses and validate each IP against the restricted ranges before allowing the request to proceed. Alternatively, configure the HTTP client with a custom resolver or socket factory that enforces these restrictions at the connection level.

Addressed in commit ced6b96 (already on the branch before this review round). The build_safe_fetch_client function now performs async DNS resolution via tokio::net::lookup_host and validates all resolved IPs against private/loopback/link-local ranges through validate_resolved_addrs. This prevents DNS-based SSRF bypass via domains like 127.0.0.1.nip.io. The HTTP client is also configured with resolve_to_addrs to pin the connection to the validated IPs, preventing TOCTOU races between resolution and connection.

Addressed in ced6b96. The skill fetch path now resolves hostnames before connecting, rejects any private/loopback/link-local result, and pins the validated addresses into reqwest with resolve_to_addrs so the request cannot re-resolve to a different target.

gemini-code-assist · 2026-03-06T18:57:48Z

+tier_violations=""
+for test_file in tests/*.rs; do
+    [ -f "$test_file" ] || continue
+
+    # Check if the file actually connects to a database (imports DB types
+    # or calls pool/connect). Mere string references like "DATABASE_URL"
+    # in config tests don't count.
+    needs_gate=false
+    if grep -q 'PgPool\|tokio_postgres::\|create_pool\|\.connect(' "$test_file" 2>/dev/null; then
+        needs_gate=true
+    fi
+
+    if [ "$needs_gate" = true ]; then
+        # Check first 5 lines for the cfg gate
+        if ! head -5 "$test_file" | grep -q 'cfg.*feature.*integration' 2>/dev/null; then
+            tier_violations="$tier_violations\n  $test_file: needs #![cfg(all(feature = \"postgres\", feature = \"integration\"))]"
+        fi
+    fi
+done
+
+if [ -n "$tier_violations" ]; then
+    echo "VIOLATION: Integration tests missing feature gate:"
+    echo -e "$tier_violations"
+    echo
+    echo "(Tests requiring external services must be gated behind the 'integration' feature)"
+    violations=$((violations + 1))
+else
+    echo "OK"
+fi


The current method of building the tier_violations string by concatenation is a bit fragile. It can lead to issues if filenames contain special characters, and echo -e has known portability and security concerns compared to printf. Using a bash array to collect violations and printf to print them would be more robust and safer.

Here's a suggested refactoring of this check:

tier_violations=() for test_file in tests/*.rs; do [ -f "$test_file" ] || continue # Check if the file actually connects to a database (imports DB types # or calls pool/connect). Mere string references like "DATABASE_URL" # in config tests don't count. needs_gate=false if grep -q 'PgPool\|tokio_postgres::\|create_pool\|\.connect(' "$test_file" 2>/dev/null; then needs_gate=true fi if [ "$needs_gate" = true ]; then # Check first 5 lines for the cfg gate if ! head -5 "$test_file" | grep -q 'cfg.*feature.*integration' 2>/dev/null; then tier_violations+=(" $test_file: needs '#![cfg(all(feature = \"postgres\", feature = \"integration\"))]'") fi fi done if [ ${#tier_violations[@]} -gt 0 ]; then echo "VIOLATION: Integration tests missing feature gate:" printf '%s\n' "${tier_violations[@]}" echo echo "(Tests requiring external services must be gated behind the 'integration' feature)" violations=$((violations + 1)) else echo "OK" fi

Addressed in ec6e328. Refactored Check 4 to use a bash array (tier_violations=()) and printf '%s\n' instead of string concatenation with echo -e, as suggested.

ilblackdragon · 2026-03-07T02:26:33Z

Known gap: DNS-based SSRF bypass. The fix handles IP literals but a domain like evil.attacker.com that
DNS-resolves to 127.0.0.1 still bypasses all checks. This is out of scope for this PR, but worth a //
TODO: DNS rebinding protection comment at the top of validate_fetch_url so nobody assumes it's fully
hardened. The fix should happen at connect time (check the resolved IP before establishing the TCP
connection), not at URL validation time.

Code Issues

test_helpers.rs ships in production builds. pub mod test_helpers; in src/channels/web/mod.rs has no
cfg gate, so this test builder code compiles into release binaries. The comment explains why (tests/
can't see #[cfg(test)] modules), but the standard pattern is a dedicated feature flag:
#[cfg(any(test, feature = "test-helpers"))]
pub mod test_helpers;
Or gate it behind the existing integration feature since that's who uses it.
use super::* in manager.rs tests (line 288). Project convention (CLAUDE.md) says prefer crate:: over
super::. Should be specific imports like use crate::channels::manager::{ChannelManager,
OutgoingResponse}; or at minimum use crate::channels::manager::*;.
Hand-rolled ZIP entries with zeroed CRC32. The build_zip_entry_store helper writes 0x00000000 for
CRC32. This works because the current ZIP parser doesn't verify checksums, but it's fragile. If someone
swaps the ZIP library or enables CRC verification, all these tests break silently. Consider computing the
actual CRC32 (it's a 4-line addition with the crc32fast crate, or just hardcode the known values for
fixed test content).

…heck Refactor Check 4 in check-boundaries.sh to use bash arrays and printf instead of string concatenation with echo -e. This is more robust with special characters in filenames and avoids portability concerns with echo -e. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zmanian · 2026-03-07T02:48:41Z

Review Feedback Addressed

Both review comments from gemini-code-assist have been addressed:

1. SSRF DNS bypass (High severity) - `src/tools/builtin/skill_tools.rs`

Already fixed in ced6b96 (prior commit on this branch). The build_safe_fetch_client function performs async DNS resolution via tokio::net::lookup_host, validates all resolved IPs through validate_resolved_addrs, and pins the HTTP client to validated addresses via resolve_to_addrs to prevent TOCTOU races.

2. Shell script robustness (Medium) - `scripts/check-boundaries.sh`

Fixed in ec6e328. Check 4's tier_violations variable refactored from string concatenation + echo -e to a bash array + printf '%s\n' for better portability and robustness with special characters.

gemini-code-assist · 2026-03-07T02:49:09Z

Hi @zmanian,

Thank you for the update and for addressing the feedback! I appreciate you taking the time to implement these improvements.

Just a quick clarification: my initial comment (Comment 1) was a summary of the pull request's changes, not a review with specific issues. The detailed review comments you're referring to were provided by @ilblackdragon in Comment 2.

Regarding the points you've addressed:

DNS-based SSRF bypass (src/tools/builtin/skill_tools.rs): I've reviewed the changes, and the implementation of build_safe_fetch_client with async DNS resolution, validation of resolved IPs via validate_resolved_addrs, and pinning the HTTP client to validated addresses effectively addresses the "Known gap" raised by @ilblackdragon. This is a significant security improvement. Well done!
Shell script robustness (scripts/check-boundaries.sh): The refactoring of tier_violations to use a bash array and printf for better portability and robustness is a good enhancement.

However, there are a few other points from @ilblackdragon's review that still need attention:

test_helpers.rs ships in production builds
- Severity: high
- Description: The pub mod test_helpers; declaration in src/channels/web/mod.rs lacks a #[cfg] gate, causing test helper code to be compiled into release binaries. This can lead to unnecessary code in production builds and potentially expose test-specific functionalities.
- Suggestion: Gate the test_helpers module behind a feature flag or the existing integration feature.
- Reference: src/channels/web/mod.rs (lines 29-32 in the AFTER patch)
use super::* in manager.rs tests
- Severity: low
- Description: The test module in src/channels/manager.rs uses use super::*;. The project convention (as noted in CLAUDE.md) prefers explicit crate:: imports or specific use statements for better clarity and to avoid potential naming conflicts.
- Suggestion: Replace use super::* with specific imports like use crate::channels::{IncomingMessage, OutgoingResponse}; and use crate::testing::StubChannel;.
- Reference: src/channels/manager.rs (line 241 in the AFTER patch)
Hand-rolled ZIP entries with zeroed CRC32
- Severity: high
- Description: The build_zip_entry_store helper in src/tools/builtin/skill_tools.rs uses a zeroed CRC32 value. While the current ZIP parser might not verify checksums, this makes the tests fragile. If the ZIP library changes or CRC verification is enabled in the future, these tests could silently break or pass incorrectly.
- Suggestion: Consider computing the actual CRC32 for the content (e.g., using a crate like crc32fast) or hardcoding known correct CRC32 values for fixed test content to make the tests more robust.
- Reference: src/tools/builtin/skill_tools.rs (lines 1012-1027 in the AFTER patch)

Please let me know if you have any questions about these remaining points.

…ests, search edge cases (nearai#623) * feat(testing): add StubChannel test double for Channel trait Adds StubChannel to src/testing.rs alongside StubLlm. Supports message injection via mpsc sender, response/status capture, and configurable health check toggling. Includes handle methods for use after ownership transfer to ChannelManager. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(testing): wire StubChannel into TestHarnessBuilder Add with_stub_channel() builder method that creates a StubChannel pre-registered in a ChannelManager. Tests can inject messages via the sender and verify routing through the manager. The channel field on TestHarness is Optional, defaulting to None for backward compat. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: gate external-service tests behind integration feature flag Replace silent try_connect() skip pattern with explicit feature gating. cargo test now runs only self-contained tests. cargo test --features integration runs tests requiring PostgreSQL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(channels): add ChannelManager unit tests using StubChannel Cover add/start_all stream merging, respond routing, unknown channel errors, health_check_all with mixed health, empty-channels error path, and injection channel merging -- all via StubChannel test double. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: document test tier separation (unit/integration/live) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: add architecture boundary check script Grep-based checks for three architecture boundaries: - Direct database driver usage (tokio_postgres/libsql) outside src/db/ - .unwrap()/.expect() in production code (warning only) - Direct std::env::var reads outside config layer (warning only) The DB driver check is a hard violation; the other two are warnings for gradual cleanup. Run with: bash scripts/check-boundaries.sh Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(search): add RRF edge case tests for empty inputs, limits, and config modes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(security): add regression tests for skill installer ZIP and SSRF protections Add 11 regression tests covering the security controls in skill_tools: ZIP extraction safety: - Valid SKILL.md extraction works correctly - Non-SKILL.md entries are ignored (returns error) - Path traversal entries (../../SKILL.md) do not match - Nested path entries (subdir/SKILL.md) do not match - Oversized entries (>1MB uncompressed) are rejected SSRF prevention: - Loopback addresses (127.0.0.1) are blocked - Private ranges (10.x, 172.16.x, 192.168.x) are blocked - Link-local addresses (169.254.x) are blocked - Public IPs (8.8.8.8, 1.1.1.1) are allowed - IPv4-mapped IPv6 unwrapping logic works correctly - Metadata endpoints and .internal/.local hostnames are blocked - Normal hostnames (github.com, clawhub.dev) are allowed Also documents a known gap: url::Url::host_str() returns bracketed IPv6 addresses that std::net::IpAddr cannot parse, so IPv4-mapped IPv6 URLs currently bypass IP-based checks in validate_fetch_url. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(testing): extract TestGatewayBuilder to eliminate gateway test duplication Both ws_gateway_integration.rs and openai_compat_integration.rs manually constructed GatewayState with 19+ fields. Extracted to a shared builder in src/channels/web/test_helpers.rs that provides sensible defaults and lets tests override only what they need. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add implementation plans for testing batches 1 and 2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(security): close IPv6 SSRF bypass in validate_fetch_url validate_fetch_url used host_str() which returns bracketed IPv6 (e.g. "[::ffff:7f00:1]") that IpAddr::parse() cannot handle, silently skipping IP-based SSRF checks for all IPv6 URLs. Switch to url::Host enum matching to extract proper IpAddr values without string parsing. IPv4-mapped IPv6 addresses like ::ffff:127.0.0.1 are now correctly unwrapped and blocked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(skills): add activation criteria limits enforcement tests Adds test_activation_criteria_enforce_limits to verify that enforce_limits() correctly trims excess patterns (>5), keywords (>20), and tags (>10), and filters out short keywords/tags (<3 chars). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(wasm): add security regression tests for WASM tool loader Add 6 tests covering: tool name path separator rejection, empty name rejection, nonexistent file handling, invalid WASM bytes rejection, dotfile discovery behavior, and subdirectory non-recursion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: address PR review feedback - Remove plan files from repo (ilblackdragon review) - Replace CLAUDE.md test tier rules with pointer to check-boundaries.sh - Add Check 4 to check-boundaries.sh: enforces integration tests are gated behind the 'integration' feature flag Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: add try_connect silent-skip pattern check to check-boundaries.sh Check 5 catches try_connect() and similar silent-skip patterns in integration tests. Tests should use feature gates to fail loudly when prerequisites are missing, not silently return. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(security): harden skill fetch SSRF checks * fix(scripts): use bash arrays in check-boundaries.sh tier violation check Refactor Check 4 in check-boundaries.sh to use bash arrays and printf instead of string concatenation with echo -e. This is more robust with special characters in filenames and avoids portability concerns with echo -e. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

zmanian and others added 16 commits March 3, 2026 17:09

docs: document test tier separation (unit/integration/live)

7d583af

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test(search): add RRF edge case tests for empty inputs, limits, and c…

be3bbdd

…onfig modes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add implementation plans for testing batches 1 and 2

6a13e87

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist Bot reviewed Mar 6, 2026

View reviewed changes

zmanian and others added 2 commits March 6, 2026 18:39

fix(security): harden skill fetch SSRF checks

ced6b96

ilblackdragon approved these changes Mar 7, 2026

View reviewed changes

ilblackdragon merged commit 45ec691 into main Mar 7, 2026
27 checks passed

ilblackdragon deleted the do-some-research-into-openclaw branch March 7, 2026 08:30

This was referenced Mar 7, 2026

chore: release v0.17.0 #633

Closed

chore: release v0.17.0 #842

Merged

BrewTestBot mentioned this pull request Mar 10, 2026

ironclaw 0.17.0 Homebrew/homebrew-core#271600

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve test infrastructure: StubChannel, gateway helpers, security tests, search edge cases#623

Improve test infrastructure: StubChannel, gateway helpers, security tests, search edge cases#623
ilblackdragon merged 18 commits intomainfrom
do-some-research-into-openclaw

zmanian commented Mar 6, 2026

Uh oh!

gemini-code-assist Bot commented Mar 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 6, 2026

Uh oh!

zmanian Mar 7, 2026

Uh oh!

zmanian Mar 7, 2026

Uh oh!

gemini-code-assist Bot Mar 6, 2026

Uh oh!

zmanian Mar 7, 2026

Uh oh!

ilblackdragon commented Mar 7, 2026

Uh oh!

zmanian commented Mar 7, 2026

Uh oh!

gemini-code-assist Bot commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zmanian commented Mar 6, 2026

Summary

Batch 1: StubChannel + Test Tier Separation

Batch 2: Gateway Helpers, Security Tests, Search Edge Cases

Batch 3: SSRF Fix, WASM Loader Tests, Skills Limits Tests

Test plan

Uh oh!

gemini-code-assist Bot commented Mar 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

zmanian Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

zmanian Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

zmanian Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

ilblackdragon commented Mar 7, 2026

Code Issues

Uh oh!

zmanian commented Mar 7, 2026

Review Feedback Addressed

1. SSRF DNS bypass (High severity) - src/tools/builtin/skill_tools.rs

2. Shell script robustness (Medium) - scripts/check-boundaries.sh

Uh oh!

gemini-code-assist Bot commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. SSRF DNS bypass (High severity) - `src/tools/builtin/skill_tools.rs`

2. Shell script robustness (Medium) - `scripts/check-boundaries.sh`