feat(gateway): add per-endpoint rate limiting and webhook idempotency#188
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning
|
| Cohort / File(s) | Summary |
|---|---|
Runtime Infrastructure src/runtime/traits.rs, src/runtime/native.rs, src/runtime/docker.rs, src/runtime/mod.rs |
Introduces RuntimeAdapter trait with build_shell_command method; implements DockerRuntime with container configuration (memory, CPU, mount options, allowlist validation) alongside updated NativeRuntime; wires Docker factory into create_runtime based on config.kind. |
Configuration Schema src/config/schema.rs, src/config/mod.rs |
Expands GatewayConfig with per-client rate limits (pair, webhook) and idempotency TTL; extends AutonomyConfig with approval and blocking flags for medium/high-risk commands; adds DockerRuntimeConfig with container resource limits, mount behavior, and workspace allowlist; updates Default implementations and serde handling. |
Gateway Rate Limiting & Idempotency src/gateway/mod.rs |
Implements sliding-window GatewayRateLimiter and TTL-based IdempotencyStore; integrates checks into handle_pair and handle_webhook with 429 responses on limit exceeded; adds X-Idempotency-Key deduplication to prevent webhook duplicates; wires components into AppState via run_gateway. |
Security Policy & Command Risk src/security/policy.rs |
Introduces CommandRiskLevel enum (Low/Medium/High) and validates commands based on risk patterns; adds validate_command_execution method enforcing allowlist, risk-based gating, and approval requirements; configurable strict defaults via require_approval_for_medium_risk and block_high_risk_commands flags. |
Tools Runtime Integration src/tools/mod.rs, src/tools/shell.rs |
Updates tools registry with runtime-aware factory functions (all_tools_with_runtime, default_tools_with_runtime); extends ShellTool to accept and use RuntimeAdapter for command building; adds approval flag to shell command schema; integrates risk validation and action budgeting into execution flow. |
Agent Runtime Wiring src/agent/loop_.rs |
Replaces unused _runtime with Arc<dyn RuntimeAdapter>; passes runtime into tools initialization via all_tools_with_runtime, enabling runtime-aware tool setup. |
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
- Fix/imessage sql injection #61 — Modifies GatewayConfig defaults and Default implementation in the same schema file, overlapping with rate limit and idempotency TTL field additions.
- fix: apply TimeoutLayer to gateway router for request timeouts #74 — Updates gateway router initialization and AppState wiring in src/gateway/mod.rs, similar architectural layer as rate limiter and idempotency injection.
- feat: add browser automation tool using Vercel agent-browser #46 — Extends the tools registry in src/tools/mod.rs with new tool configurations, directly affected by runtime-aware factory function signatures introduced here.
Poem
🐰 A Docker shell now springs to life,
Rate limits guard from surging strife,
Commands assessed for risk, then blessed,
With runtimes chosen—put to the test! 🚀
🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 58.56% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | ✅ Passed | The title 'feat(gateway): add per-endpoint rate limiting and webhook idempotency' directly and clearly summarizes the main changes in this PR, which focus on adding rate limiting and idempotency mechanisms to the gateway. |
| Merge Conflict Detection | ✅ Passed | ✅ No merge conflicts detected when merging into main |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing touches
- 📝 Generate docstrings
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/tools/shell.rs (1)
99-118:⚠️ Potential issue | 🟠 Major
env_clear()on the Docker CLI process may break Docker host configuration.When the runtime is
DockerRuntime,cmdis thedockerCLI process. Callingenv_clear()strips variables likeDOCKER_HOST,DOCKER_CONFIG,DOCKER_TLS_VERIFY, andDOCKER_CERT_PATHthat are commonly used to configure the Docker client (especially in CI or remote-Docker setups). The currentSAFE_ENV_VARSallowlist does not include any Docker-specific variables, sodocker runwill silently fall back to the default socket — which may not exist or may point to the wrong daemon.Consider letting the runtime declare additional safe env vars, or moving env sanitization inside the runtime's
build_shell_commandso each runtime can make the right call.Possible approach: add a trait method for runtime-specific safe env vars
In
src/runtime/traits.rs:/// Additional environment variables this runtime needs on the host process. fn required_env_vars(&self) -> &[&str] { &[] }In
DockerRuntime:fn required_env_vars(&self) -> &[&str] { &["DOCKER_HOST", "DOCKER_CONFIG", "DOCKER_TLS_VERIFY", "DOCKER_CERT_PATH"] }Then in
src/tools/shell.rs, after re-addingSAFE_ENV_VARS, also propagateself.runtime.required_env_vars().
🤖 Fix all issues with AI agents
In `@src/config/schema.rs`:
- Around line 420-449: The doc comment for DockerRuntimeConfig should clarify
the actual default behavior of allowed_workspace_roots (empty vector = allow all
workspaces, non-empty = allowlist-only) to match runtime logic in the Docker
mount validation; update the struct-level docs above DockerRuntimeConfig and the
field doc for allowed_workspace_roots to state "empty = allow all, non-empty =
allowlist-only (see runtime/docker.rs validation)". If you prefer a restrictive
default instead, change the serde default for allowed_workspace_roots to return
a restrictive list (e.g., default to vec!["/tmp".to_string()]) by replacing the
current #[serde(default)] with a #[serde(default =
"default_restrictive_workspace_roots")] and implement that function, and update
the docs accordingly.
In `@src/gateway/mod.rs`:
- Around line 55-58: The current allow() implementation treats limit_per_window
== 0 as "unlimited", which is counterintuitive; change the config/type so 0 no
longer means unlimited by making limit_per_window an Option<u32> (None =
unlimited) and update the config parsing for pair_rate_limit_per_minute to map 0
=> Some(0) or None appropriately (e.g., interpret an explicit sentinel like
-1/absent as None), then modify the allow(&self, key: &str) method to: return
true only when limit_per_window is None (unlimited), treat Some(0) as "block
all" and otherwise enforce the numeric limit. Also update any docs/comments to
state the new semantics for pair_rate_limit_per_minute and adjust any code that
constructs or reads limit_per_window.
- Around line 137-147: client_key_from_headers currently trusts X-Forwarded-For
/ X-Real-IP unconditionally which allows header spoofing; update the logic so
the default behavior derives the rate-limit key from the real TCP peer address
(use axum::extract::ConnectInfo<SocketAddr> where the rate-limit lookup occurs)
and only fallback to or combine with X-Forwarded-For when a new boolean config
flag (e.g. trust_proxy_headers) is true, and add a short code comment in
client_key_from_headers and an explicit note in SECURITY.md explaining the
trusted-proxy requirement when trust_proxy_headers is enabled.
🧹 Nitpick comments (5)
src/runtime/docker.rs (1)
130-134: Empty image name would produce a confusing Docker error.If
self.config.imageis empty or whitespace-only,trim()yields"", and Docker will fail with an unhelpful error. Consider validating the image name early (e.g., innew()or at the start ofbuild_shell_command).src/security/policy.rs (2)
171-289:command_risk_leveldoesn't use&self— consider making it a pure function or documenting intent.The method takes
&selfbut accesses no fields ofSecurityPolicy. If you plan to make risk classification configurable per-policy (e.g. user-defined high-risk lists), the current signature is forward-looking. Otherwise, it could be an associated function or free function for clarity and easier testability.This is a minor style observation, not blocking.
647-710: Good test coverage for risk classification and approval gates.The tests cover low/medium/high classification and the approval/blocking behavior. Consider adding an edge-case test where
block_high_risk_commands = falseandapproved = trueto verify the allow path for high-risk commands when the safety net is explicitly disabled.src/gateway/mod.rs (2)
104-135: IdempotencyStore has no max-capacity bound — memory can grow under abusive traffic.Each call to
record_if_newcleans expired entries, which bounds the store to keys from the last TTL window. However, within that window, there's no cap on unique keys. Under a DDoS with unique idempotency keys (which bypasses per-client rate limiting via spoofed headers), the HashMap grows unbounded.Consider adding a maximum capacity (e.g., evict oldest entry when exceeding N keys) as a safety net.
184-192: Nit: hardcoded default model string duplicated.The default model
"anthropic/claude-sonnet-4-20250514"is defined inline here and also presumably inConfig::default()(line 807 of schema.rs). Consider referencing a shared constant to keep them in sync.
| #[derive(Debug, Clone, Serialize, Deserialize)] | ||
| pub struct DockerRuntimeConfig { | ||
| /// Runtime image used to execute shell commands. | ||
| #[serde(default = "default_docker_image")] | ||
| pub image: String, | ||
|
|
||
| /// Docker network mode (`none`, `bridge`, etc.). | ||
| #[serde(default = "default_docker_network")] | ||
| pub network: String, | ||
|
|
||
| /// Optional memory limit in MB (`None` = no explicit limit). | ||
| #[serde(default = "default_docker_memory_limit_mb")] | ||
| pub memory_limit_mb: Option<u64>, | ||
|
|
||
| /// Optional CPU limit (`None` = no explicit limit). | ||
| #[serde(default = "default_docker_cpu_limit")] | ||
| pub cpu_limit: Option<f64>, | ||
|
|
||
| /// Mount root filesystem as read-only. | ||
| #[serde(default = "default_true")] | ||
| pub read_only_rootfs: bool, | ||
|
|
||
| /// Mount configured workspace into `/workspace`. | ||
| #[serde(default = "default_true")] | ||
| pub mount_workspace: bool, | ||
|
|
||
| /// Optional workspace root allowlist for Docker mount validation. | ||
| #[serde(default)] | ||
| pub allowed_workspace_roots: Vec<String>, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n 'allowed_workspace_roots' --type=rust -C5Repository: theonlyhennygod/zeroclaw
Length of output: 3853
Update the documentation to clarify the default workspace root behavior, or reconsider the permissive default.
When allowed_workspace_roots is empty, the code permits any workspace path to be mounted (see src/runtime/docker.rs:33-35). This contradicts the claim of "strong security defaults"—the default configuration allows unrestricted workspace mounting. Either:
- Document this behavior explicitly in the struct's doc comment (clarify that empty = allow all, non-empty = allowlist-only), or
- Change the default to a restrictive allowlist (e.g.,
["/tmp"]) to align with stated security posture.
🤖 Prompt for AI Agents
In `@src/config/schema.rs` around lines 420 - 449, The doc comment for
DockerRuntimeConfig should clarify the actual default behavior of
allowed_workspace_roots (empty vector = allow all workspaces, non-empty =
allowlist-only) to match runtime logic in the Docker mount validation; update
the struct-level docs above DockerRuntimeConfig and the field doc for
allowed_workspace_roots to state "empty = allow all, non-empty = allowlist-only
(see runtime/docker.rs validation)". If you prefer a restrictive default
instead, change the serde default for allowed_workspace_roots to return a
restrictive list (e.g., default to vec!["/tmp".to_string()]) by replacing the
current #[serde(default)] with a #[serde(default =
"default_restrictive_workspace_roots")] and implement that function, and update
the docs accordingly.
| fn allow(&self, key: &str) -> bool { | ||
| if self.limit_per_window == 0 { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
limit_per_window == 0 disables rate limiting — document or invert the semantics.
When limit_per_window is 0, allow() returns true unconditionally (line 57), effectively meaning "unlimited." This is counterintuitive — 0 might reasonably be expected to mean "block all requests." The config defaults (10, 60) make this unlikely in practice, but if an operator sets pair_rate_limit_per_minute = 0 expecting to disable the endpoint, they'd get the opposite.
Consider either:
- Documenting that
0 = unlimitedin the config schema, or - Inverting so
0 = block alland using a sentinel (e.g.u32::MAXor anOption) for unlimited.
🤖 Prompt for AI Agents
In `@src/gateway/mod.rs` around lines 55 - 58, The current allow() implementation
treats limit_per_window == 0 as "unlimited", which is counterintuitive; change
the config/type so 0 no longer means unlimited by making limit_per_window an
Option<u32> (None = unlimited) and update the config parsing for
pair_rate_limit_per_minute to map 0 => Some(0) or None appropriately (e.g.,
interpret an explicit sentinel like -1/absent as None), then modify the
allow(&self, key: &str) method to: return true only when limit_per_window is
None (unlimited), treat Some(0) as "block all" and otherwise enforce the numeric
limit. Also update any docs/comments to state the new semantics for
pair_rate_limit_per_minute and adjust any code that constructs or reads
limit_per_window.
| fn client_key_from_headers(headers: &HeaderMap) -> String { | ||
| for header_name in ["X-Forwarded-For", "X-Real-IP"] { | ||
| if let Some(value) = headers.get(header_name).and_then(|v| v.to_str().ok()) { | ||
| let first = value.split(',').next().unwrap_or("").trim(); | ||
| if !first.is_empty() { | ||
| return first.to_owned(); | ||
| } | ||
| } | ||
| } | ||
| "unknown".into() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check for ConnectInfo usage
rg -n 'ConnectInfo' --type=rust
# Check for trusted proxy configuration
rg -n 'trusted.proxy|trust_proxy|real_ip_from' --type=rust -i
# Check if there's any proxy configuration in the codebase
rg -n 'proxy|forwarded|x-forwarded|x-real-ip' --type=rust -i | head -30
# Check how the gateway is actually configured/documented
fd -e md -e txt -e toml | xargs rg -l 'gateway|proxy|reverse' 2>/dev/null | head -10Repository: theonlyhennygod/zeroclaw
Length of output: 318
🏁 Script executed:
# Check SECURITY.md for proxy/trusted proxy documentation
cat SECURITY.md
# Check gateway configuration in template
cat dev/config.template.toml | grep -A 20 -B 5 gateway
# Check README for gateway deployment info
grep -A 10 -B 5 -i 'gateway\|proxy\|reverse' README.md | head -50Repository: theonlyhennygod/zeroclaw
Length of output: 5907
🏁 Script executed:
# Check the actual gateway handler code to see how it uses client_key_from_headers
sed -n '300,450p' src/gateway/mod.rs
# Check if there's any ConnectInfo or connection-based validation
rg -n 'SocketAddr|peer|connection|remote_addr' src/gateway/mod.rs --type=rust
# Look at the full client_key_from_headers usage context
rg -B5 -A10 'client_key_from_headers' src/gateway/mod.rs --type=rustRepository: theonlyhennygod/zeroclaw
Length of output: 7313
Client rate-limit key is spoofable via header manipulation — document trusted proxy requirement or use connection peer address.
client_key_from_headers extracts rate-limiting keys from X-Forwarded-For / X-Real-IP headers without validating their source. An attacker with direct access (not behind a validating reverse proxy) can rotate these headers on each request, obtaining a fresh rate-limit bucket per request and completely bypassing the limiter.
This is particularly critical for the /pair endpoint, which relies on rate limiting as its sole DoS protection.
Options:
- If always behind trusted proxy: Document this requirement explicitly in
SECURITY.mdand code comments. - Better: Use
axum::extract::ConnectInfo<SocketAddr>to derive the rate-limit key from the actual TCP peer address, optionally layering inX-Forwarded-Foronly when a configurabletrust_proxy_headers = truesetting is enabled.
For default deployments (localhost only), this is not exploitable, but users enabling allow_public_bind = true need clear guidance.
🤖 Prompt for AI Agents
In `@src/gateway/mod.rs` around lines 137 - 147, client_key_from_headers currently
trusts X-Forwarded-For / X-Real-IP unconditionally which allows header spoofing;
update the logic so the default behavior derives the rate-limit key from the
real TCP peer address (use axum::extract::ConnectInfo<SocketAddr> where the
rate-limit lookup occurs) and only fallback to or combine with X-Forwarded-For
when a new boolean config flag (e.g. trust_proxy_headers) is true, and add a
short code comment in client_key_from_headers and an explicit note in
SECURITY.md explaining the trusted-proxy requirement when trust_proxy_headers is
enabled.
…zeroclaw-labs#188) * feat(runtime): add Docker runtime MVP and runtime-aware command builder * feat(security): add shell risk classification, approval gates, and action throttling * feat(gateway): add per-endpoint rate limiting and webhook idempotency --------- Co-authored-by: chumyin <chumyin@users.noreply.github.com>
Summary
/pairand/webhookpair_rate_limit_per_minute,webhook_rate_limit_per_minute, andidempotency_ttl_secsWhy
This improves gateway resilience under abuse/replay patterns and reduces duplicate downstream work while preserving predictable latency under load.
Notes
Testing
cargo test --lib gateway::tests::gateway_rate_limiter_blocks_after_limitcargo test --lib gateway::tests::idempotency_store_rejects_duplicate_keyCurrent
mainhas unrelated compile failures in concurrency/memory modules, so repo-wide test execution is not green independent of this PR.Summary by CodeRabbit
Release Notes
New Features
Configuration Enhancements