feat(gateway): add per-endpoint rate limiting and webhook idempotency by chumyin · Pull Request #188 · zeroclaw-labs/zeroclaw

chumyin · 2026-02-15T16:59:34Z

Summary

add per-client sliding-window rate limiting for /pair and /webhook
add in-memory webhook idempotency key store with TTL-based deduplication
wire gateway state to enforce limit and idempotency gates before provider dispatch
extend gateway config schema with pair_rate_limit_per_minute, webhook_rate_limit_per_minute, and idempotency_ttl_secs
add focused gateway tests for limiter and idempotency behavior

Why

This improves gateway resilience under abuse/replay patterns and reduces duplicate downstream work while preserving predictable latency under load.

Notes

This branch is stacked on feat(runtime): add Docker runtime MVP and runtime-aware command builder #186 and feat(security): add shell risk classification, approval gates, and action throttling #187, and currently includes those dependency commits.

Testing

attempted: cargo test --lib gateway::tests::gateway_rate_limiter_blocks_after_limit
attempted: cargo test --lib gateway::tests::idempotency_store_rejects_duplicate_key

Current main has unrelated compile failures in concurrency/memory modules, so repo-wide test execution is not green independent of this PR.

Summary by CodeRabbit

Release Notes

New Features
- Added per-client rate limiting for pairing and webhook endpoints with configurable request limits.
- Introduced idempotency mechanism to prevent duplicate webhook request processing.
- Added Docker runtime support as an alternative execution environment with configurable resource limits and security settings.
- Implemented command risk classification (Low/Medium/High) with approval-based controls and optional blocking of high-risk commands.
Configuration Enhancements
- Expanded gateway configuration with rate limit and idempotency TTL settings.
- Added risk control policies for command execution approval workflows.
- Added Docker runtime configuration options for image, network, memory, and CPU constraints.

…tion throttling

gemini-code-assist · 2026-02-15T16:59:37Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-02-15T16:59:52Z

Warning

`.coderabbit.yaml` has a parsing error

The CodeRabbit configuration file in this repository has a parsing error and default settings were used instead. Please fix the error(s) in the configuration file. You can initialize chat with CodeRabbit to get help with the configuration file.

💥 Parsing errors (1)

Validation error: Expected boolean, received object at "reviews.poem"

⚙️ Configuration instructions

Please see the configuration documentation for more information.
You can also validate your configuration using the online YAML validator.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

📝 Walkthrough

Walkthrough

This PR adds Docker runtime support via a RuntimeAdapter abstraction, introduces gateway-level rate limiting and idempotency for request deduplication, and implements command risk classification with approval workflows for supervised autonomy control. Configuration schema is expanded to support these features with per-client limits and security controls.

Changes

Cohort / File(s)	Summary
Runtime Infrastructure `src/runtime/traits.rs`, `src/runtime/native.rs`, `src/runtime/docker.rs`, `src/runtime/mod.rs`	Introduces RuntimeAdapter trait with build_shell_command method; implements DockerRuntime with container configuration (memory, CPU, mount options, allowlist validation) alongside updated NativeRuntime; wires Docker factory into create_runtime based on config.kind.
Configuration Schema `src/config/schema.rs`, `src/config/mod.rs`	Expands GatewayConfig with per-client rate limits (pair, webhook) and idempotency TTL; extends AutonomyConfig with approval and blocking flags for medium/high-risk commands; adds DockerRuntimeConfig with container resource limits, mount behavior, and workspace allowlist; updates Default implementations and serde handling.
Gateway Rate Limiting & Idempotency `src/gateway/mod.rs`	Implements sliding-window GatewayRateLimiter and TTL-based IdempotencyStore; integrates checks into handle_pair and handle_webhook with 429 responses on limit exceeded; adds X-Idempotency-Key deduplication to prevent webhook duplicates; wires components into AppState via run_gateway.
Security Policy & Command Risk `src/security/policy.rs`	Introduces CommandRiskLevel enum (Low/Medium/High) and validates commands based on risk patterns; adds validate_command_execution method enforcing allowlist, risk-based gating, and approval requirements; configurable strict defaults via require_approval_for_medium_risk and block_high_risk_commands flags.
Tools Runtime Integration `src/tools/mod.rs`, `src/tools/shell.rs`	Updates tools registry with runtime-aware factory functions (all_tools_with_runtime, default_tools_with_runtime); extends ShellTool to accept and use RuntimeAdapter for command building; adds approval flag to shell command schema; integrates risk validation and action budgeting into execution flow.
Agent Runtime Wiring `src/agent/loop_.rs`	Replaces unused _runtime with Arc<dyn RuntimeAdapter>; passes runtime into tools initialization via all_tools_with_runtime, enabling runtime-aware tool setup.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Fix/imessage sql injection #61 — Modifies GatewayConfig defaults and Default implementation in the same schema file, overlapping with rate limit and idempotency TTL field additions.
fix: apply TimeoutLayer to gateway router for request timeouts #74 — Updates gateway router initialization and AppState wiring in src/gateway/mod.rs, similar architectural layer as rate limiter and idempotency injection.
feat: add browser automation tool using Vercel agent-browser #46 — Extends the tools registry in src/tools/mod.rs with new tool configurations, directly affected by runtime-aware factory function signatures introduced here.

Poem

🐰 A Docker shell now springs to life,
Rate limits guard from surging strife,
Commands assessed for risk, then blessed,
With runtimes chosen—put to the test! 🚀

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 58.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(gateway): add per-endpoint rate limiting and webhook idempotency' directly and clearly summarizes the main changes in this PR, which focus on adding rate limiting and idempotency mechanisms to the gateway.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/tools/shell.rs (1)
99-118: ⚠️ Potential issue | 🟠 Major

env_clear() on the Docker CLI process may break Docker host configuration.

When the runtime is DockerRuntime, cmd is the docker CLI process. Calling env_clear() strips variables like DOCKER_HOST, DOCKER_CONFIG, DOCKER_TLS_VERIFY, and DOCKER_CERT_PATH that are commonly used to configure the Docker client (especially in CI or remote-Docker setups). The current SAFE_ENV_VARS allowlist does not include any Docker-specific variables, so docker run will silently fall back to the default socket — which may not exist or may point to the wrong daemon.

Consider letting the runtime declare additional safe env vars, or moving env sanitization inside the runtime's build_shell_command so each runtime can make the right call.
Possible approach: add a trait method for runtime-specific safe env vars

In src/runtime/traits.rs:
/// Additional environment variables this runtime needs on the host process.
fn required_env_vars(&self) -> &[&str] {
    &[]
}
In DockerRuntime:
fn required_env_vars(&self) -> &[&str] {
    &["DOCKER_HOST", "DOCKER_CONFIG", "DOCKER_TLS_VERIFY", "DOCKER_CERT_PATH"]
}
Then in src/tools/shell.rs, after re-adding SAFE_ENV_VARS, also propagate self.runtime.required_env_vars().

🤖 Fix all issues with AI agents

In `@src/config/schema.rs`:
- Around line 420-449: The doc comment for DockerRuntimeConfig should clarify
the actual default behavior of allowed_workspace_roots (empty vector = allow all
workspaces, non-empty = allowlist-only) to match runtime logic in the Docker
mount validation; update the struct-level docs above DockerRuntimeConfig and the
field doc for allowed_workspace_roots to state "empty = allow all, non-empty =
allowlist-only (see runtime/docker.rs validation)". If you prefer a restrictive
default instead, change the serde default for allowed_workspace_roots to return
a restrictive list (e.g., default to vec!["/tmp".to_string()]) by replacing the
current #[serde(default)] with a #[serde(default =
"default_restrictive_workspace_roots")] and implement that function, and update
the docs accordingly.

In `@src/gateway/mod.rs`:
- Around line 55-58: The current allow() implementation treats limit_per_window
== 0 as "unlimited", which is counterintuitive; change the config/type so 0 no
longer means unlimited by making limit_per_window an Option<u32> (None =
unlimited) and update the config parsing for pair_rate_limit_per_minute to map 0
=> Some(0) or None appropriately (e.g., interpret an explicit sentinel like
-1/absent as None), then modify the allow(&self, key: &str) method to: return
true only when limit_per_window is None (unlimited), treat Some(0) as "block
all" and otherwise enforce the numeric limit. Also update any docs/comments to
state the new semantics for pair_rate_limit_per_minute and adjust any code that
constructs or reads limit_per_window.
- Around line 137-147: client_key_from_headers currently trusts X-Forwarded-For
/ X-Real-IP unconditionally which allows header spoofing; update the logic so
the default behavior derives the rate-limit key from the real TCP peer address
(use axum::extract::ConnectInfo<SocketAddr> where the rate-limit lookup occurs)
and only fallback to or combine with X-Forwarded-For when a new boolean config
flag (e.g. trust_proxy_headers) is true, and add a short code comment in
client_key_from_headers and an explicit note in SECURITY.md explaining the
trusted-proxy requirement when trust_proxy_headers is enabled.

🧹 Nitpick comments (5)

src/runtime/docker.rs (1)

130-134: Empty image name would produce a confusing Docker error.

If self.config.image is empty or whitespace-only, trim() yields "", and Docker will fail with an unhelpful error. Consider validating the image name early (e.g., in new() or at the start of build_shell_command).

src/security/policy.rs (2)

171-289: command_risk_level doesn't use &self — consider making it a pure function or documenting intent.

The method takes &self but accesses no fields of SecurityPolicy. If you plan to make risk classification configurable per-policy (e.g. user-defined high-risk lists), the current signature is forward-looking. Otherwise, it could be an associated function or free function for clarity and easier testability.

This is a minor style observation, not blocking.

647-710: Good test coverage for risk classification and approval gates.

The tests cover low/medium/high classification and the approval/blocking behavior. Consider adding an edge-case test where block_high_risk_commands = false and approved = true to verify the allow path for high-risk commands when the safety net is explicitly disabled.

src/gateway/mod.rs (2)

104-135: IdempotencyStore has no max-capacity bound — memory can grow under abusive traffic.

Each call to record_if_new cleans expired entries, which bounds the store to keys from the last TTL window. However, within that window, there's no cap on unique keys. Under a DDoS with unique idempotency keys (which bypasses per-client rate limiting via spoofed headers), the HashMap grows unbounded.

Consider adding a maximum capacity (e.g., evict oldest entry when exceeding N keys) as a safety net.

184-192: Nit: hardcoded default model string duplicated.

The default model "anthropic/claude-sonnet-4-20250514" is defined inline here and also presumably in Config::default() (line 807 of schema.rs). Consider referencing a shared constant to keep them in sync.

coderabbitai · 2026-02-15T17:06:15Z

+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct DockerRuntimeConfig {
+    /// Runtime image used to execute shell commands.
+    #[serde(default = "default_docker_image")]
+    pub image: String,
+
+    /// Docker network mode (`none`, `bridge`, etc.).
+    #[serde(default = "default_docker_network")]
+    pub network: String,
+
+    /// Optional memory limit in MB (`None` = no explicit limit).
+    #[serde(default = "default_docker_memory_limit_mb")]
+    pub memory_limit_mb: Option<u64>,
+
+    /// Optional CPU limit (`None` = no explicit limit).
+    #[serde(default = "default_docker_cpu_limit")]
+    pub cpu_limit: Option<f64>,
+
+    /// Mount root filesystem as read-only.
+    #[serde(default = "default_true")]
+    pub read_only_rootfs: bool,
+
+    /// Mount configured workspace into `/workspace`.
+    #[serde(default = "default_true")]
+    pub mount_workspace: bool,
+
+    /// Optional workspace root allowlist for Docker mount validation.
+    #[serde(default)]
+    pub allowed_workspace_roots: Vec<String>,
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n 'allowed_workspace_roots' --type=rust -C5

Repository: theonlyhennygod/zeroclaw

Length of output: 3853

Update the documentation to clarify the default workspace root behavior, or reconsider the permissive default.

When allowed_workspace_roots is empty, the code permits any workspace path to be mounted (see src/runtime/docker.rs:33-35). This contradicts the claim of "strong security defaults"—the default configuration allows unrestricted workspace mounting. Either:

Document this behavior explicitly in the struct's doc comment (clarify that empty = allow all, non-empty = allowlist-only), or

Change the default to a restrictive allowlist (e.g., ["/tmp"]) to align with stated security posture.

🤖 Prompt for AI Agents

In `@src/config/schema.rs` around lines 420 - 449, The doc comment for DockerRuntimeConfig should clarify the actual default behavior of allowed_workspace_roots (empty vector = allow all workspaces, non-empty = allowlist-only) to match runtime logic in the Docker mount validation; update the struct-level docs above DockerRuntimeConfig and the field doc for allowed_workspace_roots to state "empty = allow all, non-empty = allowlist-only (see runtime/docker.rs validation)". If you prefer a restrictive default instead, change the serde default for allowed_workspace_roots to return a restrictive list (e.g., default to vec!["/tmp".to_string()]) by replacing the current #[serde(default)] with a #[serde(default = "default_restrictive_workspace_roots")] and implement that function, and update the docs accordingly.

coderabbitai · 2026-02-15T17:06:15Z

+    fn allow(&self, key: &str) -> bool {
+        if self.limit_per_window == 0 {
+            return true;
+        }


⚠️ Potential issue | 🟡 Minor

limit_per_window == 0 disables rate limiting — document or invert the semantics.

When limit_per_window is 0, allow() returns true unconditionally (line 57), effectively meaning "unlimited." This is counterintuitive — 0 might reasonably be expected to mean "block all requests." The config defaults (10, 60) make this unlikely in practice, but if an operator sets pair_rate_limit_per_minute = 0 expecting to disable the endpoint, they'd get the opposite.

Consider either:

Documenting that 0 = unlimited in the config schema, or

Inverting so 0 = block all and using a sentinel (e.g. u32::MAX or an Option) for unlimited.

🤖 Prompt for AI Agents

In `@src/gateway/mod.rs` around lines 55 - 58, The current allow() implementation treats limit_per_window == 0 as "unlimited", which is counterintuitive; change the config/type so 0 no longer means unlimited by making limit_per_window an Option<u32> (None = unlimited) and update the config parsing for pair_rate_limit_per_minute to map 0 => Some(0) or None appropriately (e.g., interpret an explicit sentinel like -1/absent as None), then modify the allow(&self, key: &str) method to: return true only when limit_per_window is None (unlimited), treat Some(0) as "block all" and otherwise enforce the numeric limit. Also update any docs/comments to state the new semantics for pair_rate_limit_per_minute and adjust any code that constructs or reads limit_per_window.

coderabbitai · 2026-02-15T17:06:15Z

+fn client_key_from_headers(headers: &HeaderMap) -> String {
+    for header_name in ["X-Forwarded-For", "X-Real-IP"] {
+        if let Some(value) = headers.get(header_name).and_then(|v| v.to_str().ok()) {
+            let first = value.split(',').next().unwrap_or("").trim();
+            if !first.is_empty() {
+                return first.to_owned();
+            }
+        }
+    }
+    "unknown".into()
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check for ConnectInfo usage rg -n 'ConnectInfo' --type=rust # Check for trusted proxy configuration rg -n 'trusted.proxy|trust_proxy|real_ip_from' --type=rust -i # Check if there's any proxy configuration in the codebase rg -n 'proxy|forwarded|x-forwarded|x-real-ip' --type=rust -i | head -30 # Check how the gateway is actually configured/documented fd -e md -e txt -e toml | xargs rg -l 'gateway|proxy|reverse' 2>/dev/null | head -10

Repository: theonlyhennygod/zeroclaw

Length of output: 318

🏁 Script executed:

# Check SECURITY.md for proxy/trusted proxy documentation cat SECURITY.md # Check gateway configuration in template cat dev/config.template.toml | grep -A 20 -B 5 gateway # Check README for gateway deployment info grep -A 10 -B 5 -i 'gateway\|proxy\|reverse' README.md | head -50

Repository: theonlyhennygod/zeroclaw

Length of output: 5907

🏁 Script executed:

# Check the actual gateway handler code to see how it uses client_key_from_headers sed -n '300,450p' src/gateway/mod.rs # Check if there's any ConnectInfo or connection-based validation rg -n 'SocketAddr|peer|connection|remote_addr' src/gateway/mod.rs --type=rust # Look at the full client_key_from_headers usage context rg -B5 -A10 'client_key_from_headers' src/gateway/mod.rs --type=rust

Repository: theonlyhennygod/zeroclaw

Length of output: 7313

Client rate-limit key is spoofable via header manipulation — document trusted proxy requirement or use connection peer address.

client_key_from_headers extracts rate-limiting keys from X-Forwarded-For / X-Real-IP headers without validating their source. An attacker with direct access (not behind a validating reverse proxy) can rotate these headers on each request, obtaining a fresh rate-limit bucket per request and completely bypassing the limiter.

This is particularly critical for the /pair endpoint, which relies on rate limiting as its sole DoS protection.

Options:

If always behind trusted proxy: Document this requirement explicitly in SECURITY.md and code comments.

Better: Use axum::extract::ConnectInfo<SocketAddr> to derive the rate-limit key from the actual TCP peer address, optionally layering in X-Forwarded-For only when a configurable trust_proxy_headers = true setting is enabled.

For default deployments (localhost only), this is not exploitable, but users enabling allow_public_bind = true need clear guidance.

🤖 Prompt for AI Agents

In `@src/gateway/mod.rs` around lines 137 - 147, client_key_from_headers currently trusts X-Forwarded-For / X-Real-IP unconditionally which allows header spoofing; update the logic so the default behavior derives the rate-limit key from the real TCP peer address (use axum::extract::ConnectInfo<SocketAddr> where the rate-limit lookup occurs) and only fallback to or combine with X-Forwarded-For when a new boolean config flag (e.g. trust_proxy_headers) is true, and add a short code comment in client_key_from_headers and an explicit note in SECURITY.md explaining the trusted-proxy requirement when trust_proxy_headers is enabled.

…zeroclaw-labs#188) * feat(runtime): add Docker runtime MVP and runtime-aware command builder * feat(security): add shell risk classification, approval gates, and action throttling * feat(gateway): add per-endpoint rate limiting and webhook idempotency --------- Co-authored-by: chumyin <chumyin@users.noreply.github.com>

chumyin added 3 commits February 16, 2026 00:53

feat(runtime): add Docker runtime MVP and runtime-aware command builder

4921615

feat(security): add shell risk classification, approval gates, and ac…

15e591d

…tion throttling

feat(gateway): add per-endpoint rate limiting and webhook idempotency

86380e4

chumyin mentioned this pull request Feb 15, 2026

feat(config): make config writes atomic with rollback-safe replacement #190

Merged

coderabbitai Bot reviewed Feb 15, 2026

View reviewed changes

theonlyhennygod merged commit 9e55ab0 into zeroclaw-labs:main Feb 15, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gateway): add per-endpoint rate limiting and webhook idempotency#188

feat(gateway): add per-endpoint rate limiting and webhook idempotency#188
theonlyhennygod merged 3 commits intozeroclaw-labs:mainfrom
chumyin:feat/pr03-gateway-rate-limit-idempotency-v2

chumyin commented Feb 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

gemini-code-assist Bot commented Feb 15, 2026

Uh oh!

coderabbitai Bot commented Feb 15, 2026 •

edited

Loading

`.coderabbit.yaml` has a parsing error

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Feb 15, 2026

Uh oh!

coderabbitai Bot Feb 15, 2026

Uh oh!

coderabbitai Bot Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chumyin commented Feb 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Notes

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

gemini-code-assist Bot commented Feb 15, 2026

Uh oh!

coderabbitai Bot commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

.coderabbit.yaml has a parsing error

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chumyin commented Feb 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 15, 2026 •

edited

Loading

`.coderabbit.yaml` has a parsing error