Skip to content

feat(gateway): add per-endpoint rate limiting and webhook idempotency#188

Merged
theonlyhennygod merged 3 commits intozeroclaw-labs:mainfrom
chumyin:feat/pr03-gateway-rate-limit-idempotency-v2
Feb 15, 2026
Merged

feat(gateway): add per-endpoint rate limiting and webhook idempotency#188
theonlyhennygod merged 3 commits intozeroclaw-labs:mainfrom
chumyin:feat/pr03-gateway-rate-limit-idempotency-v2

Conversation

@chumyin
Copy link
Copy Markdown
Contributor

@chumyin chumyin commented Feb 15, 2026

Summary

  • add per-client sliding-window rate limiting for /pair and /webhook
  • add in-memory webhook idempotency key store with TTL-based deduplication
  • wire gateway state to enforce limit and idempotency gates before provider dispatch
  • extend gateway config schema with pair_rate_limit_per_minute, webhook_rate_limit_per_minute, and idempotency_ttl_secs
  • add focused gateway tests for limiter and idempotency behavior

Why

This improves gateway resilience under abuse/replay patterns and reduces duplicate downstream work while preserving predictable latency under load.

Notes

Testing

  • attempted: cargo test --lib gateway::tests::gateway_rate_limiter_blocks_after_limit
  • attempted: cargo test --lib gateway::tests::idempotency_store_rejects_duplicate_key

Current main has unrelated compile failures in concurrency/memory modules, so repo-wide test execution is not green independent of this PR.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added per-client rate limiting for pairing and webhook endpoints with configurable request limits.
    • Introduced idempotency mechanism to prevent duplicate webhook request processing.
    • Added Docker runtime support as an alternative execution environment with configurable resource limits and security settings.
    • Implemented command risk classification (Low/Medium/High) with approval-based controls and optional blocking of high-risk commands.
  • Configuration Enhancements

    • Expanded gateway configuration with rate limit and idempotency TTL settings.
    • Added risk control policies for command execution approval workflows.
    • Added Docker runtime configuration options for image, network, memory, and CPU constraints.

@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 15, 2026

Warning

.coderabbit.yaml has a parsing error

The CodeRabbit configuration file in this repository has a parsing error and default settings were used instead. Please fix the error(s) in the configuration file. You can initialize chat with CodeRabbit to get help with the configuration file.

💥 Parsing errors (1)
Validation error: Expected boolean, received object at "reviews.poem"
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
📝 Walkthrough

Walkthrough

This PR adds Docker runtime support via a RuntimeAdapter abstraction, introduces gateway-level rate limiting and idempotency for request deduplication, and implements command risk classification with approval workflows for supervised autonomy control. Configuration schema is expanded to support these features with per-client limits and security controls.

Changes

Cohort / File(s) Summary
Runtime Infrastructure
src/runtime/traits.rs, src/runtime/native.rs, src/runtime/docker.rs, src/runtime/mod.rs
Introduces RuntimeAdapter trait with build_shell_command method; implements DockerRuntime with container configuration (memory, CPU, mount options, allowlist validation) alongside updated NativeRuntime; wires Docker factory into create_runtime based on config.kind.
Configuration Schema
src/config/schema.rs, src/config/mod.rs
Expands GatewayConfig with per-client rate limits (pair, webhook) and idempotency TTL; extends AutonomyConfig with approval and blocking flags for medium/high-risk commands; adds DockerRuntimeConfig with container resource limits, mount behavior, and workspace allowlist; updates Default implementations and serde handling.
Gateway Rate Limiting & Idempotency
src/gateway/mod.rs
Implements sliding-window GatewayRateLimiter and TTL-based IdempotencyStore; integrates checks into handle_pair and handle_webhook with 429 responses on limit exceeded; adds X-Idempotency-Key deduplication to prevent webhook duplicates; wires components into AppState via run_gateway.
Security Policy & Command Risk
src/security/policy.rs
Introduces CommandRiskLevel enum (Low/Medium/High) and validates commands based on risk patterns; adds validate_command_execution method enforcing allowlist, risk-based gating, and approval requirements; configurable strict defaults via require_approval_for_medium_risk and block_high_risk_commands flags.
Tools Runtime Integration
src/tools/mod.rs, src/tools/shell.rs
Updates tools registry with runtime-aware factory functions (all_tools_with_runtime, default_tools_with_runtime); extends ShellTool to accept and use RuntimeAdapter for command building; adds approval flag to shell command schema; integrates risk validation and action budgeting into execution flow.
Agent Runtime Wiring
src/agent/loop_.rs
Replaces unused _runtime with Arc<dyn RuntimeAdapter>; passes runtime into tools initialization via all_tools_with_runtime, enabling runtime-aware tool setup.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 A Docker shell now springs to life,
Rate limits guard from surging strife,
Commands assessed for risk, then blessed,
With runtimes chosen—put to the test! 🚀

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(gateway): add per-endpoint rate limiting and webhook idempotency' directly and clearly summarizes the main changes in this PR, which focus on adding rate limiting and idempotency mechanisms to the gateway.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/tools/shell.rs (1)

99-118: ⚠️ Potential issue | 🟠 Major

env_clear() on the Docker CLI process may break Docker host configuration.

When the runtime is DockerRuntime, cmd is the docker CLI process. Calling env_clear() strips variables like DOCKER_HOST, DOCKER_CONFIG, DOCKER_TLS_VERIFY, and DOCKER_CERT_PATH that are commonly used to configure the Docker client (especially in CI or remote-Docker setups). The current SAFE_ENV_VARS allowlist does not include any Docker-specific variables, so docker run will silently fall back to the default socket — which may not exist or may point to the wrong daemon.

Consider letting the runtime declare additional safe env vars, or moving env sanitization inside the runtime's build_shell_command so each runtime can make the right call.

Possible approach: add a trait method for runtime-specific safe env vars

In src/runtime/traits.rs:

/// Additional environment variables this runtime needs on the host process.
fn required_env_vars(&self) -> &[&str] {
    &[]
}

In DockerRuntime:

fn required_env_vars(&self) -> &[&str] {
    &["DOCKER_HOST", "DOCKER_CONFIG", "DOCKER_TLS_VERIFY", "DOCKER_CERT_PATH"]
}

Then in src/tools/shell.rs, after re-adding SAFE_ENV_VARS, also propagate self.runtime.required_env_vars().

🤖 Fix all issues with AI agents
In `@src/config/schema.rs`:
- Around line 420-449: The doc comment for DockerRuntimeConfig should clarify
the actual default behavior of allowed_workspace_roots (empty vector = allow all
workspaces, non-empty = allowlist-only) to match runtime logic in the Docker
mount validation; update the struct-level docs above DockerRuntimeConfig and the
field doc for allowed_workspace_roots to state "empty = allow all, non-empty =
allowlist-only (see runtime/docker.rs validation)". If you prefer a restrictive
default instead, change the serde default for allowed_workspace_roots to return
a restrictive list (e.g., default to vec!["/tmp".to_string()]) by replacing the
current #[serde(default)] with a #[serde(default =
"default_restrictive_workspace_roots")] and implement that function, and update
the docs accordingly.

In `@src/gateway/mod.rs`:
- Around line 55-58: The current allow() implementation treats limit_per_window
== 0 as "unlimited", which is counterintuitive; change the config/type so 0 no
longer means unlimited by making limit_per_window an Option<u32> (None =
unlimited) and update the config parsing for pair_rate_limit_per_minute to map 0
=> Some(0) or None appropriately (e.g., interpret an explicit sentinel like
-1/absent as None), then modify the allow(&self, key: &str) method to: return
true only when limit_per_window is None (unlimited), treat Some(0) as "block
all" and otherwise enforce the numeric limit. Also update any docs/comments to
state the new semantics for pair_rate_limit_per_minute and adjust any code that
constructs or reads limit_per_window.
- Around line 137-147: client_key_from_headers currently trusts X-Forwarded-For
/ X-Real-IP unconditionally which allows header spoofing; update the logic so
the default behavior derives the rate-limit key from the real TCP peer address
(use axum::extract::ConnectInfo<SocketAddr> where the rate-limit lookup occurs)
and only fallback to or combine with X-Forwarded-For when a new boolean config
flag (e.g. trust_proxy_headers) is true, and add a short code comment in
client_key_from_headers and an explicit note in SECURITY.md explaining the
trusted-proxy requirement when trust_proxy_headers is enabled.
🧹 Nitpick comments (5)
src/runtime/docker.rs (1)

130-134: Empty image name would produce a confusing Docker error.

If self.config.image is empty or whitespace-only, trim() yields "", and Docker will fail with an unhelpful error. Consider validating the image name early (e.g., in new() or at the start of build_shell_command).

src/security/policy.rs (2)

171-289: command_risk_level doesn't use &self — consider making it a pure function or documenting intent.

The method takes &self but accesses no fields of SecurityPolicy. If you plan to make risk classification configurable per-policy (e.g. user-defined high-risk lists), the current signature is forward-looking. Otherwise, it could be an associated function or free function for clarity and easier testability.

This is a minor style observation, not blocking.


647-710: Good test coverage for risk classification and approval gates.

The tests cover low/medium/high classification and the approval/blocking behavior. Consider adding an edge-case test where block_high_risk_commands = false and approved = true to verify the allow path for high-risk commands when the safety net is explicitly disabled.

src/gateway/mod.rs (2)

104-135: IdempotencyStore has no max-capacity bound — memory can grow under abusive traffic.

Each call to record_if_new cleans expired entries, which bounds the store to keys from the last TTL window. However, within that window, there's no cap on unique keys. Under a DDoS with unique idempotency keys (which bypasses per-client rate limiting via spoofed headers), the HashMap grows unbounded.

Consider adding a maximum capacity (e.g., evict oldest entry when exceeding N keys) as a safety net.


184-192: Nit: hardcoded default model string duplicated.

The default model "anthropic/claude-sonnet-4-20250514" is defined inline here and also presumably in Config::default() (line 807 of schema.rs). Consider referencing a shared constant to keep them in sync.

Comment thread src/config/schema.rs
Comment on lines +420 to +449
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DockerRuntimeConfig {
/// Runtime image used to execute shell commands.
#[serde(default = "default_docker_image")]
pub image: String,

/// Docker network mode (`none`, `bridge`, etc.).
#[serde(default = "default_docker_network")]
pub network: String,

/// Optional memory limit in MB (`None` = no explicit limit).
#[serde(default = "default_docker_memory_limit_mb")]
pub memory_limit_mb: Option<u64>,

/// Optional CPU limit (`None` = no explicit limit).
#[serde(default = "default_docker_cpu_limit")]
pub cpu_limit: Option<f64>,

/// Mount root filesystem as read-only.
#[serde(default = "default_true")]
pub read_only_rootfs: bool,

/// Mount configured workspace into `/workspace`.
#[serde(default = "default_true")]
pub mount_workspace: bool,

/// Optional workspace root allowlist for Docker mount validation.
#[serde(default)]
pub allowed_workspace_roots: Vec<String>,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n 'allowed_workspace_roots' --type=rust -C5

Repository: theonlyhennygod/zeroclaw

Length of output: 3853


Update the documentation to clarify the default workspace root behavior, or reconsider the permissive default.

When allowed_workspace_roots is empty, the code permits any workspace path to be mounted (see src/runtime/docker.rs:33-35). This contradicts the claim of "strong security defaults"—the default configuration allows unrestricted workspace mounting. Either:

  1. Document this behavior explicitly in the struct's doc comment (clarify that empty = allow all, non-empty = allowlist-only), or
  2. Change the default to a restrictive allowlist (e.g., ["/tmp"]) to align with stated security posture.
🤖 Prompt for AI Agents
In `@src/config/schema.rs` around lines 420 - 449, The doc comment for
DockerRuntimeConfig should clarify the actual default behavior of
allowed_workspace_roots (empty vector = allow all workspaces, non-empty =
allowlist-only) to match runtime logic in the Docker mount validation; update
the struct-level docs above DockerRuntimeConfig and the field doc for
allowed_workspace_roots to state "empty = allow all, non-empty = allowlist-only
(see runtime/docker.rs validation)". If you prefer a restrictive default
instead, change the serde default for allowed_workspace_roots to return a
restrictive list (e.g., default to vec!["/tmp".to_string()]) by replacing the
current #[serde(default)] with a #[serde(default =
"default_restrictive_workspace_roots")] and implement that function, and update
the docs accordingly.

Comment thread src/gateway/mod.rs
Comment on lines +55 to +58
fn allow(&self, key: &str) -> bool {
if self.limit_per_window == 0 {
return true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

limit_per_window == 0 disables rate limiting — document or invert the semantics.

When limit_per_window is 0, allow() returns true unconditionally (line 57), effectively meaning "unlimited." This is counterintuitive — 0 might reasonably be expected to mean "block all requests." The config defaults (10, 60) make this unlikely in practice, but if an operator sets pair_rate_limit_per_minute = 0 expecting to disable the endpoint, they'd get the opposite.

Consider either:

  • Documenting that 0 = unlimited in the config schema, or
  • Inverting so 0 = block all and using a sentinel (e.g. u32::MAX or an Option) for unlimited.
🤖 Prompt for AI Agents
In `@src/gateway/mod.rs` around lines 55 - 58, The current allow() implementation
treats limit_per_window == 0 as "unlimited", which is counterintuitive; change
the config/type so 0 no longer means unlimited by making limit_per_window an
Option<u32> (None = unlimited) and update the config parsing for
pair_rate_limit_per_minute to map 0 => Some(0) or None appropriately (e.g.,
interpret an explicit sentinel like -1/absent as None), then modify the
allow(&self, key: &str) method to: return true only when limit_per_window is
None (unlimited), treat Some(0) as "block all" and otherwise enforce the numeric
limit. Also update any docs/comments to state the new semantics for
pair_rate_limit_per_minute and adjust any code that constructs or reads
limit_per_window.

Comment thread src/gateway/mod.rs
Comment on lines +137 to +147
fn client_key_from_headers(headers: &HeaderMap) -> String {
for header_name in ["X-Forwarded-For", "X-Real-IP"] {
if let Some(value) = headers.get(header_name).and_then(|v| v.to_str().ok()) {
let first = value.split(',').next().unwrap_or("").trim();
if !first.is_empty() {
return first.to_owned();
}
}
}
"unknown".into()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check for ConnectInfo usage
rg -n 'ConnectInfo' --type=rust

# Check for trusted proxy configuration
rg -n 'trusted.proxy|trust_proxy|real_ip_from' --type=rust -i

# Check if there's any proxy configuration in the codebase
rg -n 'proxy|forwarded|x-forwarded|x-real-ip' --type=rust -i | head -30

# Check how the gateway is actually configured/documented
fd -e md -e txt -e toml | xargs rg -l 'gateway|proxy|reverse' 2>/dev/null | head -10

Repository: theonlyhennygod/zeroclaw

Length of output: 318


🏁 Script executed:

# Check SECURITY.md for proxy/trusted proxy documentation
cat SECURITY.md

# Check gateway configuration in template
cat dev/config.template.toml | grep -A 20 -B 5 gateway

# Check README for gateway deployment info
grep -A 10 -B 5 -i 'gateway\|proxy\|reverse' README.md | head -50

Repository: theonlyhennygod/zeroclaw

Length of output: 5907


🏁 Script executed:

# Check the actual gateway handler code to see how it uses client_key_from_headers
sed -n '300,450p' src/gateway/mod.rs

# Check if there's any ConnectInfo or connection-based validation
rg -n 'SocketAddr|peer|connection|remote_addr' src/gateway/mod.rs --type=rust

# Look at the full client_key_from_headers usage context
rg -B5 -A10 'client_key_from_headers' src/gateway/mod.rs --type=rust

Repository: theonlyhennygod/zeroclaw

Length of output: 7313


Client rate-limit key is spoofable via header manipulation — document trusted proxy requirement or use connection peer address.

client_key_from_headers extracts rate-limiting keys from X-Forwarded-For / X-Real-IP headers without validating their source. An attacker with direct access (not behind a validating reverse proxy) can rotate these headers on each request, obtaining a fresh rate-limit bucket per request and completely bypassing the limiter.

This is particularly critical for the /pair endpoint, which relies on rate limiting as its sole DoS protection.

Options:

  1. If always behind trusted proxy: Document this requirement explicitly in SECURITY.md and code comments.
  2. Better: Use axum::extract::ConnectInfo<SocketAddr> to derive the rate-limit key from the actual TCP peer address, optionally layering in X-Forwarded-For only when a configurable trust_proxy_headers = true setting is enabled.

For default deployments (localhost only), this is not exploitable, but users enabling allow_public_bind = true need clear guidance.

🤖 Prompt for AI Agents
In `@src/gateway/mod.rs` around lines 137 - 147, client_key_from_headers currently
trusts X-Forwarded-For / X-Real-IP unconditionally which allows header spoofing;
update the logic so the default behavior derives the rate-limit key from the
real TCP peer address (use axum::extract::ConnectInfo<SocketAddr> where the
rate-limit lookup occurs) and only fallback to or combine with X-Forwarded-For
when a new boolean config flag (e.g. trust_proxy_headers) is true, and add a
short code comment in client_key_from_headers and an explicit note in
SECURITY.md explaining the trusted-proxy requirement when trust_proxy_headers is
enabled.

@theonlyhennygod theonlyhennygod merged commit 9e55ab0 into zeroclaw-labs:main Feb 15, 2026
1 check passed
AllenHyang pushed a commit to AllenHyang/zeroclaw that referenced this pull request Feb 20, 2026
…zeroclaw-labs#188)

* feat(runtime): add Docker runtime MVP and runtime-aware command builder

* feat(security): add shell risk classification, approval gates, and action throttling

* feat(gateway): add per-endpoint rate limiting and webhook idempotency

---------

Co-authored-by: chumyin <chumyin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants