fix(templates): coder template uses existing tests and edit_file by qhkm · Pull Request #359 · qhkm/zeptoclaw

qhkm · 2026-03-14T17:45:26Z

Summary

Updated coder template system prompt to instruct the LLM to check for existing test files before writing new ones
Added workflow: discover existing tests → run them first → fix in-place with edit_file → verify with existing suite → summarize changes
Added regression test test_coder_prompt_instructs_existing_test_workflow to guard the key instructions

Root cause

The coder template said "write tests when appropriate" but never told the LLM to look for existing tests first. This caused it to write its own tests matching buggy behavior, pass them, and report "no bugs found."

Test plan

cargo nextest run --lib -E 'test(templates)' — 32/32 pass
cargo clippy -- -D warnings — clean
cargo fmt -- --check — clean
Manual: zeptoclaw agent --template coder -m "Read test-coding-zepto/calculator.py, find bugs, fix them, verify with tests" should now use existing test_calculator.py

Closes #345

🤖 Generated with Claude Code

Summary by CodeRabbit

Improvements
- Enhanced built-in coder template with clearer, step-by-step guidance for bug-fix workflows: check existing tests, perform in-place edits, re-run test suites, and guidance on when to add new tests.
Tests
- Added a unit test to verify the coder template includes the updated bug-fix workflow guidance.
Style
- Minor whitespace/formatting cleanups in message parsing tests.

coderabbitai · 2026-03-14T17:45:40Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2d011293-e859-489d-b062-5afd12817be4

📥 Commits

Reviewing files that changed from the base of the PR and between 677d28d and 742178e.

📒 Files selected for processing (1)

src/channels/discord.rs

✅ Files skipped from review due to trivial changes (1)

src/channels/discord.rs

📝 Walkthrough

Walkthrough

Expanded the built-in coder template's system_prompt with explicit, structured bug-fix guidance (check existing tests, use edit_file for in-place edits, re-run existing test suites, write new tests only if none exist). Added a unit test validating those prompt instructions. Minor whitespace tweaks in a Discord parser test.

Changes

Cohort / File(s)	Summary
Coder template & test `src/config/templates.rs`	Expanded the coder template `system_prompt` with step-by-step bug-fix workflow: check for existing tests, run them first, prefer `edit_file` for in-place fixes, re-run existing test suite to verify, and only write new tests when no relevant tests exist. Added `test_coder_prompt_instructs_existing_test_workflow` to assert these instructions are present.
Discord channel whitespace `src/channels/discord.rs`	Whitespace-only adjustments in parsing and corresponding test (`test_message_create_with_image_only`), no behavioral changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 I sniffed the prompt and hopped with glee,
"Use the tests that live," I whispered with tea,
Edit in place, then run them anew,
Only write tests when none exist true,
A tiny hop for a clearer dev spree. 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	The PR contains whitespace-only changes to discord.rs unrelated to the template workflow fix objectives outlined in issue `#345`.	Remove the whitespace-only changes in src/channels/discord.rs as they are out of scope and unrelated to the coder template fix.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly reflects the main change: updating the coder template to use existing tests and edit_file for bug fixing workflows.
Linked Issues check	✅ Passed	All objectives from issue `#345` are met: prompt instructs checking existing tests, using edit_file for in-place fixes, and re-running existing tests for verification.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/coder-template-workflow

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

src/r8r_bridge/health.rs (1)
27-37: Consider interval_at for deferred first tick.

time::interval() fires immediately on the first .tick(). If you want the first ping to occur after interval_secs rather than immediately on spawn, use time::interval_at(Instant::now() + duration, duration). This is likely acceptable as-is if an immediate health check is desired.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/health.rs` around lines 27 - 37, The current tokio::spawn loop
uses time::interval which causes the first interval.tick().await to fire
immediately; to defer the first health ping use time::interval_at(Instant::now()
+ Duration::from_secs(interval_secs), Duration::from_secs(interval_secs))
instead of time::interval so the first send_health_ping() is invoked after
interval_secs; update the spawned task in health.rs (the async block that
references interval_secs, bridge.is_connected(), and bridge.send_health_ping())
to construct the interval with interval_at and the appropriate Instant and
Duration.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config/types.rs`:
- Around line 135-137: The KNOWN_TOP_LEVEL constant in validate.rs is missing
the top-level field r8r_bridge; update the KNOWN_TOP_LEVEL collection (symbol:
KNOWN_TOP_LEVEL) to include "r8r_bridge" so the R8rBridgeConfig declared as
r8r_bridge in types.rs is validated properly, and then review src/config/mod.rs
(where env overrides are defined) to decide if you need to add environment
overrides for r8r_bridge (e.g., ZEPTOCLAW_R8R_BRIDGE_*) and add them if
required.

In `@src/r8r_bridge/mod.rs`:
- Around line 263-269: disconnect() currently only clears the public sender and
flips connected to false but does not close the underlying WebSocket or stop the
spawned reader task; locate the bridge struct fields (sender, connected,
ws_read, ws_write, and any task handles like reader_handle or shutdown_tx) and
update disconnect() to: signal/trigger shutdown to the reader task (e.g., via a
shutdown channel or by aborting the task handle), await or join the
reader_handle if available, and explicitly close/ shutdown both ws_write and
ws_read (or call the WebSocket close procedure) before setting sender to None
and connected to false so the old socket is fully torn down prior to logging
disconnection.
- Around line 187-205: The duplicate-check logic skips duplicates before sending
an Ack; change it so you still send an Ack for duplicates: call
dedup.lock().await and read dd.is_new(&envelope.id) into a local boolean, then
drop the dedup lock (exit that scope) and if the event is duplicate construct
Ack and use serde_json::to_string + ws_write.lock().await to send
WsMessage::Text via writer.send, then continue; if it is new, proceed with
normal processing. Ensure you do not hold the dedup lock while awaiting
ws_write.lock().await to avoid deadlocks (use a local flag, not the locked
guard, when sending the ack).
- Around line 110-118: The code currently calls unwrap() when parsing header
values for the Host and Authorization headers (the insert calls on
request.headers_mut() using host_with_port.parse().unwrap() and format!("Bearer
{token}").parse().unwrap()); replace these unwraps with proper error propagation
by attempting to parse the header values (e.g., via HeaderValue::from_str or
.parse()) and mapping any parsing error into the function's Result<(), String>
(e.g., map_err(|e| format!("invalid Host header: {}", e)) and similar for the
Authorization header using self.token), then insert the resulting HeaderValue
into request.headers_mut() so parsing failures return Err(...) instead of
panicking.

In `@tests/r8r_bridge_client_test.rs`:
- Around line 92-110: Replace the fixed
tokio::time::sleep(Duration::from_millis(500)).await with a timeout-based
polling loop: use tokio::time::timeout(...) to bound the total wait and
repeatedly call bridge.last_health_status().await (with a short sleep between
polls) until it returns Some(BridgeEvent::HealthStatus { ... }) or the timeout
elapses; then assert on the returned version and uptime_secs. Ensure you import
and use tokio::time::timeout and keep the same assertion logic that checks the
BridgeEvent::HealthStatus variant and fields.

---

Nitpick comments:
In `@src/r8r_bridge/health.rs`:
- Around line 27-37: The current tokio::spawn loop uses time::interval which
causes the first interval.tick().await to fire immediately; to defer the first
health ping use time::interval_at(Instant::now() +
Duration::from_secs(interval_secs), Duration::from_secs(interval_secs)) instead
of time::interval so the first send_health_ping() is invoked after
interval_secs; update the spawned task in health.rs (the async block that
references interval_secs, bridge.is_connected(), and bridge.send_health_ping())
to construct the interval with interval_at and the appropriate Instant and
Duration.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7cc6a605-e6aa-493a-82fc-abd532f5face

📥 Commits

Reviewing files that changed from the base of the PR and between 2d39ea2 and 156fffa.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (14)

Cargo.toml
src/cli/common.rs
src/config/mod.rs
src/config/templates.rs
src/config/types.rs
src/lib.rs
src/r8r_bridge/approval.rs
src/r8r_bridge/dedup.rs
src/r8r_bridge/events.rs
src/r8r_bridge/health.rs
src/r8r_bridge/mod.rs
tests/r8r_bridge_approval_test.rs
tests/r8r_bridge_client_test.rs
tests/r8r_bridge_events_test.rs

coderabbitai · 2026-03-14T17:57:23Z

+    /// r8r workflow-engine bridge configuration.
+    #[serde(default)]
+    pub r8r_bridge: R8rBridgeConfig,


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf "== src/config/mod.rs ==\n" rg -n "r8r_bridge|system_prompt|ZEPTOCLAW_AGENTS_DEFAULTS_SYSTEM_PROMPT|R8R_BRIDGE" src/config/mod.rs || true printf "\n== src/config/validate.rs ==\n" rg -n "KNOWN_TOP_LEVEL|r8r_bridge" src/config/validate.rs || true

Repository: qhkm/zeptoclaw

Length of output: 643

🏁 Script executed:

#!/bin/bash set -euo pipefail printf "== src/config/validate.rs: KNOWN_TOP_LEVEL content ==\n" sed -n '7,30p' src/config/validate.rs printf "\n== src/config/mod.rs: env override section (around line 142) ==\n" sed -n '135,160p' src/config/mod.rs printf "\n== Check if r8r_bridge appears anywhere in src/config/mod.rs ==\n" rg -i "r8r" src/config/mod.rs || echo "No r8r_bridge found in mod.rs"

Repository: qhkm/zeptoclaw

Length of output: 2005

Add r8r_bridge to KNOWN_TOP_LEVEL in src/config/validate.rs.

The env override for system_prompt is correctly in place at src/config/mod.rs:142-143, but r8r_bridge is missing from the KNOWN_TOP_LEVEL list in src/config/validate.rs. Since r8r_bridge is a top-level config field, it must be registered there for proper validation. Consider also whether an env override (e.g., ZEPTOCLAW_R8R_BRIDGE_*) is needed in src/config/mod.rs.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/config/types.rs` around lines 135 - 137, The KNOWN_TOP_LEVEL constant in validate.rs is missing the top-level field r8r_bridge; update the KNOWN_TOP_LEVEL collection (symbol: KNOWN_TOP_LEVEL) to include "r8r_bridge" so the R8rBridgeConfig declared as r8r_bridge in types.rs is validated properly, and then review src/config/mod.rs (where env overrides are defined) to decide if you need to add environment overrides for r8r_bridge (e.g., ZEPTOCLAW_R8R_BRIDGE_*) and add them if required.

coderabbitai · 2026-03-14T17:57:23Z

+        request
+            .headers_mut()
+            .insert("Host", host_with_port.parse().unwrap());
+
+        if let Some(ref token) = self.token {
+            request.headers_mut().insert(
+                "Authorization",
+                format!("Bearer {token}").parse().unwrap(),
+            );


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and read the exact file and lines head -n 120 src/r8r_bridge/mod.rs | tail -n 20

Repository: qhkm/zeptoclaw

Length of output: 603

🏁 Script executed:

# Get the full context around lines 110-118 sed -n '100,130p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 1007

🏁 Script executed:

# Check the function signature to understand return type sed -n '90,120p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 1102

🏁 Script executed:

# Verify the function signature and return type grep -n "pub async fn connect" src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 116

🏁 Script executed:

# Check what type .parse() is parsing into by looking at HeaderMap usage rg "headers_mut.*insert" -A 1 -B 1 src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 178

🏁 Script executed:

# Check if there are any tests or callers that show what values token and host could have rg "R8rBridge::new|R8rBridge.*token" -A 2 -B 2 src/

Repository: qhkm/zeptoclaw

Length of output: 40

🏁 Script executed:

# Verify the http crate's HeaderValue to understand parse() failure modes rg "use.*HeaderValue|use.*http" src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 40

🏁 Script executed:

# Check the imports and type signatures in the file head -n 50 src/r8r_bridge/mod.rs | grep -E "^use|^impl"

Repository: qhkm/zeptoclaw

Length of output: 378

🏁 Script executed:

# Look for the full module context to understand what types are involved sed -n '1,80p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 2939

🏁 Script executed:

# Check the http crate documentation to understand HeaderValue::parse() behavior # by looking for how it's used elsewhere in the codebase rg "HeaderValue" src/

Repository: qhkm/zeptoclaw

Length of output: 463

🏁 Script executed:

# Check the tungstenite ClientRequest and HeaderMap to understand what type parse() targets rg "ClientRequest|headers_mut" -A 2 src/r8r_bridge/mod.rs | head -20

Repository: qhkm/zeptoclaw

Length of output: 436

🏁 Script executed:

# Look for any error handling patterns with HeaderValue in the codebase rg "HeaderValue::from_str|parse.*HeaderValue" -B 2 -A 2 src/

Repository: qhkm/zeptoclaw

Length of output: 1370

🏁 Script executed:

# Check if there are any tests or examples showing R8rBridge with various token values fd "test" src/ --type f -x grep -l "R8rBridge\|connect" {} \;

Repository: qhkm/zeptoclaw

Length of output: 40

🏁 Script executed:

# Verify: can a valid URL host and port actually produce an invalid HTTP header? # Check the http crate's HeaderValue validation rules by looking at documentation comments python3 << 'EOF' # HeaderValue from http crate must contain only ASCII and specific chars # Let's check what characters are valid in HTTP header values import re # HTTP header values must be VCHAR (33-126) or SPACE (32) or HTAB (9) def is_valid_header_value(s): for c in s: code = ord(c) if not (code in [9, 32] or (33 <= code <= 126)): return False return True # Test with a URL host that might be problematic test_cases = [ "example.com:8080", "localhost:9000", "192.168.1.1:443", "example.com", # no port "example-with-dashes.com:8080", "example.com:invalid", # should fail URL parsing first ] for case in test_cases: print(f"{case}: valid_header={is_valid_header_value(case)}") EOF

Repository: qhkm/zeptoclaw

Length of output: 280

🏁 Script executed:

# Check if there are any actual test cases with malformed tokens rg "token.*=" tests/ --type rs -A 2 -B 2

Repository: qhkm/zeptoclaw

Length of output: 84

🏁 Script executed:

# Verify the exact type that .parse() is calling - check tungstenite's implementation # by looking at what types the headers_mut().insert expects python3 << 'EOF' # Let's verify what the tungstenite library expects # The insert method signature is: insert(K, V) where V typically needs to be HeaderValue # HeaderValue::from_str() can fail if the string contains invalid characters # According to HTTP spec (RFC 7230), header field values are: # field-value = *( field-content / obs-fold ) # field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ] # field-vchar = VCHAR / obs-text # Let's simulate what could fail test_tokens = [ "Bearer validtoken123", # valid "Bearer token\nwith\nnewlines", # INVALID - contains newlines "Bearer token\x00null", # INVALID - null bytes "Bearer token with spaces", # valid "Bearer token™", # potentially invalid - non-ASCII ] for token in test_tokens: is_ascii = all(ord(c) < 128 for c in token) has_control = any(ord(c) < 32 for c in token) print(f"{token!r}: ascii_only={is_ascii}, has_control_chars={has_control}") EOF

Repository: qhkm/zeptoclaw

Length of output: 386

🏁 Script executed:

# Most importantly: check what the coding guideline says about unwrap grep -n "unwrap\|expect" src/r8r_bridge/mod.rs | head -20

Repository: qhkm/zeptoclaw

Length of output: 187

🏁 Script executed:

# Verify if there are other examples in r8r_bridge that handle Results properly sed -n '90,300p' src/r8r_bridge/mod.rs | grep -E "\.map_err|Result|return Err"

Repository: qhkm/zeptoclaw

Length of output: 613

Remove unwrap() calls and handle header parsing failures gracefully.

The function returns Result<(), String> and already propagates errors consistently throughout, but lines 112 and 117 panic on invalid header values instead. The bearer token in particular is user-provided at runtime and could contain control characters or non-ASCII bytes, causing HeaderValue::parse() to fail. Propagate these errors instead of unwrapping.

Suggested change

- request - .headers_mut() - .insert("Host", host_with_port.parse().unwrap()); + let host_header = host_with_port + .parse() + .map_err(|e| format!("Invalid Host header value: {e}"))?; + request.headers_mut().insert("Host", host_header); if let Some(ref token) = self.token { + let auth_header = format!("Bearer {token}") + .parse() + .map_err(|e| format!("Invalid Authorization header value: {e}"))?; request.headers_mut().insert( "Authorization", - format!("Bearer {token}").parse().unwrap(), + auth_header, ); }

Per coding guidelines: "Do not add unwrap()/expect() in production paths unless failure is truly unrecoverable."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/r8r_bridge/mod.rs` around lines 110 - 118, The code currently calls unwrap() when parsing header values for the Host and Authorization headers (the insert calls on request.headers_mut() using host_with_port.parse().unwrap() and format!("Bearer {token}").parse().unwrap()); replace these unwraps with proper error propagation by attempting to parse the header values (e.g., via HeaderValue::from_str or .parse()) and mapping any parsing error into the function's Result<(), String> (e.g., map_err(|e| format!("invalid Host header: {}", e)) and similar for the Authorization header using self.token), then insert the resulting HeaderValue into request.headers_mut() so parsing failures return Err(...) instead of panicking.

coderabbitai · 2026-03-14T17:57:23Z

+                // Dedup check.
+                {
+                    let mut dd = dedup.lock().await;
+                    if !dd.is_new(&envelope.id) {
+                        info!("r8r bridge: skipping duplicate event {}", envelope.id);
+                        continue;
+                    }
+                }
+
+                // Send ack.
+                let ack = Ack {
+                    event_id: envelope.id.clone(),
+                };
+                if let Ok(ack_json) = serde_json::to_string(&ack) {
+                    let mut writer = ws_write.lock().await;
+                    if let Err(e) = writer.send(WsMessage::Text(ack_json.into())).await {
+                        warn!("r8r bridge: failed to send ack: {e}");
+                    }
+                }


⚠️ Potential issue | 🟠 Major

Ack duplicates before you skip them.

The duplicate check currently continues before the ack path. If r8r retransmits because the first ack was lost, this client will keep dropping the retry without ever acknowledging it, which can trap the sender in a retry loop.

🔁 Suggested change

- // Dedup check. - { - let mut dd = dedup.lock().await; - if !dd.is_new(&envelope.id) { - info!("r8r bridge: skipping duplicate event {}", envelope.id); - continue; - } - } - // Send ack. let ack = Ack { event_id: envelope.id.clone(), }; if let Ok(ack_json) = serde_json::to_string(&ack) { let mut writer = ws_write.lock().await; if let Err(e) = writer.send(WsMessage::Text(ack_json.into())).await { warn!("r8r bridge: failed to send ack: {e}"); } } + + // Dedup check. + { + let mut dd = dedup.lock().await; + if !dd.is_new(&envelope.id) { + info!("r8r bridge: skipping duplicate event {}", envelope.id); + continue; + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Dedup check.

{

let mut dd = dedup.lock().await;

if !dd.is_new(&envelope.id) {

info!("r8r bridge: skipping duplicate event {}", envelope.id);

continue;

}

}

// Send ack.

let ack = Ack {

event_id: envelope.id.clone(),

};

if let Ok(ack_json) = serde_json::to_string(&ack) {

let mut writer = ws_write.lock().await;

if let Err(e) = writer.send(WsMessage::Text(ack_json.into())).await {

warn!("r8r bridge: failed to send ack: {e}");

}

}

// Send ack.

let ack = Ack {

event_id: envelope.id.clone(),

};

if let Ok(ack_json) = serde_json::to_string(&ack) {

let mut writer = ws_write.lock().await;

if let Err(e) = writer.send(WsMessage::Text(ack_json.into())).await {

warn!("r8r bridge: failed to send ack: {e}");

}

}

// Dedup check.

{

let mut dd = dedup.lock().await;

if !dd.is_new(&envelope.id) {

info!("r8r bridge: skipping duplicate event {}", envelope.id);

continue;

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/r8r_bridge/mod.rs` around lines 187 - 205, The duplicate-check logic skips duplicates before sending an Ack; change it so you still send an Ack for duplicates: call dedup.lock().await and read dd.is_new(&envelope.id) into a local boolean, then drop the dedup lock (exit that scope) and if the event is duplicate construct Ack and use serde_json::to_string + ws_write.lock().await to send WsMessage::Text via writer.send, then continue; if it is new, proceed with normal processing. Ensure you do not hold the dedup lock while awaiting ws_write.lock().await to avoid deadlocks (use a local flag, not the locked guard, when sending the ack).

coderabbitai · 2026-03-14T17:57:23Z

+    /// Disconnect from the bridge, clearing the sender and setting connected to false.
+    pub async fn disconnect(&self) {
+        let mut sender_guard = self.sender.lock().await;
+        *sender_guard = None;
+        self.connected.store(false, Ordering::Relaxed);
+        info!("r8r bridge disconnected (manual)");
+    }


⚠️ Potential issue | 🟠 Major

disconnect() doesn't actually tear down the WebSocket session.

This only clears the public sender and flips connected to false. The spawned reader task still owns ws_read, and the reader also retains ws_write for acks, so the old socket can stay alive while a later connect() creates a second session. Store shutdown/task handles and close the split stream here before reporting the bridge as disconnected.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/r8r_bridge/mod.rs` around lines 263 - 269, disconnect() currently only clears the public sender and flips connected to false but does not close the underlying WebSocket or stop the spawned reader task; locate the bridge struct fields (sender, connected, ws_read, ws_write, and any task handles like reader_handle or shutdown_tx) and update disconnect() to: signal/trigger shutdown to the reader task (e.g., via a shutdown channel or by aborting the task handle), await or join the reader_handle if available, and explicitly close/ shutdown both ws_write and ws_read (or call the WebSocket close procedure) before setting sender to None and connected to false so the old socket is fully torn down prior to logging disconnection.

coderabbitai · 2026-03-14T17:57:23Z

+    // Wait briefly for the bridge to receive the health status response.
+    tokio::time::sleep(Duration::from_millis(500)).await;
+
+    // The mock server echoes a health status; we may also receive the ack first.
+    // Drain the ack message from rx (the bridge sends an ack for the health status).
+    // The health status should now be stored.
+    let health = bridge.last_health_status().await;
+    assert!(health.is_some(), "bridge should have stored health status");
+
+    if let Some(BridgeEvent::HealthStatus {
+        version,
+        uptime_secs,
+        ..
+    }) = health
+    {
+        assert_eq!(version, "0.1.0-test");
+        assert_eq!(uptime_secs, 42);
+    } else {
+        panic!("Expected HealthStatus variant");


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "r8r_bridge_client_test.rs" -type f

Repository: qhkm/zeptoclaw

Length of output: 91

🏁 Script executed:

sed -n '92,110p' ./tests/r8r_bridge_client_test.rs

Repository: qhkm/zeptoclaw

Length of output: 789

🏁 Script executed:

sed -n '50,115p' ./tests/r8r_bridge_client_test.rs

Repository: qhkm/zeptoclaw

Length of output: 2236

🏁 Script executed:

rg "last_health_status" ./tests/r8r_bridge_client_test.rs

Repository: qhkm/zeptoclaw

Length of output: 109

🏁 Script executed:

rg "fn last_health_status" -A 5

Repository: qhkm/zeptoclaw

Length of output: 391

🏁 Script executed:

rg "tokio::time::sleep" ./tests/r8r_bridge_client_test.rs

Repository: qhkm/zeptoclaw

Length of output: 115

Replace the fixed sleep with a timeout-based polling loop.

Lines 93–100 use an arbitrary 500ms sleep before checking the health status. This makes the test timing-sensitive and flaky on slower CI runners. Instead, poll last_health_status() within a timeout() (consistent with the timeout pattern already used at line 82 for the ping receipt) so the assertion waits for the actual state change.

Suggested change

- // Wait briefly for the bridge to receive the health status response. - tokio::time::sleep(Duration::from_millis(500)).await; - - // The mock server echoes a health status; we may also receive the ack first. - // Drain the ack message from rx (the bridge sends an ack for the health status). - // The health status should now be stored. - let health = bridge.last_health_status().await; - assert!(health.is_some(), "bridge should have stored health status"); - - if let Some(BridgeEvent::HealthStatus { + let health = tokio::time::timeout(Duration::from_secs(5), async { + loop { + if let Some(health) = bridge.last_health_status().await { + break health; + } + tokio::time::sleep(Duration::from_millis(50)).await; + } + }) + .await + .expect("bridge should receive health status"); + + if let BridgeEvent::HealthStatus { version, uptime_secs, .. - }) = health + } = health { assert_eq!(version, "0.1.0-test"); assert_eq!(uptime_secs, 42); } else { panic!("Expected HealthStatus variant");

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Wait briefly for the bridge to receive the health status response.

tokio::time::sleep(Duration::from_millis(500)).await;

// The mock server echoes a health status; we may also receive the ack first.

// Drain the ack message from rx (the bridge sends an ack for the health status).

// The health status should now be stored.

let health = bridge.last_health_status().await;

assert!(health.is_some(), "bridge should have stored health status");

if let Some(BridgeEvent::HealthStatus {

version,

uptime_secs,

..

}) = health

{

assert_eq!(version, "0.1.0-test");

assert_eq!(uptime_secs, 42);

} else {

panic!("Expected HealthStatus variant");

let health = tokio::time::timeout(Duration::from_secs(5), async {

loop {

if let Some(health) = bridge.last_health_status().await {

break health;

}

tokio::time::sleep(Duration::from_millis(50)).await;

}

})

.await

.expect("bridge should receive health status");

if let BridgeEvent::HealthStatus {

version,

uptime_secs,

..

} = health

{

assert_eq!(version, "0.1.0-test");

assert_eq!(uptime_secs, 42);

} else {

panic!("Expected HealthStatus variant");

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/r8r_bridge_client_test.rs` around lines 92 - 110, Replace the fixed tokio::time::sleep(Duration::from_millis(500)).await with a timeout-based polling loop: use tokio::time::timeout(...) to bound the total wait and repeatedly call bridge.last_health_status().await (with a short sleep between polls) until it returns Some(BridgeEvent::HealthStatus { ... }) or the timeout elapses; then assert on the returned version and uptime_secs. Ensure you import and use tokio::time::timeout and keep the same assertion logic that checks the BridgeEvent::HealthStatus variant and fields.

The coder template system prompt lacked instructions to check for existing test files before writing new ones, causing the LLM to write its own tests that matched buggy behavior and report false success. Added a 5-step workflow: discover existing tests, run them first, fix in-place with edit_file, verify with existing suite, summarize changes. Closes #345 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config/templates.rs`:
- Around line 440-465: The test
test_coder_prompt_instructs_existing_test_workflow should also assert that the
coder.system_prompt enforces running existing tests first; update the test
(using the prompt variable already defined) to include an assertion that
prompt.contains("run the existing tests first") or a similar phrase that
includes both "run" and "first" (e.g., "run tests first" / "run the existing
tests first") so the contract that tests must be executed before editing is
explicitly locked down.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 58553670-3d96-4603-9ae1-5e1a12afcc45

📥 Commits

Reviewing files that changed from the base of the PR and between 156fffa and 677d28d.

📒 Files selected for processing (1)

src/config/templates.rs

coderabbitai · 2026-03-14T18:05:09Z

+    #[test]
+    fn test_coder_prompt_instructs_existing_test_workflow() {
+        let registry = TemplateRegistry::new();
+        let coder = registry.get("coder").unwrap();
+        let prompt = &coder.system_prompt;
+        // Must instruct to check for existing tests before writing new ones
+        assert!(
+            prompt.contains("existing test"),
+            "coder prompt must mention existing tests"
+        );
+        // Must instruct to use edit_file for in-place fixes
+        assert!(
+            prompt.contains("edit_file"),
+            "coder prompt must instruct using edit_file"
+        );
+        // Must instruct to run existing tests for verification
+        assert!(
+            prompt.contains("re-running the EXISTING test suite"),
+            "coder prompt must instruct re-running existing tests"
+        );
+        // Must NOT encourage writing own tests as default verification
+        assert!(
+            prompt.contains("Only write new tests if no relevant tests exist"),
+            "coder prompt must discourage writing new tests when existing ones are available"
+        );
+    }


⚠️ Potential issue | 🟡 Minor

Regression test should also lock the “run tests first” contract.

At Line 445-Line 464, the assertions cover existing-tests usage and re-run behavior, but they don’t explicitly guard the “run them FIRST” requirement.

Suggested patch

#[test] fn test_coder_prompt_instructs_existing_test_workflow() { let registry = TemplateRegistry::new(); let coder = registry.get("coder").unwrap(); let prompt = &coder.system_prompt; @@ assert!( prompt.contains("Only write new tests if no relevant tests exist"), "coder prompt must discourage writing new tests when existing ones are available" ); + // Must instruct to run existing tests first before editing + assert!( + prompt.contains("run them FIRST"), + "coder prompt must require running existing tests first" + ); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/config/templates.rs` around lines 440 - 465, The test test_coder_prompt_instructs_existing_test_workflow should also assert that the coder.system_prompt enforces running existing tests first; update the test (using the prompt variable already defined) to include an assertion that prompt.contains("run the existing tests first") or a similar phrase that includes both "run" and "first" (e.g., "run tests first" / "run the existing tests first") so the contract that tests must be executed before editing is explicitly locked down.

…m#359) ## Summary - Updated coder template system prompt to instruct the LLM to check for existing test files before writing new ones - Added workflow: discover existing tests → run them first → fix in-place with `edit_file` → verify with existing suite → summarize changes - Added regression test `test_coder_prompt_instructs_existing_test_workflow` to guard the key instructions ## Root cause The coder template said "write tests when appropriate" but never told the LLM to look for *existing* tests first. This caused it to write its own tests matching buggy behavior, pass them, and report "no bugs found." ## Test plan - [x] `cargo nextest run --lib -E 'test(templates)'` — 32/32 pass - [x] `cargo clippy -- -D warnings` — clean - [x] `cargo fmt -- --check` — clean - [ ] Manual: `zeptoclaw agent --template coder -m "Read test-coding-zepto/calculator.py, find bugs, fix them, verify with tests"` should now use existing `test_calculator.py` Closes qhkm#345 🤖 Generated with [Claude Code](https://claude.com/claude-code)  ## Summary by CodeRabbit * **Improvements** * Enhanced built-in coder template with clearer, step-by-step guidance for bug-fix workflows: check existing tests, perform in-place edits, re-run test suites, and guidance on when to add new tests. * **Tests** * Added a unit test to verify the coder template includes the updated bug-fix workflow guidance. * **Style** * Minor whitespace/formatting cleanups in message parsing tests.  --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Mar 14, 2026

View reviewed changes

qhkm force-pushed the fix/coder-template-workflow branch from 156fffa to 677d28d Compare March 14, 2026 18:02

chore: fix trailing whitespace in discord.rs

742178e

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Mar 14, 2026

View reviewed changes

github-actions Bot mentioned this pull request Mar 15, 2026

🦞 OpenClaw 生态日报 2026-03-15 gsscsd/big_model_radar#38

Open

qhkm merged commit 624ad11 into main Mar 15, 2026
9 checks passed

qhkm deleted the fix/coder-template-workflow branch March 15, 2026 03:41

github-actions Bot mentioned this pull request Mar 16, 2026

🦞 OpenClaw 生态日报 2026-03-16 gsscsd/big_model_radar#44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(templates): coder template uses existing tests and edit_file#359

fix(templates): coder template uses existing tests and edit_file#359
qhkm merged 2 commits intomainfrom
fix/coder-template-workflow

qhkm commented Mar 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Uh oh!

coderabbitai Bot Mar 14, 2026

Uh oh!

coderabbitai Bot Mar 14, 2026

Uh oh!

coderabbitai Bot Mar 14, 2026

Uh oh!

coderabbitai Bot Mar 14, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

qhkm commented Mar 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qhkm commented Mar 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 14, 2026 •

edited

Loading