fix(agent): normalize empty successful tool output by aliasliao · Pull Request #5565 · zeroclaw-labs/zeroclaw

aliasliao · 2026-04-09T17:06:18Z

Summary

Describe this PR in 2-5 bullets:

Base branch target (master for all contributions): master
Problem: successful tool executions with empty stdout serialize into an empty tool-role content payload for some custom providers.
Why it matters: providers such as Deepseek V3.2 via GCP reject empty tool-role content and fail the follow-up LLM request with HTTP 400.
What changed: normalize empty successful tool output to (no output) before scrubbing and forwarding it.
What did not change (scope boundary): no provider-specific branching, no tool failure-path changes, no config/schema changes.

Label Snapshot (required)

Risk label (risk: low|medium|high): risk: medium
Size label (size: XS|S|M|L|XL, auto-managed/read-only): size: XS
Scope labels: agent,provider,tool
Module labels: agent: tool execution, provider: custom, tool: shell
Contributor tier label: trusted contributor
If any auto-label is incorrect, note requested correction: none

Change Metadata

Change type: bug
Primary scope: multi

Linked Issue

Closes [Bug]: Custom provider tool follow-up fails when successful tool output is empty #5564

Supersede Attribution (required when `Supersedes #` is used)

Superseded PRs + authors: N/A
Integrated scope by source PR: N/A
Co-authored-by trailers added: No — not applicable
Trailer format check: Pass

Validation Evidence (required)

Commands and result summary:

cargo fmt --all -- --check
cargo clippy --all-targets -- -D warnings
cargo test

Evidence provided: all three commands run locally after addressing review feedback; cargo fmt clean, cargo clippy --all-targets -- -D warnings 0 warnings, cargo test --lib agent::loop_::tests::execute_one_tool_normalizes_empty_success_output -- --exact passes. CI green on Apr 15 run (lint, test, build, security audit, all gates). (Validation evidence updated by @singlerider — original body incorrectly stated clippy and test were not run; author confirmed in thread that both were run post-feedback.)
If any command is intentionally skipped, explain why: N/A

Security Impact (required)

New permissions/capabilities? No
New external network calls? No
Secrets/tokens handling changed? No
File system access scope changed? No

Privacy and Data Hygiene (required)

Data-hygiene status: pass
Redaction/anonymization notes: issue reproduction and logs are sanitized and contain no personal data.
Neutral wording confirmation: confirmed

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No

i18n Follow-Through (required when docs or user-facing wording changes)

i18n follow-through triggered? No — no docs or user-facing copy changed.

Human Verification (required)

Verified scenarios: reviewed the execute_one_tool success branch to confirm empty successful outputs now become (no output) before provider-facing scrubbing.
Edge cases checked: non-empty successful output still uses the original tool output; failure paths remain unchanged.
What was not verified: end-to-end execution against Deepseek V3.2 on GCP in a live environment.

Side Effects / Blast Radius (required)

Affected subsystems/workflows: tool result packaging in the agent loop for successful tool executions.
Potential unintended effects: downstream consumers will now receive (no output) instead of an empty string for this success edge case.
Guardrails/monitoring for early detection: provider request failures for empty tool-role content should stop occurring; diff scope is limited to a single normalization branch.

Agent Collaboration Notes (recommended)

Agent tools used: Codex terminal, gh CLI
Workflow/plan summary: isolated the success-path normalization in src/agent/tool_execution.rs, formatted, committed, pushed, then opened linked issue and PR.
Verification focus: keep the patch minimal and avoid changing failure semantics.
Confirmation: naming + architecture boundaries followed (AGENTS.md + CONTRIBUTING.md): confirmed

Rollback Plan (required)

Fast rollback command/path: revert commit eac18a94 or restore the previous success-path assignment in src/agent/tool_execution.rs.
Feature flags or config toggles: none
Observable failure symptoms: regression would reintroduce provider-side 400 errors for successful empty-output tool calls.

Risks and Mitigations

Risk: some downstream code may distinguish empty string from a non-empty placeholder for successful tool results.
Mitigation: the placeholder only applies when the previous value was empty, and it prevents provider protocol errors in the reported workflow.

theonlyhennygod

Review — PR #5565: fix(agent): normalize empty successful tool output

Comprehension Summary

This PR normalizes empty successful tool output to "(no output)" in execute_one_tool (in src/agent/tool_execution.rs) before the result is passed to scrub_credentials and forwarded to providers. The motivation is that some providers (e.g. Deepseek V3.2 via GCP) reject empty tool-role content with HTTP 400. The blast radius is narrow: only the success branch of execute_one_tool is affected, and only when the tool returns an empty string on success. Downstream consumers (loop detector, hooks, trace recording, message assembly) will now see "(no output)" instead of "" in this edge case.

Security / Performance Assessment

Security: No security impact identified. No change to access control, input validation, secret handling, or attack surface. The scrub_credentials call is preserved.
Performance: No meaningful performance impact. The change adds one string comparison and, in the empty case, one small allocation. See suggestion below about an unnecessary .clone() in the non-empty path.

What was reviewed and verified

CI Required Gate: all checks passing (lint, test, build on all platforms, security audit, 32-bit check).
PR template: fully completed with all required sections.
Privacy/data hygiene: pass — no PII, credentials, or identity-specific language in the diff.
Scope: single concern, minimal patch, no unrelated changes.
Duplicate scan: no overlapping open PRs found.
Architectural alignment: no new dependencies, no trait bypass, no security weakening.
Downstream consumers: reviewed all callers of execute_one_tool and all uses of ToolExecutionOutcome.output in loop_.rs — the change is compatible. The loop detector will now hash "(no output)" instead of "" for this edge case, which is correct behavior (empty outputs were already degenerate for detection purposes).

Findings

[suggestion] Unnecessary .clone() in the non-empty path (line 96 in the diff):
```
r.output.clone()
```
scrub_credentials takes &str, so the original code passed &r.output directly without allocating. The new code clones r.output into normalized_output even in the common (non-empty) case, only to take a reference to it on the next line. Consider using Cow<'_, str> or restructuring to avoid the extra allocation:
```
let normalized: &str = if r.output.is_empty() { "(no output)" } else { &r.output };
Ok(ToolExecutionOutcome {
    output: scrub_credentials(normalized),
    ...
})
```
Why: Avoids an unnecessary heap allocation on every successful non-empty tool call — the hot path.
Action: Consider restructuring to avoid the clone.
[suggestion] No test for the new normalization behavior:
The existing tests in loop_.rs cover the unknown-tool error path and the activated-tool success path, but neither exercises the empty-successful-output normalization. A small test that creates a tool returning ToolResult { success: true, output: String::new(), error: None } and asserts the outcome output is "(no output)" would lock in this behavior.
Why: Prevents future regressions if this normalization is accidentally removed.
Action: Consider adding a targeted test.

Verdict: Needs author action

Thank you for this clean, well-scoped fix — the PR template is thorough, the motivation is clear, and the change is minimal. The two suggestions above are non-blocking improvements that would strengthen the patch before merge. Please address or acknowledge them, and this should be ready for maintainer merge.

Field	Content
PR	#5565 — fix(agent): normalize empty successful tool output
Author	@aliasliao
Summary	Normalizes empty successful tool output to `"(no output)"` to prevent provider-side HTTP 400 errors for empty tool-role content. Blast radius: success branch of `execute_one_tool` only.
Action	Needs-action
Reason	Two non-blocking suggestions: unnecessary `.clone()` and missing test coverage.
Security/performance	No security impact. Minor unnecessary allocation in non-empty path (suggestion filed).
Changes requested	(1) Avoid unnecessary `.clone()` in non-empty path. (2) Add test for empty-output normalization.
Architectural notes	No footprint, dependency, or design concerns.
Tests	CI all green. No new tests added for the changed behavior.
Notes	Well-motivated bugfix with clear provider-side evidence. Straightforward to finalize.

aliasliao · 2026-04-15T06:48:18Z

@theonlyhennygod review comments are addressed: the success path now avoids the extra clone, and I added a regression test for empty successful tool output. I also verified the branch is current with master (no merge conflict), ran cargo fmt --all -- --check, cargo clippy --all-targets -- -D warnings, and cargo test --lib agent::loop_::tests::execute_one_tool_normalizes_empty_success_output -- --exact. Please take another look when you have a moment.

aliasliao · 2026-04-15T09:59:50Z

@singlerider @WareWolf-MoonWall could you please take another look when you have a moment?

This issue #5564 can still be reproduced in master branch latest commit: 9f0de18.

Reproduce prompt: 执行 shell tool：open .

Captured llm http payload:

singlerider

Agent Review — Routing: Needs Maintainer Review

@WareWolf-MoonWall @JordanTheJet — this PR modifies crates/zeroclaw-runtime/src/agent/ (runtime path), requiring maintainer sign-off.

DRY note: @theonlyhennygod reviewed this PR and raised two suggestions. @aliasliao has addressed both:

✅ Unnecessary .clone() removed — non-empty path now uses &r.output directly
✅ Regression test added — execute_one_tool_normalizes_empty_success_output asserts "(no output)" for empty successful tool output

What the change does: Normalizes empty successful tool output to "(no output)" before forwarding to providers. Fixes HTTP 400 errors from Deepseek V3.2 via GCP (and any other provider that rejects empty tool-role content). Blast radius: success branch of execute_one_tool only. 57 lines added, 1 deleted.

Zero remaining findings. Code is correct and ready for maintainer merge.

WareWolf-MoonWall

PR Review — #5565 `fix(agent): normalize empty successful tool output`

I've read the full diff, the linked issue (#5564), and all prior review threads.

What this change does

In execute_one_tool, when a tool returns success: true with an empty output string, the fix normalizes that to "(no output)" before passing it to scrub_credentials and forwarding to the provider. Without this, providers that reject empty tool-role content — including Deepseek V3.2 via GCP — return HTTP 400, breaking the tool-calling workflow entirely. S1 severity, confirmed reproduced on current master by the author with a screenshot.

The fix is minimal, single-concern, and correctly scoped to the success branch only. The failure path is unchanged. The non-empty success path is unchanged.

✅ Commendation

Both of @theonlyhennygod's prior suggestions were addressed cleanly:

The unnecessary .clone() is gone — the non-empty path now passes &r.output directly to scrub_credentials as it did before, avoiding the heap allocation on the hot path. The fix uses a &str branch (if r.output.is_empty() { "(no output)" } else { &r.output }) which is the right shape — zero-cost for the common case.

The regression test execute_one_tool_normalizes_empty_success_output is well-structured: EmptySuccessTool is a purpose-built mock that returns success: true with empty output, and the assertion confirms both that outcome.output == "(no output)" and that error_reason.is_none() — correctly distinguishing the success-with-no-output case from the failure case. This is the FND-006 §4.3 standard applied correctly: a test that would have caught the bug.

🟡 Conditional — validation evidence section is now stale

The PR body still states that cargo clippy --all-targets -- -D warnings and cargo test "were not run." The author confirmed in the thread that both were run after addressing @theonlyhennygod's feedback, and CI is green. The PR body should be updated to reflect the current state. This is a template hygiene item — the evidence exists, it's just not recorded where the template requires it.

No new blocking findings

The fix is correct, the test covers the regression, and both prior requests have been addressed. No architectural concerns, no security impact, no performance impact on the non-empty path. Risk label (risk: high applied, risk: medium in body) follows the same pattern seen in other runtime PRs — the label is correct per AGENTS.md.

Ready for maintainer merge once the validation evidence section is updated.

singlerider

Thanks for the persistence on this one, @aliasliao — the fix is solid.

Comprehension summary

Normalizes empty successful tool output to "(no output)" in the success branch of execute_one_tool before forwarding to scrub_credentials and the provider. Fixes HTTP 400 errors from providers (Deepseek V3.2 via GCP confirmed) that reject empty tool-role content. Failure path and non-empty success path are unchanged. Single-concern, minimal diff.

What was verified

CI fully green on Apr 15 run — lint, test, build (linux/mac/windows), security audit, all gates pass
Fix correctly scoped to success branch only; &r.output used on non-empty path (no unnecessary clone)
Regression test execute_one_tool_normalizes_empty_success_output covers the bug: EmptySuccessTool returns success: true with empty output, asserts outcome.output == "(no output)" and error_reason.is_none()
PR body validation evidence section updated (was stale — original stated clippy/test not run; author confirmed in thread both were run post-feedback)

Security / performance assessment

No security impact. No performance impact on the non-empty hot path — &r.output branch is zero-cost.

This PR is ready for maintainer merge.

Stale

tool 返回 success=true 但 output 为空时，向 LLM 传空串会污染历史、某些 provider 还会报 invalid content。把空输出归一化为 "(no output)"，并在 tracing 日志里也用归一化文本。 Ports upstream fbb5ae9 (zeroclaw-labs#5565) to master_wecom's pre-workspace-split layout. Co-authored-by: Liao Jinyuan <jinyuanovo@gmail.com>

… changelog - Bump workspace version 0.7.0 → 0.7.1 in root Cargo.toml - Revert release workflow to gh release create --target for workflow_dispatch (the git-push approach from zeroclaw-labs#5860 is blocked by the org Restrict creations rule; --target uses the Releases API which bypasses it, and v0.7.1 has no immutable release lock so the previous blocker does not apply) - Update CHANGELOG-next.md: retitle to v0.6.9 → v0.7.1, restore full comprehensive notes from the upstream draft, and add entries that were missing from the original v0.7.0 draft: - feat(observability): otel_headers for authenticated OTLP exporters (zeroclaw-labs#5700) - feat: GitHub Copilot provider onboarding (zeroclaw-labs#5321) - fix(channels/telegram): inline_keyboard for tool approval requests (zeroclaw-labs#5790) - fix(provider): strip tool_stream for non-Z.AI providers (zeroclaw-labs#5806) - fix(agent): normalize empty successful tool output (zeroclaw-labs#5565) - fix(web): theme mode switch not applying correctly (zeroclaw-labs#5724) - fix(web): add visual preview swatches to theme selector (zeroclaw-labs#5767) - fix: cron_run tool output not delivered to configured channels

Version - Bump workspace version 0.7.0 → 0.7.1 in root Cargo.toml + Cargo.lock CI rationalisation (FND-004 Phase 1 — Rationalise) - Delete checks-on-pr.yml and ci-run.yml — two workflows doing identical work on every PR, producing duplicate signal and double compute cost - Add ci.yml (name: Quality Gate) — single staged pipeline replacing both: Stage 1: fmt + clippy --workspace (fast gate) Stage 2: build matrix, check all-features / no-default-features / 32-bit, benchmarks compile (parallel, gated on Stage 1) Stage 3: nextest (gated on Stage 1) Stage 4: cargo deny check — licenses, sources, advisories (deny.toml already present and triaged) Stage 5: CI Required Gate composite job (branch protection target) - Remove rust_strict_delta_gate.sh — workspace-aware clippy --workspace makes delta comparison implicit (clean baseline = any warning fails) - pre-release-validate.yml: remove pull_request trigger (secrets unavailable on fork PRs caused guaranteed failure on every Cargo.toml bump); remove stale CARGO_REGISTRY_TOKEN check (crates.io publishing removed in zeroclaw-labs#5858) Release workflow - Revert release-stable-manual.yml to gh release create --target for workflow_dispatch (git push approach from zeroclaw-labs#5860 blocked by org Restrict creations rule; Releases API bypasses it; v0.7.1 has no immutable lock) Changelog - Retitle CHANGELOG-next.md to v0.6.9 → v0.7.1, restore full release notes, add entries missing from original draft: otel_headers (zeroclaw-labs#5700), GitHub Copilot onboarding (zeroclaw-labs#5321), Telegram inline_keyboard (zeroclaw-labs#5790), tool_stream fix (zeroclaw-labs#5806), empty tool output (zeroclaw-labs#5565), web theme fixes (zeroclaw-labs#5724, zeroclaw-labs#5767), cron_run delivery fix

… 1) (#5867) Version - Bump workspace version 0.7.0 → 0.7.1 in root Cargo.toml + Cargo.lock CI rationalisation (FND-004 Phase 1 — Rationalise) - Delete checks-on-pr.yml and ci-run.yml — two workflows doing identical work on every PR, producing duplicate signal and double compute cost - Add ci.yml (name: Quality Gate) — single staged pipeline replacing both: Stage 1: fmt + clippy --workspace (fast gate) Stage 2: build matrix, check all-features / no-default-features / 32-bit, benchmarks compile (parallel, gated on Stage 1) Stage 3: nextest (gated on Stage 1) Stage 4: cargo deny check — licenses, sources, advisories (deny.toml already present and triaged) Stage 5: CI Required Gate composite job (branch protection target) - Remove rust_strict_delta_gate.sh — workspace-aware clippy --workspace makes delta comparison implicit (clean baseline = any warning fails) - pre-release-validate.yml: remove pull_request trigger (secrets unavailable on fork PRs caused guaranteed failure on every Cargo.toml bump); remove stale CARGO_REGISTRY_TOKEN check (crates.io publishing removed in #5858) Release workflow - Revert release-stable-manual.yml to gh release create --target for workflow_dispatch (git push approach from #5860 blocked by org Restrict creations rule; Releases API bypasses it; v0.7.1 has no immutable lock) Changelog - Retitle CHANGELOG-next.md to v0.6.9 → v0.7.1, restore full release notes, add entries missing from original draft: otel_headers (#5700), GitHub Copilot onboarding (#5321), Telegram inline_keyboard (#5790), tool_stream fix (#5806), empty tool output (#5565), web theme fixes (#5724, #5767), cron_run delivery fix

fix(agent): normalize empty successful tool output

eac18a9

aliasliao requested review from JordanTheJet and theonlyhennygod as code owners April 9, 2026 17:06

github-project-automation bot added this to ZeroClaw Project Board Apr 9, 2026

github-project-automation bot moved this to Backlog in ZeroClaw Project Board Apr 9, 2026

github-actions bot added the agent Auto scope: src/agent/** changed. label Apr 9, 2026

theonlyhennygod self-assigned this Apr 9, 2026

theonlyhennygod reviewed Apr 9, 2026

View reviewed changes

singlerider mentioned this pull request Apr 14, 2026

[Bug]: Custom provider tool follow-up fails when successful tool output is empty #5564

Closed

2 tasks

fix(agent): address empty tool output review

63a818c

merge: bring empty tool output fix up to date with master

9ec7ca5

github-actions bot removed the agent Auto scope: src/agent/** changed. label Apr 15, 2026

aliasliao requested a review from theonlyhennygod April 15, 2026 07:11

singlerider added risk: high Auto risk: security/runtime/gateway/tools/workflows. size: XS Auto size: <=80 non-doc changed lines. needs-maintainer-review labels Apr 17, 2026

singlerider requested a review from WareWolf-MoonWall April 17, 2026 03:04

singlerider reviewed Apr 17, 2026

View reviewed changes

WareWolf-MoonWall previously requested changes Apr 17, 2026

View reviewed changes

github-project-automation bot moved this from Backlog to Needs Changes in ZeroClaw Project Board Apr 17, 2026

singlerider approved these changes Apr 17, 2026

View reviewed changes

singlerider added agent-approved PR approved by automated review agent and removed needs-maintainer-review labels Apr 17, 2026

singlerider merged commit fbb5ae9 into zeroclaw-labs:master Apr 17, 2026
20 checks passed

github-project-automation bot moved this from Needs Changes to Shipped in ZeroClaw Project Board Apr 17, 2026

WareWolf-MoonWall mentioned this pull request Apr 18, 2026

chore: bump version to 0.7.1 and update release changelog #5867

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): normalize empty successful tool output#5565

fix(agent): normalize empty successful tool output#5565
singlerider merged 3 commits intozeroclaw-labs:masterfrom
aliasliao:codex/fix-empty-tool-output-custom-provider

aliasliao commented Apr 9, 2026 •

edited by singlerider

Loading

Uh oh!

theonlyhennygod left a comment

Uh oh!

aliasliao commented Apr 15, 2026

Uh oh!

aliasliao commented Apr 15, 2026 •

edited

Loading

Uh oh!

singlerider left a comment

Uh oh!

WareWolf-MoonWall left a comment

Uh oh!

singlerider left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aliasliao commented Apr 9, 2026 • edited by singlerider Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Label Snapshot (required)

Change Metadata

Linked Issue

Supersede Attribution (required when Supersedes # is used)

Validation Evidence (required)

Security Impact (required)

Privacy and Data Hygiene (required)

Compatibility / Migration

i18n Follow-Through (required when docs or user-facing wording changes)

Human Verification (required)

Side Effects / Blast Radius (required)

Agent Collaboration Notes (recommended)

Rollback Plan (required)

Risks and Mitigations

Uh oh!

theonlyhennygod left a comment

Choose a reason for hiding this comment

Review — PR #5565: fix(agent): normalize empty successful tool output

Comprehension Summary

Security / Performance Assessment

What was reviewed and verified

Findings

Verdict: Needs author action

Uh oh!

aliasliao commented Apr 15, 2026

Uh oh!

aliasliao commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

singlerider left a comment

Choose a reason for hiding this comment

Agent Review — Routing: Needs Maintainer Review

Uh oh!

WareWolf-MoonWall left a comment

Choose a reason for hiding this comment

PR Review — #5565 fix(agent): normalize empty successful tool output

What this change does

✅ Commendation

🟡 Conditional — validation evidence section is now stale

No new blocking findings

Uh oh!

singlerider left a comment

Choose a reason for hiding this comment

Comprehension summary

What was verified

Security / performance assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aliasliao commented Apr 9, 2026 •

edited by singlerider

Loading

Supersede Attribution (required when `Supersedes #` is used)

aliasliao commented Apr 15, 2026 •

edited

Loading

PR Review — #5565 `fix(agent): normalize empty successful tool output`