Context
Filing this from the rockbot side after running into a related bug there. The MCP spec lets a tool response carry isError: true to signal a logical (content-level) failure — the call mechanically completed, but the operation failed. Aggregators / gateways need to propagate that flag faithfully from upstream server → caller, otherwise downstream consumers (agents, recorders, learning loops) treat the error response as a successful call that just happens to contain error text.
Symptom we hit in rockbot
Today's investigation in rockbot:
- An agent called an MCP tool (
list_files against the wrong OneDrive account).
- The upstream MCP server returned
Graph API error: The resource could not be found. — clearly an error.
- rockbot's tool-call recorder logged the call as
succeeded:true because nothing on the path between the upstream error response and the recorder converted that into a flagged failure.
- Result: rockbot's dream-time retry-pattern detector saw zero failures and learned nothing from the incident.
We fixed it on the rockbot side (rockbot PR #313, commit 495c38e) by detecting the \"Error: \" prefix that RegistryToolFunction prepends when IsError=true, and flipping the recorded status. But that only works if IsError=true actually arrived at the agent — i.e. the gateway preserved it.
Why mcp-aggregator might have the same gap
mcp-aggregator is functionally similar to rockbot's MCP bridge — it proxies tool calls between agents and upstream MCP servers. If the proxy path:
- Treats the upstream's
isError: true content-block response as a normal successful response and serializes only content, dropping the flag, or
- Maps upstream errors only when the transport itself fails (HTTP non-2xx, timeout) and not when the upstream returns
isError: true in a 2xx body, or
- Translates
ToolInvokeResponse.IsError only on the inbound MCP→aggregator side but not on the outbound aggregator→caller side,
…then any client that records or learns from tool-call outcomes will silently lose the error signal, with no symptom until something downstream needs to distinguish logical success from logical failure.
Suggested check
In whichever path the aggregator handles tool-call responses (likely something around tools/call in McpAggregator.Core and surfaced through both HttpServer and StdioServer):
- When an upstream MCP server returns a response with
isError: true, does the aggregator propagate that flag through to its caller? Both transports (HTTP and stdio) and both MCP semantics (raw MCP and any REST mapping) should preserve it.
- A regression test would help: stub an upstream that returns
isError: true with a content block, invoke the tool through the aggregator, assert the caller-visible response has isError: true.
Not asking for a specific fix
Filing this so it lands in the queue rather than getting lost in conversation. Happy to dig deeper / send a PR if useful — wanted to surface the concern before forgetting about the parallel.
Context
Filing this from the rockbot side after running into a related bug there. The MCP spec lets a tool response carry
isError: trueto signal a logical (content-level) failure — the call mechanically completed, but the operation failed. Aggregators / gateways need to propagate that flag faithfully from upstream server → caller, otherwise downstream consumers (agents, recorders, learning loops) treat the error response as a successful call that just happens to contain error text.Symptom we hit in rockbot
Today's investigation in rockbot:
list_filesagainst the wrong OneDrive account).Graph API error: The resource could not be found.— clearly an error.succeeded:truebecause nothing on the path between the upstream error response and the recorder converted that into a flagged failure.We fixed it on the rockbot side (rockbot PR #313, commit 495c38e) by detecting the
\"Error: \"prefix thatRegistryToolFunctionprepends whenIsError=true, and flipping the recorded status. But that only works ifIsError=trueactually arrived at the agent — i.e. the gateway preserved it.Why mcp-aggregator might have the same gap
mcp-aggregator is functionally similar to rockbot's MCP bridge — it proxies tool calls between agents and upstream MCP servers. If the proxy path:
isError: truecontent-block response as a normal successful response and serializes onlycontent, dropping the flag, orisError: truein a 2xx body, orToolInvokeResponse.IsErroronly on the inbound MCP→aggregator side but not on the outbound aggregator→caller side,…then any client that records or learns from tool-call outcomes will silently lose the error signal, with no symptom until something downstream needs to distinguish logical success from logical failure.
Suggested check
In whichever path the aggregator handles tool-call responses (likely something around
tools/callinMcpAggregator.Coreand surfaced through bothHttpServerandStdioServer):isError: true, does the aggregator propagate that flag through to its caller? Both transports (HTTP and stdio) and both MCP semantics (raw MCP and any REST mapping) should preserve it.isError: truewith a content block, invoke the tool through the aggregator, assert the caller-visible response hasisError: true.Not asking for a specific fix
Filing this so it lands in the queue rather than getting lost in conversation. Happy to dig deeper / send a PR if useful — wanted to surface the concern before forgetting about the parallel.