Skip to content

Server tool input reconstruction missing in streaming #954

@BexTuychiev

Description

@BexTuychiev

Bug Report: Server tool input reconstruction missing in streaming

Summary

The SDK correctly reconstructs input fields for tool_use blocks during streaming via input_json_delta events, but fails to do the same for server_tool_use blocks (e.g., code execution tool). This creates inconsistent behavior between client and server tools, breaking legitimate use cases like code extraction from streaming responses. If confirmed, I can submit a PR.

Use Case & Context

I was building a math solver application that uses Claude's code execution tool with streaming for better user experience. The application needed to:

  1. Stream the response for real-time feedback
  2. Extract the executed code blocks for logging/analysis
  3. Provide a smooth user experience with both streaming and code extraction

Problem Description

When using streaming with server tools (like code_execution_20250522), the final message contains server_tool_use blocks with empty input dictionaries, making it impossible to extract the actual code that was executed.

Expected Behavior

# After streaming completion
for item in final_message.content:
    if item.type == "server_tool_use" and item.name == "code_execution":
        print(item.input)  # Should contain: {"code": "print(2 + 2)"}

Actual Behavior

# After streaming completion  
for item in final_message.content:
    if item.type == "server_tool_use" and item.name == "code_execution":
        print(item.input)  # Actually contains: {}

Investigation & Root Cause

What We Tried

  1. Non-streaming vs Streaming comparison: Non-streaming works perfectly, streaming fails
  2. Different streaming approaches: Both simple text_stream and complex event handling fail
  3. current_message_snapshot inspection: Same empty inputs (it's the same object as get_final_message())
  4. Manual delta reconstruction: Successfully implemented by tracking input_json_delta events
  5. Client vs Server tool comparison: Client tools work, server tools don't

Root Cause in SDK Source Code

Found in src/anthropic/lib/streaming/_messages.py, line 431:

elif event.delta.type == "input_json_delta":
    if content.type == "tool_use":  # ← Only handles CLIENT tools
        from jiter import from_json
        # JSON reconstruction logic...
        json_buf = cast(bytes, getattr(content, JSON_BUF_PROPERTY, b""))
        json_buf += bytes(event.delta.partial_json, "utf-8")
        if json_buf:
            content.input = from_json(json_buf, partial_mode=True)
        setattr(content, JSON_BUF_PROPERTY, json_buf)
    # Missing: elif content.type == "server_tool_use": block

The SDK only reconstructs inputs for tool_use (client tools), completely ignoring server_tool_use (server tools).

Reproduction Steps

Minimal Test Case

import os
from anthropic import Anthropic

client = Anthropic(
    api_key=os.getenv("ANTHROPIC_API_KEY"),
    default_headers={"anthropic-beta": "code-execution-2025-05-22"}
)

# Test with streaming
with client.messages.stream(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Calculate 2+2 using Python"}],
    tools=[{"type": "code_execution_20250522", "name": "code_execution"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    
    final_message = stream.get_final_message()
    
    for item in final_message.content:
        if item.type == "server_tool_use":
            print(f"\nServer tool input: {item.input}")  # Shows: {}

# Compare with non-streaming (works correctly)
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Calculate 2+2 using Python"}],
    tools=[{"type": "code_execution_20250522", "name": "code_execution"}],
)

for item in response.content:
    if item.type == "server_tool_use":
        print(f"Non-streaming input: {item.input}")  # Shows: {"code": "print(2 + 2)"}

Client vs Server Tool Comparison

# CLIENT TOOL (works with streaming)
with client.messages.stream(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=[{
        "name": "get_weather",
        "description": "Get weather",
        "input_schema": {
            "type": "object",
            "properties": {"location": {"type": "string"}},
            "required": ["location"]
        }
    }],
    tool_choice={"type": "tool", "name": "get_weather"}
) as stream:
    # ... consume stream ...
    for item in final_message.content:
        if item.type == "tool_use":
            print(item.input)  # ✅ Shows: {"location": "Paris"}

# SERVER TOOL (broken with streaming)
with client.messages.stream(
    # ... same code as above but with code_execution tool ...
    for item in final_message.content:
        if item.type == "server_tool_use":
            print(item.input)  # ❌ Shows: {}

Evidence from API Documentation

The official streaming documentation clearly shows that input_json_delta events are sent for server tools:

// Code execution streaming example from docs
event: content_block_delta  
data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"code\":\"import pandas as pd\\ndf = pd.read_csv('data.csv')\\nprint(df.head())\""}}

The API sends the data, but the SDK ignores it for server tools.

Impact

This affects any application that needs to:

  • Extract executed code for logging/analysis
  • Build debugging tools for AI code execution
  • Implement code history/replay features
  • Provide transparency about what code was run
  • Create educational tools showing step-by-step code execution

Recommended Fix

Extend the existing reconstruction logic to handle server tools:

elif event.delta.type == "input_json_delta":
    if content.type == "tool_use":
        # existing client tool logic
        from jiter import from_json
        json_buf = cast(bytes, getattr(content, JSON_BUF_PROPERTY, b""))
        json_buf += bytes(event.delta.partial_json, "utf-8")
        if json_buf:
            content.input = from_json(json_buf, partial_mode=True)
        setattr(content, JSON_BUF_PROPERTY, json_buf)
    elif content.type == "server_tool_use":  # ← Add this block
        # Same reconstruction logic for server tools
        from jiter import from_json
        json_buf = cast(bytes, getattr(content, JSON_BUF_PROPERTY, b""))
        json_buf += bytes(event.delta.partial_json, "utf-8")
        if json_buf:
            content.input = from_json(json_buf, partial_mode=True)
        setattr(content, JSON_BUF_PROPERTY, json_buf)

Workaround (Manual Implementation)

We successfully implemented manual delta tracking as a workaround:

def extract_code_blocks_streaming_fixed(response):
    """Working code extraction with manual delta reconstruction."""
    code_blocks = []
    accumulated_deltas = {}  # Track by content block index
    
    # During streaming, accumulate input_json_delta events
    # Then manually parse and reconstruct after completion
    # (Full implementation available if needed)
    
    return code_blocks

But this should not be necessary - the SDK should handle this automatically like it does for client tools.

Environment

  • anthropic-sdk-python: Latest version
  • Python: 3.9+
  • Model: claude-sonnet-4-20250514
  • Tool: code_execution_20250522

Conclusion

This appears to be an oversight in the SDK implementation rather than intentional design. The API sends input_json_delta events for server tools, the documentation shows examples of it, and the SDK already has the reconstruction logic - it just doesn't apply it consistently to both tool types.

The fix would be minimal, low-risk, and would restore API consistency while enabling legitimate use cases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions