Skip to content

feat: anthropic provider#637

Merged
droot merged 7 commits into
GoogleCloudPlatform:mainfrom
zvdy:feat/anthropic-provider
Mar 10, 2026
Merged

feat: anthropic provider#637
droot merged 7 commits into
GoogleCloudPlatform:mainfrom
zvdy:feat/anthropic-provider

Conversation

@zvdy
Copy link
Copy Markdown
Contributor

@zvdy zvdy commented Mar 4, 2026

Context

Adds a native Anthropic provider for kubectl-ai using the official github.com/anthropics/anthropic-sdk-go SDK, instead of routing Claude through the OpenAI compatibility shim.

The compat shim hides Claude-exclusive features and has caused real bugs (tool_call history corruption in streaming). This adds a first-class anthropic:// provider.

Why native?

  • Prompt cachingcache_control on system prompt and tool definitions; the system prompt is large and repeated every turn, caching cuts cost significantly
  • Extended thinkingthinking content blocks for complex k8s debugging queries
  • Reliable streaming — native SSE events instead of the compat translation layer
  • Native error codesoverloaded_error, rate_limit_error map cleanly to retry logic
  • Future-proofing — adopts the SDK early so new Claude features (e.g. interleaved thinking) are one import away

Testing

Trace: Prompt Caching

Both API turns show cache_creation_input_tokens: 0 and cache_read_input_tokens: 2254 — the system prompt + tool definitions (2254 tokens) are served from Anthropic's prompt cache on every turn.

Cache breakpoints are set via "cache_control": {"type": "ephemeral"} on the system prompt block and the last tool definition.

Turn 1input_tokens: 339, cache_read: 2254, cache_creation: 0

event: message_start
data: {
  "usage": {
    "input_tokens": 339,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 2254,
    "output_tokens": 3
  }
}

Turn 2 — history added (input_tokens grew 339 → 696), cache still hits at 2254

event: message_start
data: {
  "usage": {
    "input_tokens": 696,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 2254,
    "output_tokens": 1
  }
}
Trace: Tool Use (parallel, streamed)

Claude streamed two parallel tool_use blocks in a single response. Tool input JSON is assembled incrementally via input_json_delta events, then executed and returned as tool_result blocks.

event: content_block_start
data: {"index":1,"content_block":{"type":"tool_use","id":"toolu_01Co6xj94B74RC3zmsmi7eHs","name":"kubectl","input":{}}}

event: content_block_delta
data: {"index":1,"delta":{"type":"input_json_delta","partial_json":"{\"command\": \"kubectl get namespaces\", \"modifies_resource\": \"no\"}"}}

event: content_block_stop
data: {"index":1}

event: content_block_start
data: {"index":2,"content_block":{"type":"tool_use","id":"toolu_01WitVoRNAaJ39aUSaide8ui","name":"kubectl","input":{}}}

event: content_block_delta
data: {"index":2,"delta":{"type":"input_json_delta","partial_json":"{\"command\": \"kubectl get pods -n default\", \"modifies_resource\": \"no\"}"}}

event: content_block_stop
data: {"index":2}

event: message_delta
data: {"delta":{"stop_reason":"tool_use"}}

Tool execution:

action: tool-request
  name: kubectl
  arguments:
    command: kubectl get namespaces

action: tool-response
  response:
    stdout: |
      NAME                 STATUS   AGE
      app                  Active   18d
      default              Active   18d
      kube-node-lease      Active   18d
      kube-public          Active   18d
      kube-system          Active   18d
      local-path-storage   Active   18d
      pgbouncer-reader     Active   16d
      pgbouncer-writer     Active   16d
      postgres             Active   16d

action: tool-request
  name: kubectl
  arguments:
    command: kubectl get pods -n default

action: tool-response
  response:
    stderr: No resources found in default namespace.
Trace: Native SSE Streaming

The final answer is delivered as a stream of text_delta events — each chunk yielded to the UI incrementally, no buffering until completion.

event: content_block_start
data: {"index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"index":0,"delta":{"type":"text_delta","text":"## 📋 All Namespaces in your cluster:"}}

event: content_block_delta
data: {"index":0,"delta":{"type":"text_delta","text":"\n\n| Namespace | Status | Age |\n|-----------|--------|-----|"}}

event: content_block_delta
data: {"index":0,"delta":{"type":"text_delta","text":"\n| app | Active | 18d |\n| default | Active | 18d |"}}

event: content_block_delta
data: {"index":0,"delta":{"type":"text_delta","text":"\n| kube-system | Active | 18d |\n| postgres | Active | 16d |"}}

event: content_block_delta
data: {"index":0,"delta":{"type":"text_delta","text":"\n\n## 🔍 Pods in the default namespace:\n\n**No resources found in default namespace.** ✨"}}

event: content_block_stop
data: {"index":0}

event: message_delta
data: {"delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":246}}

event: message_stop
data: {"type":"message_stop"}
Trace: Multi-Turn History

input_tokens grew from 339 (turn 1) to 696 (turn 2), reflecting the full conversation history appended to each request. cache_read_input_tokens stays at 2254 — the system prompt cache is unaffected by history growth.

Messages array sent in turn 2:

[
  {
    "role": "user",
    "content": [{"type": "text", "text": "list all namespaces\nwhat pods are running in the default namespace?"}]
  },
  {
    "role": "assistant",
    "content": [
      {"type": "text", "text": "I'll help you with those requests! Let me fetch the information from your Kubernetes cluster."},
      {"type": "tool_use", "id": "toolu_01Co6xj94B74RC3zmsmi7eHs", "name": "kubectl", "input": {"command": "kubectl get namespaces"}},
      {"type": "tool_use", "id": "toolu_01WitVoRNAaJ39aUSaide8ui", "name": "kubectl", "input": {"command": "kubectl get pods -n default"}}
    ]
  },
  {
    "role": "user",
    "content": [
      {"type": "tool_result", "tool_use_id": "toolu_01Co6xj94B74RC3zmsmi7eHs", "content": "...kubectl get namespaces output..."},
      {"type": "tool_result", "tool_use_id": "toolu_01WitVoRNAaJ39aUSaide8ui", "content": "...No resources found..."}
    ]
  }
]

@droot

@zvdy zvdy marked this pull request as draft March 4, 2026 20:27
@droot
Copy link
Copy Markdown
Member

droot commented Mar 10, 2026

Love it!! Looking forward to it. Will be happy to review it :)

@zvdy zvdy force-pushed the feat/anthropic-provider branch from 799b251 to 9402919 Compare March 10, 2026 19:56
@zvdy zvdy marked this pull request as ready for review March 10, 2026 20:36
@zvdy
Copy link
Copy Markdown
Contributor Author

zvdy commented Mar 10, 2026

I'm ok with current state, will iterate over more features like temperature and things like that, I' think most of the users don't tweak those so are minor future improvements

ptal @droot

@zvdy zvdy changed the title feat(wip): anthropic provider feat: anthropic provider Mar 10, 2026
Copy link
Copy Markdown
Member

@droot droot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@droot droot merged commit e7ec597 into GoogleCloudPlatform:main Mar 10, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants