feat: anthropic provider#637
Merged
Merged
Conversation
Member
|
Love it!! Looking forward to it. Will be happy to review it :) |
799b251 to
9402919
Compare
Contributor
Author
|
I'm ok with current state, will iterate over more features like temperature and things like that, I' think most of the users don't tweak those so are minor future improvements ptal @droot |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Adds a native Anthropic provider for
kubectl-aiusing the officialgithub.1485827954.workers.dev/anthropics/anthropic-sdk-goSDK, instead of routing Claude through the OpenAI compatibility shim.Why native?
cache_controlon system prompt and tool definitions; the system prompt is large and repeated every turn, caching cuts cost significantlythinkingcontent blocks for complex k8s debugging queriesoverloaded_error,rate_limit_errormap cleanly to retry logicTesting
Trace: Prompt Caching
Both API turns show
cache_creation_input_tokens: 0andcache_read_input_tokens: 2254— the system prompt + tool definitions (2254 tokens) are served from Anthropic's prompt cache on every turn.Cache breakpoints are set via
"cache_control": {"type": "ephemeral"}on the system prompt block and the last tool definition.Turn 1 —
input_tokens: 339,cache_read: 2254,cache_creation: 0Turn 2 — history added (
input_tokensgrew 339 → 696), cache still hits at 2254Trace: Tool Use (parallel, streamed)
Claude streamed two parallel
tool_useblocks in a single response. Tool input JSON is assembled incrementally viainput_json_deltaevents, then executed and returned astool_resultblocks.Tool execution:
Trace: Native SSE Streaming
The final answer is delivered as a stream of
text_deltaevents — each chunk yielded to the UI incrementally, no buffering until completion.Trace: Multi-Turn History
input_tokensgrew from 339 (turn 1) to 696 (turn 2), reflecting the full conversation history appended to each request.cache_read_input_tokensstays at 2254 — the system prompt cache is unaffected by history growth.Messages array sent in turn 2:
[ { "role": "user", "content": [{"type": "text", "text": "list all namespaces\nwhat pods are running in the default namespace?"}] }, { "role": "assistant", "content": [ {"type": "text", "text": "I'll help you with those requests! Let me fetch the information from your Kubernetes cluster."}, {"type": "tool_use", "id": "toolu_01Co6xj94B74RC3zmsmi7eHs", "name": "kubectl", "input": {"command": "kubectl get namespaces"}}, {"type": "tool_use", "id": "toolu_01WitVoRNAaJ39aUSaide8ui", "name": "kubectl", "input": {"command": "kubectl get pods -n default"}} ] }, { "role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_01Co6xj94B74RC3zmsmi7eHs", "content": "...kubectl get namespaces output..."}, {"type": "tool_result", "tool_use_id": "toolu_01WitVoRNAaJ39aUSaide8ui", "content": "...No resources found..."} ] } ]@droot