Replies: 1 comment 1 reply
-
What would you propose? Even between Anthropic and Gemini the controls exposed vary. I'm not against the idea of adding support to the core abstractions if there's a reasonable way to handle it robustly in a provider-agnostic manner. It doesn't necessarily have to be on the exact content instance. It could, for example, be a dedicated content type that's added immediately after the target content, in which case it could be provider specific without complicating the model. The IChatClient implementation then peeks ahead at the next content to see if they should be combined logically. (This is how, for example, the google nuget package handles thought signatures on arbitrary content, by adding an additional reasoning content node after the target.) AdditionalProperties can also be used, e.g. TextContent tc = ...;
tc.AdditionalProperties = new() { ["CacheControl"] = new CacheControlEphemeral { TTL = ttl };and then IChatClient implementation just looks for additional property information there. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Some APIs, in particular the Anthropic API, use explicit caching, where content parts need to be marked for caching. The cost savings are 90% - so in agentic loops with a lot of tool use, the difference between caching and not caching is immense in terms of cost.
Currently one has to do something like this:
This is extremely awkward. And I'm not even sure it is possible to do for System-level content, depending on how the mapping to the top-level property is done (the Anthropic API doesn't have system messages in the messages array).
I think a shared abstraction for Caching would be good so that provider-specific IChatClients can support this. Something like a property on AIContent perhaps?
I'll probably propose an extension to the Anthropic C# SDK to handle this in the meanwhile, but I think it's a valid top-level concern, as Gemini also supports explicit caching.
See: anthropics/anthropic-sdk-csharp#73
Beta Was this translation helpful? Give feedback.
All reactions