Skip to content

Releases: cloudflare/ai

workers-ai-provider@3.1.8

25 Mar 08:47
525b4ad

Choose a tag to compare

Patch Changes

@cloudflare/tanstack-ai@0.1.6

25 Mar 08:48
525b4ad

Choose a tag to compare

Patch Changes

  • #459 a375d3f Thanks @TimoWilhelm! - Add maxTokens support to WorkersAi chat and handle non-string responses
    • Forward maxTokens from TextOptions to the Workers AI binding as max_tokens in both streaming and non-streaming paths.
    • Stringify object responses from the binding when building assistant messages instead of defaulting to empty string.

workers-ai-provider@3.1.7

22 Mar 13:51
1ad154e

Choose a tag to compare

Patch Changes

  • #457 cc94a06 Thanks @threepointone! - Fix request cancellation by propagating abortSignal to outbound network calls.

    ai-gateway-provider: Pass abortSignal to the fetch call (API path) and to binding.run() (binding path) so that cancelled requests are properly aborted.

    workers-ai-provider: Pass abortSignal to binding.run() for chat, embedding, and image models, matching the existing behavior in transcription, speech, and reranking models.

    @cloudflare/tanstack-ai: Pass signal through to binding.run() in both createGatewayFetch (AI Gateway binding path) and createWorkersAiBindingFetch (Workers AI binding path).

workers-ai-provider@3.1.6

22 Mar 10:12
8b86a0f

Choose a tag to compare

Patch Changes

  • #454 29087ad Thanks @mchenco! - Fix three tool calling bugs that caused multi-turn agentic loops to fail

    1. Tool result output not unwrapped

    convert-to-workersai-chat-messages.ts was calling JSON.stringify(toolResponse.output) on the entire LanguageModelV3ToolResultOutput wrapper object ({ type: 'text', value: '...' }), sending the wrapper as the tool message content instead of just the value. Models received garbled tool results and stopped after the first tool call instead of continuing.

    Fix: extract output.value and serialize only that.

    2. toolChoice: "required" mapped to "any" instead of "required"

    utils.ts mapped toolChoice: "required" to tool_choice: "any". All vLLM-backed models (@cf/moonshotai/kimi-k2.5, @cf/meta/llama-4-scout-17b-16e-instruct, @cf/zai-org/glm-4.7-flash) return 8001: Invalid input for tool_choice: "any". The same incorrect mapping applied to toolChoice: { type: "tool" }.

    Fix: map both to "required".

    3. description: false in tool definitions

    utils.ts used && short-circuit for tool description and parameters, which evaluates to false (not undefined) when tool.type !== "function". Sending description: false to the binding causes 8001: Invalid input.

    Fix: use ternary to produce undefined when not applicable.

    Tested against @cf/moonshotai/kimi-k2.5, @cf/meta/llama-4-scout-17b-16e-instruct, and @cf/zai-org/glm-4.7-flash via the Workers AI binding.

ai-gateway-provider@3.1.2

22 Mar 13:51
1ad154e

Choose a tag to compare

Patch Changes

  • #457 cc94a06 Thanks @threepointone! - Fix request cancellation by propagating abortSignal to outbound network calls.

    ai-gateway-provider: Pass abortSignal to the fetch call (API path) and to binding.run() (binding path) so that cancelled requests are properly aborted.

    workers-ai-provider: Pass abortSignal to binding.run() for chat, embedding, and image models, matching the existing behavior in transcription, speech, and reranking models.

    @cloudflare/tanstack-ai: Pass signal through to binding.run() in both createGatewayFetch (AI Gateway binding path) and createWorkersAiBindingFetch (Workers AI binding path).

@cloudflare/tanstack-ai@0.1.5

22 Mar 13:51
1ad154e

Choose a tag to compare

Patch Changes

  • #457 cc94a06 Thanks @threepointone! - Fix request cancellation by propagating abortSignal to outbound network calls.

    ai-gateway-provider: Pass abortSignal to the fetch call (API path) and to binding.run() (binding path) so that cancelled requests are properly aborted.

    workers-ai-provider: Pass abortSignal to binding.run() for chat, embedding, and image models, matching the existing behavior in transcription, speech, and reranking models.

    @cloudflare/tanstack-ai: Pass signal through to binding.run() in both createGatewayFetch (AI Gateway binding path) and createWorkersAiBindingFetch (Workers AI binding path).

workers-ai-provider@3.1.5

21 Mar 14:20
e07d57e

Choose a tag to compare

Patch Changes

  • #451 2a62e23 Thanks @mchenco! - Fix reasoning content being concatenated into assistant message content in multi-turn conversations

    Previously, reasoning parts in assistant messages were concatenated into the content string when building message history. This caused models like kimi-k2.5 and deepseek-r1 to receive their own internal reasoning as if it were spoken text, corrupting the conversation history and resulting in empty text responses or leaked special tokens on subsequent turns.

    Reasoning parts are now sent as the reasoning field on the assistant message object, which is the field name vLLM expects on input for reasoning models (kimi-k2.5, glm-4.7-flash).

workers-ai-provider@3.1.4

19 Mar 13:09
fec9b69

Choose a tag to compare

Patch Changes

  • #448 054ccb8 Thanks @threepointone! - Fix image inputs for vision-capable chat models

    • Handle all LanguageModelV3DataContent variants (Uint8Array, base64 string, data URL) instead of only Uint8Array
    • Send images as OpenAI-compatible image_url content parts inline in messages, enabling vision for models like Llama 4 Scout and Kimi K2.5
    • Works with both the binding and REST API paths

workers-ai-provider@3.1.3

19 Mar 07:17
761720e

Choose a tag to compare

Patch Changes

  • #429 ae24f06 Thanks @michaeldwan! - Pass tool_choice through to binding.run() so tool selection mode (auto, required, none) is respected when using Workers AI with the binding API

  • #410 bc2eba3 Thanks @vaibhavshn! - fix: route REST API requests through AI Gateway when the gateway option is provided in createRun()

  • #446 3c35051 Thanks @threepointone! - Remove tool_call_id sanitization that truncated IDs to 9 alphanumeric chars, which caused all tool call IDs to collide after round-trip

  • #444 b1c742b Thanks @mchenco! - Add sessionAffinity setting to send x-session-affinity header for prefix-cache optimization. Also forward extraHeaders in the REST API path instead of discarding them.

@cloudflare/tanstack-ai@0.1.4

19 Mar 13:09
fec9b69

Choose a tag to compare

Patch Changes

  • #448 054ccb8 Thanks @threepointone! - Fix image inputs for vision-capable chat models

    • Handle all LanguageModelV3DataContent variants (Uint8Array, base64 string, data URL) instead of only Uint8Array
    • Send images as OpenAI-compatible image_url content parts inline in messages, enabling vision for models like Llama 4 Scout and Kimi K2.5
    • Works with both the binding and REST API paths