Releases · cloudflare/ai

19 Mar 07:18

github-actions

@cloudflare/tanstack-ai@0.1.3

761720e

@cloudflare/tanstack-ai@0.1.3

Patch Changes

#435 7381171 Thanks @mdhruvil! - Fix workers-ai adapter silently dropping image content parts.
#424 b2eeca8 Thanks @vaibhavshn! - Avoid duplicate tool call IDs by generating unique IDs per tool call index instead of trusting backend-provided IDs
#411 af08464 Thanks @baldyeagle! - Annotate createAnthropicChat to improve client type narrowing
#398 40e53c8 Thanks @vaibhavshn! - fix: add run/ prefix to workers-ai gateway endpoint and make API key optional for gateway bindings
#444 414b4d5 Thanks @mchenco! - Add sessionAffinity option to WorkersAiAdapterConfig for prefix-cache optimization. Routes requests with the same key to the same backend replica via the x-session-affinity header. Supported across binding, REST, and gateway modes.

Assets 2

20 Feb 10:12

github-actions

workers-ai-provider@3.1.2

9b6d4a9

workers-ai-provider@3.1.2

Patch Changes

#400 8822603 Thanks @threepointone! - Add early config validation to createWorkersAI that throws a clear error when neither a binding nor credentials (accountId + apiKey) are provided. Widen all model type parameters (TextGenerationModels, ImageGenerationModels, EmbeddingModels, TranscriptionModels, SpeechModels, RerankingModels) to accept arbitrary strings while preserving autocomplete for known models.

Assets 2

20 Feb 10:12

github-actions

@cloudflare/tanstack-ai@0.1.2

9b6d4a9

@cloudflare/tanstack-ai@0.1.2

Patch Changes

#406 9af703b Thanks @vaibhavshn! - Pass API Key properly for Gemini Tanstack AI Adapter
#400 8822603 Thanks @threepointone! - Add config validation to all Workers AI adapter constructors that throws a clear error when neither a binding, credentials (accountId + apiKey), nor a gateway configuration is provided. Widen all model type parameters (WorkersAiTextModel, WorkersAiImageModel, WorkersAiEmbeddingModel, WorkersAiTranscriptionModel, WorkersAiTTSModel, WorkersAiSummarizeModel) to accept arbitrary strings while preserving autocomplete for known models.

Assets 2

12 Feb 15:24

github-actions

workers-ai-provider@3.1.1

409b28d

workers-ai-provider@3.1.1

Patch Changes

#396 2fb3ca8 Thanks @threepointone! - - Rewrite README with updated model recommendations (GPT-OSS 120B, EmbeddingGemma 300M, Aura-2 EN)
- Stream tool calls incrementally using tool-input-start/delta/end events instead of buffering until stream end
- Fix REST streaming for models that don't support it on /ai/run/ (GPT-OSS, Kimi) by retrying without streaming
- Add Aura-2 EN/ES to SpeechModels type
- Log malformed SSE events with console.warn instead of silently swallowing

Assets 2

12 Feb 15:24

github-actions

@cloudflare/tanstack-ai@0.1.1

409b28d

@cloudflare/tanstack-ai@0.1.1

Patch Changes

#396 2fb3ca8 Thanks @threepointone! - - Update model recommendations: Aura-2 EN for TTS, Llama 4 Scout for chat examples
- Add Aura-2 EN/ES to TTS model type
- Preserve image/vision content in user messages instead of stripping to text-only
- Add non-streaming fallback when REST streaming fails (GPT-OSS, Kimi)
- Warn on premature stream termination instead of silently reporting "stop"
- Consistent console.warn prefix for SSE parse errors
- Move @cloudflare/workers-types from optionalDependencies to devDependencies (types-only, no runtime use)
- Fix @openrouter/sdk version mismatch type errors

Assets 2

11 Feb 21:25

github-actions

workers-ai-provider@3.1.0

122eae6

workers-ai-provider@3.1.0

Minor Changes

#389 8538cd5 Thanks @vaibhavshn! - Add transcription, text-to-speech, and reranking support to the Workers AI provider.

New capabilities
- Transcription (provider.transcription(model)) — implements TranscriptionModelV3. Supports Whisper models (@cf/openai/whisper, whisper-tiny-en, whisper-large-v3-turbo) and Deepgram Nova-3 (@cf/deepgram/nova-3). Handles model-specific input formats: number arrays for basic Whisper, base64 for v3-turbo via REST, and { body, contentType } for Nova-3 via binding or raw binary upload for Nova-3 via REST.
- Speech / TTS (provider.speech(model)) — implements SpeechModelV3. Supports Workers AI TTS models including Deepgram Aura-1 (@cf/deepgram/aura-1). Accepts text, voice, and speed options. Returns audio as Uint8Array. Uses returnRawResponse to handle binary audio from the REST path without JSON parsing.
- Reranking (provider.reranking(model)) — implements RerankingModelV3. Supports BGE reranker models (@cf/baai/bge-reranker-base, bge-reranker-v2-m3). Converts AI SDK's document format to Workers AI's { query, contexts, top_k } input. Handles both text and JSON object documents.
Bug fixes
- AbortSignal passthrough — createRun REST shim now passes the abort signal to fetch, enabling request cancellation and timeout handling. Previously the signal was silently dropped.
- Nova-3 REST support — Added createRunBinary utility for models that require raw binary upload instead of JSON (used by Nova-3 transcription via REST).
Usage
```
import { createWorkersAI } from "workers-ai-provider";
import {
  experimental_transcribe,
  experimental_generateSpeech,
  rerank,
} from "ai";

const workersai = createWorkersAI({ binding: env.AI });

// Transcription
const transcript = await experimental_transcribe({
  model: workersai.transcription("@cf/openai/whisper-large-v3-turbo"),
  audio: audioData,
  mediaType: "audio/wav",
});

// Speech
const speech = await experimental_generateSpeech({
  model: workersai.speech("@cf/deepgram/aura-1"),
  text: "Hello world",
  voice: "asteria",
});

// Reranking
const ranked = await rerank({
  model: workersai.reranking("@cf/baai/bge-reranker-base"),
  query: "What is machine learning?",
  documents: ["ML is a branch of AI.", "The weather is sunny."],
});
```

Assets 2

11 Feb 00:52

github-actions

workers-ai-provider@3.0.5

c8b1507

workers-ai-provider@3.0.5

Patch Changes

#393 91b32e0 Thanks @threepointone! - Comprehensive cleanup of the workers-ai-provider package.

Bug fixes:
- Fixed phantom dependency on fetch-event-stream that caused runtime crashes when installed outside the monorepo. Replaced with a built-in SSE parser.
- Fixed streaming buffering: responses now stream token-by-token instead of arriving all at once. The root cause was twofold — an eager ReadableStream start() pattern that buffered all chunks, and a heuristic that silently fell back to non-streaming doGenerate whenever tools were defined. Both are fixed. Streaming now uses a proper TransformStream pipeline with backpressure.
- Fixed reasoning-delta ID mismatch in simulated streaming — was using generateId() instead of the reasoningId from the preceding reasoning-start event, causing the AI SDK to drop reasoning content.
- Fixed REST API client (createRun) silently swallowing HTTP errors. Non-200 responses now throw with status code and response body.
- Fixed response_format being sent as undefined on every non-JSON request. Now only included when actually set.
- Fixed json_schema field evaluating to false (a boolean) instead of undefined when schema was missing.
  
  Workers AI quirk workarounds:
- Added sanitizeToolCallId() — strips non-alphanumeric characters and pads/truncates to 9 chars, fixing tool call round-trips through the binding which rejects its own generated IDs.
- Added normalizeMessagesForBinding() — converts content: null to "" and sanitizes tool call IDs before every binding call. Only applied on the binding path (REST preserves original IDs).
- Added null-finalization chunk filtering for streaming tool calls.
- Added numeric value coercion in native-format streams (Workers AI sometimes returns numbers instead of strings for the response field).
- Improved image model to handle all output types from binding.run(): ReadableStream, Uint8Array, ArrayBuffer, Response, and { image: base64 } objects.
- Graceful degradation: if binding.run() returns a non-streaming response despite stream: true, it wraps the complete response as a simulated stream instead of throwing.
  
  Premature stream termination detection:
- Streams that end without a [DONE] sentinel now report finishReason: "error" with raw: "stream-truncated" instead of silently reporting "stop".
- Stream read errors are caught and emit finishReason: "error" with raw: "stream-error".
  
  AI Search (formerly AutoRAG):
- Added createAISearch and AISearchChatLanguageModel as the canonical exports, reflecting the rename from AutoRAG to AI Search.
- createAutoRAG still works but emits a one-time deprecation warning pointing to createAISearch.
- createAutoRAG preserves "autorag.chat" as the provider name for backward compatibility.
- AI Search now warns when tools or JSON response format are provided (unsupported by the aiSearch API).
- Simplified AI Search internals — removed dead tool/response-format processing code.
  
  Code quality:
- Removed dead code: workersai-error.ts (never imported), workersai-image-config.ts (inlined).
- Consistent file naming: renamed workers-ai-embedding-model.ts to workersai-embedding-model.ts.
- Replaced StringLike catch-all index signatures with [key: string]: unknown on settings types.
- Replaced any types with proper interfaces (FlatToolCall, OpenAIToolCall, PartialToolCall).
- Tightened processToolCall format detection to check function.name instead of just the presence of a function property.
- Removed @ai-sdk/provider-utils and zod peer dependencies (no longer used in source).
- Added imageModel to the WorkersAI interface type for consistency.
  
  Tests:
- 149 unit tests across 10 test files (up from 82).
- New test coverage: sanitizeToolCallId, normalizeMessagesForBinding, prepareToolsAndToolChoice, processText, mapWorkersAIUsage, image model output types, streaming error scenarios (malformed SSE, premature termination, empty stream), backpressure verification, graceful degradation (non-streaming fallback with text/tools/reasoning), REST API error handling (401/404/500), AI Search warnings, embedding TooManyEmbeddingValuesForCallError, message conversion with images and reasoning.
- Integration tests for REST API and binding across 12 models and 7 categories (chat, streaming, multi-turn, tool calling, tool round-trip, structured output, image generation, embeddings).
- All tests use the AI SDK's public APIs (generateText, streamText, generateImage, embedMany) instead of internal .doGenerate()/.doStream() methods.
  
  README:
- Rewritten from scratch with concise examples, model recommendations, configuration guide, and known limitations section.
- Updated to use current AI SDK v6 APIs (generateText + Output.object instead of deprecated generateObject, generateImage instead of experimental_generateImage, stopWhen: stepCountIs(2) instead of maxSteps).
- Added sections for tool calling, structured output, embeddings, image generation, and AI Search.
- Uses wrangler.jsonc format for configuration examples.

Assets 2

11 Feb 21:25

github-actions

@cloudflare/tanstack-ai@0.1.0

122eae6

@cloudflare/tanstack-ai@0.1.0

Minor Changes

#389 a4b756e Thanks @vaibhavshn! - Add @cloudflare/tanstack-ai — adapters for using TanStack AI with Cloudflare Workers AI and AI Gateway.

Workers AI adapters

All Workers AI adapters support four configuration modes: plain binding (env.AI), plain REST (account ID + API key), AI Gateway binding (env.AI.gateway(id)), and AI Gateway REST (account ID + gateway ID).
- Chat (createWorkersAiChat) — Streaming chat completions via the OpenAI-compatible API. Includes tool calling with full round-trip support, structured output via json_schema, and reasoning text streaming (STEP_STARTED/STEP_FINISHED AG-UI events) for models like QwQ, DeepSeek R1, and Kimi K2.5. A custom fetch shim translates OpenAI SDK calls to env.AI.run() for binding mode, with a stream transformer that handles both Workers AI native format and OpenAI-compatible format.
- Image generation (createWorkersAiImage) — Stable Diffusion and other text-to-image models.
- Transcription (createWorkersAiTranscription) — Speech-to-text via Whisper and Deepgram Nova-3.
- Text-to-speech (createWorkersAiTts) — Audio generation via Deepgram Aura-1.
- Summarization (createWorkersAiSummarize) — Text summarization via BART-large-CNN.
- Embeddings (createWorkersAiEmbedding) — Text embeddings (implemented but not yet exported, pending TanStack AI's BaseEmbeddingAdapter).
AI Gateway adapters (third-party providers)

Route requests through Cloudflare AI Gateway for caching, rate limiting, and unified billing. Each adapter injects a custom fetch (or httpOptions for Gemini) that handles both binding and credential-based gateway configurations.
- OpenAI — Chat, summarize, image, transcription, TTS, video (createOpenAi*)
- Anthropic — Chat, summarize (createAnthropic*)
- Gemini — Chat, summarize, image, TTS (createGemini*). Credentials-only (Google GenAI SDK lacks custom fetch support).
- Grok — Chat, summarize, image (createGrok*)
- OpenRouter — Chat, summarize, image (createOpenRouter*). Accepts any model string.
Utilities
- createGatewayFetch — Shared fetch factory that routes requests through AI Gateway (binding or REST), with support for cache control headers (skipCache, cacheTtl, customCacheKey, metadata).
- createWorkersAiBindingFetch — Fetch shim that makes env.AI look like an OpenAI endpoint, including stream transformation and tool call ID sanitization for the binding's strict [a-zA-Z0-9]{9} validation.
- Config detection helpers (isDirectBindingConfig, isDirectCredentialsConfig, isGatewayConfig) using structural typing to discriminate env.AI from env.AI.gateway(id).
- Shared binary utilities for normalizing Workers AI responses (Uint8Array, ArrayBuffer, ReadableStream, JSON wrapper) to base64.
Robustness
- Premature stream termination detection — if Workers AI truncates a response or the connection drops (no finish_reason), the adapter emits proper closing events so consumers don't hang.
- Graceful non-streaming fallback — if a model returns a complete response despite stream: true, the binding shim wraps it into a valid response.
- Deepgram Nova-3 transcription uses raw binary audio via REST (not JSON), automatically detected by model name.
Testing
- Comprehensive unit tests (186 tests) covering all adapters, config modes, stream transformation, message building, tool calling, reasoning events, premature termination, and public API surface.
- E2E integration tests against real Workers AI APIs (both binding and REST paths) across 12 chat models + 4 transcription models + image/TTS/summarize, validating chat, multi-turn, tool calling, tool round-trips, structured output, reasoning, and all non-chat capabilities.
- Tree-shakeable package exports with per-adapter entry points for ESM and CJS.

Assets 2

31 Jan 01:50

github-actions

workers-ai-provider@3.0.4

bcc4ee2

workers-ai-provider@3.0.4

Patch Changes

#390 41b92a3 Thanks @mchenco! - fix(workers-ai-provider): extract actual finish reason in streaming instead of hardcoded "stop"

Previously, the streaming implementation always returned finishReason: "stop" regardless of the actual completion reason. This caused:
- Tool calling scenarios to incorrectly report "stop" instead of "tool-calls"
- Multi-turn tool conversations to fail because the AI SDK couldn't detect when tools were requested
- Length limit scenarios to show "stop" instead of "length"
- Error scenarios to show "stop" instead of "error"
The fix extracts the actual finish_reason from streaming chunks and uses the existing mapWorkersAIFinishReason() function to properly map it to the AI SDK's finish reason format. This enables proper multi-turn tool calling and accurate completion status reporting.

Assets 2

29 Jan 00:46

github-actions

ai-gateway-provider@3.1.1

ff539d6

ai-gateway-provider@3.1.1

Patch Changes

8b1d870 Thanks @threepointone! - Update dependencies

Assets 2

Releases: cloudflare/ai

@cloudflare/tanstack-ai@0.1.3

Patch Changes

Uh oh!

workers-ai-provider@3.1.2

Patch Changes

Uh oh!

@cloudflare/tanstack-ai@0.1.2

Patch Changes

Uh oh!

workers-ai-provider@3.1.1

Patch Changes

Uh oh!

@cloudflare/tanstack-ai@0.1.1

Patch Changes

Uh oh!

workers-ai-provider@3.1.0

Minor Changes

New capabilities

Bug fixes

Usage

Uh oh!

workers-ai-provider@3.0.5

Patch Changes

Uh oh!

@cloudflare/tanstack-ai@0.1.0

Minor Changes

Workers AI adapters

AI Gateway adapters (third-party providers)

Utilities

Robustness

Testing

Uh oh!

workers-ai-provider@3.0.4

Patch Changes

Uh oh!

ai-gateway-provider@3.1.1

Patch Changes

Uh oh!