Name	Name	Last commit message	Last commit date
parent directory ..
src	src
test	test
.env.example	.env.example
CHANGELOG.md	CHANGELOG.md
README.md	README.md
package.json	package.json
tsconfig.json	tsconfig.json
vitest.config.ts	vitest.config.ts
vitest.e2e.config.ts	vitest.e2e.config.ts

workers-ai-provider

Workers AI provider for the AI SDK. Run Cloudflare's models for chat, embeddings, image generation, transcription, text-to-speech, reranking, and AI Search — all from a single provider.

Quick Start

// wrangler.jsonc
{
	"ai": { "binding": "AI" },
}

import { createWorkersAI } from "workers-ai-provider";
import { streamText } from "ai";

export default {
	async fetch(req: Request, env: { AI: Ai }) {
		const workersai = createWorkersAI({ binding: env.AI });

		const result = streamText({
			model: workersai("@cf/moonshotai/kimi-k2.5"),
			messages: [{ role: "user", content: "Write a haiku about Cloudflare" }],
		});

		return result.toTextStreamResponse();
	},
};

npm install workers-ai-provider ai

Configuration

Workers binding (recommended)

Inside a Cloudflare Worker, pass the env.AI binding directly. No API keys needed.

const workersai = createWorkersAI({ binding: env.AI });

REST API

Outside of Workers (Node.js, Bun, etc.), use your Cloudflare credentials:

const workersai = createWorkersAI({
	accountId: process.env.CLOUDFLARE_ACCOUNT_ID,
	apiKey: process.env.CLOUDFLARE_API_TOKEN,
});

AI Gateway

Route requests through AI Gateway for caching, rate limiting, and observability:

const workersai = createWorkersAI({
	binding: env.AI,
	gateway: { id: "my-gateway" },
});

Models

Browse the full catalog at developers.cloudflare.com/workers-ai/models.

Some good defaults:

Task	Model	Notes
Chat	`@cf/moonshotai/kimi-k2.5`	256k ctx, tools, vision, reasoning
Chat	`@cf/zai-org/glm-4.7-flash`	Fast, multilingual, 131k ctx
Chat	`@cf/openai/gpt-oss-120b`	OpenAI open-weights, high reasoning
Reasoning	`@cf/moonshotai/kimi-k2.5`	Configurable `reasoning_effort`
Reasoning	`@cf/qwen/qwq-32b`	Emits `reasoning_content`
Embeddings	`@cf/baai/bge-base-en-v1.5`	768-dim, English
Embeddings	`@cf/google/embeddinggemma-300m`	100+ languages, by Google
Images	`@cf/black-forest-labs/flux-1-schnell`	Fast, free-tier image generation
Transcription	`@cf/openai/whisper-large-v3-turbo`	Best accuracy, multilingual
Transcription	`@cf/deepgram/nova-3`	Fast, high accuracy
Text-to-Speech	`@cf/deepgram/aura-2-en`	Context-aware, natural pacing
Reranking	`@cf/baai/bge-reranker-base`	Fast document reranking

Text Generation

import { generateText } from "ai";

const { text } = await generateText({
	model: workersai("@cf/moonshotai/kimi-k2.5"),
	prompt: "Explain Workers AI in one paragraph",
});

Streaming:

import { streamText } from "ai";

const result = streamText({
	model: workersai("@cf/moonshotai/kimi-k2.5"),
	messages: [{ role: "user", content: "Write a short story" }],
});

for await (const chunk of result.textStream) {
	process.stdout.write(chunk);
}

Reasoning Controls

Reasoning-capable Workers AI models (GLM-4.7-flash, Kimi K2.5/K2.6, GPT-OSS, QwQ) accept reasoning_effort and chat_template_kwargs on their inputs. Either set them at model creation time as settings, or per-call via providerOptions["workers-ai"] (per-call wins):

// Settings-level (applies to every request on this model instance)
const model = workersai("@cf/zai-org/glm-4.7-flash", {
	reasoning_effort: "low", // "low" | "medium" | "high" | null
	chat_template_kwargs: { enable_thinking: false },
});

await generateText({ model, prompt: "Summarize in one sentence." });

// Per-call (overrides any settings-level value)
const model = workersai("@cf/zai-org/glm-4.7-flash");

await generateText({
	model,
	prompt: "Summarize in one sentence.",
	providerOptions: {
		"workers-ai": { reasoning_effort: "low" },
	},
});

reasoning_effort: null is meaningful — it's the explicit "disable reasoning" signal for models that support it. Both fields land on the inputs object of binding.run() (and the JSON body of the REST request), matching the shape expected by Workers AI. See the model catalog for per-model reasoning capabilities.

Vision (Image Inputs)

Send images to vision-capable models like Kimi K2.5:

import { generateText } from "ai";

const { text } = await generateText({
	model: workersai("@cf/moonshotai/kimi-k2.5"),
	messages: [
		{
			role: "user",
			content: [
				{ type: "text", text: "What's in this image?" },
				{ type: "image", image: imageUint8Array },
			],
		},
	],
});

Images can be provided as Uint8Array, base64 strings, or data URLs. Multiple images per message are supported. Works with both the binding and REST API configurations.

Tool Calling

import { generateText, stepCountIs } from "ai";
import { z } from "zod";

const { text } = await generateText({
	model: workersai("@cf/moonshotai/kimi-k2.5"),
	prompt: "What's the weather in London?",
	tools: {
		getWeather: {
			description: "Get the current weather for a city",
			inputSchema: z.object({ city: z.string() }),
			execute: async ({ city }) => ({ city, temperature: 18, condition: "Cloudy" }),
		},
	},
	stopWhen: stepCountIs(2),
});

Structured Output

import { generateText, Output } from "ai";
import { z } from "zod";

const { output } = await generateText({
	model: workersai("@cf/moonshotai/kimi-k2.5"),
	prompt: "Recipe for spaghetti bolognese",
	output: Output.object({
		schema: z.object({
			name: z.string(),
			ingredients: z.array(z.object({ name: z.string(), amount: z.string() })),
			steps: z.array(z.string()),
		}),
	}),
});

Embeddings

import { embedMany } from "ai";

const { embeddings } = await embedMany({
	model: workersai.textEmbedding("@cf/baai/bge-base-en-v1.5"),
	values: ["sunny day at the beach", "rainy afternoon in the city"],
});

Image Generation

import { generateImage } from "ai";

const { images } = await generateImage({
	model: workersai.image("@cf/black-forest-labs/flux-1-schnell"),
	prompt: "A mountain landscape at sunset",
	size: "1024x1024",
});

// images[0].uint8Array contains the PNG bytes

Transcription (Speech-to-Text)

Transcribe audio using Whisper or Deepgram Nova-3 models.

import { transcribe } from "ai";
import { readFile } from "node:fs/promises";

const { text, segments } = await transcribe({
	model: workersai.transcription("@cf/openai/whisper-large-v3-turbo"),
	audio: await readFile("./audio.mp3"),
	mediaType: "audio/mpeg",
});

With language hints (Whisper only):

const { text } = await transcribe({
	model: workersai.transcription("@cf/openai/whisper-large-v3-turbo", {
		language: "fr",
	}),
	audio: audioBuffer,
	mediaType: "audio/wav",
});

Deepgram Nova-3 is also supported and detects language automatically:

const { text } = await transcribe({
	model: workersai.transcription("@cf/deepgram/nova-3"),
	audio: audioBuffer,
	mediaType: "audio/wav",
});

Text-to-Speech

Generate spoken audio from text using Deepgram Aura-2.

import { speech } from "ai";

const { audio } = await speech({
	model: workersai.speech("@cf/deepgram/aura-2-en"),
	text: "Hello from Cloudflare Workers AI!",
	voice: "asteria",
});

// audio is a Uint8Array of MP3 bytes

Reranking

Reorder documents by relevance to a query — useful for RAG pipelines.

import { rerank } from "ai";

const { results } = await rerank({
	model: workersai.reranking("@cf/baai/bge-reranker-base"),
	query: "What is Cloudflare Workers?",
	documents: [
		"Cloudflare Workers lets you run JavaScript at the edge.",
		"A cookie is a small piece of data stored in the browser.",
		"Workers AI runs inference on Cloudflare's global network.",
	],
	topN: 2,
});

// results is sorted by relevance score

AI Search

AI Search is Cloudflare's managed RAG service. Connect your data and query it with natural language.

// wrangler.jsonc
{
	"ai_search": [{ "binding": "AI_SEARCH", "name": "my-search-index" }],
}

import { createAISearch } from "workers-ai-provider";
import { generateText } from "ai";

const aisearch = createAISearch({ binding: env.AI_SEARCH });

const { text } = await generateText({
	model: aisearch(),
	messages: [{ role: "user", content: "How do I setup AI Gateway?" }],
});

Streaming works the same way — use streamText instead of generateText.

createAutoRAG still works but is deprecated. Use createAISearch instead.

API Reference

`createWorkersAI(options)`

Option	Type	Description
`binding`	`Ai`	Workers AI binding (`env.AI`). Use this OR credentials.
`accountId`	`string`	Cloudflare account ID. Required with `apiKey`.
`apiKey`	`string`	Cloudflare API token. Required with `accountId`.
`gateway`	`GatewayOptions`	Optional AI Gateway config.

Returns a provider with model factories. Each factory accepts an optional second argument for per-model settings:

workersai("@cf/moonshotai/kimi-k2.5", {
	sessionAffinity: "my-unique-session-id",
});

Setting	Type	Description
`safePrompt`	`boolean`	Inject a safety prompt before all conversations.
`sessionAffinity`	`string`	Routes requests with the same key to the same backend replica for prefix-cache optimization.

Model factories:

// Chat — for generateText / streamText
workersai(modelId);
workersai.chat(modelId);

// Embeddings — for embedMany / embed
workersai.textEmbedding(modelId);

// Images — for generateImage
workersai.image(modelId);

// Transcription — for transcribe
workersai.transcription(modelId, settings?);

// Text-to-Speech — for speech
workersai.speech(modelId);

// Reranking — for rerank
workersai.reranking(modelId);

`createAISearch(options)`

Option	Type	Description
`binding`	`AutoRAG`	AI Search binding (`env.AI_SEARCH`).

Returns a callable provider:

aisearch(); // AI Search model (shorthand)
aisearch.chat(); // AI Search model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

workers-ai-provider

Quick Start

Configuration

Workers binding (recommended)

REST API

AI Gateway

Models

Text Generation

Reasoning Controls

Vision (Image Inputs)

Tool Calling

Structured Output

Embeddings

Image Generation

Transcription (Speech-to-Text)

Text-to-Speech

Reranking

AI Search

API Reference

`createWorkersAI(options)`

`createAISearch(options)`

FilesExpand file tree

workers-ai-provider

Directory actions

More options

Directory actions

More options

Latest commit

History

workers-ai-provider

Folders and files

parent directory

README.md

workers-ai-provider

Quick Start

Configuration

Workers binding (recommended)

REST API

AI Gateway

Models

Text Generation

Reasoning Controls

Vision (Image Inputs)

Tool Calling

Structured Output

Embeddings

Image Generation

Transcription (Speech-to-Text)

Text-to-Speech

Reranking

AI Search

API Reference

createWorkersAI(options)

createAISearch(options)

`createWorkersAI(options)`

`createAISearch(options)`