Skip to content

fix(workers-ai-provider): forward reasoning_effort and chat_template_kwargs (#501)#504

Merged
threepointone merged 2 commits intomainfrom
fix/workers-ai-provider-reasoning-passthrough
Apr 23, 2026
Merged

fix(workers-ai-provider): forward reasoning_effort and chat_template_kwargs (#501)#504
threepointone merged 2 commits intomainfrom
fix/workers-ai-provider-reasoning-passthrough

Conversation

@threepointone
Copy link
Copy Markdown
Collaborator

Summary

  • Forwards reasoning_effort and chat_template_kwargs onto binding.run(model, inputs)'s inputs object instead of silently dropping them into the options arg / REST query string.
  • Supports both settings-level and per-call usage (via providerOptions["workers-ai"]; per-call wins).
  • Types both fields directly on WorkersAIChatSettings; reasoning_effort: null is preserved as the explicit "disable reasoning" signal.

Closes #501.

Why

modelSettings passed as the 2nd arg of workersAi(modelId, modelSettings) were flowing through getRunOptions() into the 3rd arg (options) of binding.run(model, inputs, options). But Cloudflare Workers AI's reasoning_effort and chat_template_kwargs parameters belong on the 2nd arg (inputs). The result: reasoning models (GLM-4.7-flash, Kimi K2.5/K2.6, GPT-OSS, QwQ) burned the entire output token budget on chain-of-thought with no visible content, because reasoning_effort: "low" was silently ignored. The REST path had an additional latent bug where reasoning_effort: null would throw from createRun at the query-string coercion step.

What changed

src/workersai-chat-settings.ts — typed reasoning_effort and chat_template_kwargs directly on WorkersAIChatSettings alongside the existing [key: string]: unknown escape hatch.

src/workersai-chat-language-model.ts — three surgical changes:

  1. buildRunInputs() now pulls both fields from settings and from providerOptions["workers-ai"] (per-call wins via "key" in perCall, so per-call null overrides settings "high"). A typeof + Array.isArray guard protects against malformed runtime values (since "key" in primitive throws).
  2. getRunOptions() explicitly destructures the two fields out so they can't leak into ...passthroughOptions (which is what bled into URL query strings / options arg).
  3. doGenerate and doStream now wire options.providerOptions through to buildRunInputs.

Usage

// Settings-level
const model = workersai("@cf/zai-org/glm-4.7-flash", {
  reasoning_effort: "low",
  chat_template_kwargs: { enable_thinking: false },
});

// Per-call (overrides settings)
await generateText({
  model,
  prompt,
  providerOptions: {
    "workers-ai": { reasoning_effort: "low" },
  },
});

Test plan

  • 15 new unit tests (binding + REST + gateway + providerOptions paths)
  • Binding path: reasoning_effort lands on inputs (2nd arg), not options
  • Binding path: chat_template_kwargs lands on inputs, not options
  • reasoning_effort: null is preserved on inputs
  • Omission stays absent from both inputs and options
  • Per-call providerOptions["workers-ai"] overrides settings
  • Per-call null overrides non-null settings (locks in "key" in precedence)
  • Malformed providerOptions["workers-ai"] (string / array) falls back to settings instead of crashing
  • Reasoning params + Gateway on binding path — no interference
  • Streaming request forwards both fields with stream: true
  • REST path: both fields land in the JSON body, not the URL query string
  • REST path: reasoning_effort: null round-trips (pre-fix regression)
  • REST path: unrelated custom settings (custom_flag: "yes") still flow to the URL query — no regression for existing passthrough behavior
  • 250/250 unit tests passing
  • Type-check clean

Sibling chat classes (AutoRAGChatLanguageModel, AISearchChatLanguageModel) audited — they don't use binding.run(model, inputs, options) and have no reasoning surface, so no duplicate fix is needed.

Notes for reviewers

  • Per-call providerOptions key is "workers-ai" (matches npm package name). Other reasonable candidates are "workersai" (matches the internal provider field prefix) or "workersai.chat" (matches the full provider name). Happy to flip this pre-merge if the maintainers prefer a different convention.
  • Split into two commits for auditable review: the original fix, plus a review-driven hardening pass with extra tests and the defensive typeof guard. Happy to squash before merge if preferred.

Made with Cursor

…kwargs (#501)

`modelSettings` passed to the provider were flowing through `getRunOptions()`
into the 3rd arg (options) of `binding.run(model, inputs, options)`, but
Cloudflare Workers AI's `reasoning_effort` and `chat_template_kwargs`
parameters belong on the 2nd arg (inputs). As a result they were silently
dropped, causing reasoning models (GLM-4.7-flash, Kimi K2.5/K2.6, GPT-OSS,
QwQ) to burn the entire output token budget on chain-of-thought.

- Type `reasoning_effort` and `chat_template_kwargs` directly on
  `WorkersAIChatSettings`.
- In `buildRunInputs()`, pull both values from settings and from
  `providerOptions["workers-ai"]` (per-call wins) and place them on the
  inputs object. `reasoning_effort: null` is preserved (`!== undefined`
  check) because it's the explicit "disable reasoning" signal.
- In `getRunOptions()`, strip them from `passthroughOptions` so they don't
  leak into the binding's options arg or the REST URL query string.
- Wire `options.providerOptions` through `doGenerate` and `doStream` so
  per-call overrides work without settings.

Adds 11 tests covering binding inputs placement, REST body placement,
null preservation, no leakage into options/query, per-call override, and
unrelated settings passthrough (no regression).

Closes #501.

Made-with: Cursor
Review-driven follow-ups on top of the #501 fix:

- Defensively guard `providerOptions["workers-ai"]` against non-object
  runtime values. `"key" in x` throws for primitives, so fall back to
  settings if a user passes a string / number / boolean / array rather
  than crashing the call.
- Test: per-call `null` overrides a non-null settings value (confirms the
  `"key" in perCall` precedence logic works when the value is explicitly
  falsy).
- Test: malformed `providerOptions["workers-ai"]` falls back to settings.
- Test: reasoning params + AI Gateway on the binding path — inputs and
  options stay cleanly separated; gateway doesn't see reasoning_effort.
- Test: `reasoning_effort: null` in settings no longer throws on the REST
  path. Before the fix, `createRun` rejected null at the query-string
  coercion step. Now that reasoning_effort lives in the JSON body, this
  round-trips cleanly.

Made-with: Cursor
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 23, 2026

🦋 Changeset detected

Latest commit: 4b25307

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
workers-ai-provider Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 23, 2026

Open in StackBlitz

npx https://pkg.pr.new/cloudflare/ai/ai-gateway-provider@504
npx https://pkg.pr.new/cloudflare/ai/@cloudflare/tanstack-ai@504
npx https://pkg.pr.new/cloudflare/ai/workers-ai-provider@504

commit: 4b25307

@threepointone threepointone merged commit 8024caf into main Apr 23, 2026
3 checks passed
@threepointone threepointone deleted the fix/workers-ai-provider-reasoning-passthrough branch April 23, 2026 13:04
@github-actions github-actions Bot mentioned this pull request Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

workers-ai-provider: modelSettings not forwarded to inputs — reasoning_effort and chat_template_kwargs silently ignored

1 participant