fix(workers-ai-provider): forward reasoning_effort and chat_template_kwargs (#501)#504
Merged
threepointone merged 2 commits intomainfrom Apr 23, 2026
Merged
Conversation
…kwargs (#501) `modelSettings` passed to the provider were flowing through `getRunOptions()` into the 3rd arg (options) of `binding.run(model, inputs, options)`, but Cloudflare Workers AI's `reasoning_effort` and `chat_template_kwargs` parameters belong on the 2nd arg (inputs). As a result they were silently dropped, causing reasoning models (GLM-4.7-flash, Kimi K2.5/K2.6, GPT-OSS, QwQ) to burn the entire output token budget on chain-of-thought. - Type `reasoning_effort` and `chat_template_kwargs` directly on `WorkersAIChatSettings`. - In `buildRunInputs()`, pull both values from settings and from `providerOptions["workers-ai"]` (per-call wins) and place them on the inputs object. `reasoning_effort: null` is preserved (`!== undefined` check) because it's the explicit "disable reasoning" signal. - In `getRunOptions()`, strip them from `passthroughOptions` so they don't leak into the binding's options arg or the REST URL query string. - Wire `options.providerOptions` through `doGenerate` and `doStream` so per-call overrides work without settings. Adds 11 tests covering binding inputs placement, REST body placement, null preservation, no leakage into options/query, per-call override, and unrelated settings passthrough (no regression). Closes #501. Made-with: Cursor
Review-driven follow-ups on top of the #501 fix: - Defensively guard `providerOptions["workers-ai"]` against non-object runtime values. `"key" in x` throws for primitives, so fall back to settings if a user passes a string / number / boolean / array rather than crashing the call. - Test: per-call `null` overrides a non-null settings value (confirms the `"key" in perCall` precedence logic works when the value is explicitly falsy). - Test: malformed `providerOptions["workers-ai"]` falls back to settings. - Test: reasoning params + AI Gateway on the binding path — inputs and options stay cleanly separated; gateway doesn't see reasoning_effort. - Test: `reasoning_effort: null` in settings no longer throws on the REST path. Before the fix, `createRun` rejected null at the query-string coercion step. Now that reasoning_effort lives in the JSON body, this round-trips cleanly. Made-with: Cursor
🦋 Changeset detectedLatest commit: 4b25307 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
reasoning_effortandchat_template_kwargsontobinding.run(model, inputs)'sinputsobject instead of silently dropping them into the options arg / REST query string.providerOptions["workers-ai"]; per-call wins).WorkersAIChatSettings;reasoning_effort: nullis preserved as the explicit "disable reasoning" signal.Closes #501.
Why
modelSettingspassed as the 2nd arg ofworkersAi(modelId, modelSettings)were flowing throughgetRunOptions()into the 3rd arg (options) ofbinding.run(model, inputs, options). But Cloudflare Workers AI'sreasoning_effortandchat_template_kwargsparameters belong on the 2nd arg (inputs). The result: reasoning models (GLM-4.7-flash, Kimi K2.5/K2.6, GPT-OSS, QwQ) burned the entire output token budget on chain-of-thought with no visible content, becausereasoning_effort: "low"was silently ignored. The REST path had an additional latent bug wherereasoning_effort: nullwould throw fromcreateRunat the query-string coercion step.What changed
src/workersai-chat-settings.ts— typedreasoning_effortandchat_template_kwargsdirectly onWorkersAIChatSettingsalongside the existing[key: string]: unknownescape hatch.src/workersai-chat-language-model.ts— three surgical changes:buildRunInputs()now pulls both fields from settings and fromproviderOptions["workers-ai"](per-call wins via"key" in perCall, so per-callnulloverrides settings"high"). Atypeof+Array.isArrayguard protects against malformed runtime values (since"key" in primitivethrows).getRunOptions()explicitly destructures the two fields out so they can't leak into...passthroughOptions(which is what bled into URL query strings / options arg).doGenerateanddoStreamnow wireoptions.providerOptionsthrough tobuildRunInputs.Usage
Test plan
reasoning_effortlands on inputs (2nd arg), not optionschat_template_kwargslands on inputs, not optionsreasoning_effort: nullis preserved on inputsproviderOptions["workers-ai"]overrides settingsnulloverrides non-null settings (locks in"key" inprecedence)providerOptions["workers-ai"](string / array) falls back to settings instead of crashingstream: truereasoning_effort: nullround-trips (pre-fix regression)custom_flag: "yes") still flow to the URL query — no regression for existing passthrough behaviorSibling chat classes (
AutoRAGChatLanguageModel,AISearchChatLanguageModel) audited — they don't usebinding.run(model, inputs, options)and have no reasoning surface, so no duplicate fix is needed.Notes for reviewers
providerOptionskey is"workers-ai"(matches npm package name). Other reasonable candidates are"workersai"(matches the internalproviderfield prefix) or"workersai.chat"(matches the full provider name). Happy to flip this pre-merge if the maintainers prefer a different convention.typeofguard. Happy to squash before merge if preferred.Made with Cursor