Problem
modelSettings passed to the provider (2nd arg of workersAi(modelId, modelSettings)) are forwarded to getRunOptions() which becomes the 3rd argument (options) of binding.run(model, inputs, options).
However, Cloudflare Workers AI parameters like reasoning_effort and chat_template_kwargs belong on the inputs level (2nd argument of binding.run()). buildRunInputs() only constructs inputs from AI SDK-mapped args (max_tokens, temperature, top_p, response_format, tools, tool_choice) and never includes anything from modelSettings.
This means these parameters are silently dropped:
const workersAi = createWorkersAI({ binding: env.AI })
// reasoning_effort ends up in runOptions (3rd arg), not inputs (2nd arg)
const model = workersAi('@cf/zai-org/glm-4.7-flash', {
reasoning_effort: 'low',
})
// chat_template_kwargs also silently dropped
const model = workersAi('@cf/zai-org/glm-4.7-flash', {
chat_template_kwargs: { enable_thinking: false },
})
Affected parameters
From @cloudflare/workers-types (ChatCompletionsCommonOptions):
reasoning_effort — "low" | "medium" | "high" | null
chat_template_kwargs — { enable_thinking?: boolean, clear_thinking?: boolean }
Both are typed on the inputs object and accepted by binding.run(), but the provider never puts them there.
Where it breaks
In workers-ai-provider@3.1.10:
buildRunInputs() in workersai-chat-language-model.ts — hardcoded field list, no passthrough from settings
getRunOptions() in workersai-chat-language-model.ts — rest-spreads settings into options (wrong destination for these params)
Impact
GLM-4.7-flash, Kimi K2.5/K2.6, and any other reasoning model on Workers AI cannot have reasoning controlled via the provider. In our case, GLM-4.7-flash burned all 16,384 output tokens on an infinite reasoning loop because reasoning_effort: 'low' was silently ignored, causing AI_NoOutputGeneratedError after exhausting retries.
Expected behavior
reasoning_effort and chat_template_kwargs from modelSettings should be included in the inputs payload passed to binding.run().
Versions
workers-ai-provider@3.1.10
ai@6.0.159
@cloudflare/workers-types@4.20260418.1
Problem
modelSettingspassed to the provider (2nd arg ofworkersAi(modelId, modelSettings)) are forwarded togetRunOptions()which becomes the 3rd argument (options) ofbinding.run(model, inputs, options).However, Cloudflare Workers AI parameters like
reasoning_effortandchat_template_kwargsbelong on the inputs level (2nd argument ofbinding.run()).buildRunInputs()only constructs inputs from AI SDK-mapped args (max_tokens,temperature,top_p,response_format,tools,tool_choice) and never includes anything frommodelSettings.This means these parameters are silently dropped:
Affected parameters
From
@cloudflare/workers-types(ChatCompletionsCommonOptions):reasoning_effort—"low" | "medium" | "high" | nullchat_template_kwargs—{ enable_thinking?: boolean, clear_thinking?: boolean }Both are typed on the inputs object and accepted by
binding.run(), but the provider never puts them there.Where it breaks
In
workers-ai-provider@3.1.10:buildRunInputs()inworkersai-chat-language-model.ts— hardcoded field list, no passthrough from settingsgetRunOptions()inworkersai-chat-language-model.ts— rest-spreads settings into options (wrong destination for these params)Impact
GLM-4.7-flash, Kimi K2.5/K2.6, and any other reasoning model on Workers AI cannot have reasoning controlled via the provider. In our case, GLM-4.7-flash burned all 16,384 output tokens on an infinite reasoning loop because
reasoning_effort: 'low'was silently ignored, causingAI_NoOutputGeneratedErrorafter exhausting retries.Expected behavior
reasoning_effortandchat_template_kwargsfrommodelSettingsshould be included in the inputs payload passed tobinding.run().Versions
workers-ai-provider@3.1.10ai@6.0.159@cloudflare/workers-types@4.20260418.1