-
Notifications
You must be signed in to change notification settings - Fork 662
Open
Description
Description

In the Z.AI Coding Plan, usage is limited per prompt within N hours. However, due to the provider.json settings, cost is calculated as if it were per million input/output tokens. Also, the API endpoint is correctly configured automatically by crush.
Expected result is not showing any cost at all while using z.ai coding plan.
"name": "Z.AI",
"id": "zai",
"api_key": "$ZAI_API_KEY",
"api_endpoint": "https://api.z.ai/api/coding/paas/v4",
"type": "openai",
"default_large_model_id": "glm-4.5",
"default_small_model_id": "glm-4.5-air",
"models": [
{
"id": "glm-4.5",
"name": "GLM-4.5",
"cost_per_1m_in": 0.6,
"cost_per_1m_out": 2.2,
"cost_per_1m_in_cached": 0.11,
"cost_per_1m_out_cached": 0,
"context_window": 131072,
"default_max_tokens": 98304,
"can_reason": true,
"has_reasoning_efforts": false,
"supports_attachments": false
}
Version
0.9.2
Environment
macos 26.0
cobra91 and bernaferrari
Metadata
Metadata
Assignees
Labels
No labels