Skip to content

Tool calls with self-hosted Qwen3-Coder fail (almost always) #124

@ruudniew

Description

@ruudniew

What happened?

Self hosting Qwen3-Coder-480B-A35B-Instruct was a success, using something like this, but tool calls don't work:

Without a jinja template

 lama-box --host 0.0.0.0 --embeddings --gpu-layers 63 --parallel 4 --ctx-size 8192 --port 40003 --model /mnt/nfs/models/Qwen3-Coder-480B-A35B-Instruct-Q4_K_M-00001-of-00006.gguf --alias Qwen3-Coder-480B-A35B-Instruct --no-mmap --no-warmup --tensor-split 182335,182335 --ctx-size 262144 --flash-attn --parallel 4 --mmap --mlock --verbose --top-k 20 --temp 0.7 --top-p 0.8 --repeat-penalty 1.05 --min-p 0.00

With a jinja template

 lama-box --host 0.0.0.0 --embeddings --gpu-layers 63 --parallel 4 --ctx-size 8192 --port 40003 --model /mnt/nfs/models/Qwen3-Coder-480B-A35B-Instruct-Q4_K_M-00001-of-00006.gguf --alias Qwen3-Coder-480B-A35B-Instruct --no-mmap --no-warmup --tensor-split 182335,182335 --ctx-size 262144 --flash-attn --parallel 4 --mmap --mlock --verbose --top-k 20 --temp 0.7 --top-p 0.8 --repeat-penalty 1.05 --jinja --min-p 0.00 --chat-template-file /mnt/nfs/models/chat_template.jinja

Qwen Code appears to call tools, but it doesn't actually execute them. It just says something like [calling tool x with argument y and reasoning z].

What did you expect to happen?

I'd expect the tool calls to work.

Client information

Details

Login information

No response

Anything else we need to know?

I tried various quantizations and jinja files

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working as expected

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions