Skip to content

Misc. bug: --jinja for llama-cli is not enabled by default as the documentation indicates #18446

@ctataryn

Description

@ctataryn

Name and Version

$ llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3070, compute capability 8.6, VMM: yes
version: 7149 (134e6940c)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

llama-cli -m ./qwen2.5-1.5b-instruct.gguf --lora ./qwen-lora.gguf

Problem description & steps to reproduce

I noticed this issue when I was using the --lora switch for running a fine-tuning LoRA along with a base model. The LoRA didn't seem to take effect, that is, asking the model questions with it enabled rendered the same results as not using the LoRA at all. If I use --jinja however, the fine-tuning LoRA did take effect.

The documentation for llama-cli indicates that --jinja is the default behaviour. However the following pull request suggests that --jinja is only enabled by default when running in server mode, not cli:

#17524

If I can figure out the auto-gen documentation thing I'll also submit a pull request to correct the -cli README

First Bad Commit

No response

Relevant log output

Logs

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions