Skip to content

Misc. bug: Model Presets duplicate model entries #23931

Description

@sgayda2

Name and Version

sudo docker run ghcr.io/ggml-org/llama.cpp:server-cuda13 --version 127 ↵
version: 9404 (241cbd4)
built with GNU 14.2.0 for Linux x86_64

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

No response

Command line

llama-server --models-preset config.ini

Problem description & steps to reproduce

I am running the llama server using docker and specify the LLAMA_CACHE: "/models" which has some models already downloaded. When I run with a model preset for something like gpt-oss-120b, depending on how the preset is specified it will either recognize the already downloaded model and apply the presets to it, or it will create a new entry in the models endpoint.

This produces a duplicate:

[gpt-oss-120b]
hf          = ggml-org/gpt-oss-120b-GGUF
top-p       = 1.0

While this correctly recognizes the cached value:

[ggml-org/gpt-oss-120b-GGUF]
top-p       = 1.0

Ideally it would not create duplicates in the model list when specifying either.

First Bad Commit

No response

Relevant log output

v1/models shows "id":"ggml-org/gpt-oss-120b-GGUF:MXFP4" and "id":"gpt-oss-120b"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions