Skip to content

Segfault when submitting image to ggml-org/Qwen2.5-VL-7B-Instruct-GGUF #13467

Closed
@mlang

Description

@mlang

I am getting the following error/segfault
when submitting an image to ggml-org/Qwen2.5-VL-7B-Instruct-GGUF
using llama-server and the Python llm package as a client.

The same image is processed perfectly fine with
ggml-org/gemma-3-4b-it-GGUF
and
ggml-org/SmolVLM2-2.2B-Instruct-GGUF.

This is the output of "identify" on the image file:
neocube-one-layer-pattern.jpg JPEG 2592x1944 2592x1944+0+0 8-bit sRGB 858245B 0.000u 0:00.000

And here is the output I get from llama-server before it segfaults.
Note that it wants to allocate 44GB RAM, which is likely an error somewhere.

slot launch_slot_: id  0 | task 0 | processing task
slot update_slots: id  0 | task 0 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 11
slot update_slots: id  0 | task 0 | kv cache rm [0, end)
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 4, n_tokens = 4, progress = 0.363636
encoding image or slice...
slot update_slots: id  0 | task 0 | kv cache rm [4, end)
srv  process_chun: processing image...
ggml_aligned_malloc: insufficient memory (attempted to allocate 44668.09 MB)
ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 46837887616
ggml_gallocr_reserve_n: failed to allocate CPU buffer of size 46837887616
ggml_aligned_malloc: insufficient memory (attempted to allocate 44668.09 MB)
ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 46837887616
ggml_gallocr_reserve_n: failed to allocate CPU buffer of size 46837887616
make: *** [Makefile:20: llm-api] Segmentation fault

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions