Closed
Description
I am getting the following error/segfault
when submitting an image to ggml-org/Qwen2.5-VL-7B-Instruct-GGUF
using llama-server and the Python llm package as a client.
The same image is processed perfectly fine with
ggml-org/gemma-3-4b-it-GGUF
and
ggml-org/SmolVLM2-2.2B-Instruct-GGUF.
This is the output of "identify" on the image file:
neocube-one-layer-pattern.jpg JPEG 2592x1944 2592x1944+0+0 8-bit sRGB 858245B 0.000u 0:00.000
And here is the output I get from llama-server before it segfaults.
Note that it wants to allocate 44GB RAM, which is likely an error somewhere.
slot launch_slot_: id 0 | task 0 | processing task
slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 11
slot update_slots: id 0 | task 0 | kv cache rm [0, end)
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 4, n_tokens = 4, progress = 0.363636
encoding image or slice...
slot update_slots: id 0 | task 0 | kv cache rm [4, end)
srv process_chun: processing image...
ggml_aligned_malloc: insufficient memory (attempted to allocate 44668.09 MB)
ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 46837887616
ggml_gallocr_reserve_n: failed to allocate CPU buffer of size 46837887616
ggml_aligned_malloc: insufficient memory (attempted to allocate 44668.09 MB)
ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 46837887616
ggml_gallocr_reserve_n: failed to allocate CPU buffer of size 46837887616
make: *** [Makefile:20: llm-api] Segmentation fault
Metadata
Metadata
Assignees
Labels
No labels