Description
Name and Version
Compiled at commit 6562e5a,
but it also happens on the latest master version.
Operating systems
Mac
Which llama.cpp modules do you know to be affected?
libllama (core library)
Problem description & steps to reproduce
The process crashes with SIGSEGV
when calling llama_state_get_size
on a context created with a reranking model (bge-reranker-v2-m3-Q8_0.gguf
in my tests).
Here's a simple reproduction code:
void repro() {
llama_backend_init();
auto model_params = llama_model_default_params();
model_params.n_gpu_layers = 33;
auto model_path = "/home/user/models/bge-reranker-v2-m3-Q8_0.gguf";
auto model = llama_model_load_from_file(model_path, model_params);
fputs("model loaded\n", stdout);
fflush(stdout);
auto context_params = llama_context_default_params();
context_params.embeddings = true;
context_params.pooling_type = LLAMA_POOLING_TYPE_RANK;
auto ctx = llama_init_from_model(model, context_params);
fputs("context created\n", stdout);
fflush(stdout);
auto state_size = llama_state_get_size(ctx); // <- crashes here
fputs(("State size: " + std::to_string(state_size) + "\n").c_str(), stdout);
fflush(stdout);
llama_free(ctx);
llama_model_free(model);
llama_backend_free();
}
First Bad Commit
The issue was introduced at commit 6562e5a (PR #13108)
Relevant log output
Last logs before the crash:
set_abort_callback: call
llama_context: CPU output buffer size = 0.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 3
llama_context: max_nodes = 65536
context created
state_write_data: writing state
state_write_data: - writing model info
state_write_data: - writing output ids
state_write_data: - writing logits
state_write_data: - writing embeddings
state_write_data: - writing KV self
From lldb
:
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000104d2dd08 libllama.dylib`llama_context::state_write_data(llama_io_write_i&) + 684
frame #1: 0x0000000104d2d9e4 libllama.dylib`llama_context::state_get_size() + 40
frame #2: 0x000000010456f708 llama-addon.node`repro() + 240