Skip to content

Misc. bug: crashes when calling llama_state_get_size on a reranking model #13463

@giladgd

Description

@giladgd

Name and Version

Compiled at commit 6562e5a,
but it also happens on the latest master version.

Operating systems

Mac

Which llama.cpp modules do you know to be affected?

libllama (core library)

Problem description & steps to reproduce

The process crashes with SIGSEGV when calling llama_state_get_size on a context created with a reranking model (bge-reranker-v2-m3-Q8_0.gguf in my tests).

Here's a simple reproduction code:

void repro() {
    llama_backend_init();

    auto model_params = llama_model_default_params();
    model_params.n_gpu_layers = 33;

    auto model_path = "/home/user/models/bge-reranker-v2-m3-Q8_0.gguf";
    auto model = llama_model_load_from_file(model_path, model_params);
    fputs("model loaded\n", stdout);
    fflush(stdout);

    auto context_params = llama_context_default_params();
    context_params.embeddings = true;
    context_params.pooling_type = LLAMA_POOLING_TYPE_RANK;
    auto ctx = llama_init_from_model(model, context_params);
    fputs("context created\n", stdout);
    fflush(stdout);

    auto state_size = llama_state_get_size(ctx); // <- crashes here
    fputs(("State size: " + std::to_string(state_size) + "\n").c_str(), stdout);
    fflush(stdout);

    llama_free(ctx);
    llama_model_free(model);

    llama_backend_free();
}

First Bad Commit

The issue was introduced at commit 6562e5a (PR #13108)

Relevant log output

Last logs before the crash:

set_abort_callback: call
llama_context:        CPU  output buffer size =     0.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 3
llama_context: max_nodes = 65536
context created
state_write_data: writing state
state_write_data: - writing model info
state_write_data: - writing output ids
state_write_data: - writing logits
state_write_data: - writing embeddings
state_write_data: - writing KV self

From lldb:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000104d2dd08 libllama.dylib`llama_context::state_write_data(llama_io_write_i&) + 684
    frame #1: 0x0000000104d2d9e4 libllama.dylib`llama_context::state_get_size() + 40
    frame #2: 0x000000010456f708 llama-addon.node`repro() + 240

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions