Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b5560
b5558
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Win…
b5556
server: allow unclosed thinking tags (#13931)
b5555
llama : deprecate explicit kv_self defrag/update calls (#13921) ggml-ci
b5554
llama : use n_swa + n_ubatch cells for SWA cache (#13833) * llama : use n_swa + n_ubatch cells for SWA cache ggml-ci * llama : add warning about multi-sqeuence SWA contexts
b5553
webui : Replace alert and confirm with custom modals. (#13711) * Replace alert and confirm with custom modals. This is needed as Webview in VS Code doesn't permit alert and confirm for security reasons. * use Modal Provider to simplify the use of confirm and alert modals. * Increase the z index of the modal dialogs. * Update index.html.gz * also add showPrompt * rebuild --------- Co-authored-by: igardev <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]>
b5551
mtmd : drop `_shared` from `libmtmd` name, merge helpers into libmtmd…
b5549
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dG…
b5548
CUDA: fix typo in FlashAttention code (#13926)
b5547
sched : avoid changing cur_copy when a graph is already allocated (#1…