Skip to content

Releases: ngxson/llama.cpp

b5560

01 Jun 08:58
c046217
Compare
Choose a tag to compare
parallel : fix n_junk == 0 (#13952)

b5558

31 May 23:45
053b153
Compare
Choose a tag to compare
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Win…

b5556

31 May 15:45
e15898d
Compare
Choose a tag to compare
server: allow unclosed thinking tags (#13931)

b5555

31 May 13:29
803f8ba
Compare
Choose a tag to compare
llama : deprecate explicit kv_self defrag/update calls (#13921)

ggml-ci

b5554

31 May 13:22
3600cc2
Compare
Choose a tag to compare
llama : use n_swa + n_ubatch cells for SWA cache (#13833)

* llama : use n_swa + n_ubatch cells for SWA cache

ggml-ci

* llama : add warning about multi-sqeuence SWA contexts

b5553

31 May 10:26
c7e0a20
Compare
Choose a tag to compare
webui : Replace alert and confirm with custom modals. (#13711)

* Replace alert and confirm with custom modals. This is needed as Webview in VS Code doesn't permit alert and confirm for security reasons.

* use Modal Provider to simplify the use of confirm and alert modals.

* Increase the z index of the modal dialogs.

* Update index.html.gz

* also add showPrompt

* rebuild

---------

Co-authored-by: igardev <[email protected]>
Co-authored-by: Xuan Son Nguyen <[email protected]>

b5551

31 May 08:46
51fa76f
Compare
Choose a tag to compare
mtmd : drop `_shared` from `libmtmd` name, merge helpers into libmtmd…

b5549

31 May 07:56
eb39499
Compare
Choose a tag to compare
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dG…

b5548

30 May 19:46
e562eec
Compare
Choose a tag to compare
CUDA: fix typo in FlashAttention code (#13926)

b5547

30 May 17:24
b47ab7b
Compare
Choose a tag to compare
sched : avoid changing cur_copy when a graph is already allocated (#1…