-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
requirements: relax torch~=2.6.0 to torch>=2.6.0 for convert_hf_to_gguf
#23503
opened May 21, 2026 by
adityasingh2400
Loading…
Optimized flash attention (FA) for OpenCL backend, and add Q4/Q8 KV cache quantization with FA for Adreno GPUs.
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#23501
opened May 21, 2026 by
wanghqc
Contributor
Loading…
perplexity: fix integer overflow
examples
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
#23496
opened May 21, 2026 by
fairydreaming
Collaborator
Loading…
opencl: batch profiling to prevent resource exhaustion
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#23495
opened May 21, 2026 by
shaofeiqi
Contributor
Loading…
ggm-cpu: ARM Repack kernels for Q1_0
ggml
changes relating to the ggml tensor library for machine learning
#23492
opened May 21, 2026 by
pl752
Contributor
Loading…
common/download: prevent duplicate MTP draft model downloads
#23489
opened May 21, 2026 by
iOptimizeThings
Loading…
Add missing changes relating to the ggml tensor library for machine learning
buffer set in allreduce fallback !COMPUTE clear
ggml
#23480
opened May 21, 2026 by
TheBlueMatt
Contributor
Loading…
feat(ui): add lazy-loaded mermaid diagram rendering
examples
server/ui
#23475
opened May 21, 2026 by
StrikeOner
Loading…
Optimize ggml_vec_dot_q4_K_q8_K_generic
ggml
changes relating to the ggml tensor library for machine learning
#23474
opened May 21, 2026 by
pauser0000001
Loading…
ui: media attachments before text
examples
server/ui
#23467
opened May 21, 2026 by
sfallah
Contributor
Loading…
vocab : keep DNA k-mer ids distinct from colliding BPE tokens
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
python
python script changes
#23466
opened May 21, 2026 by
kashif
Contributor
Loading…
[WebGPU] Check batch_compute_passes before sending passes when not doing GPU profiling
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#23457
opened May 21, 2026 by
nikhilJain17
Contributor
Loading…
hexagon: apply repl optimization in flash attn softmax as #22993
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
#23455
opened May 21, 2026 by
njsyw1997
Contributor
Loading…
Generalize Adreno MoE kernels on size M
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#23449
opened May 20, 2026 by
shawngu-quic
Contributor
Loading…
Hip fattn expf approx
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#23441
opened May 20, 2026 by
a-huk
Loading…
MoE disk offloading for Metal
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#23440
opened May 20, 2026 by
kisasexypantera94
•
Draft
ggml/cpu: skip zero-scale blocks in TQ1_0 and TQ2_0 vec_dot kernels
ggml
changes relating to the ggml tensor library for machine learning
#23439
opened May 20, 2026 by
eriirfos-eng
Loading…
json-schema-to-grammar: expand PCRE shorthands in pattern strings
testing
Everything test related
#23436
opened May 20, 2026 by
iOptimizeThings
Loading…
ggml: replace fixed 1GB context pool with growable buffer in meta backend (#22404)
ggml
changes relating to the ggml tensor library for machine learning
#23432
opened May 20, 2026 by
nonml
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.