Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

ggml: add initial MetaX backend integration ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22212 opened Apr 21, 2026 by Dayuxiaoshui Loading…
ggml: vectorize ggml_vec_dot_q4_1_q8_1 with WASM SIMD128 ggml changes relating to the ggml tensor library for machine learning
#22209 opened Apr 21, 2026 by sirohikartik Loading…
cuda: LRU eviction + overalloc for legacy pool ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22207 opened Apr 21, 2026 by TheTom Loading…
cuda: flash attention DKQ 320 / DV 256 (MLA) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#22205 opened Apr 21, 2026 by lnigam Loading…
ggml-webgpu: enable FLASH_ATTN_EXT on browser without subgroup matrix ggml changes relating to the ggml tensor library for machine learning WebGPU
#22199 opened Apr 21, 2026 by ArberSephirotheca Contributor Loading…
hexagon: add support for FILL op ggml changes relating to the ggml tensor library for machine learning Hexagon
#22198 opened Apr 21, 2026 by aparmp-quic Contributor Loading…
ggml-cuda: Repost of 21896: Blackwell native NVFP4 support ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#22196 opened Apr 21, 2026 by michaelw9999 Contributor Loading…
Hexagon: DAIG op ggml changes relating to the ggml tensor library for machine learning Hexagon
#22195 opened Apr 21, 2026 by shreyajn Contributor Loading…
hexagon: fix missing v79 entry in libggml-htp.inf ggml changes relating to the ggml tensor library for machine learning Hexagon
#22194 opened Apr 21, 2026 by mengshengwu Contributor Loading…
cuda: add partial eviction on pool OOM ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22193 opened Apr 21, 2026 by leonardHONG Contributor Loading…
server: add ipv6 support examples ggml changes relating to the ggml tensor library for machine learning
#22192 opened Apr 21, 2026 by alphaonex86 Loading…
Add IP whitelist with CIDR support to llama-server examples server
#22191 opened Apr 21, 2026 by cabelo Contributor Loading…
gguf-py: Read and write empty array. python python script changes
#22189 opened Apr 21, 2026 by zhujunling-nj Loading…
Modality conditional adapters examples server testing Everything test related
#22184 opened Apr 20, 2026 by gabe-l-hart Collaborator Loading…
Optimize reduction stage of dot product of q4_L/q5_K to q8_K on AVX2 ggml changes relating to the ggml tensor library for machine learning
#22181 opened Apr 20, 2026 by nariox Loading…
cuda: disable MMQ stream-k by default for MoE ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22174 opened Apr 20, 2026 by nisparks Contributor Loading…
ngram-mod: Reset i_last when low acceptance streak occurs
#22168 opened Apr 20, 2026 by treo Loading…
Fix incorrect assertion ggml changes relating to the ggml tensor library for machine learning
#22167 opened Apr 20, 2026 by fiesh Loading…
common: improve GGUF quantization tag regex
#22164 opened Apr 20, 2026 by v1b3coder Loading…
sycl: scalar SWAR byte-subtract in Q6_K MMVQ dot product ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22156 opened Apr 20, 2026 by aicss-genai Loading…
sycl: add GGML_SYCL_USE_ASYNC_MEM_OP env toggle ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22153 opened Apr 20, 2026 by aicss-genai Loading…
sycl: Q5_K reorder MMVQ/dequant + Q8_0 reorder MMVQ path ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22152 opened Apr 20, 2026 by aicss-genai Loading…
ProTip! Follow long discussions with comments:>50.