Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

hexagon: add support for CONCAT op
#23648 opened May 25, 2026 by max-krasnyansky Member Draft
hexagon: add support for Q4_1 in MUL_MAT and MUL_MAT_ID ggml changes relating to the ggml tensor library for machine learning Hexagon
#23647 opened May 25, 2026 by max-krasnyansky Member Draft
server: MTP layer kv-cache should respect draft type ctk examples merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. server
#23646 opened May 25, 2026 by am17an Contributor Loading…
llama: add llm_graph_input_mtp model Model specific
#23643 opened May 25, 2026 by am17an Contributor Draft
ci: update spacemit toolchain url and enhance curl command devops improvements to build systems and github actions documentation Improvements or additions to documentation
#23642 opened May 25, 2026 by alex-spacemit Collaborator Loading…
vulkan: don't hold the device mutex while compiling pipelines ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#23641 opened May 25, 2026 by jeffbolznv Contributor Loading…
vendor : update cpp-httplib to 0.45.1 python python script changes script Script related
#23639 opened May 25, 2026 by cabelo Contributor Loading…
tests: test-backend-ops -j <N> to run tests in parallel testing Everything test related
#23637 opened May 25, 2026 by jeffbolznv Contributor Loading…
ci : install host compiler on android-ndk build devops improvements to build systems and github actions
#23630 opened May 24, 2026 by aldehir Contributor Loading…
Fix 23627: Attach Mistral3 NVFP4 weight scales model Model specific
#23629 opened May 24, 2026 by michaelw9999 Contributor Loading…
gguf-py: preserve MoE size labels for mmproj metadata python python script changes
#23618 opened May 24, 2026 by ooovenenoso Loading…
CUDA: add fast walsh-hadamard transform ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#23615 opened May 24, 2026 by am17an Contributor Loading…
cuda : fix KQ mask offset integer overflow in flash attention MMA kernel ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#23610 opened May 24, 2026 by fairydreaming Collaborator Loading…
cuda: read memory through NVML if available ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#23604 opened May 24, 2026 by 0cc4m Contributor Loading…
common: preserve horizontal whitespace in tool calls testing Everything test related
#23602 opened May 24, 2026 by Krish2882005 Loading…
1 task done
Parallelize quant LUT init ggml changes relating to the ggml tensor library for machine learning
#23595 opened May 24, 2026 by jeffbolznv Contributor Loading…
ggml-webgpu: Add MMVQ path for Q4/Q8/Q2_K/Q4_K and clean up legacy MUL_MAT pipeline ggml changes relating to the ggml tensor library for machine learning WebGPU
#23594 opened May 24, 2026 by yomaytk Contributor Loading…
ggml: fix AVX-512 BF16 build with clang-cl ggml changes relating to the ggml tensor library for machine learning
#23593 opened May 24, 2026 by marcusds Loading…
Update build.md with Fedora Vulkan dependencies documentation Improvements or additions to documentation
#23584 opened May 23, 2026 by JCTRoth Loading…
cmake : error when LLAMA_BUILD_APP=ON and LLAMA_BUILD_TOOLS=OFF build Compilation issues
#23580 opened May 23, 2026 by Pento95 Loading…
Static quantize mtp layers
#23575 opened May 23, 2026 by de-wim Draft
ProTip! Add no:assignee to see everything that’s not assigned.