abetlen / llama-cpp-python Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 10.4k

Code
Issues 597
Pull requests 66
Discussions
Actions
Projects
Security and quality 1
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: abetlen/llama-cpp-python

Labels 24 Milestones 0

New pull request New

66 Open 560 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

security: fix SSRF in multimodal image URL loading (_load_image)

#2220 opened May 16, 2026 by hoangperry

Loading…

5 tasks done

fix: improve error message when LlamaModel fails to load

#2187 opened Apr 21, 2026 by Anai-Guo Contributor

Loading…

Add chat template for gemma models

#2183 opened Apr 13, 2026 by C00kieFact0ry

Loading…

fix: prevent KV cache corruption on SWA/ISWA models + hot-path perf

#2180 opened Apr 12, 2026 by avion23 Contributor

Loading…

perf: vectorize KV cache prefix matching with numpy

#2179 opened Apr 11, 2026 by nausicaalii

Loading…

4 tasks done

build: disable soname to reduce binary size

#2177 opened Apr 9, 2026 by Bing-su

Loading…

feat(example): Updated server example (batch processing, /v1/responses api, response parsing)

#2174 opened Apr 5, 2026 by abetlen Owner

Loading…

feat: add reasoning_effort to chat completions API

#2167 opened Mar 30, 2026 by abetlen Owner

Loading…

ci: refactor cpu wheel build workflow

#2164 opened Mar 26, 2026 by Bing-su

Loading…

fix: auto-disable mmap when all layers offloaded to GPU (#1964)

#2147 opened Mar 22, 2026 by ljluestc • Draft

Clear kv cache and reset tokens after chat completion

#2141 opened Mar 14, 2026 by thisisayushg

Loading…

This PR implements the previously stubbed state management methods in the _internals.py module and updates the corresponding API calls in llama.py to use the correct underlying C++ function names.

#2134 opened Mar 5, 2026 by bsides230

Loading…

feat: Add DeepSeek R1 and distilled model support

#2131 opened Mar 1, 2026 by ljluestc • Draft

feat: add streaming tool use (rebased #1884 on latest main)

#2129 opened Feb 23, 2026 by XyLearningProgramming

Loading…

chore: bump conda-incubator/setup-miniconda from v3.1.0 to v3.3.0

#2128 opened Feb 22, 2026 by Aiudadadadf

Loading…

feat: support Granite-Docling model

#2109 opened Jan 4, 2026 by dhdaines

Loading…

Fix issue #2096: Handle URLs with embedded HTTP credentials in _load_image

#2102 opened Dec 10, 2025 by nMaroulis

Loading…

chore: update typing-extensions dependency and set github actions setup-python to v6

#2099 opened Nov 28, 2025 by AnvithaCodes

Loading…

Fix: Install correct CUDA toolkit during build

#2088 opened Nov 12, 2025 by chamalgomes

Loading…

Include x64 directory for CUDA DLLs on Windows

#2083 opened Oct 24, 2025 by ajparsons

Loading…

Better Qwen2.5-VL chat template.

#2066 opened Sep 7, 2025 by alcoftTAO Contributor

Loading…

Add timeout and error handling in FastAPI uvicorn server

#2044 opened Jul 22, 2025 by amandwivedi45

Loading…

Actually create a random seed when using seed = -1 on load

#2042 opened Jul 16, 2025 by m-from-space

Loading…

Improve error message when model file is missing

#2041 opened Jul 9, 2025 by NITHIN0710

Loading…

ARM Runners support CUDA SBSA

#2039 opened Jul 7, 2025 by johnnynunez

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!