Skip to content

Synchronize LLAMA_API with ggml-org/llama.cpp and update cuda workflow for windows#1966

Closed
JamePeng wants to merge 0 commit into
abetlen:mainfrom
JamePeng:main
Closed

Synchronize LLAMA_API with ggml-org/llama.cpp and update cuda workflow for windows#1966
JamePeng wants to merge 0 commit into
abetlen:mainfrom
JamePeng:main

Conversation

@JamePeng

@JamePeng JamePeng commented Mar 9, 2025

Copy link
Copy Markdown

Update llama.cpp version llama.cpp updated [from 794fe2 to f08f4b3]
Use the llama_sampler_init instead of llama_sampler() for safe usage
Sync llama : add Phi-4-mini support
Sync llama : expose llama_model_n_head_kv in the API
Sync tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars
class LlamaSampler: append add_xtc(), add_top_n_sigma() and add_dry()
Remove Tail-Free sampling
Add TopN-Sigma/XTC/DRY samplers code into sampler
Sync llama : Add Gemma 3 support

@JamePeng JamePeng changed the title Sync LLAMA_API names with ggml-org/llama.cpp 20250309, support LLAMA_VOCAB_PRE_TYPE_GPT4O Sync LLAMA_API names with ggml-org/llama.cpp 20250309 Mar 9, 2025
@JamePeng

JamePeng commented Mar 9, 2025

Copy link
Copy Markdown
Author

I tried to adjust the workflow output based on VS2022 to compile pip wheels, and generate two cuda versions 12.4.1 and 12.6.3 and the win version of py310-312 for your convenience.
It should have been compiled now: https://github.com/JamePeng/llama-cpp-python/releases

@JamePeng JamePeng changed the title Sync LLAMA_API names with ggml-org/llama.cpp 20250309 Synchronize LLAMA_API with ggml-org/llama.cpp and update cuda workflow for windows Mar 9, 2025
@JamePeng

JamePeng commented Mar 13, 2025

Copy link
Copy Markdown
Author

llama.cpp : refactor llama_context, llama_kv_cache, llm_build_context (ggml-org/llama.cpp#12181)
They change API name again, :<

@JamePeng

Copy link
Copy Markdown
Author

The adjusted code is moved to https://github.com/JamePeng/llama-cpp-python/tree/1966-branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant