Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: abetlen/llama-cpp-python
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: etemiz/llama-cpp-python
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 1 commit
  • 1 file changed
  • 1 contributor

Commits on Apr 22, 2026

  1. Disable thinking/reasoning for Qwen3.5

    model = Llama(
        model_path="./Qwen3.5-0.8B/Qwen3.5-0.8B-Q8_0.gguf",
        chat_template_kwargs={"enable_thinking": False}
    )
    
    Per-call override
    model.create_chat_completion(
        messages=[{"role": "user", "content": "hi"}],
        enable_thinking=False  # overrides load-time setting
    )
    etemiz committed Apr 22, 2026
    Configuration menu
    Copy the full SHA
    e318199 View commit details
    Browse the repository at this point in the history
Loading