microsoft / onnxruntime-genai Public

Notifications You must be signed in to change notification settings
Fork 285
Star 1k

Code
Issues 134
Pull requests 33
Discussions
Actions
Projects
Models
Wiki
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Models
Wiki
Security and quality
Insights

Pull requests: microsoft/onnxruntime-genai

Labels 56 Milestones 0

New pull request New

33 Open 1,405 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add quant mode for qwen3.5

#2096 opened Apr 22, 2026 by apsonawane Contributor • Draft

[Qwen3.5] Make linear attention INT8 quantization opt-in via extra_options

#2094 opened Apr 22, 2026 by daijh Contributor

Loading…

Add AMDGPU execution provider support

#2093 opened Apr 20, 2026 by aditya-dl

Loading…

Gemma 4 builder scaffolding (Part 1 of N, #2062)

#2088 opened Apr 16, 2026 by jaburges • Draft

WIP: TurboQuant for ORT WebGPU

#2084 opened Apr 14, 2026 by sushraja-msft Contributor • Draft

[WebGPU] Support continuous decoding (RewindTo) with graph capture

#2083 opened Apr 13, 2026 by qjia7 Contributor

Loading…

[Don't merge it] Fix quark quantize weight loading for Qwen3-VL-4B text model

#2082 opened Apr 13, 2026 by Tianping-amd

Loading…

[Draft] First chunk handling and drop count for Nemotron

#2081 opened Apr 11, 2026 by jiafatom Contributor • Draft

extend modelbuilder to build Olmo3, SmolLM3 and other models

#2078 opened Apr 10, 2026 by xadupre Member

Loading…

162

[Mistral3] Add VLM support with multi-image inference

#2077 opened Apr 8, 2026 by titaiwangms Contributor

Loading…

Enable CUDA graph capture for CUDA EP to improve decode throughput

#2070 opened Apr 7, 2026 by apsonawane Contributor

Loading…

Fix CUDA build with MSVC by enabling /Zc:preprocessor for nvcc host compilation on VS 16.5 or greater

#2054 opened Apr 1, 2026 by nsubaru

Loading…

Fix: Win32 build failure when paths contain spaces

#2053 opened Apr 1, 2026 by nsubaru

Loading…

Add HunYuan Dense V1 (hunyuan_v1_dense) model support

#2045 opened Mar 25, 2026 by amdrajeevp1 Contributor

Loading…

Add WebGPU EP support and repetitions flag to whisper.py

#2032 opened Mar 17, 2026 by qjia7 Contributor • Draft

[VitisAI] external_ep_library typo fix

#2027 opened Mar 13, 2026 by akholodnamdcom Contributor

Loading…

Add Qwen3.5 support

#2025 opened Mar 13, 2026 by kinfey Contributor

Loading…

Support Visual Studio 18 2026 build

#2017 opened Mar 11, 2026 by Copilot AI • Draft

[Don't review] Optimizations for graph capture

#2011 opened Mar 6, 2026 by qjia7 Contributor • Draft

Handle VLM weight name prefixes in QuantizedModel loader

#1996 opened Feb 28, 2026 by uday610 • Draft

GenAI changes to support EPContext compilation and validation

#1993 opened Feb 27, 2026 by lnigam Contributor

Loading…

Add model builder support for LFM2

#1979 opened Feb 14, 2026 by xenova Contributor

Loading…

[Draft] Parakeet export

#1977 opened Feb 12, 2026 by jiafatom Contributor • Draft

remove one assert not verified with model microsoft/OptiMind-SFT

#1975 opened Feb 12, 2026 by xadupre Member

Loading…

KV Cache optimization based on sequence length

#1974 opened Feb 11, 2026 by chilukam-qti Contributor

Loading…

Previous 1 2 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!