-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[None][perf] Remove redundant allreduce
#14974
opened Jun 4, 2026 by
mikeiovine
Collaborator
Loading…
1 task done
[https://nvbugs/6266705][fix] Gate the FlashInfer import-time selection on
get_sm_version() == 90 (in…
#14973
opened Jun 4, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][infra] Reduce Docker image layer count in release stage
#14972
opened Jun 4, 2026 by
tburt-nv
Collaborator
Loading…
1 task done
[None][chore] Unwaive AutoDeploy accuracy tests
#14971
opened Jun 4, 2026 by
bmarimuthu-nv
Collaborator
Loading…
1 task done
[None][feat] Add PyTorch reset_prefix_cache API
api-compatible
Accepted LLM API contract change that is backwards-compatible
#14970
opened Jun 4, 2026 by
milesial
Collaborator
Loading…
1 task done
feat(autodeploy): register Llama-3.1 Nemotron Ultra variants in model registry
#14969
opened Jun 4, 2026 by
Priyanshu31102003
Loading…
[None][test] Waive qwen3_30b_a3b_fp8 pd_disagg mm-encoder test on single-GPU (NIXL setup failure)
#14968
opened Jun 4, 2026 by
venkywonka
Collaborator
Loading…
[TRTLLM-13177][doc] Add Nemotron 3 Ultra doc
#14964
opened Jun 4, 2026 by
nv-guomingz
Collaborator
Loading…
1 task done
[None][feat] add MXFP8 weight format + CUTLASS W8A8 Linear and MoE
#14962
opened Jun 4, 2026 by
WeiHaocheng
Collaborator
•
Draft
1 task
[None][feat] enable GQA and cross-attention for attn2d
#14961
opened Jun 4, 2026 by
NVShreyas
Collaborator
Loading…
1 task done
[None][test] Add GLM-5 into CI Perf Test
#14960
opened Jun 4, 2026 by
chenfeiz0326
Collaborator
Loading…
1 task done
[None][fix] add use_remote_kv_events option in kvaware router
#14959
opened Jun 4, 2026 by
reasonsolo
Collaborator
Loading…
1 task done
[https://nvbugs/6248776][fix] Add trust_remote_code=True to the LLM(...) call in test_nemotron_nas_lora and…
#14958
opened Jun 4, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][perf] ConversationRouter: skip block-hash compute on sticky conversation_id routing
#14957
opened Jun 4, 2026 by
lishicheng1996-nv
Collaborator
Loading…
1 task done
[None][refactor] split VisualGen pipeline and model configs
#14956
opened Jun 4, 2026 by
bobboli
Collaborator
Loading…
[None][infra] Add nv-xtf, rahul-steiger-nv, tedzhouhk, tensorrt-cicd to blossom-ci allowlist
#14955
opened Jun 4, 2026 by
ZhanruiSunCh
Collaborator
Loading…
1 task done
[https://nvbugs/6160629][fix] AutoDeploy: increase rtol for bf16 HF vs FI rope test
#14954
opened Jun 4, 2026 by
galagam
Collaborator
Loading…
1 task done
[None][feat] Sparse-attention behavior-layer framework + V2-migrated RocketKV with chunked prefill
api-compatible
Accepted LLM API contract change that is backwards-compatible
#14953
opened Jun 4, 2026 by
Hudayday
Collaborator
Loading…
[https://nvbugs/5859886][fix] Remove the waiver
#14948
opened Jun 4, 2026 by
ziyixiong-nv
Collaborator
Loading…
1 task
[None][opt] attn kernel epilogue fuse RopeQuant
#14947
opened Jun 4, 2026 by
yunruis
Contributor
Loading…
1 task done
[None][feat] Support beam search in KV cache manager v2
#14945
opened Jun 4, 2026 by
yizhang-nv
Member
Loading…
1 task done
[TRTLLM-13052][feat] Enable TRTLLM moe backend for nemotron-h BF16 ckpt
#14944
opened Jun 4, 2026 by
Wanli-Jiang
Collaborator
•
Draft
1 task done
[None][feat] AutoDeploy: Fix hardcoded configs
#14943
opened Jun 4, 2026 by
taylor-yb-lee
Collaborator
Loading…
1 task done
[TRTLLM-10184][chore] Remove legacy XQA precompiled path
#14941
opened Jun 4, 2026 by
pengbowang-nv
Collaborator
•
Draft
1 task done
[https://nvbugs/6211441][fix] Resolve yaml_extra paths from the configs dir via a class-level YAML_EXTRA…
#14938
opened Jun 4, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
Previous Next
ProTip!
Follow long discussions with comments:>50.