-
Notifications
You must be signed in to change notification settings - Fork 178
Pull requests: alibaba/rtp-llm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(scheduler): guard GatherBatchScheduler against py_model mixed batch
#928
opened Apr 23, 2026 by
Vinkle-hzt
Collaborator
Loading…
feat: add pre-merge-gate workflow to sync internal merge before external
#926
opened Apr 23, 2026 by
guoj14
Contributor
Loading…
build: migrate from Bazel Python to setup.py + pytest
#924
opened Apr 22, 2026 by
LLLLKKKK
Collaborator
Loading…
3 of 4 tasks
Defer engine and RPC loop start until after full server init
#916
opened Apr 21, 2026 by
xinfei-shi
Collaborator
Loading…
Support batch_prefill && TPS bench mode
#914
opened Apr 21, 2026 by
alibaba-miji
Collaborator
Loading…
6 tasks done
perf: optimize MoE model weight loading (8.6x speedup)
#908
opened Apr 17, 2026 by
netaddi
Collaborator
Loading…
3 tasks
feat: support input_embeddings in inference pipeline
#905
opened Apr 17, 2026 by
KrisCheng9
Collaborator
Loading…
perf: add masked aware top-k op to boost perfermance of beam search with constrained decoding
#901
opened Apr 16, 2026 by
zhangjianning-zjn
Collaborator
Loading…
[ROCm] Optimize Qwen3.5 with fused kernel and allreduce merging
#900
opened Apr 16, 2026 by
chengshu-lcc
Collaborator
Loading…
feat: add Kimi Linear (KDA) model support
#899
opened Apr 16, 2026 by
theNiemand
Collaborator
Loading…
feat: Qwen3.5 Blackwell GDN prefill optimization
#897
opened Apr 15, 2026 by
netaddi
Collaborator
Loading…
3 tasks
fix: fix nvfp4 dp2 cuda graph smoke crash bug
#887
opened Apr 14, 2026 by
JackTan25
Collaborator
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.