Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix(scheduler): guard GatherBatchScheduler against py_model mixed batch
#928 opened Apr 23, 2026 by Vinkle-hzt Collaborator Loading…
Develop/fix int64
#927 opened Apr 23, 2026 by xinfei-shi Collaborator Loading…
fix: write cache store wrong gid
#925 opened Apr 23, 2026 by SJTUGavinLiu Collaborator Loading…
build: migrate from Bazel Python to setup.py + pytest
#924 opened Apr 22, 2026 by LLLLKKKK Collaborator Loading…
3 of 4 tasks
fix - fix linear reuse cache core
#923 opened Apr 22, 2026 by Nancheng-11 Collaborator Loading…
amd qwen35 optimize fused_l2norm_qk
#920 opened Apr 22, 2026 by hxy0118 Collaborator Loading…
update: update kvcm client
#918 opened Apr 21, 2026 by lucky-zzz Collaborator Loading…
feat: refactor py model device
#917 opened Apr 21, 2026 by JackTan25 Collaborator Loading…
Defer engine and RPC loop start until after full server init
#916 opened Apr 21, 2026 by xinfei-shi Collaborator Loading…
Support batch_prefill && TPS bench mode
#914 opened Apr 21, 2026 by alibaba-miji Collaborator Loading…
6 tasks done
Feature/p2p connector complete
#910 opened Apr 17, 2026 by ZhihanYan Collaborator Loading…
refactor: refactor codes
#909 opened Apr 17, 2026 by JackTan25 Collaborator Loading…
perf: optimize MoE model weight loading (8.6x speedup)
#908 opened Apr 17, 2026 by netaddi Collaborator Loading…
3 tasks
Feat/hybrid cp gdn
#906 opened Apr 17, 2026 by yang1556 Collaborator Loading…
feat: support input_embeddings in inference pipeline
#905 opened Apr 17, 2026 by KrisCheng9 Collaborator Loading…
optimize beam search
#903 opened Apr 16, 2026 by parkerpang Loading…
feat: support xgrammer
#902 opened Apr 16, 2026 by wanglining97 Collaborator Loading…
[ROCm] Optimize Qwen3.5 with fused kernel and allreduce merging
#900 opened Apr 16, 2026 by chengshu-lcc Collaborator Loading…
feat: add Kimi Linear (KDA) model support
#899 opened Apr 16, 2026 by theNiemand Collaborator Loading…
feat: Qwen3.5 Blackwell GDN prefill optimization
#897 opened Apr 15, 2026 by netaddi Collaborator Loading…
3 tasks
限制性解码修改
#893 opened Apr 14, 2026 by Glen11111Z Loading…
Gb200 Qwen3.5 NVFP4
#888 opened Apr 14, 2026 by qqbbiu Collaborator Loading…
fix: fix nvfp4 dp2 cuda graph smoke crash bug
#887 opened Apr 14, 2026 by JackTan25 Collaborator Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.