-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Pull requests: vllm-project/vllm-ascend
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[BugFix][Platform] Guard --ray-workers-use-nsight on Ascend NPU
module:core
#8703
opened Apr 25, 2026 by
underfituu
Contributor
Loading…
2 tasks done
[Refactor] Replace BailingMoELinearAttention monkey-patching with PluggableLayer
module:core
module:ops
#8702
opened Apr 25, 2026 by
ghphotoframe
Contributor
Loading…
[Doc][v0.18.0] Fix documentation formatting and improve code examples
#8701
opened Apr 25, 2026 by
MrZ20
Contributor
Loading…
[WIP][Feature]Using torch.float8_e8m0fnu instead of torch_npu.float8_e8m0fnu
module:quantization
#8700
opened Apr 25, 2026 by
lijiahang226
Contributor
Loading…
[Doc][Misc] Refactor and simplify KV Pool documentation and scripts
documentation
Improvements or additions to documentation
#8696
opened Apr 25, 2026 by
internel-error
Loading…
[BugFix] non-stream recompute text join
#8695
opened Apr 25, 2026 by
wangxiaoteng888
Contributor
Loading…
[CI][BugFix] Increase /dev/shm size limit from 15Gi to 128Gi in multi-node LWS template
module:tests
#8693
opened Apr 25, 2026 by
zhangxinyuehfad
Collaborator
Loading…
[CI]Fix the error caused by enabling layer_sharding in Dsv32 mixed de…
module:tests
nightly-test
#8691
opened Apr 25, 2026 by
Nagisa125
Contributor
Loading…
[Doc] Translated Doc files 2026-04-25
documentation
Improvements or additions to documentation
#8689
opened Apr 25, 2026 by
vllm-ascend-ci
Collaborator
Loading…
[Doc][v0.13.0] remove duplicate --net=host flag in DeepSeek-V4 tutorial
#8688
opened Apr 25, 2026 by
LGiki
Loading…
[Ops][Feature] Support Bailing25 (bailing_hybrid) quantization
module:quantization
#8685
opened Apr 24, 2026 by
alex101-ops
Contributor
Loading…
[CI][Cherry-pick] Relax TTFT benefits threshold from 0.4 to 0.5 to account for DP load imbalance
#8684
opened Apr 24, 2026 by
underfituu
Contributor
Loading…
[CI][Main] Relax TTFT benefits threshold from 0.4 to 0.5 to account for DP load imbalance
module:tests
#8683
opened Apr 24, 2026 by
underfituu
Contributor
Loading…
[CI] add nightly MiniMax-M2.5-w8a8-QuaRot
ci/build
module:tests
nightly-test
#8681
opened Apr 24, 2026 by
weixinAc
Loading…
[CI] Add nightly case:GLM-5_1-W8A8
ci/build
module:tests
nightly-test
#8680
opened Apr 24, 2026 by
guxin108
Contributor
Loading…
[BugFix] Fix DSV3.1 W4A8 TTFT degradation
ready
read for review
ready-for-test
start test by label for PR
#8675
opened Apr 24, 2026 by
wangbj127
Contributor
Loading…
[v0.18.0][BugFix] Fix DSV3.1 W4A8 TTFT degradation
ready
read for review
ready-for-test
start test by label for PR
#8674
opened Apr 24, 2026 by
wangbj127
Contributor
Loading…
[CI] repair ci customop for main
module:tests
#8673
opened Apr 24, 2026 by
ZT-AIA
Contributor
Loading…
[Doc][0.18.0] Fix the wrong triton uninstall guide
#8672
opened Apr 24, 2026 by
Tflowers-0129
Contributor
Loading…
In scenarios A2 and A3, replace npu_fusion_attention with the _npu_flash_attention_unpad operator.
module:ops
#8671
opened Apr 24, 2026 by
chenxi-hh
Collaborator
Loading…
[BugFix][Eagle3] Add fullgraph case and check mock function
module:tests
#8668
opened Apr 24, 2026 by
lilinsiman
Collaborator
Loading…
[CI] add nightly case: Kimi-2.5
ci/build
module:tests
nightly-test
#8667
opened Apr 24, 2026 by
chen-commits
Loading…
[Test]Add quantization test case
module:tests
ready
read for review
ready-for-test
start test by label for PR
#8666
opened Apr 24, 2026 by
kunpengW-code
Contributor
Loading…
Fix formatting of gpu-memory-utilization flag
documentation
Improvements or additions to documentation
#8665
opened Apr 24, 2026 by
zkryakgul
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.