[None][test] Update K2.5 andGLM-5 into CI Perf Test by chenfeiz0326 · Pull Request #14960 · NVIDIA/TensorRT-LLM

chenfeiz0326 · 2026-06-04T14:44:56Z

Summary by CodeRabbit

Tests
- Added performance sanity test coverage for GLM-5 FP4 model across GB200 and GB300 hardware configurations.
- Expanded multi-node and multi-GPU test scenarios with new test cases for various parallelism and batch size combinations.
- Added new benchmark configurations supporting disaggregated and aggregated testing modes.
Chores
- Updated test pipeline triggering conditions and test count adjustments for multi-GPU and multi-node performance validation stages.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

coderabbitai · 2026-06-04T14:56:47Z

📝 Walkthrough

Walkthrough

This PR extends TensorRT-LLM's multi-GPU performance sanity testing infrastructure by introducing GLM-5-fp4 model support across B200, GB200, and GB300 hardware platforms. It updates Jenkins pipeline orchestration logic, integration test definitions, and provides comprehensive benchmark reference configurations for both aggregated and disaggregated testing modes.

Changes

GLM-5-fp4 Multi-GPU Performance Testing

Layer / File(s)	Summary
Jenkins pipeline test orchestration and stage matrix updates `jenkins/L0_MergeRequest.groovy`, `jenkins/L0_Test.groovy`	File-change detection patterns updated to include GB300 `gpu2`-variant test configs instead of `gpu4` variants; testCount parameters increased across GB200 multi-node disaggregated stages (7→9, 5→7, 8→14, 11→15), decreased for GB300 2-node (4→2), and new GLM-5 DEP2 configurations added for GB300 2/3/9-node cases.
GB200 and B200 integration test lists with GLM-5-fp4 entries `tests/integration/test_lists/test-db/l0_b200_multi_gpus_perf_sanity.yml`, `tests/integration/test_lists/test-db/l0_gb200_multi_gpus_perf_sanity.yml`, `tests/integration/test_lists/test-db/l0_gb200_multi_nodes_perf_sanity_*`	Adds GLM-5-fp4 test cases to B200 and GB200 multi-GPU/multi-node test matrices, including post-merge aggregated upload variants and context-only test conditions, each with appropriate timeout settings (90–120 seconds).
GB300 integration test lists with GPU2-based configurations `tests/integration/test_lists/test-db/l0_gb300_multi_gpus_perf_sanity.yml`, `tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu2_gen1_*`, `tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node8_gpu32.yml`	Replaces deepseek-v32-fp4 with GLM-5-fp4 entries in multi-GPU tests; introduces three new GPU2-based 2-node, 3-node, and 9-node disaggregated configurations for GB300; removes obsolete GPU4 9-node variant.
Aggregated benchmark configurations for GLM-5-fp4 `tests/scripts/perf-sanity/aggregated/glm5_fp4_*.yaml`	Defines reference aggregated (multi-node collective tensor reduction) benchmark configurations for GLM-5-fp4 on B200 (8 GPUs/node) and GB200 (4 GPUs/node), each with TEP/DEP parallelism variants, TRTLLM or CUTLASS attention backends, fp8 KV cache, MTP speculative decoding, and openai client configuration templates.
GB200 disaggregated benchmark configuration tuning `tests/scripts/perf-sanity/disaggregated/gb200_glm-5-fp4_*.yaml`	Adjusts token generation limits (128→64 tokens), reduces parallelism degrees (tensor parallel 8→4, moe expert parallel 8→4), lowers GPU memory headroom (0.9→0.85 fraction), and adds load_balancer configuration (num_slots 256, layer updates per iteration) to existing GB200 disaggregated benchmark configs.
GB300 disaggregated benchmark configurations (new) `tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_*.yaml`	Introduces ten new disaggregated benchmark configurations covering 1k1k and 8k1k input/output sequences with varying tensor/moe parallelism, context-generation server splits, and concurrency profiles; each config defines SLURM job templates, worker batch/token limits, KV cache settings (fp4/fp8 dtypes), attention DP toggles, NIXL cache transceiver, and shared speculative decoding configuration via YAML anchors.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

NVIDIA/TensorRT-LLM#10912: Introduces the buildStageConfigs(...) function used to define stage-matrix entries in L0_Test.groovy, directly related to the stage orchestration updates in this PR.

Suggested reviewers

yufeiwu-nv
ruodil
LarryXFly

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description contains only template placeholders with no substantive content filled in.	Provide a clear description of the changes, explain why the GLM-5 model tests are being added, list the relevant test coverage, and complete the PR checklist items as applicable.
Title check	❓ Inconclusive	The title partially relates to the changeset by mentioning GLM-5, which is a model added throughout the PR, but it is unclear and contains inconsistencies (e.g., 'K2.5' is mentioned but not substantive in the changes; 'andGLM-5' appears to be a typo; '[None][test]' is vague). The title does not clearly convey the main focus of updating test configurations and performance sanity benchmarks.	Clarify the title to explicitly describe the primary changes, such as 'Add GLM-5 model to multi-GPU and multi-node perf sanity tests' or 'Update perf sanity test configs for GLM-5 on GB200/GB300'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (4)

tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml (1)
1-75: Coverage status: sufficient in-PR; one follow-up is outside this layer.

For tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml, coverage is sufficient for GB200 aggregated 1k1k TEP/DEP variants.
Follow-up outside this PR layer (if not already handled in other stacked files): confirm CI selection references include all three new aggregated configs:

tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml

tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml

tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml` around
lines 1 - 75, The CI selection references need to include all three new
aggregated config files so tests run for each variant; update whatever
CI/selection list references (e.g., in the perf-sanity CI matrix or selection
files) to add
"tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml",
"tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml", and
"tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml" so
the new GB200 aggregated 1k1k TEP/DEP variants are selected by CI. Ensure any
selection logic that filters by the directory
tests/scripts/perf-sanity/aggregated or by model_name "glm_5_nvfp4" also
accounts for these three files.
tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml (1)
1-75: Coverage status: sufficient for this file’s aggregated scope.

For tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml, coverage is sufficient in this PR for GB200 aggregated perf sanity because both TEP (glm5_fp4_tep8_mtp3_8k1k) and DEP (glm5_fp4_dep8_mtp1_8k1k) variants are included with matching client workloads.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml`
around lines 1 - 75, The YAML already includes both TEP and DEP aggregated
configs, but update/verify that metadata.model_name ("glm_5_nvfp4") matches each
server_config.model_name and that the two server_configs named
"glm5_fp4_tep8_mtp3_8k1k" and "glm5_fp4_dep8_mtp1_8k1k" remain present; also
replace the placeholder dataset_file in each client_configs entry with the
actual dataset path (or a CI-provided variable) so the perf-sanity jobs can run
end-to-end.
tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml (1)
1-75: Coverage status: sufficient for this file’s aggregated scope.

For tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml, coverage is sufficient in this PR for B200 aggregated perf sanity via both TEP and DEP 8k1k configurations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml` around lines 1
- 75, Coverage for glm5_fp4_blackwell.yaml is already sufficient so no
structural changes are required; however ensure the client_configs dataset_file
placeholder is wired to the test runner by replacing the literal
"<dataset_file>" with the CI/test variable your harness expects (e.g.,
${DATASET_FILE}) so the two server configs named "glm5_fp4_tep8_mtp3_8k1k" and
"glm5_fp4_dep8_mtp1_8k1k" (and keys metadata.model_name and supported_gpus) run
with a real dataset path during execution.
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con1_ctx1_dep2_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml (1)
1-94: QA coverage status: sufficient for config-definition scope; execution evidence needs follow-up outside this PR.

Coverage is sufficient across the new GB300 disaggregated config set for this cohort:

tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con1_ctx1_dep2_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml

tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con4096_ctx1_dep2_gen1_dep8_eplb256_mtp1_ccb-NIXL.yaml

tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con512_ctx1_dep2_gen1_dep32_eplb0_mtp3_ccb-NIXL.yaml

tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_8k1k_con1024_ctx1_dep2_gen1_dep8_eplb256_mtp1_ccb-NIXL.yaml

tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_8k1k_con1_ctx1_dep2_gen1_tep8_eplb0_mtp3_ccb-NIXL.yaml

tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_8k1k_con512_ctx1_dep2_gen1_dep32_eplb0_mtp3_ccb-NIXL.yaml

Actionable follow-up outside this PR: capture CI artifact evidence that placeholder fields (<partition>, <account>, <dataset_file>, <model_path>) are fully resolved at runtime for each listed file.

As per coding guidelines, "Act as a QA engineer reviewing test changes and coverage for TensorRT-LLM. Keep feedback actionable: suggest concrete list file names and whether coverage is sufficient, insufficient, or needs follow-up outside the PR."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con1_ctx1_dep2_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml`
around lines 1 - 94, The YAML contains unresolved placeholders (<partition>,
<account>, <dataset_file>, <model_path>) that must be validated before job
submission; update the test harness or the config generation step to replace
those placeholders for the files (e.g.,
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con1_ctx1_dep2_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml
and the other listed YAMLs) and add a preflight check that reads keys partition,
account, dataset_file, model_path and fails early with a clear error if any
still match the placeholder pattern; alternatively wire them to concrete CI
variables or templating logic so create_job()/load_config() (or whatever config
loader function you use) performs substitution and validation.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml`:
- Around line 1-75: The YAML already includes both TEP and DEP aggregated
configs, but update/verify that metadata.model_name ("glm_5_nvfp4") matches each
server_config.model_name and that the two server_configs named
"glm5_fp4_tep8_mtp3_8k1k" and "glm5_fp4_dep8_mtp1_8k1k" remain present; also
replace the placeholder dataset_file in each client_configs entry with the
actual dataset path (or a CI-provided variable) so the perf-sanity jobs can run
end-to-end.

In `@tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml`:
- Around line 1-75: Coverage for glm5_fp4_blackwell.yaml is already sufficient
so no structural changes are required; however ensure the client_configs
dataset_file placeholder is wired to the test runner by replacing the literal
"<dataset_file>" with the CI/test variable your harness expects (e.g.,
${DATASET_FILE}) so the two server configs named "glm5_fp4_tep8_mtp3_8k1k" and
"glm5_fp4_dep8_mtp1_8k1k" (and keys metadata.model_name and supported_gpus) run
with a real dataset path during execution.

In `@tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml`:
- Around line 1-75: The CI selection references need to include all three new
aggregated config files so tests run for each variant; update whatever
CI/selection list references (e.g., in the perf-sanity CI matrix or selection
files) to add
"tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml",
"tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml", and
"tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml" so
the new GB200 aggregated 1k1k TEP/DEP variants are selected by CI. Ensure any
selection logic that filters by the directory
tests/scripts/perf-sanity/aggregated or by model_name "glm_5_nvfp4" also
accounts for these three files.

In
`@tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con1_ctx1_dep2_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml`:
- Around line 1-94: The YAML contains unresolved placeholders (<partition>,
<account>, <dataset_file>, <model_path>) that must be validated before job
submission; update the test harness or the config generation step to replace
those placeholders for the files (e.g.,
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con1_ctx1_dep2_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml
and the other listed YAMLs) and add a preflight check that reads keys partition,
account, dataset_file, model_path and fails early with a clear error if any
still match the placeholder pattern; alternatively wire them to concrete CI
variables or templating logic so create_job()/load_config() (or whatever config
loader function you use) performs substitution and validation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8f8c1c3a-d00e-4a02-983b-8504f53ec70c

📥 Commits

Reviewing files that changed from the base of the PR and between c17611c and eabbbc1.

📒 Files selected for processing (28)

jenkins/L0_MergeRequest.groovy
jenkins/L0_Test.groovy
tests/integration/test_lists/test-db/l0_b200_multi_gpus_perf_sanity.yml
tests/integration/test_lists/test-db/l0_gb200_multi_gpus_perf_sanity.yml
tests/integration/test_lists/test-db/l0_gb200_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node1_gpu4.yml
tests/integration/test_lists/test-db/l0_gb200_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node2_gpu8.yml
tests/integration/test_lists/test-db/l0_gb200_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node8_gpu32.yml
tests/integration/test_lists/test-db/l0_gb200_multi_nodes_perf_sanity_node2_gpu8.yml
tests/integration/test_lists/test-db/l0_gb300_multi_gpus_perf_sanity.yml
tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu2_gen1_node1_gpu4.yml
tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu2_gen1_node2_gpu8.yml
tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu2_gen1_node8_gpu32.yml
tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node1_gpu4.yml
tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node8_gpu32.yml
tests/scripts/perf-sanity/aggregated/glm5_fp4_2_nodes_grace_blackwell.yaml
tests/scripts/perf-sanity/aggregated/glm5_fp4_blackwell.yaml
tests/scripts/perf-sanity/aggregated/glm5_fp4_grace_blackwell.yaml
tests/scripts/perf-sanity/disaggregated/gb200_glm-5-fp4_1k1k_con1_ctx1_dep4_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb200_glm-5-fp4_1k1k_con4096_ctx1_dep4_gen1_dep8_eplb256_mtp1_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb200_glm-5-fp4_1k1k_con512_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb200_glm-5-fp4_8k1k_con1024_ctx1_dep4_gen1_dep8_eplb256_mtp1_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb200_glm-5-fp4_8k1k_con512_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con1_ctx1_dep2_gen1_tep4_eplb0_mtp3_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con4096_ctx1_dep2_gen1_dep8_eplb256_mtp1_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_1k1k_con512_ctx1_dep2_gen1_dep32_eplb0_mtp3_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_8k1k_con1024_ctx1_dep2_gen1_dep8_eplb256_mtp1_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_8k1k_con1_ctx1_dep2_gen1_tep8_eplb0_mtp3_ccb-NIXL.yaml
tests/scripts/perf-sanity/disaggregated/gb300_glm-5-fp4_8k1k_con512_ctx1_dep2_gen1_dep32_eplb0_mtp3_ccb-NIXL.yaml

💤 Files with no reviewable changes (2)

tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node1_gpu4.yml
tests/integration/test_lists/test-db/l0_gb300_multi_nodes_perf_sanity_ctx1_node1_gpu4_gen1_node8_gpu32.yml

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

chenfeiz0326 · 2026-06-05T02:51:50Z

/bot run --disable-fail-fast --stage-list "GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-1,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-2,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-3,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-4,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-5,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-6,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-1,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-2,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-3,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-4,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-5,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-6,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-7,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-8,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-9,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-10,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-11,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-12,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-13,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-1,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-2,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-3,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-4,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-5,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-6,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-7,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-8,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-9,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-10,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-11,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-12,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-13,GB300-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE1-GPU4-Post-Merge-1,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-1,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-2,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-3,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-4,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-5,GB300-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE8-GPU32-Post-Merge-1,GB300-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE8-GPU32-Post-Merge-2,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-1,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-2,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-3,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-4,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-5,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-6,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-7,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-1,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-2,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-3,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-4,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-5,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-6,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-7,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-8,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-9"

tensorrt-cicd · 2026-06-05T02:58:25Z

PR_Github #52219 [ run ] triggered by Bot. Commit: d2afbdb Link to invocation

tensorrt-cicd · 2026-06-05T07:33:10Z

PR_Github #52292 [ run ] triggered by Bot. Commit: d2afbdb Link to invocation

tensorrt-cicd · 2026-06-05T07:37:15Z

PR_Github #52219 [ run ] completed with state ABORTED. Commit: d2afbdb

Link to invocation

tensorrt-cicd · 2026-06-05T07:47:53Z

PR_Github #52292 [ run ] completed with state FAILURE. Commit: d2afbdb
/LLM/main/L0_MergeRequest_PR pipeline #41601 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

chenfeiz0326 · 2026-06-05T07:58:09Z

/bot run --disable-fail-fast --stage-list "GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-1,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-2,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-3,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-4,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-5,GB200-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE1-GPU4-Post-Merge-6,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-1,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-2,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-3,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-4,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-5,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-6,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-7,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-8,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-9,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-10,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-11,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-12,GB200-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE2-GPU8-Post-Merge-13,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-1,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-2,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-3,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-4,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-5,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-6,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-7,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-8,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-9,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-10,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-11,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-12,GB200-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU4-GEN1-NODE8-GPU32-Post-Merge-13,GB300-8_GPUs-2_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE1-GPU4-Post-Merge-1,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-1,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-2,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-3,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-4,GB300-12_GPUs-3_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE2-GPU8-Post-Merge-5,GB300-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE8-GPU32-Post-Merge-1,GB300-36_GPUs-9_Nodes-PyTorch-Disagg-PerfSanity-CTX1-NODE1-GPU2-GEN1-NODE8-GPU32-Post-Merge-2,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-1,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-2,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-3,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-4,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-5,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-6,GB200-4_GPUs-PyTorch-PerfSanity-Post-Merge-7,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-1,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-2,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-3,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-4,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-5,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-6,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-7,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-8,GB200-8_GPUs-2_Nodes-PyTorch-PerfSanity-Node2-GPU8-Post-Merge-9"

tensorrt-cicd · 2026-06-05T08:03:40Z

PR_Github #52308 [ run ] triggered by Bot. Commit: d2afbdb Link to invocation

tensorrt-cicd · 2026-06-05T11:27:25Z

PR_Github #52308 [ run ] completed with state FAILURE. Commit: d2afbdb
/LLM/main/L0_MergeRequest_PR pipeline #41614 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

update

eabbbc1

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

chenfeiz0326 requested review from a team as code owners June 4, 2026 14:44

chenfeiz0326 requested review from dpitman-nvda and mlefeb01 June 4, 2026 14:44

github-actions Bot assigned chenfeiz0326 Jun 4, 2026

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

chenfeiz0326 added 3 commits June 4, 2026 08:29

update

4514c57

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

update

c90adb7

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

update

d2afbdb

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

chenfeiz0326 requested a review from a team as a code owner June 5, 2026 02:49

update

1e7216c

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

chenfeiz0326 changed the title ~~[None][test] Add GLM-5 into CI Perf Test~~ [None][test] Update K2.5 andGLM-5 into CI Perf Test Jun 5, 2026

Conversation

chenfeiz0326 commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

chenfeiz0326 commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

chenfeiz0326 commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chenfeiz0326 commented Jun 4, 2026 •

edited

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading