Skip to content

performance-metrics: Add --continue-on-failure flag and status tracking#7905

Open
anirudhrb wants to merge 1 commit intocloud-hypervisor:mainfrom
anirudhrb:perf_metrics_harness_improvements
Open

performance-metrics: Add --continue-on-failure flag and status tracking#7905
anirudhrb wants to merge 1 commit intocloud-hypervisor:mainfrom
anirudhrb:perf_metrics_harness_improvements

Conversation

@anirudhrb
Copy link
Copy Markdown
Member

Add a --continue-on-failure CLI flag that allows the test harness to continue executing remaining tests after encountering a failure, instead of aborting immediately. When set, failed tests are recorded with zeroed metrics and a "FAILED" status, the report file is always generated, and the process exits with a non-zero code if any test failed.

Without the flag, the existing fail-fast behavior is preserved.

Also add a "status" field ("PASSED"/"FAILED") to PerformanceTestResult so report consumers can distinguish successful tests from failed ones.

@anirudhrb anirudhrb requested a review from a team as a code owner March 26, 2026 10:30
max: 0.0,
min: 0.0,
status: "FAILED".to_string(),
});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A panicked test could leave things hanging. The failure path should call cleanup_stale_processes(), to avoid possible subsequent tests contamination.

Thanks

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

std_dev: f64,
max: f64,
min: f64,
status: String,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no compatibility concerns regarding the JSON format change, at least this should update the ./docs/perfomance-metrics.md about the changed output format.

Thanks

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/performance-metrics.md doesn't have this schema defined currently. It just says:

| results            | A list of metrics                        |

But the schema of results is not mentioned there.

if continue_on_failure {
eprintln!("Test '{}' failed: '{e:?}'. Continuing.", test.name);
has_failure = true;
metrics_report.results.push(PerformanceTestResult {
Copy link
Copy Markdown
Member

@weltling weltling Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to unify PerformanceTestResult creation through a helper/constructor, so success/failure status and fields stay consistent and future changes don't diverge.

Thanks

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure how a constructor/helper would be useful here. Could you please elaborate?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, success and failure construct PerformanceTestResult independently, duplicating the field list and status logic. A helper would centralize that, so future field additions only need updating in one place.

Thanks

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check now.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for addressing this. One could tighten it further with impl Default or alike but given both variants here are on the same page, it's obvious enough in case it's changing.

Thanks

@liuw
Copy link
Copy Markdown
Member

liuw commented Mar 27, 2026

It looks like we're trying to replicate what cargo nextest already does.

It should be possible to convert the performance metric tests to use the same framework.

To be clear, I'm not asking you to stop this PR. Switching to cargo nextest is a medium to long term thing.

@anirudhrb
Copy link
Copy Markdown
Member Author

It looks like we're trying to replicate what cargo nextest already does.

It should be possible to convert the performance metric tests to use the same framework.

Yeah, moving these tests to cargo nextest would be great. We wouldn't have to maintain our own test harness, and we'll get a lot of features automatically.

To be clear, I'm not asking you to stop this PR. Switching to cargo nextest is a medium to long term thing.

Ack. I'll continue with this PR for the short term.

@anirudhrb anirudhrb force-pushed the perf_metrics_harness_improvements branch from 7656e98 to 5688819 Compare March 30, 2026 12:32
@rbradford rbradford force-pushed the perf_metrics_harness_improvements branch from 5688819 to 849f729 Compare April 1, 2026 15:07
@anirudhrb anirudhrb force-pushed the perf_metrics_harness_improvements branch from 849f729 to 1ce8c5c Compare April 3, 2026 04:16
std_dev,
max,
min,
status: "PASSED".to_string(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use an enum here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it to enum now.

.long("continue-on-failure")
.help("Continue running remaining tests after a test failure")
.num_args(0)
.action(ArgAction::SetTrue)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default?

Copy link
Copy Markdown
Member Author

@anirudhrb anirudhrb Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is false. ArgAction::SetTrue in clap inherently defaults to false - absent flag means false, passing --continue-on-failure flips it to true.

Add a --continue-on-failure CLI flag that allows the test harness to
continue executing remaining tests after encountering a failure, instead
of aborting immediately. When set, failed tests are recorded with zeroed
metrics and a "FAILED" status, the report file is always generated, and
the process exits with a non-zero code if any test failed.

Without the flag, the existing fail-fast behavior is preserved.

Also add a "status" field ("PASSED"/"FAILED") to PerformanceTestResult
so report consumers can distinguish successful tests from failed ones.

Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
@anirudhrb anirudhrb force-pushed the perf_metrics_harness_improvements branch from 1ce8c5c to e225eea Compare April 6, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants