Fix audit comparison turns delta for cached summaries by Copilot · Pull Request #43501 · github/gh-aw

Copilot · 2026-07-05T07:25:49Z

audit comparison could report comparison.delta.turns.after = 0 even when the run had non-zero turns, causing false turns_decrease classifications. The bug occurred on cached-summary rendering paths where Run.Turns could be stale/zero while Metrics.Turns was correct.

Current-run snapshot backfill in cached path
- Updated processedRunFromSummary to backfill processedRun.Run.Turns from summary.Metrics.Turns only when Run.Turns is zero.
- Preserves existing non-zero Run.Turns values.
Regression coverage
- Added tests for:
  - backfilling turns from metrics when cached run turns are missing
  - preserving existing run turns when already present

if processedRun.Run.Turns == 0 && summary.Metrics.Turns > 0 {
	processedRun.Run.Turns = summary.Metrics.Turns
}

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

Fixes an audit comparison bug where cached-summary rendering could incorrectly treat Run.Turns as 0 (stale) even when the cached Metrics.Turns was non-zero, leading to incorrect turns_decrease classifications.

Changes:

Backfill ProcessedRun.Run.Turns from RunSummary.Metrics.Turns when Run.Turns is 0 and metrics turns are available.
Add regression tests covering both the backfill behavior and the “do not overwrite non-zero turns” behavior.

Show a summary per file

File	Description
pkg/cli/audit.go	Backfills `processedRun.Run.Turns` from cached summary metrics when the cached run metadata is missing turns.
pkg/cli/audit_test.go	Adds unit tests to prevent regressions in cached-summary turns handling.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 2/2 changed files
Comments generated: 0
Review effort level: Low

github-actions · 2026-07-05T07:58:11Z

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

github-actions · 2026-07-05T07:58:16Z

✅ PR Code Quality Reviewer completed the code quality review.

github-actions · 2026-07-05T07:58:16Z

✅ Test Quality Sentinel completed test quality analysis.

github-actions · 2026-07-05T07:58:17Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #43501 does not have the 'implementation' label and has only 26 new lines of code in business logic directories (threshold: >100).

github-actions

Review: Fix audit comparison turns delta for cached summaries

The fix is correct, well-scoped, and well-tested.

Root cause: processedRunFromSummary copies summary.Run into processedRun.Run, but Run.Turns can be zero when the summary was cached before that field was populated — while summary.Metrics.Turns is always authoritative (computed from logs). The backfill guard (Run.Turns == 0 && Metrics.Turns > 0) is the right condition and correctly preserves any pre-existing non-zero value.

What was reviewed:

processedRunFromSummary fix in pkg/cli/audit.go — minimal, correct, placed before the LogsPath assignment
Two regression tests in pkg/cli/audit_test.go — cover the backfill path and preserve-existing-turns path with clear assertions
Downstream in audit_comparison.go — delta.Turns.After now gets the correct non-zero value, fixing the false turns_decrease classification

No issues found. LGTM.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

proxy.golang.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "proxy.golang.org"

See Network Configuration for more information.

🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 31.1 AIC · ⌖ 4.99 AIC · ⊞ 4.8K

github-actions · 2026-07-05T08:04:54Z

🧪 Test Quality Sentinel Report

✅ Test Quality Score: 90/100 — Excellent

Analyzed 2 test(s): 2 design, 0 implementation, 0 violation(s).

📊 Metrics (2 tests)

Metric	Value
Analyzed	2 (Go: 2, JS: 0)
✅ Design	2 (100%)
⚠️ Implementation	0 (0%)
Edge/error coverage	2 (100%)
Duplicate clusters	0
Inflation	⚠️ Yes (7.67:1 — 23 test lines / 3 prod lines)
🚨 Violations	0

Test	File	Classification	Issues
`TestProcessedRunFromSummaryBackfillsTurnsFromMetrics`	`pkg/cli/audit_test.go`	design_test / behavioral_contract / high_value	None
`TestProcessedRunFromSummaryPreservesExistingTurns`	`pkg/cli/audit_test.go`	design_test / behavioral_contract / high_value	None

Verdict

✅ Passed. 0% implementation tests (threshold: 30%). Both tests verify distinct behavioral contracts of processedRunFromSummary: (1) backfilling turns from metrics when the run has no turn data, and (2) preserving existing turns when present. All assertions include descriptive failure messages. Build tag //go:build !integration is present.

⚠️ Inflation note: test-to-production line ratio is 7.67:1, above the 2:1 threshold. This is expected for a focused 3-line production fix — the test setup requires more lines than the conditional being tested. The tests are substantive, not inflated.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🧪 Test quality analysis by Test Quality Sentinel · 28.5 AIC · ⌖ 12.7 AIC · ⊞ 6.8K · ◷
_{Comment /review to run again}

github-actions

✅ Test Quality Sentinel: 90/100. 0% implementation tests (threshold: 30%). Both tests cover distinct behavioral contracts with descriptive assertions and no violations.

github-actions · 2026-07-05T08:05:44Z

Skills-Based Review 🧠

Applied /diagnosing-bugs and /tdd — two minor suggestions on this clean, well-scoped bug fix.

📋 Key Themes & Highlights

Suggestions

Missing code comment (audit.go:501) — The backfill condition is non-obvious without knowing why Run.Turns can be stale on cached-summary paths. Without a comment, a future developer may treat it as dead code and remove it. See inline comment.
Uncovered edge case (audit_test.go:130) — No test covers the case where both Run.Turns and Metrics.Turns are zero. The code handles it correctly (no backfill), but documenting it as an explicit invariant protects against future drift. See inline comment.

Positive Highlights

✅ Surgical, minimal fix — 3 lines targeting only the stale-turns problem
✅ Excellent test-to-code ratio: 23 lines of test coverage for 3 lines of logic
✅ Defensive condition (Run.Turns == 0 && Metrics.Turns > 0) correctly preserves non-zero existing values
✅ Test names read as specifications — easy to understand without reading the body
✅ PR description clearly explains the root cause and fix approach

@copilot please address the review comments above.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 45.6 AIC · ⌖ 5.51 AIC · ⊞ 6.6K · ◷
_{Comment /matt to run again}

github-actions

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 45.6 AIC · ⌖ 5.51 AIC · ⊞ 6.6K
_{Comment /matt to run again}

github-actions · 2026-07-05T08:05:46Z

 		GitHubRateLimitUsage:    summary.GitHubRateLimitUsage,
 		JobDetails:              summary.JobDetails,
 	}
+	if processedRun.Run.Turns == 0 && summary.Metrics.Turns > 0 {


[/diagnosing-bugs] The fix is correct, but the why is missing — future readers will not know that Run.Turns can be stale/zero specifically in the cached-summary rendering path.

💡 Suggested comment

// Run.Turns may be zero on cached-summary paths where the RunSummary was // serialised before the run completed. Metrics.Turns is populated from log // parsing and is authoritative; backfill here so that audit comparison deltas // are computed from an accurate value. if processedRun.Run.Turns == 0 && summary.Metrics.Turns > 0 { processedRun.Run.Turns = summary.Metrics.Turns }

Without this context, the next developer may remove it as apparently defensive dead code.

@copilot please address this.

Added the explanatory comment in 3d0db57.

github-actions · 2026-07-05T08:05:46Z

 	}
 }

+func TestProcessedRunFromSummaryBackfillsTurnsFromMetrics(t *testing.T) {


[/tdd] Consider also testing the edge case where both Run.Turns and Metrics.Turns are zero — the current fix silently leaves turns as 0 in that scenario, which would still produce a false turns_decrease in comparison. A dedicated test makes the invariant explicit.

💡 Suggested test

func TestProcessedRunFromSummaryBothTurnsZero(t *testing.T) { summary := &RunSummary{ Run: WorkflowRun{DatabaseID: 789, Turns: 0}, Metrics: LogMetrics{Turns: 0}, } processed := processedRunFromSummary(summary, "/tmp/run-output") // Both are zero — no backfill occurs, and turns should remain 0. assert.Equal(t, 0, processed.Run.Turns, "turns should remain zero when neither Run.Turns nor Metrics.Turns is available") }

Documents the known limitation: if a run genuinely has zero turns and Metrics.Turns is also zero, the delta will still reflect 0.

@copilot please address this.

Added TestProcessedRunFromSummaryBothTurnsZero in 3d0db57.

github-actions

REQUEST_CHANGES — incomplete fix scope

The processedRunFromSummary backfill is correct, but the same staleness bug exists in computeRunMetricsDiff (audit_diff.go) which reads summary.Run.Turns directly and is not patched by this PR. Old cached summaries will still produce zero turns in audit diff output. See inline comments for details and suggested fixes.

Review summary

High: computeRunMetricsDiff reads summary{1,2}.Run.Turns directly without the backfill — audit diff remains broken for old cached summaries.

Medium: Test suite is missing the edge case where both Run.Turns == 0 and Metrics.Turns == 0, leaving the Metrics.Turns > 0 guard untested.

🔎 Code quality review by PR Code Quality Reviewer · 112.3 AIC · ⌖ 6.55 AIC · ⊞ 5.4K
_{Comment /review to run again}

github-actions · 2026-07-05T08:09:42Z

+		Metrics: LogMetrics{Turns: 34},
+	}
+
+	processed := processedRunFromSummary(summary, "/tmp/run-output")


Missing test for the both-zeros edge case (Run.Turns == 0, Metrics.Turns == 0). The two new tests cover backfill (zero→non-zero) and preservation (non-zero→non-zero). Neither exercises the path where Metrics.Turns == 0 too — meaning the guard summary.Metrics.Turns > 0 is never exercised as false. A future regression that drops the > 0 check and unconditionally writes summary.Metrics.Turns back to processedRun.Run.Turns would not be caught by the current suite.

💡 Suggested test

func TestProcessedRunFromSummaryBothTurnsZero(t *testing.T) { summary := &RunSummary{ Run: WorkflowRun{DatabaseID: 789, Turns: 0}, Metrics: LogMetrics{Turns: 0}, } processed := processedRunFromSummary(summary, "/tmp/run-output") // Neither field has data; turns should remain 0, not be set to Metrics.Turns. assert.Equal(t, 0, processed.Run.Turns, "run turns should remain 0 when both Run.Turns and Metrics.Turns are 0") }

Added TestProcessedRunFromSummaryBothTurnsZero and also applied the same Run.Turns backfill from Metrics.Turns inside computeRunMetricsDiff (audit_diff.go) so audit diff is fixed for old cached summaries too — see 3d0db57.

gh-aw-bot · 2026-07-05T09:32:19Z

@copilot please run the pr-finisher skill, review the remaining feedback, ensure the branch is current, and rerun checks.

Generated by 👨‍🍳 PR Sous Chef · 16.1 AIC · ⌖ 6.23 AIC · ⊞ 4.6K · ◷
_{Comment /souschef to run again}

…backfill, add both-zeros test Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Initial plan

46339c7

Copilot AI assigned Copilot and pelikhan Jul 5, 2026

Copilot AI linked an issue Jul 5, 2026 that may be closed by this pull request

[cli-tools-test] [exploratory-test] audit: comparison.delta.turns.after always reports 0 regardless of actual turn count #43489

Open

Copilot started work on behalf of pelikhan July 5, 2026 07:26 View session

fix(audit): backfill cached run turns from summary metrics

8aa9a07

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix comparison.delta.turns.after reporting incorrect value~~ Fix audit comparison turns delta for cached summaries Jul 5, 2026

Copilot finished work on behalf of pelikhan July 5, 2026 07:39

Copilot AI requested a review from pelikhan July 5, 2026 07:39

pelikhan marked this pull request as ready for review July 5, 2026 07:40

Copilot AI review requested due to automatic review settings July 5, 2026 07:40

Copilot started reviewing on behalf of pelikhan July 5, 2026 07:40 View session

Copilot AI reviewed Jul 5, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jul 5, 2026

[aw] No-Op Runs #43499

Open

github-actions Bot approved these changes Jul 5, 2026

View reviewed changes

github-actions Bot reviewed Jul 5, 2026

View reviewed changes

github-actions Bot requested changes Jul 5, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jul 5, 2026

[pr-sous-chef] Run report — 13 nudged, 0 skipped #43518

Closed

Copilot started work on behalf of gh-aw-bot July 5, 2026 09:32 View session

fix(audit): address review feedback - add comment, fix audit_diff.go …

ec0eb83

…backfill, add both-zeros test Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot finished work on behalf of gh-aw-bot July 5, 2026 09:46

Copilot AI requested a review from gh-aw-bot July 5, 2026 09:46

Uh oh!

Conversation

Copilot AI commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Review details

Uh oh!

github-actions Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review: Fix audit comparison turns delta for cached summaries

Uh oh!

github-actions Bot commented Jul 5, 2026

🧪 Test Quality Sentinel Report

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jul 5, 2026

Skills-Based Review 🧠

Suggestions

Positive Highlights

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

REQUEST_CHANGES — incomplete fix scope

Uh oh!

github-actions Bot Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

gh-aw-bot commented Jul 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Jul 5, 2026 •

edited

Loading

github-actions Bot commented Jul 5, 2026 •

edited

Loading

github-actions Bot commented Jul 5, 2026 •

edited

Loading

github-actions Bot commented Jul 5, 2026 •

edited

Loading

github-actions Bot commented Jul 5, 2026 •

edited

Loading