logs: keep paging in all-workflows mode when full batches filter to zero runs#39741
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes gh aw logs pagination in all-workflows mode so iteration doesn’t terminate early when a fully-fetched raw page filters down to zero (e.g., skipped/cancelled-heavy windows). The orchestrator now continues paging based on raw API exhaustion and advances the cursor using the oldest run timestamp from the raw batch.
Changes:
- Reworked pagination stop/continue logic to use
totalFetched < batchSizerather than post-filter emptiness. - Propagated the oldest raw
CreatedAttimestamp fromlistWorkflowRunsWithPaginationto drive cursor advancement even when filtered batches are empty. - Added focused unit tests for the pagination helper logic; updated several compiled workflow
.lock.ymlschemas forpull_request_number.
Show a summary per file
| File | Description |
|---|---|
| pkg/cli/logs_orchestrator.go | Adjusts the main logs paging loop to continue on filtered-empty full pages and advances cursor from raw batch oldest timestamp. |
| pkg/cli/logs_orchestrator_pagination_test.go | Adds unit tests for the new pagination helper functions. |
| pkg/cli/logs_github_api.go | Extends ListWorkflowRunsOptions and returns the oldest raw run timestamp to the orchestrator for pagination cursor progression. |
| .github/workflows/test-quality-sentinel.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/smoke-copilot.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/smoke-copilot-arm.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/smoke-copilot-aoai-entra.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/smoke-copilot-aoai-apikey.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/smoke-claude.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/security-review.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/refiner.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/pr-triage-agent.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/pr-nitpick-reviewer.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/pr-code-quality-reviewer.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/mattpocock-skills-reviewer.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
| .github/workflows/grumpy-reviewer.lock.yml | Updates pull_request_number schema to issueOrPRNumber. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 16/16 changed files
- Comments generated: 2
|
@copilot review all comments and address the unresolved review feedback on the pagination cursor handling.
|
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
❌ Design Decision Gate 🏗️ failed during design decision gate check. |
|
✅ Test Quality Sentinel completed test quality analysis. |
🧪 Test Quality Sentinel Report✅ Test Quality Score: 100/100 — Excellent
📊 Metrics & Test Classification (5 scenarios analyzed)
Go: 1 file ( Verdict
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — 5 observations, no blocking issues.
📋 Key Themes & Highlights
Key Themes
- Missing orchestrator-level regression test: the two extracted helpers are well-tested in isolation, but the
continuepath that is the heart of the bug fix has no end-to-end test. A swap ofcontinue→breakin the loop would pass all new tests. - Cursor stagnation not guarded: in repos with many same-second runs a full batch can repeatedly produce the same
beforeDate, spinning toMaxIterationssilently. - Out-param in options struct:
OldestFetchedCreatedAt *time.Timemixes input and output concerns inListWorkflowRunsOptions; an extra return value would be cleaner Go. - Undocumented ordering assumption: both
selectPaginationCursorDateand theruns[totalFetched-1]assignment rely on descending-time sort order from the API without documenting that contract.
Positive Highlights
- ✅ Root cause is correctly identified and addressed — stop condition now uses raw API exhaustion, not post-filter emptiness.
- ✅
shouldStopPaginationandselectPaginationCursorDateextracted as pure, testable helpers — good decomposition. - ✅ PR description is clear and even includes a representative code snippet; existing comment in the loop explains the old bug well.
- ✅ Verbose logging on both the continue and fallback-break paths makes the new behaviour observable.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer
| if ok { | ||
| t.Fatalf("expected no cursor, got %s", cursor) | ||
| } | ||
| } |
There was a problem hiding this comment.
[/diagnose] The helpers are tested in isolation, but the orchestrator loop's continue path — the core of this bug fix — has no regression test. If someone swaps continue for break in the if len(runs) == 0 block, all these tests still pass.
💡 Suggested approach
Add a test that exercises the orchestrator with a mock listWorkflowRunsWithPagination that returns:
- A full batch (
totalFetched == batchSize) with zero filtered runs - A partial batch that terminates pagination
Assert the API is called at least twice and beforeDate correctly advances. This directly documents and protects the fix.
| totalFetched: 250, | ||
| batchSize: 250, | ||
| want: false, | ||
| }, |
There was a problem hiding this comment.
[/tdd] TestShouldStopPagination is missing the totalFetched == 0 case. It correctly returns true (stop) via the formula, but an explicit test documents the intent and would catch a future refactor that accidentally changes the semantics for empty batches.
💡 Suggested addition
{
name: "stop when no runs fetched",
totalFetched: 0,
batchSize: 250,
want: true,
},| if len(filteredRuns) == 0 { | ||
| return "", false | ||
| } | ||
| return filteredRuns[len(filteredRuns)-1].CreatedAt.Format(time.RFC3339), true |
There was a problem hiding this comment.
[/diagnose] selectPaginationCursorDate silently assumes filteredRuns is sorted descending (newest first) so the last element is the oldest. This matches GitHub's default run ordering, but the assumption isn't documented or enforced — a future change that reorders runs before calling this function would silently pick the wrong cursor.
💡 Suggestion
Add a brief comment to document the contract:
// filteredRuns must be ordered newest-first (GitHub API default);
// the last element is treated as the oldest.The same assumption applies in logs_github_api.go where runs[totalFetched-1].CreatedAt is used for OldestFetchedCreatedAt.
| break | ||
| } | ||
|
|
||
| cursor, ok := selectPaginationCursorDate(nil, oldestFetchedCreatedAt) |
There was a problem hiding this comment.
[/diagnose] No guard against cursor stagnation: if GitHub returns a full batch where all runs share the same CreatedAt timestamp as the previous batch's oldest, beforeDate is set to the same value and the next API call returns an identical page. The loop then spins until MaxIterations, silently truncating results.
💡 Suggested safeguard
Track the previous cursor and break (or log a warning) if it hasn't advanced:
if cursor == beforeDate {
if verbose {
fmt.Fprintln(os.Stderr, console.FormatWarningMessage("Pagination cursor did not advance; stopping to prevent infinite loop"))
}
break
}This is a degenerate case, but in high-volume repos with many same-second runs it is reachable.
| RepoOverride string // fetch from a specific repository instead of current | ||
| // OldestFetchedCreatedAt, when set, is populated with the oldest run creation | ||
| // timestamp returned by GitHub in this batch before any workflow/conclusion filtering. | ||
| OldestFetchedCreatedAt *time.Time |
There was a problem hiding this comment.
[/diagnose] OldestFetchedCreatedAt is an output field on an options struct, mixing input and output concerns in ListWorkflowRunsOptions. Future callers reading the struct type won't immediately know this field is populated by the function, not consumed by it.
💡 Alternative
Return it as an additional named return value alongside totalFetched:
func listWorkflowRunsWithPagination(opts ListWorkflowRunsOptions) ([]WorkflowRun, int, time.Time, error)
// ^^^^^^^^^^^ oldest fetchedThis keeps the options struct as pure input and makes the out-param visible at every call site.
In all-workflows mode,
gh aw logscould stop after a filtered-empty batch even when GitHub returned a full page of runs, causing silent truncation in high-CI repos (especially with skipped/cancelled-heavy agentic workflows). This change makes pagination termination depend on raw API exhaustion, not post-filter emptiness.Pagination stop condition
totalFetched < batchSize).Cursor progression from raw batch
listWorkflowRunsWithPaginationback to the orchestrator.Focused pagination logic coverage