Skip to content

Warn on outdated action versions in user-provided steps#43455

Open
pelikhan with Copilot wants to merge 7 commits into
mainfrom
copilot/scan-user-provided-steps
Open

Warn on outdated action versions in user-provided steps#43455
pelikhan with Copilot wants to merge 7 commits into
mainfrom
copilot/scan-user-provided-steps

Conversation

Copilot AI commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Agents tend to generate uses: references with stale major versions (e.g. actions/checkout@v3) even when a newer version is pinned in action_pins.json. The compiler now detects this and surfaces an actionable warning at compile time.

Changes

  • pkg/workflow/action_pins.go — new warnIfOutdatedActionVersion helper called from applyActionPinToTypedStep (the single pinning entry point for all user-provided steps, pre-steps, pre-agent-steps, and post-steps):

    • Partial tags (@vN): warns only when the latest major is higher — avoids false positives since @v4 floats to the latest v4.x anyway
    • Exact tags (@vN.M.P): warns when latest > requested via semver compare
    • SHA refs and non-semver refs are silently skipped
    • Deduplicates per repo@version within a compilation via ActionPinWarnings
  • pkg/workflow/action_pins_test.go — unit tests covering all branching cases plus deduplication

Example output

⚠ Action actions/checkout@v3 is outdated; latest available version is v7.0.0.
  Consider upgrading (update the version tag in your workflow file).

Copilot AI and others added 2 commits July 5, 2026 03:58
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title Warn when user-provided action steps reference outdated versions Warn on outdated action versions in user-provided steps Jul 5, 2026
Copilot AI requested a review from pelikhan July 5, 2026 04:00
@pelikhan pelikhan marked this pull request as ready for review July 5, 2026 04:01
Copilot AI review requested due to automatic review settings July 5, 2026 04:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds compile-time diagnostics for user-authored uses: steps that reference an outdated GitHub Action version compared to the embedded action_pins.json “latest” pin, helping users avoid stale major tags like @v3.

Changes:

  • Introduces warnIfOutdatedActionVersion to detect outdated semver-style action tags (major-only vs fully-qualified) and emit a deduplicated warning.
  • Calls the warning helper from applyActionPinToTypedStep, the central pinning entry point for user-provided steps.
  • Adds unit tests covering the warning decision tree and deduplication behavior.
Show a summary per file
File Description
pkg/workflow/action_pins.go Adds outdated-version warning helper and wires it into step pinning.
pkg/workflow/action_pins_test.go Adds tests for warning behavior, deduplication, and integration via applyActionPinToTypedStep.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 3
  • Review effort level: Low

Comment on lines +1810 to +1824
if tt.expectWarn {
if output == "" {
t.Errorf("expected a warning to be printed but got none")
}
if tt.warnContains != "" && !strings.Contains(output, tt.warnContains) {
t.Errorf("warning output %q does not contain %q", output, tt.warnContains)
}
if !strings.Contains(output, tt.latestVer) {
t.Errorf("warning output %q does not mention latest version %q", output, tt.latestVer)
}
} else {
if output != "" {
t.Errorf("expected no warning but got: %q", output)
}
}
Comment thread pkg/workflow/action_pins_test.go Outdated
Comment on lines +1848 to +1851
count := strings.Count(output, "v5.0.0")
if count != 1 {
t.Errorf("expected warning to appear exactly once, got %d occurrences in: %q", count, output)
}
Comment thread pkg/workflow/action_pins_test.go Outdated
Comment on lines +1881 to +1886
if !strings.Contains(output, "v5.0.0") {
t.Errorf("expected outdated-version warning mentioning v5.0.0 in stderr, got: %q", output)
}
if !strings.Contains(output, "v7.0.0") {
t.Errorf("expected outdated-version warning mentioning latest v7.0.0 in stderr, got: %q", output)
}
@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

⚠️ PR Code Quality Reviewer failed during code quality review.

@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (271 new lines in pkg/) but does not have a linked Architecture Decision Record (ADR).

Draft ADR committed: docs/adr/43455-warn-on-outdated-action-versions.md — review and complete it before merging.

This PR cannot merge until an ADR is linked in the PR body.

What to do next
  1. Review the draft ADR committed to your branch — it was generated from the PR diff
  2. Complete the missing sections — add context the AI could not infer, refine the decision rationale, and list real alternatives you considered
  3. Commit the finalized ADR to docs/adr/ on your branch
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-43455: Warn on Outdated Action Versions

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

Why ADRs Matter

ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you.

Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 0042-use-postgresql.md for PR #42).

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · 50.4 AIC · ⌖ 7.4 AIC · ⊞ 8.4K ·
Comment /review to run again

@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100 — Excellent

Analyzed 4 test(s): 3 design, 1 implementation, 0 violation(s).

📊 Metrics (4 tests)
Metric Value
Analyzed 4 (Go: 4, JS: 0)
✅ Design 3 (75%)
⚠️ Implementation 1 (25%)
Edge/error coverage 4 (100%)
Duplicate clusters 0
Inflation YES (204 test lines / 67 prod lines = 3.04:1)
🚨 Violations 0

| Test | File | Classification | Issues |
|---|---|---|
| TestWarnIfOutdatedActionVersion (8 sub-cases) | action_pins_test.go:1716 | design_test | None |
| TestWarnIfOutdatedActionVersion_Deduplication | action_pins_test.go:1796 | design_test | None |
| TestApplyActionPinToTypedStep_OutdatedWarning | action_pins_test.go:1821 | design_test | None |
| TestApplyActionPinToTypedStep_NoOutdatedWarningForCurrentVersion | action_pins_test.go:1863 | implementation_test | Absence-of-output assertion; duplicated stderr-capture boilerplate |

⚠️ Flagged Tests (1)

TestApplyActionPinToTypedStep_NoOutdatedWarningForCurrentVersion (action_pins_test.go:~1863) — implementation_test. The negative assertion (no warning emitted for a current version) is regression-safe but verifies absence of output rather than a user-visible behavioral contract. Additionally, the os.Pipe/stderr-capture boilerplate is duplicated verbatim across all 4 tests — extracting a captureStderr(t, fn func()) helper would reduce noise and make intent clearer.

Test inflation note: 204 test lines added vs. 67 production lines (3.04:1 ratio, above the 2:1 threshold). The extra lines are justified by the os.Pipe stderr-capture boilerplate required in each test plus an 8-row table-driven suite. Coverage is solid — the inflation is cosmetic, not padding.

Verdict

Passed. 25% implementation tests (threshold: 30%). No guideline violations (build tag //go:build !integration present on line 1, no mock libraries used). Test inflation (3.04:1) exceeds the 2:1 threshold due to stderr-capture boilerplate, not artificial test padding.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🧪 Test quality analysis by Test Quality Sentinel · 34.5 AIC · ⌖ 12.1 AIC · ⊞ 6.8K ·
Comment /review to run again

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 80/100 — Excellent. 25% implementation tests (threshold: 30%). No guideline violations. Test inflation (3.04:1) is driven by stderr-capture boilerplate, not padding.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd, /codebase-design, and /grill-with-docs — requesting changes on a few correctness and test-quality issues.

📋 Key Themes & Highlights

Key Themes

  • Test fragility: os.Stderr is mutated globally in the new tests without t.Cleanup guards, which can cause flaky failures if t.Fatal fires before the manual restore.
  • Partial-tag edge case: The isPartialTag logic only guards v4 (no dot) but not v4.1 (major.minor), which would generate a false-positive warning for v4.1 when the latest is v4.2.0. A test and a one-line fix are both needed.
  • Nil-data deduplication gap: When data == nil, the dedup cache is bypassed and the warning can fire multiple times per compilation — the contract should be tightened or documented.
  • Message format mismatch: The emitted warning string doesn't match the two-line format shown in the PR description.
  • Test brittleness: Integration tests hardcode v5.0.0/v7.0.0 against the embedded pins; they'll silently stop being meaningful when the embedded pins are bumped.

Positive Highlights

  • ✅ Excellent overall test coverage: 8 unit cases + deduplication test + 2 integration tests — well above average for a new helper.
  • ✅ Clean integration point: calling from applyActionPinToTypedStep ensures all uses: step types (pre-steps, steps, post-steps) are covered by a single hook.
  • ✅ Smart floating-tag semantics: suppressing warnings for same-major @vN refs avoids common false positives.
  • ✅ Deduplication via ActionPinWarnings prevents warning spam in multi-step workflows.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 81.8 AIC · ⌖ 6.24 AIC · ⊞ 6.6K
Comment /matt to run again

Comment thread pkg/workflow/action_pins_test.go Outdated
if err != nil {
t.Fatalf("os.Pipe() failed: %v", err)
}
os.Stderr = w

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] os.Stderr is a process-global: if a subtest panics or t.Fatal fires before the manual restore, all subsequent tests in the same process will write into the closed pipe — causing flaky or incorrect results in parallel CI runs.

💡 Use t.Cleanup for guaranteed restoration

Replace the manual save/restore with a t.Cleanup call immediately after opening the pipe:

os.Stderr = w
t.Cleanup(func() {
    w.Close()
    os.Stderr = oldStderr
})

This ensures stderr is always restored even when t.Fatal / t.FailNow short-circuits the test, and prevents state leakage if tests are ever run with t.Parallel().

@copilot please address this.

Comment thread pkg/workflow/action_pins.go Outdated

// Deduplicate: only emit the warning once per repo@version within this compilation.
cacheKey := "outdated:" + actionpins.FormatCacheKey(actionRepo, rawVersion)
if data != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/codebase-design] When data is nil, deduplication is silently skipped — meaning each call to warnIfOutdatedActionVersion with a nil data pointer will emit a duplicate warning. This is a subtle correctness gap since applyActionPinToTypedStep can be called with nil data in tests.

💡 Suggested fix

Consider a stricter guard: emit the warning only when data != nil, or document explicitly that callers must always supply a non-nil WorkflowData. If nil data is a valid production path, the warning should either be suppressed or dedup should use a package-level set:

// If data is nil, skip the warning to avoid duplicate noise.
if data == nil {
    return
}

@copilot please address this.

Comment thread pkg/workflow/action_pins.go Outdated
// resolve to the latest compatible patch within that major. Only warn when
// the latest available major version is higher — same-major newer patches are
// not "outdated" for a floating tag.
isPartialTag := !strings.Contains(strings.TrimPrefix(rawVersion, "v"), ".")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/codebase-design] The isPartialTag check treats v4.1 (major.minor, no patch) as a *full* tag — but a v4.1 floating ref behaves the same as v4 within the same major: it floats over patch releases. This means v4.1 with latest v4.2.0 would emit a spurious warning even though both are within the same major.

💡 Consider extending the partial-tag logic to major.minor refs

The partial-tag guard could be defined as *fewer than 3 components*:

// Treat tags without a patch component (e.g. "v4", "v4.1") as floating refs.
withoutV := strings.TrimPrefix(rawVersion, "v")
isPartialTag := strings.Count(withoutV, ".") < 2

This avoids a false positive for v4.1 → latest v4.2.0 since both share the same major, and aligns the warning semantics with how GitHub Actions resolves floating tags.

@copilot please address this.

}

// TestWarnIfOutdatedActionVersion tests the warnIfOutdatedActionVersion helper.
func TestWarnIfOutdatedActionVersion(t *testing.T) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] There is no test case for the v4.1 (major.minor, no patch) partial-tag edge case. Given the current logic, v4.1 with latest v4.2.0 would incorrectly emit a warning — a regression test would catch this and any future regressions in the partial-tag boundary.

💡 Suggested additional test cases
{
    name:       "minor partial tag same major does not warn",
    repo:       checkoutRepo,
    rawVersion: "v7.0",
    latestVer:  "v7.1.0",
    expectWarn: false,
},
{
    name:         "minor partial tag older major warns",
    repo:         checkoutRepo,
    rawVersion:   "v6.0",
    latestVer:    "v7.1.0",
    expectWarn:   true,
    warnContains: "v6.0",
},

@copilot please address this.


// TestApplyActionPinToTypedStep_OutdatedWarning checks that applying an outdated action
// pin via applyActionPinToTypedStep emits a warning on stderr.
func TestApplyActionPinToTypedStep_OutdatedWarning(t *testing.T) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The integration test TestApplyActionPinToTypedStep_OutdatedWarning hardcodes v5.0.0 and v7.0.0 as implicit knowledge of the embedded pins database. If the embedded pins are updated (e.g. a future v8.0.0 ships), the test still passes — but it no longer proves the warning fires for a version that's actually below latest.

💡 Derive expected values from the live embedded data

Use getLatestActionPinByRepo to dynamically look up the real latest version and construct an older version from it, making the test self-adapting:

latestPin, ok := getLatestActionPinByRepo("actions/checkout")
if !ok {
    t.Skip("no embedded pin for actions/checkout")
}
// Use a version guaranteed to be older than whatever is embedded.
olderVersion := fmt.Sprintf("v%d.0.0", latestPin.ParsedVersion().Major-2)
step := &WorkflowStep{Uses: "actions/checkout@" + olderVersion}

@copilot please address this.


warningMsg := fmt.Sprintf("Action %s@%s is outdated; latest available version is %s. Consider upgrading (update the version tag in your workflow file).",
actionRepo, rawVersion, latestVersion)
fmt.Fprintln(os.Stderr, console.FormatWarningMessage(warningMsg))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] The warning message emitted to stderr is formatted as a single long sentence with no newline separation from the "Consider upgrading" suggestion. The PR's example output in the description shows a two-line format — the implementation collapses it into one line, which differs from what's advertised.

💡 Match the PR's documented output format

The PR description shows:

⚠ Action actions/checkout@v3 is outdated; latest available version is v7.0.0.
  Consider upgrading (update the version tag in your workflow file).

Splitting the message at the period keeps the warning scannable:

warningMsg := fmt.Sprintf("Action %s@%s is outdated; latest available version is %s.\n  Consider upgrading (update the version tag in your workflow file).",
    actionRepo, rawVersion, latestVersion)

@copilot please address this.

@pelikhan

pelikhan commented Jul 5, 2026

Copy link
Copy Markdown
Collaborator

@copilot invoke pr-finisher skill

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 5, 2026

Copy link
Copy Markdown
Contributor Author

@copilot invoke pr-finisher skill

Handled. The follow-up fixes are in bc05783, and local validation is green (make lint, make test-unit, make test, and make agent-report-progress). CI on this head still needs a maintainer re-trigger.

@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

🤖 PR Triage — Run #7

Field Value
Category feature
Risk 🟡 medium
Score 60 / 100
Action fast_track

Score breakdown: impact 28 + urgency 18 + quality 14

New compiler warning for outdated uses: action versions vs action_pins.json. Handles partial (@vN), exact (@vN.M.P), and SHA refs with deduplication per compilation. 3 files, +384/-0 including unit tests. Useful DX improvement — fast-track for review.

Generated by 🔧 PR Triage Agent · 56.3 AIC · ⌖ 8.84 AIC · ⊞ 5.5K ·

@pelikhan

pelikhan commented Jul 5, 2026

Copy link
Copy Markdown
Collaborator

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 5, 2026

Copy link
Copy Markdown
Contributor Author

@copilot fix https://github.com/github/gh-aw/actions/runs/28733083714/job/85202304261?pr=43455

Fixed in a8e881b. The failing CGO test was caused by a stale pr-sous-chef contract assertion; it now matches the current workflow text (Process all eligible PRs per run.). Final validation is clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants