Skip to content

Make Copilot/Claude harness retry policy configurable via GH_AW_HARNESS_*#43051

Open
pelikhan with Copilot wants to merge 18 commits into
mainfrom
copilot/make-retry-count-configurable
Open

Make Copilot/Claude harness retry policy configurable via GH_AW_HARNESS_*#43051
pelikhan with Copilot wants to merge 18 commits into
mainfrom
copilot/make-retry-count-configurable

Conversation

Copilot AI commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Copilot and Claude harness retries were hardcoded to a short fixed window, which made sustained transient API outages fail the run before safe outputs could be written. This change keeps the existing defaults but allows workflows to widen the retry window without patching the built-in harnesses.

  • Harness retry policy

    • Added env-driven retry configuration to copilot_harness.cjs and claude_harness.cjs:
      • GH_AW_HARNESS_MAX_RETRIES
      • GH_AW_HARNESS_INITIAL_DELAY_MS
      • GH_AW_HARNESS_BACKOFF_MULTIPLIER
      • GH_AW_HARNESS_MAX_DELAY_MS
    • Preserved current defaults (3, 5000, 2, 60000).
    • Added parsing/validation with safe fallback to defaults for invalid values.
    • Clamp max delay to at least initial delay to avoid invalid backoff windows.
  • Workflow-facing configuration

    • Documented the intended workflow-level usage via engine.env, so retry tuning can be done in frontmatter without replacing the harness.
  • Coverage

    • Added focused harness tests for:
      • default retry config
      • env overrides
      • invalid env values
      • max-delay clamping

Example workflow override:

engine:
  id: copilot
  env:
    GH_AW_HARNESS_MAX_RETRIES: "6"
    GH_AW_HARNESS_INITIAL_DELAY_MS: "10000"
    GH_AW_HARNESS_BACKOFF_MULTIPLIER: "2"
    GH_AW_HARNESS_MAX_DELAY_MS: "180000"

Generated by 👨‍🍳 PR Sous Chef · 6.07 AIC · ⌖ 7.49 AIC · ⊞ 6.4K ·

Copilot AI and others added 3 commits July 2, 2026 23:56
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Make Copilot harness retry count and backoff cap configurable Make Copilot/Claude harness retry policy configurable via GH_AW_HARNESS_* Jul 3, 2026
Copilot AI requested a review from pelikhan July 3, 2026 00:02
@pelikhan

pelikhan commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

@copilot refactor configuration code so that all harnesses can use it. Reuse exiting function to read integer from env in JavaScript.
These are configuration vars that should be reused by every agentic engine.

Copilot AI and others added 4 commits July 3, 2026 00:39
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot refactor configuration code so that all harnesses can use it. Reuse exiting function to read integer from env in JavaScript.
These are configuration vars that should be reused by every agenti...

Addressed in 64ddfca.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Triage Result

Field Value
Category feature
Risk Medium
Priority Score 57/100
Score Breakdown Impact: 30 · Urgency: 15 · Quality: 12
Recommended Action fast_track

Rationale: Makes harness retry policy configurable via GH_AW_HARNESS_* env vars. Defaults preserved. Larger diff (+297/-74, 7 files) for a config change — review to confirm backward compat. Draft state needs resolution.

Generated by 🔧 PR Triage Agent · 98.6 AIC · ⌖ 9.89 AIC · ⊞ 5.5K ·

@pelikhan pelikhan marked this pull request as ready for review July 3, 2026 02:35
Copilot AI review requested due to automatic review settings July 3, 2026 02:35
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@pelikhan

pelikhan commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

@copilot single string legacy format is still accepted as "short form" . Test both.

…nly long form

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot single string legacy format is still accepted as "short form" . Test both.

Done in the latest commit. The existing string-form test was renamed to "object format - harness string short form (legacy)" and a new "object format - harness sub-object use-only (long form)" test was added alongside it, both asserting that HarnessScript is set to the same value.

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 6.07 AIC · ⌖ 7.49 AIC · ⊞ 6.4K ·

Copilot AI and others added 2 commits July 3, 2026 04:32
…dependency, add max-retries cap

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
…ility

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
@pelikhan

pelikhan commented Jul 4, 2026

Copy link
Copy Markdown
Collaborator

@copilot merge main and recompile, fix conflicts

Copilot AI and others added 2 commits July 4, 2026 03:36
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

@copilot merge main and recompile, fix conflicts

Done. Merged main into the branch and resolved conflicts in claude_harness.cjs and copilot_harness.cjs — kept the env-driven variable names (maxRetries, backoffMultiplier, maxDelayMs) from our branch while incorporating the soft-timeout guard checks added in main. Recompiled all 258 workflows successfully.

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address the unresolved review feedback from github-actions (review: GH_AW_HARNESS retry policy correctness/security concerns), add and link the required ADR in the PR body, refresh the branch from main if needed, and rerun checks once updates are pushed.

Generated by 👨‍🍳 PR Sous Chef · 5.58 AIC · ⌖ 7.65 AIC · ⊞ 6.7K ·
Comment /souschef to run again

Copilot AI and others added 2 commits July 4, 2026 03:56
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
…DR-43051

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot AI commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

@copilot please run the pr-finisher skill, address the unresolved review feedback from github-actions (review: GH_AW_HARNESS retry policy correctness/security concerns), add and link the required AD...

Done. All unresolved review threads were already addressed in prior commits (all five threads are marked outdated or their concerns resolved). Added ADR-43051 at docs/adr/43051-configurable-harness-retry-policy.md documenting the design decision. Branch is up to date with main (merged in the previous round). Local validation passes (make fmt, make lint — one pre-existing unrelated lint issue in arc_dind_artifacts.go, harness tests all green). CI needs a human re-trigger after this push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make Copilot/Claude harness retry count and backoff cap configurable (transient API outages)

4 participants