Harden issue-intent confidence handling in safe-output ingestion by Copilot · Pull Request #43168 · github/gh-aw

Copilot · 2026-07-03T10:26:14Z

Issue-intent safe outputs were too brittle: malformed optional confidence / rationale metadata could cause otherwise valid set_issue_type and set_issue_field calls to be dropped during NDJSON ingestion. This change makes intent guidance more explicit and treats invalid optional intent metadata as removable enrichment instead of a hard failure.

Clarify issue-intent tool guidance
- Update the generated INTENT suffix for issue-intent-aware tools to explicitly require:
  - rationale as a string
  - confidence as a string enum: LOW | MEDIUM | HIGH
Make optional intent metadata lenient
- Strip invalid top-level rationale / confidence from:
  - set_issue_type
  - set_issue_field
- Strip invalid structured-label rationale / confidence from issue-intent label payloads
- Preserve the rest of the safe-output item when required fields are valid
Add regression coverage
- Cover the updated description suffix
- Cover validator behavior for malformed optional intent fields
- Cover NDJSON collection preserving valid tool calls after intent-field stripping

Example of the new behavior:

{"type":"set_issue_field","field_name":"Priority","value":"P1","confidence":"0.95","rationale":{"why":"bad"}}

Now ingests as:

{"type":"set_issue_field","field_name":"Priority","value":"P1"}

instead of rejecting the entire message.

Generated by 👨‍🍳 PR Sous Chef · 7.05 AIC · ⌖ 5.18 AIC · ⊞ 6.6K · ◷
_{Comment /souschef to run again}

Generated by 👨‍🍳 PR Sous Chef · 7.23 AIC · ⌖ 3.11 AIC · ⊞ 6.6K · ◷
_{Comment /souschef to run again}

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

This PR hardens “issue intent” safe-output ingestion so malformed optional rationale / confidence metadata no longer causes otherwise-valid set_issue_type / set_issue_field (and structured label) outputs to be dropped during NDJSON collection. It also updates the generated tool-description INTENT suffix to specify the expected types for these optional fields.

Changes:

Make issue-intent enrichment fields (rationale, confidence) lenient by stripping them when malformed instead of rejecting the entire safe-output item.
Update structured-label validation to ignore invalid rationale/confidence values while preserving the rest of each label entry.
Add regression tests for the updated INTENT suffix, stripping behavior, and NDJSON preservation.

Show a summary per file

File	Description
actions/setup/js/safe_output_type_validator.cjs	Implements lenient stripping for invalid optional issue-intent fields and structured-label normalization changes.
actions/setup/js/safe_output_type_validator.test.cjs	Adds validator regression coverage for structured labels and top-level stripping behavior.
actions/setup/js/generate_safe_outputs_tools.cjs	Updates the generated INTENT guidance text for issue-intent-aware tools.
actions/setup/js/generate_safe_outputs_tools.test.cjs	Updates tests to assert the new INTENT suffix string.
actions/setup/js/collect_ndjson_output.test.cjs	Adds NDJSON ingestion regression coverage to ensure valid items are preserved after stripping invalid intent fields.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 5/5 changed files
Comments generated: 2
Review effort level: Low

+/**
+ * Optional issue-intent enrichment fields should be stripped rather than causing
+ * the entire safe-output item to be rejected when they are malformed.
+ * @param {string} fieldName
+ * @param {FieldValidation} validation
+ * @returns {boolean}
+ */
+function shouldStripInvalidIssueIntentField(fieldName, validation) {
+  return !validation.required && LENIENT_ISSUE_INTENT_FIELDS.has(fieldName);
+}


    // Handle enum validation
    if (validation.enum) {
      const normalizedValue = value.toLowerCase ? value.toLowerCase() : value;
      const normalizedEnum = validation.enum.map(e => (e.toLowerCase ? e.toLowerCase() : e));
      if (!normalizedEnum.includes(normalizedValue)) {
+        if (shouldStripInvalidIssueIntentField(fieldName, validation)) {
+          return { isValid: true, removeField: true };
+        }


gh-aw-bot · 2026-07-03T11:22:31Z

@copilot please run the pr-finisher skill, verify whether this branch needs a refresh from main, and do a final pass on validation/check coverage before rerunning checks if needed.

Generated by 👨‍🍳 PR Sous Chef · 5.99 AIC · ⌖ 7.3 AIC · ⊞ 5.1K · ◷
_{Comment /souschef to run again}

Copilot · 2026-07-03T11:33:50Z

@copilot please run the pr-finisher skill, verify whether this branch needs a refresh from main, and do a final pass on validation/check coverage before rerunning checks if needed.

...

Done. I ran a PR-finisher pass and validated readiness on f50e626.

Branch refresh check: this branch is currently 2 commits behind main (and 2 ahead), so it does need a refresh if up-to-date-with-base is required before merge.
Final local validation pass completed successfully: make fmt, make lint, make test-unit, make test, make recompile.
CI/check coverage review: recent action_required workflow runs on this head show no failed jobs via log inspection; they appear to be workflow/action-required outcomes rather than test/build failures.

Since agent pushes do not re-trigger CI, a maintainer should re-run/re-trigger checks after any refresh/update to confirm green status on the final head.

github-actions · 2026-07-03T12:51:39Z

✅ PR Code Quality Reviewer completed the code quality review.

github-actions · 2026-07-03T12:51:40Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #43168 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold: 100).

github-actions · 2026-07-03T12:51:45Z

🧠 Matt Pocock Skills Reviewer failed during the skills-based review.

github-actions · 2026-07-03T12:51:51Z

✅ Test Quality Sentinel completed test quality analysis.

github-actions · 2026-07-03T13:02:21Z

🤖 PR Triage

Field	Value
Category	`bug`
Risk	🟢 Low
Score	58/100 (Impact 26 + Urgency 18 + Quality 14)
Action	⚡ `fast_track`
Batch	`pr-batch:safe-output-fixes` (with #43169)

Score Breakdown

Impact (26/50): Hardens safe-output NDJSON ingestion — malformed optional confidence/rationale metadata no longer drops valid intent calls
Urgency (18/30): 6 issue comments including bot review; copilot-pull-request-reviewer COMMENTED; pr-finisher ping
Quality (14/20): 5 files, 100 adds, well-described defensive hardening

Batch candidate with #43169. Both are defensive safe-output pipeline fixes.

Generated by 🔧 PR Triage Agent · 92.3 AIC · ⌖ 6.73 AIC · ⊞ 5.5K · ◷

gh-aw-bot · 2026-07-03T14:33:02Z

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 7.05 AIC · ⌖ 5.18 AIC · ⊞ 6.6K · ◷
_{Comment /souschef to run again}

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot · 2026-07-03T14:51:28Z

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

...

Addressed unresolved review feedback in 9e1958c0f4.

Scoped lenient stripping for rationale/confidence to top-level issue-intent types (set_issue_type, set_issue_field)
Trimmed enum input so values like " medium " normalize to "MEDIUM"
Applied hard-limit rationale normalization for top-level issue-intent rationale
Made structured-label confidence handling explicitly string-only
Removed the redundant mkdir in NDJSON test and expanded regression coverage

Branch status against main: 0 behind / 3 ahead after refresh.

I can’t reliably rerun CI on the updated head from this agent context; please re-trigger checks from the PR UI (or via maintainer push/workflow dispatch) to validate the latest commit.

gh-aw-bot · 2026-07-03T15:29:58Z

@copilot please run the pr-finisher skill, address the latest review suggestions (especially any unresolved comment threads), and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 10.6 AIC · ⌖ 3.18 AIC · ⊞ 6.6K · ◷
_{Comment /souschef to run again}

…intent-confidence-handling Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot · 2026-07-03T15:46:45Z

@copilot please run the pr-finisher skill, address the latest review suggestions (especially any unresolved comment threads), and rerun checks once the branch is up to date.

...

Completed a fresh PR-finisher pass.

Branch refresh: merged origin/main into this branch in b5297f7df5 (now 0 behind / 5 ahead).
Latest review suggestions: rechecked unresolved threads; no additional code changes were needed beyond the prior validator/test fixes in 9e1958c0f4.
Local validation on the updated head passed: make fmt, make lint, make test-unit, make test, make recompile.
CI/log review: inspected recent runs for this PR branch and fetched failed-job logs where applicable; no actionable product/test failure was reproducible from agent context.

CI for the current agent-pushed HEAD is still unverified until a maintainer re-triggers workflows from the PR UI (close/reopen, workflow_dispatch, or maintainer push).

gh-aw-bot · 2026-07-03T16:48:34Z

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 7.23 AIC · ⌖ 3.11 AIC · ⊞ 6.6K · ◷
_{Comment /souschef to run again}

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot · 2026-07-03T17:02:46Z

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

...

Addressed the remaining actionable validator review feedback in 36366ea by hardening validateItem normalization so validated fields are rebuilt after stripping invalid optional intent metadata. Branch is up to date with main (0 behind / 8 ahead).

Checks rerun locally:

make fmt ✅
make test-unit ✅
make test ✅
make recompile ✅

make lint and make agent-report-progress still fail on a pre-existing unrelated main-branch linter issue in pkg/workflow/arc_dind_artifacts.go (modernize/stringscutprefix).

I also reviewed prior CI runs via MCP logs; current PR-head CI is pending/stale after agent push and needs a maintainer re-trigger from the PR UI.

pelikhan · 2026-07-04T03:04:34Z

@copilot

use x-synonyms field that is already supported to specify the list of alternate field names
be lenient to parse values, case insensitive.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-07-04T03:21:31Z

@copilot

use x-synonyms field that is already supported to specify the list of alternate field names

be lenient to parse values, case insensitive.

Implemented in d128275.

Added x-synonyms support in safe-output type validation and wired issue-intent synonyms in validation config (issueType, fieldName, fieldNodeId, issueNumber).
Canonicalization now maps alternate field names case-insensitively before custom/field validation.
Value parsing remains lenient and case-insensitive for enums (including trimmed confidence values).
Added regression coverage in validator and NDJSON ingestion tests for synonym mapping + case-insensitive confidence parsing.

pelikhan · 2026-07-04T03:34:29Z

@copilot review all fields starting with "x-" and make sure there are on 2 concepts of field synonims/alises. Refactor and unify. Simplify changes.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-07-04T03:51:22Z

@copilot review all fields starting with "x-" and make sure there are on 2 concepts of field synonims/alises. Refactor and unify. Simplify changes.

Addressed in ac45852.

Unified field-name alias handling around a single x-synonyms concept in the validator.
Simplified synonym normalization via a dedicated helper (getFieldSynonyms) that trims/deduplicates values.
Renamed Go config field to XSynonyms to align internal naming with the x-synonyms extension key.

Targeted checks passed (make fmt, make fmt-cjs, make lint-cjs, npm test -- safe_output_type_validator.test.cjs collect_ndjson_output.test.cjs, go test ./pkg/workflow -count=1). make agent-report-progress still hits the existing unrelated modernize/stringscutprefix lint issue in pkg/workflow/arc_dind_artifacts.go.

Initial plan

06eae8b

Copilot AI assigned Copilot and pelikhan Jul 3, 2026

Copilot started work on behalf of pelikhan July 3, 2026 10:26 View session

Copilot AI linked an issue Jul 3, 2026 that may be closed by this pull request

Harden issue-intent confidence handling: clearer suffix + lenient validation #43164

Open

Harden issue intent confidence handling

f50e626

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Harden issue-intent confidence handling with clearer suffix and lenient validation~~ Harden issue-intent confidence handling in safe-output ingestion Jul 3, 2026

Copilot finished work on behalf of pelikhan July 3, 2026 10:38

Copilot AI requested a review from pelikhan July 3, 2026 10:38

pelikhan marked this pull request as ready for review July 3, 2026 10:42

Copilot AI review requested due to automatic review settings July 3, 2026 10:42

Copilot started reviewing on behalf of pelikhan July 3, 2026 10:43 View session

Copilot AI reviewed Jul 3, 2026

View reviewed changes

Copilot started work on behalf of gh-aw-bot July 3, 2026 11:22 View session

Copilot finished work on behalf of gh-aw-bot July 3, 2026 11:34

Copilot AI requested a review from gh-aw-bot July 3, 2026 11:34

github-actions Bot mentioned this pull request Jul 3, 2026

[aw] No-Op Runs #39849

Open

github-actions Bot added pr-action:fast_track pr-agent:copilot-swe-agent pr-batch:safe-output-fixes pr-priority:medium pr-risk:low pr-type:bug labels Jul 3, 2026

Copilot started work on behalf of gh-aw-bot July 3, 2026 14:33 View session

Address issue-intent validator review feedback

9e1958c

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot finished work on behalf of gh-aw-bot July 3, 2026 14:51

Copilot started work on behalf of gh-aw-bot July 3, 2026 15:30 View session

Merge remote-tracking branch 'origin/main' into copilot/harden-issue-…

b5297f7

…intent-confidence-handling Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot finished work on behalf of gh-aw-bot July 3, 2026 15:47

Merge branch 'main' into copilot/harden-issue-intent-confidence-handling

d1a00b4

Copilot started work on behalf of gh-aw-bot July 3, 2026 16:48 View session

Harden normalizedItem construction in safe output validator

36366ea

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot finished work on behalf of gh-aw-bot July 3, 2026 17:03

This was referenced Jul 3, 2026

[PR Triage Report] PR Triage Report — 2026-07-03 (Run #28677894072) #43257

Closed

[Outcome Report] Outcome Report — 2026-07-04 #43274

Open

[PR Triage Report] PR Triage Report — 2026-07-04 (Run #28689979672) #43275

Open

Copilot started work on behalf of pelikhan July 4, 2026 03:04 View session

Support issue-intent x-synonyms and case-insensitive parsing

d128275

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan July 4, 2026 03:22

Copilot started work on behalf of pelikhan July 4, 2026 03:34 View session

Copilot AI and others added 2 commits July 4, 2026 03:38

chore: outline plan for review feedback

1f0020a

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

refactor: unify x-synonyms handling for safe-output field names

ac45852

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan July 4, 2026 03:51

Uh oh!

Conversation

Copilot AI commented Jul 3, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Review details

Uh oh!

gh-aw-bot commented Jul 3, 2026

Uh oh!

Copilot AI commented Jul 3, 2026

Uh oh!

github-actions Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 3, 2026

🤖 PR Triage

Score Breakdown

Uh oh!

gh-aw-bot commented Jul 3, 2026

Uh oh!

Copilot AI commented Jul 3, 2026

Uh oh!

gh-aw-bot commented Jul 3, 2026

Uh oh!

Copilot AI commented Jul 3, 2026

Uh oh!

gh-aw-bot commented Jul 3, 2026

Uh oh!

Copilot AI commented Jul 3, 2026

Uh oh!

pelikhan commented Jul 4, 2026

Uh oh!

Copilot AI commented Jul 4, 2026

Uh oh!

pelikhan commented Jul 4, 2026

Uh oh!

Copilot AI commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Jul 3, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented Jul 3, 2026 •

edited

Loading

github-actions Bot commented Jul 3, 2026 •

edited

Loading

github-actions Bot commented Jul 3, 2026 •

edited

Loading

github-actions Bot commented Jul 3, 2026 •

edited

Loading