Skip to content

Harden issue-intent confidence handling in safe-output ingestion#43168

Open
pelikhan with Copilot wants to merge 10 commits into
mainfrom
copilot/harden-issue-intent-confidence-handling
Open

Harden issue-intent confidence handling in safe-output ingestion#43168
pelikhan with Copilot wants to merge 10 commits into
mainfrom
copilot/harden-issue-intent-confidence-handling

Conversation

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Issue-intent safe outputs were too brittle: malformed optional confidence / rationale metadata could cause otherwise valid set_issue_type and set_issue_field calls to be dropped during NDJSON ingestion. This change makes intent guidance more explicit and treats invalid optional intent metadata as removable enrichment instead of a hard failure.

  • Clarify issue-intent tool guidance

    • Update the generated INTENT suffix for issue-intent-aware tools to explicitly require:
      • rationale as a string
      • confidence as a string enum: LOW | MEDIUM | HIGH
  • Make optional intent metadata lenient

    • Strip invalid top-level rationale / confidence from:
      • set_issue_type
      • set_issue_field
    • Strip invalid structured-label rationale / confidence from issue-intent label payloads
    • Preserve the rest of the safe-output item when required fields are valid
  • Add regression coverage

    • Cover the updated description suffix
    • Cover validator behavior for malformed optional intent fields
    • Cover NDJSON collection preserving valid tool calls after intent-field stripping

Example of the new behavior:

{"type":"set_issue_field","field_name":"Priority","value":"P1","confidence":"0.95","rationale":{"why":"bad"}}

Now ingests as:

{"type":"set_issue_field","field_name":"Priority","value":"P1"}

instead of rejecting the entire message.


Generated by 👨‍🍳 PR Sous Chef · 7.05 AIC · ⌖ 5.18 AIC · ⊞ 6.6K ·
Comment /souschef to run again


Generated by 👨‍🍳 PR Sous Chef · 7.23 AIC · ⌖ 3.11 AIC · ⊞ 6.6K ·
Comment /souschef to run again

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Harden issue-intent confidence handling with clearer suffix and lenient validation Harden issue-intent confidence handling in safe-output ingestion Jul 3, 2026
Copilot AI requested a review from pelikhan July 3, 2026 10:38
@pelikhan pelikhan marked this pull request as ready for review July 3, 2026 10:42
Copilot AI review requested due to automatic review settings July 3, 2026 10:42

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens “issue intent” safe-output ingestion so malformed optional rationale / confidence metadata no longer causes otherwise-valid set_issue_type / set_issue_field (and structured label) outputs to be dropped during NDJSON collection. It also updates the generated tool-description INTENT suffix to specify the expected types for these optional fields.

Changes:

  • Make issue-intent enrichment fields (rationale, confidence) lenient by stripping them when malformed instead of rejecting the entire safe-output item.
  • Update structured-label validation to ignore invalid rationale/confidence values while preserving the rest of each label entry.
  • Add regression tests for the updated INTENT suffix, stripping behavior, and NDJSON preservation.
Show a summary per file
File Description
actions/setup/js/safe_output_type_validator.cjs Implements lenient stripping for invalid optional issue-intent fields and structured-label normalization changes.
actions/setup/js/safe_output_type_validator.test.cjs Adds validator regression coverage for structured labels and top-level stripping behavior.
actions/setup/js/generate_safe_outputs_tools.cjs Updates the generated INTENT guidance text for issue-intent-aware tools.
actions/setup/js/generate_safe_outputs_tools.test.cjs Updates tests to assert the new INTENT suffix string.
actions/setup/js/collect_ndjson_output.test.cjs Adds NDJSON ingestion regression coverage to ensure valid items are preserved after stripping invalid intent fields.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 5/5 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment on lines +88 to +97
/**
* Optional issue-intent enrichment fields should be stripped rather than causing
* the entire safe-output item to be rejected when they are malformed.
* @param {string} fieldName
* @param {FieldValidation} validation
* @returns {boolean}
*/
function shouldStripInvalidIssueIntentField(fieldName, validation) {
return !validation.required && LENIENT_ISSUE_INTENT_FIELDS.has(fieldName);
}
Comment on lines 498 to +505
// Handle enum validation
if (validation.enum) {
const normalizedValue = value.toLowerCase ? value.toLowerCase() : value;
const normalizedEnum = validation.enum.map(e => (e.toLowerCase ? e.toLowerCase() : e));
if (!normalizedEnum.includes(normalizedValue)) {
if (shouldStripInvalidIssueIntentField(fieldName, validation)) {
return { isValid: true, removeField: true };
}
@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, verify whether this branch needs a refresh from main, and do a final pass on validation/check coverage before rerunning checks if needed.

Generated by 👨‍🍳 PR Sous Chef · 5.99 AIC · ⌖ 7.3 AIC · ⊞ 5.1K ·
Comment /souschef to run again

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot please run the pr-finisher skill, verify whether this branch needs a refresh from main, and do a final pass on validation/check coverage before rerunning checks if needed.

...

Done. I ran a PR-finisher pass and validated readiness on f50e626.

  • Branch refresh check: this branch is currently 2 commits behind main (and 2 ahead), so it does need a refresh if up-to-date-with-base is required before merge.
  • Final local validation pass completed successfully: make fmt, make lint, make test-unit, make test, make recompile.
  • CI/check coverage review: recent action_required workflow runs on this head show no failed jobs via log inspection; they appear to be workflow/action-required outcomes rather than test/build failures.

Since agent pushes do not re-trigger CI, a maintainer should re-run/re-trigger checks after any refresh/update to confirm green status on the final head.

Copilot AI requested a review from gh-aw-bot July 3, 2026 11:34
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #43168 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold: 100).

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer failed during the skills-based review.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🤖 PR Triage

Field Value
Category bug
Risk 🟢 Low
Score 58/100 (Impact 26 + Urgency 18 + Quality 14)
Action fast_track
Batch pr-batch:safe-output-fixes (with #43169)

Score Breakdown

  • Impact (26/50): Hardens safe-output NDJSON ingestion — malformed optional confidence/rationale metadata no longer drops valid intent calls
  • Urgency (18/30): 6 issue comments including bot review; copilot-pull-request-reviewer COMMENTED; pr-finisher ping
  • Quality (14/20): 5 files, 100 adds, well-described defensive hardening

Batch candidate with #43169. Both are defensive safe-output pipeline fixes.

Generated by 🔧 PR Triage Agent · 92.3 AIC · ⌖ 6.73 AIC · ⊞ 5.5K ·

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 7.05 AIC · ⌖ 5.18 AIC · ⊞ 6.6K ·
Comment /souschef to run again

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

...

Addressed unresolved review feedback in 9e1958c0f4.

  • Scoped lenient stripping for rationale/confidence to top-level issue-intent types (set_issue_type, set_issue_field)
  • Trimmed enum input so values like " medium " normalize to "MEDIUM"
  • Applied hard-limit rationale normalization for top-level issue-intent rationale
  • Made structured-label confidence handling explicitly string-only
  • Removed the redundant mkdir in NDJSON test and expanded regression coverage

Branch status against main: 0 behind / 3 ahead after refresh.

I can’t reliably rerun CI on the updated head from this agent context; please re-trigger checks from the PR UI (or via maintainer push/workflow dispatch) to validate the latest commit.

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address the latest review suggestions (especially any unresolved comment threads), and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 10.6 AIC · ⌖ 3.18 AIC · ⊞ 6.6K ·
Comment /souschef to run again

…intent-confidence-handling

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot please run the pr-finisher skill, address the latest review suggestions (especially any unresolved comment threads), and rerun checks once the branch is up to date.

...

Completed a fresh PR-finisher pass.

  • Branch refresh: merged origin/main into this branch in b5297f7df5 (now 0 behind / 5 ahead).
  • Latest review suggestions: rechecked unresolved threads; no additional code changes were needed beyond the prior validator/test fixes in 9e1958c0f4.
  • Local validation on the updated head passed: make fmt, make lint, make test-unit, make test, make recompile.
  • CI/log review: inspected recent runs for this PR branch and fetched failed-job logs where applicable; no actionable product/test failure was reproducible from agent context.

CI for the current agent-pushed HEAD is still unverified until a maintainer re-triggers workflows from the PR UI (close/reopen, workflow_dispatch, or maintainer push).

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 7.23 AIC · ⌖ 3.11 AIC · ⊞ 6.6K ·
Comment /souschef to run again

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

...

Addressed the remaining actionable validator review feedback in 36366ea by hardening validateItem normalization so validated fields are rebuilt after stripping invalid optional intent metadata. Branch is up to date with main (0 behind / 8 ahead).

Checks rerun locally:

  • make fmt
  • make test-unit
  • make test
  • make recompile

make lint and make agent-report-progress still fail on a pre-existing unrelated main-branch linter issue in pkg/workflow/arc_dind_artifacts.go (modernize/stringscutprefix).

I also reviewed prior CI runs via MCP logs; current PR-head CI is pending/stale after agent push and needs a maintainer re-trigger from the PR UI.

@pelikhan

pelikhan commented Jul 4, 2026

Copy link
Copy Markdown
Collaborator

@copilot

  • use x-synonyms field that is already supported to specify the list of alternate field names
  • be lenient to parse values, case insensitive.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

@copilot

  • use x-synonyms field that is already supported to specify the list of alternate field names
  • be lenient to parse values, case insensitive.

Implemented in d128275.

  • Added x-synonyms support in safe-output type validation and wired issue-intent synonyms in validation config (issueType, fieldName, fieldNodeId, issueNumber).
  • Canonicalization now maps alternate field names case-insensitively before custom/field validation.
  • Value parsing remains lenient and case-insensitive for enums (including trimmed confidence values).
  • Added regression coverage in validator and NDJSON ingestion tests for synonym mapping + case-insensitive confidence parsing.

@pelikhan

pelikhan commented Jul 4, 2026

Copy link
Copy Markdown
Collaborator

@copilot review all fields starting with "x-" and make sure there are on 2 concepts of field synonims/alises. Refactor and unify. Simplify changes.

Copilot AI and others added 2 commits July 4, 2026 03:38
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

@copilot review all fields starting with "x-" and make sure there are on 2 concepts of field synonims/alises. Refactor and unify. Simplify changes.

Addressed in ac45852.

  • Unified field-name alias handling around a single x-synonyms concept in the validator.
  • Simplified synonym normalization via a dedicated helper (getFieldSynonyms) that trims/deduplicates values.
  • Renamed Go config field to XSynonyms to align internal naming with the x-synonyms extension key.

Targeted checks passed (make fmt, make fmt-cjs, make lint-cjs, npm test -- safe_output_type_validator.test.cjs collect_ndjson_output.test.cjs, go test ./pkg/workflow -count=1). make agent-report-progress still hits the existing unrelated modernize/stringscutprefix lint issue in pkg/workflow/arc_dind_artifacts.go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Harden issue-intent confidence handling: clearer suffix + lenient validation

4 participants