Skip to content

Include last 5 tool calls in tool-denials failure issues#39122

Merged
pelikhan merged 5 commits into
mainfrom
copilot/update-agentic-failure-issue
Jun 13, 2026
Merged

Include last 5 tool calls in tool-denials failure issues#39122
pelikhan merged 5 commits into
mainfrom
copilot/update-agentic-failure-issue

Conversation

Copilot AI commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Tool-denial failure issues currently show only the terminal denied request (guard.tool_denials_exceeded reason), which hides the lead-up to the guardrail trip. This change adds the preceding tool-call context so triage can quickly identify the permission mismatch pattern.

  • Scope

    • Extend tool_denials_exceeded issue context with the last 5 tool.execution_start calls seen before the guard event.
  • Event parsing updates (actions/setup/js/handle_agent_failure.cjs)

    • While scanning session events.jsonl, collect tool calls from tool.execution_start.
    • Persist recentToolCalls: recentToolCalls.slice(-5) on each captured guard.tool_denials_exceeded event.
    • Keep existing denied-reason normalization behavior unchanged.
  • Issue rendering updates (actions/setup/md/tool_denials_exceeded_context.md)

    • Add a new collapsible section:
      • Last 5 tool calls
    • Render calls as markdown list items for quick scanning in failure issues.
  • Test updates (actions/setup/js/handle_agent_failure.test.cjs)

    • Verify loader captures recent tool calls from session events.
    • Verify context rendering includes the new section and entries.
    • Verify truncation behavior keeps only the latest five calls.
events.push({
  denialCount,
  threshold,
  reason: parsed.data.reason?.trim() ?? "",
  recentToolCalls: recentToolCalls.slice(-5),
});

Copilot AI and others added 4 commits June 13, 2026 16:34
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan pelikhan marked this pull request as ready for review June 13, 2026 16:53
Copilot AI review requested due to automatic review settings June 13, 2026 16:53

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves triage for guard.tool_denials_exceeded failures by capturing and rendering the last 5 tool calls that occurred before the guardrail trip, so failure issues show the lead-up context instead of only the final denied request.

Changes:

  • Extend loadToolDenialsExceededEvents() to collect recent tool.execution_start calls and attach recentToolCalls to each captured guard event.
  • Update the tool_denials_exceeded_context.md template and renderer to include a new “Last 5 tool calls” collapsible section.
  • Add/extend tests to validate capture, rendering, and truncation behavior.
Show a summary per file
File Description
actions/setup/md/tool_denials_exceeded_context.md Adds a new collapsible section placeholder for rendering the last 5 tool calls.
actions/setup/js/handle_agent_failure.cjs Collects tool call context from events.jsonl and renders it into the tool-denials-exceeded issue body.
actions/setup/js/handle_agent_failure.test.cjs Adds coverage for capturing tool calls, rendering the new section, and truncating to the last 5.
.github/workflows/test-quality-sentinel.lock.yml Updates generated lockfile content (includes a user-visible message emoji change).

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 4

Comment thread actions/setup/js/handle_agent_failure.cjs
Comment on lines +1271 to +1275
@@ -1262,6 +1272,7 @@ function loadToolDenialsExceededEvents() {
denialCount,
threshold,
reason: typeof parsed.data.reason === "string" ? parsed.data.reason.trim() : "",
recentToolCalls: recentToolCalls.slice(-5),
Comment on lines +1304 to +1308
// Normalize the reason for display: multi-line programs (e.g. Python 3 heredocs) are
// collapsed to a single-line summary so the issue body renders cleanly.
const normalizedReason = normalizeDeniedPermissionCommand(reason);
const recentToolCallsList =
Array.isArray(latestEvent.recentToolCalls) && latestEvent.recentToolCalls.length > 0
env:
GH_AW_WORKFLOW_NAME: "Test Quality Sentinel"
GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🧪 *Test quality analysis by [{workflow_name}]({run_url})*{ai_credits_suffix}{history_link}\",\"runStarted\":\"🔬 [{workflow_name}]({run_url}) is analyzing test quality on this {event_type}...\",\"runSuccess\":\"🧪 [{workflow_name}]({run_url}) completed test quality analysis.\",\"runFailure\":\"❌ [{workflow_name}]({run_url}) {status} during test quality analysis.\"}"
GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🧪 *Test quality analysis by [{workflow_name}]({run_url})*{ai_credits_suffix}{history_link}\",\"runStarted\":\"🔬 [{workflow_name}]({run_url}) is analyzing test quality on this {event_type}...\",\"runSuccess\":\" [{workflow_name}]({run_url}) completed test quality analysis.\",\"runFailure\":\"❌ [{workflow_name}]({run_url}) {status} during test quality analysis.\"}"
@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot run pr-finisher skill

… revert lock emoji

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

@copilot run pr-finisher skill

Done. Here's what was addressed in commit f1c0b4e:

recentToolCalls unbounded growth — buffer is now capped at 5 with shift() on overflow, so it never grows beyond the window we need.

Guard event timestamp captureloadToolDenialsExceededEvents now persists timestamp (ISO string) on each pushed event.

Nondeterministic latest-event selectionbuildToolDenialsExceededContext now uses reduce to pick the event with the lexicographically greatest ISO timestamp instead of events[events.length - 1], which was order-dependent when multiple session directories were present.

Unrelated lock file change — reverted the 🧪 → ✅ runSuccess emoji change in test-quality-sentinel.lock.yml.

Tests updated to assert the new timestamp field in loader output. make fmt, make lint, make test-unit all pass locally.

CI needs a human re-trigger (close/reopen or workflow_dispatch) since agent pushes don't trigger workflows.

Copilot AI requested a review from pelikhan June 13, 2026 18:08
@github-actions

Copy link
Copy Markdown
Contributor

@copilot review all comments and address unresolved review feedback. Then re-trigger CI after the open threads are resolved.

Generated by 👨‍🍳 PR Sous Chef · 143.8 AIC · ⌖ 1.04 AIC · ⊞ 17.4K ·

@pelikhan pelikhan merged commit 4bacc0a into main Jun 13, 2026
14 checks passed
@pelikhan pelikhan deleted the copilot/update-agentic-failure-issue branch June 13, 2026 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants