Include last 5 tool calls in tool-denials failure issues#39122
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR improves triage for guard.tool_denials_exceeded failures by capturing and rendering the last 5 tool calls that occurred before the guardrail trip, so failure issues show the lead-up context instead of only the final denied request.
Changes:
- Extend
loadToolDenialsExceededEvents()to collect recenttool.execution_startcalls and attachrecentToolCallsto each captured guard event. - Update the
tool_denials_exceeded_context.mdtemplate and renderer to include a new “Last 5 tool calls” collapsible section. - Add/extend tests to validate capture, rendering, and truncation behavior.
Show a summary per file
| File | Description |
|---|---|
| actions/setup/md/tool_denials_exceeded_context.md | Adds a new collapsible section placeholder for rendering the last 5 tool calls. |
| actions/setup/js/handle_agent_failure.cjs | Collects tool call context from events.jsonl and renders it into the tool-denials-exceeded issue body. |
| actions/setup/js/handle_agent_failure.test.cjs | Adds coverage for capturing tool calls, rendering the new section, and truncating to the last 5. |
| .github/workflows/test-quality-sentinel.lock.yml | Updates generated lockfile content (includes a user-visible message emoji change). |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 4/4 changed files
- Comments generated: 4
| @@ -1262,6 +1272,7 @@ function loadToolDenialsExceededEvents() { | |||
| denialCount, | |||
| threshold, | |||
| reason: typeof parsed.data.reason === "string" ? parsed.data.reason.trim() : "", | |||
| recentToolCalls: recentToolCalls.slice(-5), | |||
| // Normalize the reason for display: multi-line programs (e.g. Python 3 heredocs) are | ||
| // collapsed to a single-line summary so the issue body renders cleanly. | ||
| const normalizedReason = normalizeDeniedPermissionCommand(reason); | ||
| const recentToolCallsList = | ||
| Array.isArray(latestEvent.recentToolCalls) && latestEvent.recentToolCalls.length > 0 |
| env: | ||
| GH_AW_WORKFLOW_NAME: "Test Quality Sentinel" | ||
| GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🧪 *Test quality analysis by [{workflow_name}]({run_url})*{ai_credits_suffix}{history_link}\",\"runStarted\":\"🔬 [{workflow_name}]({run_url}) is analyzing test quality on this {event_type}...\",\"runSuccess\":\"🧪 [{workflow_name}]({run_url}) completed test quality analysis.\",\"runFailure\":\"❌ [{workflow_name}]({run_url}) {status} during test quality analysis.\"}" | ||
| GH_AW_SAFE_OUTPUT_MESSAGES: "{\"footer\":\"\\u003e 🧪 *Test quality analysis by [{workflow_name}]({run_url})*{ai_credits_suffix}{history_link}\",\"runStarted\":\"🔬 [{workflow_name}]({run_url}) is analyzing test quality on this {event_type}...\",\"runSuccess\":\"✅ [{workflow_name}]({run_url}) completed test quality analysis.\",\"runFailure\":\"❌ [{workflow_name}]({run_url}) {status} during test quality analysis.\"}" |
|
@copilot run pr-finisher skill |
… revert lock emoji Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done. Here's what was addressed in commit
Guard event timestamp capture — Nondeterministic latest-event selection — Unrelated lock file change — reverted the Tests updated to assert the new CI needs a human re-trigger (close/reopen or |
|
@copilot review all comments and address unresolved review feedback. Then re-trigger CI after the open threads are resolved.
|
Tool-denial failure issues currently show only the terminal denied request (
guard.tool_denials_exceededreason), which hides the lead-up to the guardrail trip. This change adds the preceding tool-call context so triage can quickly identify the permission mismatch pattern.Scope
tool_denials_exceededissue context with the last 5tool.execution_startcalls seen before the guard event.Event parsing updates (
actions/setup/js/handle_agent_failure.cjs)events.jsonl, collect tool calls fromtool.execution_start.recentToolCalls: recentToolCalls.slice(-5)on each capturedguard.tool_denials_exceededevent.Issue rendering updates (
actions/setup/md/tool_denials_exceeded_context.md)Test updates (
actions/setup/js/handle_agent_failure.test.cjs)