Skip to content

fix: detect stream_options HTTP 400 as non-retryable Anthropic BYOK error#43127

Open
pelikhan with Copilot wants to merge 6 commits into
mainfrom
copilot/aw-failures-fix-stream-options
Open

fix: detect stream_options HTTP 400 as non-retryable Anthropic BYOK error#43127
pelikhan with Copilot wants to merge 6 commits into
mainfrom
copilot/aw-failures-fix-stream-options

Conversation

Copilot AI commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

When the Copilot SDK runs in BYOK mode against an Anthropic-format endpoint, it can send the OpenAI-only stream_options field, causing Anthropic to return 400 stream_options: Extra inputs are not permitted. Because HTTP_400_RESPONSE_ERROR_PATTERN didn't match this error shape, the harness silently exhausted all 4 retries with failureClass=partial_execution and zero agent output.

Changes

  • copilot_harness.cjs + detect_agent_errors.cjs — Add a third alternative to HTTP_400_RESPONSE_ERROR_PATTERN matching 400[^\n]*stream_options:\s*Extra inputs are not permitted. The harness now classifies this as isHTTP400ResponseError, retries once as a fresh run, then breaks with a clear diagnostic instead of burning all retries.
// Before
const HTTP_400_RESPONSE_ERROR_PATTERN = /(?:Response status code does not indicate success:\s*400(?:\s*\(Bad Request\))?|400[^\n]*no model endpoints available given user constraints)/i;

// After — adds stream_options alternative
const HTTP_400_RESPONSE_ERROR_PATTERN = /(?:Response status code does not indicate success:\s*400(?:\s*\(Bad Request\))?|400[^\n]*no model endpoints available given user constraints|400[^\n]*stream_options:\s*Extra inputs are not permitted)/i;
  • copilot_harness.test.cjs — Three regression tests: exact match on the full SDK error string, match embedded in larger output, and a false-positive guard for unrelated stream_options mentions.

Root-cause note: inferProviderTypeForModel already short-circuits to "openai" for provider=copilot (line 421, awf_reflect.cjs), preventing the type mismatch that triggered the original failures. This PR adds the safety-net classifier for any residual cases where the 400 reaches the harness.

…ropic BYOK

Update HTTP_400_RESPONSE_ERROR_PATTERN in copilot_harness.cjs and
detect_agent_errors.cjs to match the Copilot SDK error
"400 400 400 stream_options: Extra inputs are not permitted".

This error is emitted when the SDK sends an OpenAI-only field
(stream_options) to an Anthropic-type provider endpoint. Previously
the error was not matched, so all 4 retry attempts failed identically
and silently (failureClass=partial_execution) instead of stopping
immediately with a clear diagnostic.

With this fix the harness detects the error as isHTTP400ResponseError,
retries once as a fresh run (in case conversation state is stale) and
then breaks with "HTTP 400 response error — not retrying".

Note: the root-cause (inferProviderTypeForModel always returning
"openai" for the "copilot" provider so the OpenAI-format client is
used against the Copilot LLM gateway) is already covered by the
check at awf_reflect.cjs line 421 and its companion tests in
awf_reflect.test.cjs.

Closes #43032

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix strip of OpenAI-only stream_options for type=anthropic providers fix: detect stream_options HTTP 400 as non-retryable Anthropic BYOK error Jul 3, 2026
Copilot AI requested a review from pelikhan July 3, 2026 06:48
@pelikhan pelikhan marked this pull request as ready for review July 3, 2026 06:59
Copilot AI review requested due to automatic review settings July 3, 2026 06:59
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

⚠️ PR Code Quality Reviewer failed during code quality review.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #43127 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold: 100).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves non-retryable error detection for Copilot SDK BYOK runs against Anthropic-format endpoints by recognizing a specific HTTP 400 validation error caused by the OpenAI-only stream_options field, preventing wasted retries and enabling clearer diagnostics.

Changes:

  • Extended HTTP_400_RESPONSE_ERROR_PATTERN in both the harness and action-side detector to match 400 ... stream_options: Extra inputs are not permitted.
  • Added regression tests in copilot_harness.test.cjs to validate matching (direct + embedded) and to guard against unrelated stream_options mentions.
Show a summary per file
File Description
actions/setup/js/detect_agent_errors.cjs Extends the shared HTTP 400 detection regex to include the stream_options Anthropic BYOK error shape.
actions/setup/js/copilot_harness.cjs Extends the harness-side HTTP 400 detection regex to classify the stream_options error as non-retryable.
actions/setup/js/copilot_harness.test.cjs Adds regression tests covering the new stream_options-related HTTP 400 detection behavior.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 1
  • Review effort level: Low

Comment on lines 66 to +75
// Pattern: Generic HTTP 400 Bad Request responses emitted by engine / SDK wrappers.
// NOTE: keep in sync with HTTP_400_RESPONSE_ERROR_PATTERN in copilot_harness.cjs.
// Also matches "400 400 400 no model endpoints available given user constraints" which is emitted
// by the Copilot SDK when no model endpoints are available for the user's configured constraints.
// The second alternative is anchored to a leading "400" to avoid false positives from unrelated
// Also matches "400 400 400 stream_options: Extra inputs are not permitted" which is emitted when
// the Copilot SDK sends an OpenAI-only field to an Anthropic-type provider.
// The non-first alternatives are anchored to a leading "400" to avoid false positives from unrelated
// diagnostic or informational messages that might contain the phrase.
const HTTP_400_RESPONSE_ERROR_PATTERN = /(?:Response status code does not indicate success:\s*400(?:\s*\(Bad Request\))?|400[^\n]*no model endpoints available given user constraints)/i;
const HTTP_400_RESPONSE_ERROR_PATTERN =
/(?:Response status code does not indicate success:\s*400(?:\s*\(Bad Request\))?|400[^\n]*no model endpoints available given user constraints|400[^\n]*stream_options:\s*Extra inputs are not permitted)/i;
@github-actions github-actions Bot mentioned this pull request Jul 3, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix: detect stream_options HTTP 400 as non-retryable Anthropic BYOK error

Pattern change is correct and safe. Key checks:

  • Regex safety — New alternative 400[^\n]*stream_options:\s*Extra inputs are not permitted does not match empty string, has no zero-width match risk, and correctly false-negatives on unrelated stream_options mentions.
  • Sync maintained — Both copilot_harness.cjs and detect_agent_errors.cjs updated identically per the // NOTE: keep in sync contract.
  • Tests — Three regressions cover exact match, embedded output, and false-positive guard.
  • Retry path unchanged — The existing single-retry-then-stop logic already applies to all isHTTP400ResponseError cases; no additional wiring needed.
  • Comment accuracy — Updated from "second alternative" to "non-first alternatives" to reflect the 3-alternative pattern.

🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 23.2 AIC · ⌖ 6.06 AIC · ⊞ 4.9K

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100 — Excellent

Analyzed 3 test(s): 3 design, 0 implementation, 0 violation(s).

📊 Metrics (3 tests)
Metric Value
Analyzed 3 (Go: 0, JS: 3)
✅ Design 3 (100%)
⚠️ Implementation 0 (0%)
Edge/error coverage 2 (67%)
Duplicate clusters 0
Inflation YES (test +13 lines vs prod +5 lines, ratio 2.6:1)
🚨 Violations 0
Test File Classification Issues
matches the stream_options error (basic positive) copilot_harness.test.cjs:1042 design_test None — single positive assertion, happy-path
matches stream_options embedded in larger output copilot_harness.test.cjs:1046 design_test None — covers multiline/embedded-output edge case
does not false-positive on unrelated messages mentioning stream_options copilot_harness.test.cjs:1051 design_test None — strong negative boundary assertion

Verdict

Passed. 0% implementation tests (threshold: 30%). Inflation ratio (2.6:1) exceeds 2:1, costing 10 pts, but all three tests verify behavioral contracts of the new regex pattern: positive match, embedded multiline match, and false-positive guard. No violations detected.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🧪 Test quality analysis by Test Quality Sentinel · 31.5 AIC · ⌖ 9.58 AIC · ⊞ 6.8K ·
Comment /review to run again

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 80/100. 0% implementation tests (threshold: 30%).

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnosing-bugs and /tdd — one non-blocking suggestion, otherwise this is a clean, well-structured fix.

📋 Key Themes & Highlights

Key Themes

  • Test coverage gap (minor): The three new regression tests were added only to copilot_harness.test.cjs. detect_agent_errors.test.cjs has its own describe("HTTP_400_RESPONSE_ERROR_PATTERN", ...) block that tests the exported symbol directly — it should get the same stream_options cases so pattern drift is caught at the source.

Positive Highlights

  • Precise root-cause diagnosis — PR description clearly explains the BYOK mismatch, why retries were silently exhausted, and distinguishes the safety-net fix from the upstream inferProviderTypeForModel mitigation.
  • False-positive guard included"Configuring stream_options for the request" negative test is exactly the kind of discipline /diagnosing-bugs calls for.
  • Pattern kept in sync — both copilot_harness.cjs and detect_agent_errors.cjs updated in lock-step with clear NOTE: keep in sync comments.
  • Minimal change — only the regex literal and its comment are touched; no structural churn.
  • Comment quality — the updated comment accurately describes all three alternatives and why the anchoring is there.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 50.9 AIC · ⌖ 5.03 AIC · ⊞ 6.6K
Comment /matt to run again

// diagnostic or informational messages that might contain the phrase.
const HTTP_400_RESPONSE_ERROR_PATTERN = /(?:Response status code does not indicate success:\s*400(?:\s*\(Bad Request\))?|400[^\n]*no model endpoints available given user constraints)/i;
const HTTP_400_RESPONSE_ERROR_PATTERN =
/(?:Response status code does not indicate success:\s*400(?:\s*\(Bad Request\))?|400[^\n]*no model endpoints available given user constraints|400[^\n]*stream_options:\s*Extra inputs are not permitted)/i;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] detect_agent_errors.test.cjs directly tests the exported HTTP_400_RESPONSE_ERROR_PATTERN (lines 185–209), but no stream_options cases were added there — only copilot_harness.test.cjs got the new coverage. If the two pattern definitions drift in the future, the dedicated test file won't catch it.

💡 Suggested additions to detect_agent_errors.test.cjs

Add a parallel set of cases inside the existing describe("HTTP_400_RESPONSE_ERROR_PATTERN", ...) block:

it("matches the stream_options: Extra inputs are not permitted Anthropic BYOK error", () => {
  expect(
    HTTP_400_RESPONSE_ERROR_PATTERN.test(
      "[copilot-sdk-driver] [sdk-driver] error: 400 400 400 stream_options: Extra inputs are not permitted"
    )
  ).toBe(true);
});

it("does not false-positive on unrelated messages mentioning stream_options", () => {
  expect(
    HTTP_400_RESPONSE_ERROR_PATTERN.test("Configuring stream_options for the request")
  ).toBe(false);
});

This mirrors the new cases added to copilot_harness.test.cjs and directly guards the exported symbol against drift.

@copilot please address this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in ea44e72detect_agent_errors.test.cjs now includes both the positive match case and the false-positive guard for stream_options, mirroring the coverage in copilot_harness.test.cjs.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

🤖 PR Triage

Field Value
Category bug
Risk 🟢 Low
Score 60 / 100
Action fast_track
Batch pr-batch:bug-fixes

Score Breakdown: Impact 30 + Urgency 18 + Quality 12

Rationale: Detects HTTP 400 stream_options as non-retryable Anthropic BYOK error (23+/4−, 3 files). Targeted fix to JS harness error detection; has 5 active review comments. Low risk, focused on error classification path.

Recommended Action: Fast-track human review; small and well-targeted.

Generated by 🔧 PR Triage Agent · 86.3 AIC · ⌖ 6.42 AIC · ⊞ 5.5K ·

@pelikhan

pelikhan commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

@copilot run pr-finisher skill

….cjs

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan

pelikhan commented Jul 4, 2026

Copy link
Copy Markdown
Collaborator

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

@copilot lint js https://github.com/github/gh-aw/actions/runs/28650178544/job/85021788462

Fixed in f60e097 — ran make fmt-cjs to apply Prettier formatting to detect_agent_errors.test.cjs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw-failures] [aw-fix] Copilot SDK BYOK: strip OpenAI-only stream_options for type=anthropic providers (HTTP 400 kills agent runs)

3 participants