Start iterating on #1682 On branch edburns/1682-java-tool-ergonomics by edburns · Pull Request #1692 · github/copilot-sdk

edburns · 2026-06-16T22:22:19Z

new file: 1682-java-tool-ergonomics-prompts-remove-before-merge/20260615-prompts.md

new file: 1682-java-tool-ergonomics-prompts-remove-before-merge/20260615-prompts.md Signed-off-by: Ed Burns <edburns@microsoft.com>

Packages the knowledge of creating net-new Java E2E integration tests with handcrafted replay proxy YAML snapshots into a reusable Copilot skill. New files: - .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md Main skill instructions covering: YAML snapshot format, proxy matching logic, Java IT test template, verification commands, key classes/files reference table, and common pitfalls. Includes the critical constraint that Java's CapiProxy always sets GITHUB_ACTIONS=true so snapshots must be handcrafted rather than recorded. - .github/skills/new-java-e2e-test-yaml-and-test/examples.md Two working examples from the codebase: a simple single-turn test (Botanica identity REPLACE) and a multi-turn test with tool calls (system message transform with view tool). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add a new Java failsafe integration test and replay snapshot that exercise the current explicit tool-definition APIs before ergonomic annotations are added. - add `LowLevelToolDefinitionIT` covering `create`, `createOverride`, `getArgumentsAs(record)`, `getArguments()`, and `ToolSet` available tools - add `tools/low_level_tool_definition.yaml` with multi-tool call and final response replay conversations - assert handler-driven state mutation (`currentPhase`) and expected response content Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Port the shared `tools/low_level_tool_definition` replay scenario to the remaining SDKs so every language validates the same low-level tool-definition behavior. - dotnet: add `Low_Level_Tool_Definition` E2E in `dotnet/test/E2E/ToolsE2ETests.cs` - go: add `low_level_tool_definition` subtest in `go/internal/e2e/tools_e2e_test.go` - nodejs: add `low_level_tool_definition` test in `nodejs/test/e2e/tools.e2e.test.ts` - python: add `test_low_level_tool_definition` in `python/e2e/test_tools_e2e.py` - rust: add `low_level_tool_definition` in `rust/tests/e2e/tools.rs` - snapshot: finalize `test/snapshots/tools/low_level_tool_definition.yaml` for consistent replay across languages Each test covers explicit tool registration, built-in tool override behavior, raw-argument access, available-tools filtering (`custom:*` + `builtin:web_fetch`), and mutable phase state (`analyzing`) asserted after tool execution. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 1.6M

github-actions · 2026-06-17T17:08:46Z

+                    Description = "Custom grep override",
+                    AdditionalProperties = new Dictionary<string, object?>
+                    {
+                        ["is_override"] = true,


Cross-SDK consistency: hardcoded internal key "is_override"

This line uses the string "is_override" directly, but that key is declared as internal const string OverridesBuiltInToolKey = "is_override" in CopilotTool.cs — it isn't part of the public API surface.

Every other SDK exposes this intent through a typed, discoverable API:

Node.js: overridesBuiltInTool: true (option in defineTool)

Python: overrides_built_in_tool=True (kwarg in @define_tool)

Go: grepTool.OverridesBuiltInTool = true (struct field)

Java: ToolDefinition.createOverride(...) (factory method)

Rust: .with_overrides_built_in_tool(true) (builder method)

Two options to bring .NET into alignment:

Option A — Use CopilotTool.DefineTool (the idiomatic .NET helper):

CopilotTool.DefineTool( CustomGrep, new CopilotToolOptions { OverridesBuiltInTool = true }, new AIFunctionFactoryOptions { Name = "grep", Description = "Custom grep override" }),

This keeps the AIFunctionArguments-based parameter access (the "low-level" part of the test) while using the typed override flag.

Option B — Make OverridesBuiltInToolKey public so callers who intentionally use raw AIFunctionFactory.Create can reference a typed constant instead of a magic string:

["is_override"] = true, // ← becomes: [CopilotTool.OverridesBuiltInToolKey] = true

Option A is preferred for consistency, since it still demonstrates the low-level AIFunctionArguments pattern without leaking an internal key.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.2M

github-actions · 2026-06-17T17:33:32Z

+                    Description = "Custom grep override",
+                    AdditionalProperties = new Dictionary<string, object?>
+                    {
+                        ["is_override"] = true,


Cross-SDK consistency note: This hardcodes the raw "is_override" string, which is the internal value of CopilotTool.OverridesBuiltInToolKey (currently internal const). In all other SDKs the low-level override path uses a named, public API element:

SDK Approach

Go grepTool.OverridesBuiltInTool = true (exported struct field)

Java ToolDefinition.createOverride(...) (dedicated factory)

Node.js overridesBuiltInTool: true (named option)

Python overrides_built_in_tool=True (named parameter)

Rust .with_overrides_built_in_tool(true) (builder method)

For .NET, callers doing the raw approach must know the magic string "is_override". Consider making CopilotTool.OverridesBuiltInToolKey public so this test (and any other low-level callers) can write:

[CopilotTool.OverridesBuiltInToolKey] = true,

instead of relying on the private string value. Alternatively, the test could use CopilotTool.DefineTool() with CopilotToolOptions { OverridesBuiltInTool = true } for the override tool — even in a "low-level" test, the other two tools still demonstrate the raw AIFunctionArguments path.

github-actions · 2026-06-17T17:33:32Z

+                        60_000).get(90, TimeUnit.SECONDS);
+
+                assertNotNull(response, "Expected a response from the assistant");
+                String content = response.getData().content().toLowerCase();


Minor null-safety concern: content() returns String and can be null if the field is absent in the JSON response. Calling .toLowerCase() directly on it would throw a NullPointerException.

All other SDK tests guard against this:

Node.js: const content = assistantMessage?.data.content ?? "";

Python: content = assistant_message.data.content or ""

Go: explicitly checks if content == "" { t.Fatalf(...) }

Rust: uses a helper assistant_message_content() that handles null

Suggest:

String content = response.getData().content(); assertNotNull(content, "Expected non-null content in assistant response"); assertFalse(content.isEmpty(), "Expected non-empty content"); assertTrue(content.toLowerCase().contains("analyzing"), ...);

After an abort during active streaming the CLI sometimes resets the session history. Add a second conversation entry covering the 2-message recovery case [system, user] so the replay proxy can match regardless of which code path runs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.6M

github-actions · 2026-06-17T18:08:13Z

+                    Description = "Custom grep override",
+                    AdditionalProperties = new Dictionary<string, object?>
+                    {
+                        ["is_override"] = true,


Cross-SDK consistency note — .NET "low-level" override key discoverability

The magic string "is_override" here is intentionally testing the low-level AIFunctionFactory.Create path, and that's correct. However, CopilotTool.OverridesBuiltInToolKey (which equals "is_override") is declared internal, so any .NET consumer who wants to use AIFunctionFactory.Create directly (instead of CopilotTool.DefineTool) has no discoverable public constant to reference.

The equivalent APIs in every other SDK are public and discoverable:

Node.js: overridesBuiltInTool: true property on defineTool

Python: overrides_built_in_tool=True parameter on @define_tool

Go: grepTool.OverridesBuiltInTool = true public struct field

Java: ToolDefinition.createOverride() public factory method

Rust: .with_overrides_built_in_tool(true) public builder method

Suggestion: consider either (a) making OverridesBuiltInToolKey a public const to allow consumers to reference it without hardcoding the string, or (b) adding a comment here pointing to CopilotTool.DefineTool with new CopilotToolOptions { OverridesBuiltInTool = true } as the idiomatic alternative when AIFunctionFactory.Create isn't required.

Use the same second prompt (Say hi) as the lifecycle snapshot to avoid replay cache misses in the list-sessions test path on Windows CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-17T23:30:42Z

Cross-SDK Consistency Review

✅ Consistent: `low_level_tool_definition` test added to all 6 SDKs

This PR consistently adds an equivalent low_level_tool_definition E2E test across all six SDK implementations (Node.js, Python, Go, .NET, Java, Rust). The test scenarios are well-aligned:

Same prompt and assertions across all languages
ToolSet.AddCustom("*").AddBuiltIn("web_fetch") pattern used in all 6
All three tool behaviors tested: typed argument deserialization, raw argument map access, and built-in tool override

⚠️ Minor cross-SDK inconsistency: `.NET` override uses internal magic string

All SDKs expose a typed public API to mark a tool as overriding a built-in:

SDK	Public override API
Java	`ToolDefinition.createOverride(...)`
Go	`grepTool.OverridesBuiltInTool = true`
Node.js	`overridesBuiltInTool: true` in `defineTool` options
Python	`overrides_built_in_tool=True` in `@define_tool`
Rust	`.with_overrides_built_in_tool(true)`
.NET	`AdditionalProperties = new Dictionary<string, object?> { ["is_override"] = true }` ← uses internal key string

The "is_override" key is declared internal const string OverridesBuiltInToolKey in CopilotTool.cs. The .NET SDK provides CopilotTool.DefineTool() with CopilotToolOptions { OverridesBuiltInTool = true } as the public equivalent. Using it for the override tool in the low-level test would align the .NET SDK's documented approach with the other SDKs. (See inline comment on dotnet/test/E2E/ToolsE2ETests.cs for a suggested alternative.)

🗑️ Cleanup needed before merge

The 1682-java-tool-ergonomics-prompts-remove-before-merge/ directory (3 planning/prompt markdown files) should be removed before merging — the folder name itself indicates this. These files appear to be working notes from the development process.

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.2M · ◷

github-actions

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.2M

github-actions · 2026-06-17T23:30:45Z

+                    Description = "Custom grep override",
+                    AdditionalProperties = new Dictionary<string, object?>
+                    {
+                        ["is_override"] = true,


Cross-SDK consistency note: All other SDKs expose overridesBuiltInTool as a typed, public API in their low-level tool definition (e.g., Java's ToolDefinition.createOverride(), Go's grepTool.OverridesBuiltInTool = true, Node.js's overridesBuiltInTool: true, Python's overrides_built_in_tool=True, Rust's .with_overrides_built_in_tool(true)). Here the .NET test sets "is_override" via a raw AdditionalProperties dictionary key, which is an internal SDK constant (CopilotTool.OverridesBuiltInToolKey).

Using CopilotTool.DefineTool() with CopilotToolOptions { OverridesBuiltInTool = true } would be the equivalent public SDK API and would align better with how other languages expose this feature:

CopilotTool.DefineTool(CustomGrep, new CopilotToolOptions { OverridesBuiltInTool = true, }, new AIFunctionFactoryOptions { Name = "grep", Description = "Custom grep override", }),

This is especially relevant since "is_override" is internal const — SDK users reading this test as documentation would copy a magic string that's not part of the documented public surface.

edburns force-pushed the edburns/1682-java-tool-ergonomics branch from 6c3fe31 to 299c3e4 Compare June 16, 2026 23:52

This comment has been minimized.

Sign in to view

edburns and others added 6 commits June 17, 2026 11:25

Start iterating on #1682 On branch edburns/1682-java-tool-ergonomics

e7cb02f

new file: 1682-java-tool-ergonomics-prompts-remove-before-merge/20260615-prompts.md Signed-off-by: Ed Burns <edburns@microsoft.com>

WIP prompts

94af6cc

Plan to generate test for non-ergonomic-tool-definition

c25ef62

chore: trigger GPG passphrase cache

4cf9335

edburns force-pushed the edburns/1682-java-tool-ergonomics branch from 721eeb4 to ac67ecf Compare June 17, 2026 15:26

github-actions Bot mentioned this pull request Jun 17, 2026

[aw] SDK Consistency Review Agent failed #1708

Open

Prompt to create net-new test in other non-Java languages

8f67322

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Jun 17, 2026

View reviewed changes

edburns and others added 2 commits June 17, 2026 13:18

Fix python ruff lint in low-level tool e2e test

4719550

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fix Node e2e test prettier formatting

7138fe4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Jun 17, 2026

View reviewed changes

edburns and others added 2 commits June 17, 2026 13:57

Merge branch 'main' into edburns/1682-java-tool-ergonomics

b02b5aa

This comment has been minimized.

Sign in to view

github-actions Bot reviewed Jun 17, 2026

View reviewed changes

Fix Windows session lifecycle replay mismatch

150b841

Use the same second prompt (Say hi) as the lifecycle snapshot to avoid replay cache misses in the list-sessions test path on Windows CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions Bot reviewed Jun 17, 2026

View reviewed changes

SDK	Approach
Go	`grepTool.OverridesBuiltInTool = true` (exported struct field)
Java	`ToolDefinition.createOverride(...)` (dedicated factory)
Node.js	`overridesBuiltInTool: true` (named option)
Python	`overrides_built_in_tool=True` (named parameter)
Rust	`.with_overrides_built_in_tool(true)` (builder method)

Conversation

edburns commented Jun 16, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 17, 2026

Cross-SDK Consistency Review

✅ Consistent: low_level_tool_definition test added to all 6 SDKs

⚠️ Minor cross-SDK inconsistency: .NET override uses internal magic string

🗑️ Cleanup needed before merge

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

✅ Consistent: `low_level_tool_definition` test added to all 6 SDKs

⚠️ Minor cross-SDK inconsistency: `.NET` override uses internal magic string