Skip to content

Start iterating on #1682 On branch edburns/1682-java-tool-ergonomics#1692

Draft
edburns wants to merge 13 commits into
mainfrom
edburns/1682-java-tool-ergonomics
Draft

Start iterating on #1682 On branch edburns/1682-java-tool-ergonomics#1692
edburns wants to merge 13 commits into
mainfrom
edburns/1682-java-tool-ergonomics

Conversation

@edburns

@edburns edburns commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

new file: 1682-java-tool-ergonomics-prompts-remove-before-merge/20260615-prompts.md

@edburns edburns force-pushed the edburns/1682-java-tool-ergonomics branch from 6c3fe31 to 299c3e4 Compare June 16, 2026 23:52
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

edburns and others added 6 commits June 17, 2026 11:25
new file:   1682-java-tool-ergonomics-prompts-remove-before-merge/20260615-prompts.md

Signed-off-by: Ed Burns <edburns@microsoft.com>
Packages the knowledge of creating net-new Java E2E integration tests
with handcrafted replay proxy YAML snapshots into a reusable Copilot skill.

New files:

- .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md
  Main skill instructions covering: YAML snapshot format, proxy matching
  logic, Java IT test template, verification commands, key classes/files
  reference table, and common pitfalls. Includes the critical constraint
  that Java's CapiProxy always sets GITHUB_ACTIONS=true so snapshots must
  be handcrafted rather than recorded.

- .github/skills/new-java-e2e-test-yaml-and-test/examples.md
  Two working examples from the codebase: a simple single-turn test
  (Botanica identity REPLACE) and a multi-turn test with tool calls
  (system message transform with view tool).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a new Java failsafe integration test and replay snapshot that exercise the
current explicit tool-definition APIs before ergonomic annotations are added.

- add `LowLevelToolDefinitionIT` covering `create`, `createOverride`,
  `getArgumentsAs(record)`, `getArguments()`, and `ToolSet` available tools
- add `tools/low_level_tool_definition.yaml` with multi-tool call and final
  response replay conversations
- assert handler-driven state mutation (`currentPhase`) and expected response
  content

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@edburns edburns force-pushed the edburns/1682-java-tool-ergonomics branch from 721eeb4 to ac67ecf Compare June 17, 2026 15:26
@github-actions

This comment has been minimized.

Port the shared `tools/low_level_tool_definition` replay scenario to the
remaining SDKs so every language validates the same low-level tool-definition
behavior.

- dotnet: add `Low_Level_Tool_Definition` E2E in
  `dotnet/test/E2E/ToolsE2ETests.cs`
- go: add `low_level_tool_definition` subtest in
  `go/internal/e2e/tools_e2e_test.go`
- nodejs: add `low_level_tool_definition` test in
  `nodejs/test/e2e/tools.e2e.test.ts`
- python: add `test_low_level_tool_definition` in
  `python/e2e/test_tools_e2e.py`
- rust: add `low_level_tool_definition` in `rust/tests/e2e/tools.rs`
- snapshot: finalize `test/snapshots/tools/low_level_tool_definition.yaml`
  for consistent replay across languages

Each test covers explicit tool registration, built-in tool override behavior,
raw-argument access, available-tools filtering (`custom:*` + `builtin:web_fetch`),
and mutable phase state (`analyzing`) asserted after tool execution.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

This comment has been minimized.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 1.6M

Description = "Custom grep override",
AdditionalProperties = new Dictionary<string, object?>
{
["is_override"] = true,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-SDK consistency: hardcoded internal key "is_override"

This line uses the string "is_override" directly, but that key is declared as internal const string OverridesBuiltInToolKey = "is_override" in CopilotTool.cs — it isn't part of the public API surface.

Every other SDK exposes this intent through a typed, discoverable API:

  • Node.js: overridesBuiltInTool: true (option in defineTool)
  • Python: overrides_built_in_tool=True (kwarg in @define_tool)
  • Go: grepTool.OverridesBuiltInTool = true (struct field)
  • Java: ToolDefinition.createOverride(...) (factory method)
  • Rust: .with_overrides_built_in_tool(true) (builder method)

Two options to bring .NET into alignment:

Option A — Use CopilotTool.DefineTool (the idiomatic .NET helper):

CopilotTool.DefineTool(
    CustomGrep,
    new CopilotToolOptions { OverridesBuiltInTool = true },
    new AIFunctionFactoryOptions { Name = "grep", Description = "Custom grep override" }),

This keeps the AIFunctionArguments-based parameter access (the "low-level" part of the test) while using the typed override flag.

Option B — Make OverridesBuiltInToolKey public so callers who intentionally use raw AIFunctionFactory.Create can reference a typed constant instead of a magic string:

["is_override"] = true,  // ← becomes: [CopilotTool.OverridesBuiltInToolKey] = true

Option A is preferred for consistency, since it still demonstrates the low-level AIFunctionArguments pattern without leaking an internal key.

edburns and others added 2 commits June 17, 2026 13:18
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

This comment has been minimized.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.2M

Description = "Custom grep override",
AdditionalProperties = new Dictionary<string, object?>
{
["is_override"] = true,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-SDK consistency note: This hardcodes the raw "is_override" string, which is the internal value of CopilotTool.OverridesBuiltInToolKey (currently internal const). In all other SDKs the low-level override path uses a named, public API element:

SDK Approach
Go grepTool.OverridesBuiltInTool = true (exported struct field)
Java ToolDefinition.createOverride(...) (dedicated factory)
Node.js overridesBuiltInTool: true (named option)
Python overrides_built_in_tool=True (named parameter)
Rust .with_overrides_built_in_tool(true) (builder method)

For .NET, callers doing the raw approach must know the magic string "is_override". Consider making CopilotTool.OverridesBuiltInToolKey public so this test (and any other low-level callers) can write:

[CopilotTool.OverridesBuiltInToolKey] = true,

instead of relying on the private string value. Alternatively, the test could use CopilotTool.DefineTool() with CopilotToolOptions { OverridesBuiltInTool = true } for the override tool — even in a "low-level" test, the other two tools still demonstrate the raw AIFunctionArguments path.

60_000).get(90, TimeUnit.SECONDS);

assertNotNull(response, "Expected a response from the assistant");
String content = response.getData().content().toLowerCase();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor null-safety concern: content() returns String and can be null if the field is absent in the JSON response. Calling .toLowerCase() directly on it would throw a NullPointerException.

All other SDK tests guard against this:

  • Node.js: const content = assistantMessage?.data.content ?? "";
  • Python: content = assistant_message.data.content or ""
  • Go: explicitly checks if content == "" { t.Fatalf(...) }
  • Rust: uses a helper assistant_message_content() that handles null

Suggest:

String content = response.getData().content();
assertNotNull(content, "Expected non-null content in assistant response");
assertFalse(content.isEmpty(), "Expected non-empty content");
assertTrue(content.toLowerCase().contains("analyzing"), ...);

edburns and others added 2 commits June 17, 2026 13:57
After an abort during active streaming the CLI sometimes resets the session history. Add a second conversation entry covering the 2-message recovery case [system, user] so the replay proxy can match regardless of which code path runs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

This comment has been minimized.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.6M

Description = "Custom grep override",
AdditionalProperties = new Dictionary<string, object?>
{
["is_override"] = true,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-SDK consistency note — .NET "low-level" override key discoverability

The magic string "is_override" here is intentionally testing the low-level AIFunctionFactory.Create path, and that's correct. However, CopilotTool.OverridesBuiltInToolKey (which equals "is_override") is declared internal, so any .NET consumer who wants to use AIFunctionFactory.Create directly (instead of CopilotTool.DefineTool) has no discoverable public constant to reference.

The equivalent APIs in every other SDK are public and discoverable:

  • Node.js: overridesBuiltInTool: true property on defineTool
  • Python: overrides_built_in_tool=True parameter on @define_tool
  • Go: grepTool.OverridesBuiltInTool = true public struct field
  • Java: ToolDefinition.createOverride() public factory method
  • Rust: .with_overrides_built_in_tool(true) public builder method

Suggestion: consider either (a) making OverridesBuiltInToolKey a public const to allow consumers to reference it without hardcoding the string, or (b) adding a comment here pointing to CopilotTool.DefineTool with new CopilotToolOptions { OverridesBuiltInTool = true } as the idiomatic alternative when AIFunctionFactory.Create isn't required.

Use the same second prompt (Say hi) as the lifecycle snapshot to avoid replay cache misses in the list-sessions test path on Windows CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

Cross-SDK Consistency Review

✅ Consistent: low_level_tool_definition test added to all 6 SDKs

This PR consistently adds an equivalent low_level_tool_definition E2E test across all six SDK implementations (Node.js, Python, Go, .NET, Java, Rust). The test scenarios are well-aligned:

  • Same prompt and assertions across all languages
  • ToolSet.AddCustom("*").AddBuiltIn("web_fetch") pattern used in all 6
  • All three tool behaviors tested: typed argument deserialization, raw argument map access, and built-in tool override

⚠️ Minor cross-SDK inconsistency: .NET override uses internal magic string

All SDKs expose a typed public API to mark a tool as overriding a built-in:

SDK Public override API
Java ToolDefinition.createOverride(...)
Go grepTool.OverridesBuiltInTool = true
Node.js overridesBuiltInTool: true in defineTool options
Python overrides_built_in_tool=True in @define_tool
Rust .with_overrides_built_in_tool(true)
.NET AdditionalProperties = new Dictionary<string, object?> { ["is_override"] = true } ← uses internal key string

The "is_override" key is declared internal const string OverridesBuiltInToolKey in CopilotTool.cs. The .NET SDK provides CopilotTool.DefineTool() with CopilotToolOptions { OverridesBuiltInTool = true } as the public equivalent. Using it for the override tool in the low-level test would align the .NET SDK's documented approach with the other SDKs. (See inline comment on dotnet/test/E2E/ToolsE2ETests.cs for a suggested alternative.)


🗑️ Cleanup needed before merge

The 1682-java-tool-ergonomics-prompts-remove-before-merge/ directory (3 planning/prompt markdown files) should be removed before merging — the folder name itself indicates this. These files appear to be working notes from the development process.

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.2M ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 2.2M

Description = "Custom grep override",
AdditionalProperties = new Dictionary<string, object?>
{
["is_override"] = true,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-SDK consistency note: All other SDKs expose overridesBuiltInTool as a typed, public API in their low-level tool definition (e.g., Java's ToolDefinition.createOverride(), Go's grepTool.OverridesBuiltInTool = true, Node.js's overridesBuiltInTool: true, Python's overrides_built_in_tool=True, Rust's .with_overrides_built_in_tool(true)). Here the .NET test sets "is_override" via a raw AdditionalProperties dictionary key, which is an internal SDK constant (CopilotTool.OverridesBuiltInToolKey).

Using CopilotTool.DefineTool() with CopilotToolOptions { OverridesBuiltInTool = true } would be the equivalent public SDK API and would align better with how other languages expose this feature:

CopilotTool.DefineTool(CustomGrep, new CopilotToolOptions
{
    OverridesBuiltInTool = true,
}, new AIFunctionFactoryOptions
{
    Name = "grep",
    Description = "Custom grep override",
}),

This is especially relevant since "is_override" is internal const — SDK users reading this test as documentation would copy a magic string that's not part of the documented public surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant