Skip to content

fix(kosong): strip JSON Schema metadata from Google GenAI tool parameters#739

Open
xiaoju111a wants to merge 2 commits into
MoonshotAI:mainfrom
xiaoju111a:fix/google-genai-schema-metadata
Open

fix(kosong): strip JSON Schema metadata from Google GenAI tool parameters#739
xiaoju111a wants to merge 2 commits into
MoonshotAI:mainfrom
xiaoju111a:fix/google-genai-schema-metadata

Conversation

@xiaoju111a
Copy link
Copy Markdown
Contributor

@xiaoju111a xiaoju111a commented Jan 28, 2026

Related Issue

Resolves #734

Description

This PR fixes a compatibility issue between Google GenAI provider and MCP tools that include standard JSON Schema metadata fields.

Problem

When using MCP tools (like Exa MCP) with Google GenAI provider, the following validation error occurs:

1 validation error for FunctionDeclaration
parameters.$schema
  Extra inputs are not permitted [type=extra_forbidden, input_value='http://json-schema.org/draft-07/schema#', input_type=str]

Root cause:

  • MCP tools' inputSchema includes standard JSON Schema metadata fields
  • Google GenAI SDK's Pydantic model has extra='forbid', which rejects these additional fields
  • Kimi CLI was passing the complete schema without filtering metadata

Solution

Strip JSON Schema metadata fields in tool_to_google_genai() function before passing to Google GenAI SDK. We filter 4 fields that are rejected by the SDK:

  • $schema, $id, $comment - JSON Schema metadata
  • examples - Example values (not part of validation schema)

Note: $defs and definitions are already removed by kosong's deref_json_schema() function, so we don't need to filter them here.

def tool_to_google_genai(tool: KosongTool) -> Tool:
    # Strip JSON Schema metadata fields (google-genai SDK has extra='forbid')
    # Note: $defs/definitions are already removed by kosong's deref_json_schema()
    parameters = {
        k: v
        for k, v in tool.parameters.items()
        if k not in ("$schema", "$id", "$comment", "examples")
    }
    
    return Tool(
        function_declarations=[
            FunctionDeclaration(
                name=tool.name,
                description=tool.description,
                parameters=parameters,
            )
        ]
    )

Impact

  • ✅ Only affects Google GenAI provider
  • ✅ Backward compatible (tools without metadata work as before)
  • ✅ No performance impact (simple dict filtering)
  • ✅ Follows JSON Schema best practices (metadata fields are not parameter definitions)

Testing

Manually verified the 4 rejected fields with Google GenAI SDK:

# Tested fields that cause ValidationError:
# ❌ $schema, $id, $comment, examples

# After fix: ✅ All metadata stripped automatically
# Tools work correctly with Google GenAI provider

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked the related issue, if any.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run make gen-changelog to update the changelog.
  • I have run make gen-docs to update the user documentation.

Open with Devin

…ters

Google GenAI SDK's Pydantic model has extra='forbid', which rejects
JSON Schema metadata fields like $schema, $id, and $comment.

This causes validation errors when using MCP tools that include
standard JSON Schema metadata in their inputSchema.

Fixes MoonshotAI#734
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional flags.

Open in Devin Review

@xxchan xxchan requested a review from pvzheroes125 January 28, 2026 05:10
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +360 to +364
parameters = {
k: v
for k, v in tool.parameters.items()
if k not in ("$schema", "$id", "$comment", "examples")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 JSON Schema metadata stripping only applied at top level, not recursively into nested schemas

The new filtering at packages/kosong/src/kosong/contrib/chat_provider/google_genai.py:360-364 only strips metadata fields ($schema, $id, $comment, examples) from the top-level keys of tool.parameters. However, these fields—especially examples—can appear at any depth in a JSON Schema (e.g., inside properties entries when Pydantic's Field(examples=[...]) is used). Since the google-genai SDK uses extra='forbid' (as noted in the comment on line 358), it will recursively validate nested schema dicts and reject any that contain these forbidden keys. A tool whose parameter field uses Field(examples=["foo"]) would produce a schema like {"properties": {"name": {"type": "string", "examples": ["foo"]}}}, and the nested examples would still cause the SDK to raise a validation error.

Prompt for agents
In packages/kosong/src/kosong/contrib/chat_provider/google_genai.py, the tool_to_google_genai function at lines 360-364 needs to recursively strip the metadata fields from the entire JSON Schema tree, not just the top-level dict. Replace the top-level dict comprehension with a recursive helper function that walks the schema dict (and any nested dicts/lists) and removes keys in the forbidden set ("$schema", "$id", "$comment", "examples") at every level. For example:

def _strip_schema_metadata(schema: dict) -> dict:
    FORBIDDEN = {"$schema", "$id", "$comment", "examples"}
    result = {}
    for k, v in schema.items():
        if k in FORBIDDEN:
            continue
        if isinstance(v, dict):
            result[k] = _strip_schema_metadata(v)
        elif isinstance(v, list):
            result[k] = [_strip_schema_metadata(item) if isinstance(item, dict) else item for item in v]
        else:
            result[k] = v
    return result

Then use parameters = _strip_schema_metadata(tool.parameters) in tool_to_google_genai().
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Google GenAI provider fails with extra_forbidden for tool parameters containing $schema

2 participants