[LiteLLM] _is_thinking_blocks_format drops Gemini thinking_blocks (only matches Anthropic 'signature' key)

  ## Summary

  `google.adk.models.lite_llm._is_thinking_blocks_format`, introduced in **1.28.0** via
  [fc45fa6](https://github.com/google/adk-python/commit/fc45fa68d75fbf5276bf5951929026285a8bb4af) (PR closing #4801), gates Anthropic thinking_blocks parsing on the presence of a 
  per-block `signature` key:

  ```python
  # src/google/adk/models/lite_llm.py (main)
  def _is_thinking_blocks_format(reasoning_value: Any) -> bool:
      """Returns True if reasoning_value is Anthropic thinking_blocks format."""
      if not isinstance(reasoning_value, list) or not reasoning_value:
          return False
      first = reasoning_value[0]
      return isinstance(first, dict) and "signature" in first
  ```

  LiteLLM's Gemini integration also emits `thinking_blocks` when thinking is enabled on Gemini 2.5 / 3 models, but **the per-block dicts do not carry a `signature`** — the thought
  signatures are returned at the response level under `provider_specific_fields.thought_signatures` as a parallel array. The detector therefore returns `False`, falls through to
  `_iter_reasoning_texts`, which only matches dict keys `("text", "content", "reasoning", "reasoning_content")` — Gemini blocks have `"type"` and `"thinking"`, so **nothing is 
  yielded** and the response surfaces zero thought `Part`s to the agent layer.
  
  Net effect: a regression from <1.28.0 for any agent built on `LiteLlm` + a Gemini thinking model.

  ## Affected versions
  
  - google-adk **>= 1.28.0** (still present on `main`, 2026-05-15)

  ## Environment
  
  - Python 3.12
  - google-adk 1.28.0+
  - litellm latest
  - Models reproduced on: `gemini-3-flash-preview`, `gemini-2.5-pro` (via LiteLLM proxy)

  ## Actual LiteLLM response payload

  Captured directly from LiteLLM with thought output enabled. Note `choices[0].message.thinking_blocks` shape and the separate response-level
  `provider_specific_fields.thought_signatures` field:

  ```json
  {
    "model": "gemini-3-flash-preview",
    "choices": [{
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I am a large language model, trained by Google.",
        "reasoning_content": "**Understanding the User's Query and My Identity** ...",
        "thinking_blocks": [
          {
            "type": "thinking",
            "thinking": "**Understanding the User's Query and My Identity** ..."
          }
        ],
        "provider_specific_fields": {
          "thought_signatures": [
            "AY89a1/RGkcaRoJvGVOsj0pMpznJpT6OZESRZQF8ZYxB1+YHABJ+NjzLIb0fk8FOFQ..."
          ]
        }
      }
    }],
    "usage": {
      "completion_tokens": 73,
      "prompt_tokens": 5,
      "total_tokens": 78,
      "completion_tokens_details": {"reasoning_tokens": 62, "text_tokens": 11}
    }
  }
  ```

  ## Call trace through `main`

  1. `_extract_reasoning_value(message)` prefers `thinking_blocks` over `reasoning_content` — returns the Gemini list.
  2. `_convert_reasoning_value_to_parts(reasoning_value)` calls `_is_thinking_blocks_format(...)` → `False` (no per-block `signature`).
  3. Falls back to `_iter_reasoning_texts`, which for each dict only yields under keys `("text", "content", "reasoning", "reasoning_content")` — none present → yields nothing.
  4. Returned thought parts: `[]`. The thought is lost.

  ## Expected behavior

  Gemini-shaped `thinking_blocks` should be recognized as a thinking-blocks payload and surfaced as `Part(thought=True, text=...)`. The parallel signatures from
  `provider_specific_fields.thought_signatures` should be attached to the corresponding thought parts so they can be relayed back to the model on subsequent turns.

  ## Suggested fix

  Normalize Gemini-shaped `thinking_blocks` into the Anthropic shape inside `_extract_reasoning_value`, by zipping the response-level `thought_signatures` onto each block. The
  existing Anthropic codepath in `_convert_reasoning_value_to_parts` then handles both providers unchanged.

  PR / unit tests below. Happy to open the PR if it looks right.

  ## Related
  
  - #4801 — Anthropic-side fix that introduced this detector.
  - #4650 / #3694 / #3948 — separate LiteLlm thinking/reasoning issues. This one is specifically about format detection.

  ---
  PR diff
  
  src/google/adk/models/lite_llm.py:
```python
  @@ def _extract_reasoning_value(message: Message | Delta | None) -> Any:
     if message is None:
       return None
     # Anthropic models return thinking_blocks with type/thinking/signature fields.
     # This must be preserved to maintain thinking across tool call boundaries.
     thinking_blocks = message.get("thinking_blocks")
     if thinking_blocks is not None:
  +    # Gemini also emits thinking_blocks, but each block lacks a per-block
  +    # `signature`; signatures arrive in parallel under
  +    # `provider_specific_fields.thought_signatures`. Zip them in so the
  +    # downstream Anthropic codepath handles both providers uniformly.
  +    if (
  +        isinstance(thinking_blocks, list)
  +        and thinking_blocks
  +        and isinstance(thinking_blocks[0], dict)
  +        and "signature" not in thinking_blocks[0]
  +    ):
  +      provider_fields = message.get("provider_specific_fields") or {}
  +      signatures = provider_fields.get("thought_signatures") or []
  +      if signatures:
  +        merged: list[dict] = []
  +        for index, block in enumerate(thinking_blocks):
  +          if (
  +              isinstance(block, dict)
  +              and index < len(signatures)
  +              and signatures[index]
  +          ):
  +            merged.append({**block, "signature": signatures[index]})
  +          else:
  +            merged.append(block)
  +        thinking_blocks = merged
       return thinking_blocks
     reasoning_content = message.get("reasoning_content")
     if reasoning_content is not None:
       return reasoning_content
     return message.get("reasoning")
```

  A note for maintainers (worth adding to the PR description, not the code): Anthropic per-block signature is treated as an opaque token and stored on Part.thought_signature via
  signature.encode("utf-8"). Gemini signatures are base64-encoded bytes. If Part.thought_signature is expected to hold the decoded bytes (matching the outbound b64encode(...) path
  in _extract_thought_signature_from_tool_call's counterpart), _convert_reasoning_value_to_parts should base64.b64decode(signature) when the source is Gemini. Left out of this PR to
   keep the diff surgical — happy to address as a follow-up once you confirm the desired semantics.

  ---
  Unit tests

  Append to tests/unittests/models/test_litellm.py:
```python
  def test_extract_reasoning_value_gemini_thinking_blocks_zips_signatures():
    """Gemini emits thinking_blocks without per-block signatures; signatures
    arrive in parallel under provider_specific_fields.thought_signatures.
    _extract_reasoning_value should normalize them into the Anthropic shape."""
    message = {
        "role": "assistant",
        "content": "I am a large language model.",
        "thinking_blocks": [
            {"type": "thinking", "thinking": "Step one ..."},
            {"type": "thinking", "thinking": "Step two ..."},
        ],
        "provider_specific_fields": {
            "thought_signatures": ["sig-1", "sig-2"],
        },
    }
    result = _extract_reasoning_value(message)
    assert result == [
        {"type": "thinking", "thinking": "Step one ...", "signature": "sig-1"},
        {"type": "thinking", "thinking": "Step two ...", "signature": "sig-2"},
    ]


  def test_extract_reasoning_value_gemini_thinking_blocks_without_signatures():
    """If provider_specific_fields is absent, Gemini thinking_blocks pass
    through unchanged. Downstream detector should still accept them once
    broadened — covered separately."""
    message = {
        "role": "assistant",
        "content": "Answer",
        "thinking_blocks": [
            {"type": "thinking", "thinking": "Inner monologue"},
        ],
    }
    result = _extract_reasoning_value(message)
    assert result == [{"type": "thinking", "thinking": "Inner monologue"}]


  def test_extract_reasoning_value_anthropic_thinking_blocks_unchanged():
    """Regression guard: Anthropic-shaped blocks (already carrying signature)
    must not be re-zipped or otherwise modified."""
    blocks = [
        {"type": "thinking", "thinking": "Anthropic thought", "signature": "abc"},
    ]
    message = {
        "role": "assistant",
        "content": "Answer",
        "thinking_blocks": blocks,
        "provider_specific_fields": {"thought_signatures": ["should-be-ignored"]},
    }
    result = _extract_reasoning_value(message)
    assert result == blocks
  

  def test_message_to_generate_content_response_gemini_thinking_blocks():
    """End-to-end: a Gemini-shaped message should surface a thought Part and
    the visible text Part, with the thought signature attached as bytes."""
    message = {
        "role": "assistant",
        "content": "I am a large language model.",
        "thinking_blocks": [
            {"type": "thinking", "thinking": "Identity check ..."},
        ],
        "provider_specific_fields": {
            "thought_signatures": ["AY89a1/RGkc"],
        },
    }
    response = _message_to_generate_content_response(message)
    assert len(response.content.parts) == 2
    thought_part = response.content.parts[0]
    text_part = response.content.parts[1]
    assert thought_part.thought is True
    assert thought_part.text == "Identity check ..."
    assert thought_part.thought_signature == b"AY89a1/RGkc"
    assert text_part.text == "I am a large language model."
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LiteLLM] _is_thinking_blocks_format drops Gemini thinking_blocks (only matches Anthropic 'signature' key) #5712

Summary

Affected versions

Environment

Actual LiteLLM response payload

Call trace through `main`

Expected behavior

Suggested fix

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[LiteLLM] _is_thinking_blocks_format drops Gemini thinking_blocks (only matches Anthropic 'signature' key) #5712

Description

Summary

Affected versions

Environment

Actual LiteLLM response payload

Call trace through main

Expected behavior

Suggested fix

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Call trace through `main`