Skip to content

Commit 5e8fe0b

Browse files
moonbox3CopilotCopilot
authored
Python: Stop emitting duplicate reasoning content from OpenAI response.reasoning_text.done and response.reasoning_summary_text.done events (#5162)
* Fix reasoning text done events duplicating streamed delta content (#5157) The OpenAI Responses API sends both reasoning_text.delta (incremental chunks) and reasoning_text.done (full accumulated text) events. The chat client was emitting Content for both, causing ag-ui to append the full done text onto already-accumulated delta text, producing duplicated reasoning output. Stop emitting Content for reasoning_text.done and reasoning_summary_text.done events, matching how output_text.done is already handled (not emitted). The deltas contain all the content; the done event is redundant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(openai): emit reasoning done content as fallback when no deltas observed (#5157) Address PR review feedback: - Track item_ids that received reasoning deltas via seen_reasoning_delta_item_ids set - Emit content from done events only when no deltas were received for the item_id, preventing silent content loss on stream resumption - Add comment documenting code_interpreter done event asymmetry - Replace redundant ag-ui test with deduplication-focused test - Add integration test for delta+done sequence in OpenAI chat client tests - Add fallback path tests for done events without preceding deltas Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for #5157: Python: [Bug]: "type": "response.reasoning_text.delta" and "response.reasoning_text.done" both get exposed as "text_reasoning" * Fix AG-UI reasoning streaming to use proper Start/End pattern (#5157) _emit_text_reasoning now follows the same streaming pattern as _emit_text: - Emits ReasoningStartEvent/ReasoningMessageStartEvent only on the first delta for a given message_id - Emits only ReasoningMessageContentEvent for subsequent deltas - Defers ReasoningMessageEndEvent/ReasoningEndEvent until _close_reasoning_block is called (on content type switch or end-of-run) This produces the correct protocol pattern: ReasoningStartEvent ReasoningMessageStartEvent ReasoningMessageContentEvent(delta1) ReasoningMessageContentEvent(delta2) ReasoningMessageEndEvent ReasoningEndEvent Instead of wrapping every delta in a full Start→End sequence. Backward compatibility is preserved: calling _emit_text_reasoning without a flow argument still produces the full sequence per call. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix import ordering lint error in AG-UI test file (#5157) Move inline import of TextMessageContentEvent to the top-level import block and ensure alphabetical ordering to satisfy ruff I001 rule. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix mypy error: rename loop variable to avoid type conflict with WorkflowEvent The 'event' variable was already typed as WorkflowEvent[Any] from the async for loop at line 590. Reusing it in the _close_reasoning_block loop (which returns list[BaseEvent]) caused an incompatible assignment error. Renamed to 'reasoning_evt' to avoid the conflict. Fixes #5162 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback for #5157: review comment fixes * narrow test result reporting to explicit pytest JUnit XML * Fix test args * Fix pytest-results-action in merge workflow and remove committed test artifacts Apply the same JUnit XML fix from python-tests.yml to python-merge-tests.yml: add --junitxml=pytest.xml to all test commands and narrow the results action path from ./python/**.xml to ./python/pytest.xml. Also remove accidentally committed pytest.xml and python-coverage.xml and add them to .gitignore. --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 1dd828d commit 5e8fe0b

15 files changed

Lines changed: 412 additions & 89 deletions

File tree

.github/workflows/python-merge-tests.yml

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -115,12 +115,13 @@ jobs:
115115
-m "not integration"
116116
--timeout=120 --session-timeout=900 --timeout_method thread
117117
--retries 2 --retry-delay 5
118+
--junitxml=pytest.xml
118119
working-directory: ./python
119120
- name: Surface failing tests
120121
if: always()
121122
uses: pmeier/pytest-results-action@v0.7.2
122123
with:
123-
path: ./python/**.xml
124+
path: ./python/pytest.xml
124125
summary: true
125126
display-options: fEX
126127
fail-on-empty: false
@@ -163,6 +164,7 @@ jobs:
163164
-n logical --dist worksteal
164165
--timeout=120 --session-timeout=900 --timeout_method thread
165166
--retries 2 --retry-delay 5
167+
--junitxml=pytest.xml
166168
working-directory: ./python
167169
- name: Test OpenAI samples
168170
timeout-minutes: 10
@@ -173,7 +175,7 @@ jobs:
173175
if: always()
174176
uses: pmeier/pytest-results-action@v0.7.2
175177
with:
176-
path: ./python/**.xml
178+
path: ./python/pytest.xml
177179
summary: true
178180
display-options: fEX
179181
fail-on-empty: false
@@ -225,6 +227,7 @@ jobs:
225227
-n logical --dist worksteal
226228
--timeout=120 --session-timeout=900 --timeout_method thread
227229
--retries 2 --retry-delay 5
230+
--junitxml=pytest.xml
228231
working-directory: ./python
229232
- name: Test Azure samples
230233
timeout-minutes: 10
@@ -235,7 +238,7 @@ jobs:
235238
if: always()
236239
uses: pmeier/pytest-results-action@v0.7.2
237240
with:
238-
path: ./python/**.xml
241+
path: ./python/pytest.xml
239242
summary: true
240243
display-options: fEX
241244
fail-on-empty: false
@@ -285,6 +288,7 @@ jobs:
285288
-n logical --dist worksteal
286289
--timeout=120 --session-timeout=900 --timeout_method thread
287290
--retries 2 --retry-delay 5
291+
--junitxml=pytest.xml
288292
working-directory: ./python
289293
- name: Stop local MCP server
290294
if: always()
@@ -310,7 +314,7 @@ jobs:
310314
if: always()
311315
uses: pmeier/pytest-results-action@v0.7.2
312316
with:
313-
path: ./python/**.xml
317+
path: ./python/pytest.xml
314318
summary: true
315319
display-options: fEX
316320
fail-on-empty: false
@@ -375,12 +379,13 @@ jobs:
375379
-x
376380
--timeout=360 --session-timeout=900 --timeout_method thread
377381
--retries 2 --retry-delay 5
382+
--junitxml=pytest.xml
378383
working-directory: ./python
379384
- name: Surface failing tests
380385
if: always()
381386
uses: pmeier/pytest-results-action@v0.7.2
382387
with:
383-
path: ./python/**.xml
388+
path: ./python/pytest.xml
384389
summary: true
385390
display-options: fEX
386391
fail-on-empty: false
@@ -430,12 +435,13 @@ jobs:
430435
-n logical --dist worksteal
431436
--timeout=120 --session-timeout=900 --timeout_method thread
432437
--retries 2 --retry-delay 5
438+
--junitxml=pytest.xml
433439
working-directory: ./python
434440
- name: Surface failing tests
435441
if: always()
436442
uses: pmeier/pytest-results-action@v0.7.2
437443
with:
438-
path: ./python/**.xml
444+
path: ./python/pytest.xml
439445
summary: true
440446
display-options: fEX
441447
fail-on-empty: false
@@ -489,13 +495,13 @@ jobs:
489495
echo "Cosmos DB emulator did not become ready in time." >&2
490496
exit 1
491497
- name: Test with pytest (Cosmos integration)
492-
run: uv run --directory packages/azure-cosmos poe integration-tests -n logical --dist worksteal --timeout=120 --session-timeout=900 --timeout_method thread --retries 2 --retry-delay 5
498+
run: uv run --directory packages/azure-cosmos poe integration-tests -n logical --dist worksteal --timeout=120 --session-timeout=900 --timeout_method thread --retries 2 --retry-delay 5 --junitxml=pytest.xml
493499
working-directory: ./python
494500
- name: Surface failing tests
495501
if: always()
496502
uses: pmeier/pytest-results-action@v0.7.2
497503
with:
498-
path: ./python/**.xml
504+
path: ./python/pytest.xml
499505
summary: true
500506
display-options: fEX
501507
fail-on-empty: false

.github/workflows/python-tests.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,15 +40,15 @@ jobs:
4040
UV_CACHE_DIR: /tmp/.uv-cache
4141
# Unit tests
4242
- name: Run all tests
43-
run: uv run poe test -A
43+
run: uv run poe test -A --junitxml=pytest.xml
4444
working-directory: ./python
4545

4646
# Surface failing tests
4747
- name: Surface failing tests
4848
if: always()
4949
uses: pmeier/pytest-results-action@v0.7.2
5050
with:
51-
path: ./python/**.xml
51+
path: ./python/pytest.xml
5252
summary: true
5353
display-options: fEX
5454
fail-on-empty: false

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ htmlcov/
4747
.cache
4848
nosetests.xml
4949
coverage.xml
50+
pytest.xml
51+
python-coverage.xml
5052
*.cover
5153
*.py,cover
5254
.hypothesis/

python/packages/ag-ui/agent_framework_ag_ui/_agent_run.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
from ._run_common import (
4747
FlowState,
4848
_build_run_finished_event, # type: ignore
49+
_close_reasoning_block, # type: ignore
4950
_emit_content, # type: ignore
5051
_extract_resume_payload, # type: ignore
5152
_has_only_tool_calls, # type: ignore
@@ -1058,6 +1059,10 @@ async def run_agent_stream(
10581059
}
10591060
)
10601061

1062+
# Close any open reasoning block
1063+
for event in _close_reasoning_block(flow):
1064+
yield event
1065+
10611066
# Close any open message
10621067
if flow.message_id:
10631068
logger.debug(f"End of run: closing text message message_id={flow.message_id}")

python/packages/ag-ui/agent_framework_ag_ui/_run_common.py

Lines changed: 82 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,7 @@ class FlowState:
128128
interrupts: list[dict[str, Any]] = field(default_factory=list) # pyright: ignore[reportUnknownVariableType]
129129
reasoning_messages: list[dict[str, Any]] = field(default_factory=list) # pyright: ignore[reportUnknownVariableType]
130130
accumulated_reasoning: dict[str, str] = field(default_factory=dict) # pyright: ignore[reportUnknownVariableType]
131+
reasoning_message_id: str | None = None
131132

132133
def get_tool_name(self, call_id: str | None) -> str | None:
133134
"""Get tool name by call ID."""
@@ -462,12 +463,39 @@ def _emit_mcp_tool_result(
462463
return _emit_tool_result_common(content.call_id, raw_output, flow, predictive_handler)
463464

464465

466+
def _close_reasoning_block(flow: FlowState) -> list[BaseEvent]:
467+
"""Close an open reasoning block, emitting end events.
468+
469+
Should be called when the reasoning block is complete -- e.g. when
470+
non-reasoning content arrives or at end of a run.
471+
"""
472+
if not flow.reasoning_message_id:
473+
return []
474+
message_id = flow.reasoning_message_id
475+
flow.reasoning_message_id = None
476+
return [
477+
ReasoningMessageEndEvent(message_id=message_id),
478+
ReasoningEndEvent(message_id=message_id),
479+
]
480+
481+
465482
def _emit_text_reasoning(content: Content, flow: FlowState | None = None) -> list[BaseEvent]:
466483
"""Emit AG-UI reasoning events for text_reasoning content.
467484
468485
Uses the protocol-defined reasoning event types so that AG-UI consumers
469486
such as CopilotKit can render reasoning natively.
470487
488+
When *flow* is provided the function follows the streaming pattern: it
489+
emits ``ReasoningStartEvent`` / ``ReasoningMessageStartEvent`` only on
490+
the first delta for a given ``message_id`` and just
491+
``ReasoningMessageContentEvent`` for subsequent deltas. The matching
492+
``ReasoningMessageEndEvent`` / ``ReasoningEndEvent`` are deferred until
493+
``_close_reasoning_block`` is called (e.g. when non-reasoning content
494+
arrives or at end-of-run).
495+
496+
Without *flow* (backward-compat) the full Start→Content→End sequence is
497+
emitted for every call.
498+
471499
Only ``content.text`` is used for the visible reasoning message. If
472500
``content.protected_data`` is present it is emitted as a
473501
``ReasoningEncryptedValueEvent`` so that consumers can persist encrypted
@@ -483,26 +511,49 @@ def _emit_text_reasoning(content: Content, flow: FlowState | None = None) -> lis
483511

484512
message_id = content.id or generate_event_id()
485513

486-
events: list[BaseEvent] = [
487-
ReasoningStartEvent(message_id=message_id),
488-
ReasoningMessageStartEvent(message_id=message_id, role="assistant"),
489-
]
514+
events: list[BaseEvent] = []
490515

491-
if text:
492-
events.append(ReasoningMessageContentEvent(message_id=message_id, delta=text))
516+
if flow is not None:
517+
# Streaming mode: track open reasoning block in flow state.
518+
if flow.reasoning_message_id != message_id:
519+
# Close any previously open reasoning block (different message_id).
520+
events.extend(_close_reasoning_block(flow))
521+
# Open new reasoning block.
522+
events.append(ReasoningStartEvent(message_id=message_id))
523+
events.append(ReasoningMessageStartEvent(message_id=message_id, role="assistant"))
524+
flow.reasoning_message_id = message_id
493525

494-
events.append(ReasoningMessageEndEvent(message_id=message_id))
526+
if text:
527+
events.append(ReasoningMessageContentEvent(message_id=message_id, delta=text))
528+
529+
if content.protected_data is not None:
530+
events.append(
531+
ReasoningEncryptedValueEvent(
532+
subtype="message",
533+
entity_id=message_id,
534+
encrypted_value=content.protected_data,
535+
)
536+
)
537+
else:
538+
# No flow -- backward-compatible full sequence per call.
539+
events.append(ReasoningStartEvent(message_id=message_id))
540+
events.append(ReasoningMessageStartEvent(message_id=message_id, role="assistant"))
495541

496-
if content.protected_data is not None:
497-
events.append(
498-
ReasoningEncryptedValueEvent(
499-
subtype="message",
500-
entity_id=message_id,
501-
encrypted_value=content.protected_data,
542+
if text:
543+
events.append(ReasoningMessageContentEvent(message_id=message_id, delta=text))
544+
545+
events.append(ReasoningMessageEndEvent(message_id=message_id))
546+
547+
if content.protected_data is not None:
548+
events.append(
549+
ReasoningEncryptedValueEvent(
550+
subtype="message",
551+
entity_id=message_id,
552+
encrypted_value=content.protected_data,
553+
)
502554
)
503-
)
504555

505-
events.append(ReasoningEndEvent(message_id=message_id))
556+
events.append(ReasoningEndEvent(message_id=message_id))
506557

507558
# Persist reasoning into flow state for MESSAGES_SNAPSHOT.
508559
# Accumulate reasoning text per message_id, similar to flow.accumulated_text,
@@ -546,23 +597,30 @@ def _emit_content(
546597
) -> list[BaseEvent]:
547598
"""Emit appropriate events for any content type."""
548599
content_type = getattr(content, "type", None)
600+
601+
# Close open reasoning block when switching to non-reasoning content.
602+
if content_type != "text_reasoning":
603+
events = _close_reasoning_block(flow)
604+
else:
605+
events = []
606+
549607
if content_type == "text":
550-
return _emit_text(content, flow, skip_text)
608+
return events + _emit_text(content, flow, skip_text)
551609
if content_type == "function_call":
552-
return _emit_tool_call(content, flow, predictive_handler)
610+
return events + _emit_tool_call(content, flow, predictive_handler)
553611
if content_type == "function_result":
554-
return _emit_tool_result(content, flow, predictive_handler)
612+
return events + _emit_tool_result(content, flow, predictive_handler)
555613
if content_type == "function_approval_request":
556-
return _emit_approval_request(content, flow, predictive_handler, require_confirmation)
614+
return events + _emit_approval_request(content, flow, predictive_handler, require_confirmation)
557615
if content_type == "usage":
558-
return _emit_usage(content)
616+
return events + _emit_usage(content)
559617
if content_type == "oauth_consent_request":
560-
return _emit_oauth_consent(content)
618+
return events + _emit_oauth_consent(content)
561619
if content_type == "mcp_server_tool_call":
562-
return _emit_mcp_tool_call(content, flow)
620+
return events + _emit_mcp_tool_call(content, flow)
563621
if content_type == "mcp_server_tool_result":
564-
return _emit_mcp_tool_result(content, flow, predictive_handler)
622+
return events + _emit_mcp_tool_result(content, flow, predictive_handler)
565623
if content_type == "text_reasoning":
566624
return _emit_text_reasoning(content, flow)
567625
logger.debug("Skipping unsupported content type in AG-UI emitter: %s", content_type)
568-
return []
626+
return events

python/packages/ag-ui/agent_framework_ag_ui/_workflow_run.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
from ._run_common import (
3030
FlowState,
3131
_build_run_finished_event,
32+
_close_reasoning_block,
3233
_emit_content,
3334
_extract_resume_payload,
3435
_normalize_resume_interrupts,
@@ -729,6 +730,9 @@ def _drain_open_message() -> list[TextMessageEndEvent]:
729730
run_error_emitted = True
730731
terminal_emitted = True
731732

733+
for reasoning_evt in _close_reasoning_block(flow):
734+
yield reasoning_evt
735+
732736
for end_event in _drain_open_message():
733737
yield end_event
734738

0 commit comments

Comments
 (0)