Skip to content

fix(session): use server timestamps instead of IDs in runLoop exit condition#28637

Open
kagura-agent wants to merge 1 commit into
anomalyco:devfrom
kagura-agent:fix/runloop-clock-skew-exit
Open

fix(session): use server timestamps instead of IDs in runLoop exit condition#28637
kagura-agent wants to merge 1 commit into
anomalyco:devfrom
kagura-agent:fix/runloop-clock-skew-exit

Conversation

@kagura-agent
Copy link
Copy Markdown
Contributor

Issue for this PR

Closes #28618

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

The runLoop exit condition at prompt.ts:1272 uses lastUser.id < lastAssistant.id to check whether the user message predates the assistant response. Message IDs encode Date.now() in their first 12 hex chars. When the web client sends a client-generated messageID (timestamped with the browser clock) and the browser is ahead of the server, the user ID sorts after the assistant ID — the exit check fails, the loop fires a second LLM call with a <system-reminder> wrap.

The fix replaces the ID comparison with lastUser.time.created <= lastAssistant.time.created. Both fields are set server-side via Date.now() when the message is created, so they are immune to client clock skew.

How did you verify your code works?

  • Traced the data flow: User.time.created is set at line 722 (time: { created: Date.now() }), always server-side regardless of whether input.messageID is client-provided
  • Assistant.time.created is also set server-side
  • Ran bun test test/session/ — 39 pass, 16 fail (all pre-existing upstream failures, identical without this change)
  • bun test test/session/message-v2.test.ts — 36/36 pass

Screenshots / recordings

N/A — backend logic change, no UI impact.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

…ndition

The runLoop exit condition compared lastUser.id < lastAssistant.id to
determine message ordering. Since message IDs encode Date.now() in their
first 12 hex chars, this breaks when the web client provides a
client-generated messageID timestamped ahead of the server clock.

Replace the ID comparison with time.created comparison. Both User and
Assistant time.created are set server-side via Date.now(), making them
immune to client clock skew.

Closes anomalyco#28618
@github-actions
Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

I found some potentially related PRs that addressed similar issues with message ID comparisons in the runLoop:

Related PRs:

  1. PR fix(session): use parentID instead of timestamp for loop exit condition #21365 - fix(session): use parentID instead of timestamp for loop exit condition

    • Also addresses the runLoop exit condition logic
    • May have explored similar timestamp/ID comparison issues
  2. PR fix(session): use transcript position instead of lexical ID compare in prompt loop #24379 - fix(session): use transcript position instead of lexical ID compare in prompt loop

    • Directly related to fixing ID comparison logic in the prompt loop
  3. PR fix(session): compare message positions instead of IDs in SessionPrompt.run #24544 - fix(session): compare message positions instead of IDs in SessionPrompt.run

    • Also addresses comparing message positions/IDs in the session prompt logic

These PRs all appear to be working on the same area of code (the session prompt loop and message comparison logic). While they may have used different solutions (parentID, transcript position, message positions vs. timestamps), they're related to the same problem domain of fixing the exit condition. You may want to check if these prior approaches were considered or if there's any historical context that explains why the current timestamp comparison approach is preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

runLoop fails to exit when client-generated messageID has clock skew ahead of server, causing infinite continuation with <system-reminder> wrap

1 participant