Skip to content

feat: add persisted whole-chat summary with background generation#26657

Draft
jaaydenh wants to merge 2 commits into
mainfrom
chat-summary-62j9
Draft

feat: add persisted whole-chat summary with background generation#26657
jaaydenh wants to merge 2 commits into
mainfrom
chat-summary-62j9

Conversation

@jaaydenh

Copy link
Copy Markdown
Contributor

Adds the persisted whole-chat summary that backs the chat summary popover. A new nullable chats.summary column is populated in the background after a successful root-chat turn and pushed to clients via a new chat_summary_change watch event, so the popover reads chat.summary straight off the loaded Chat with no extra query.

This is the data source for the popover and per-chat cost UI built in #26649; the popover can consume chat.summary once this lands (the field is nullable, so merge order does not matter).

How it works

  • Generation runs in the existing successful-turn finalize hook, detached from the request so the user's turn is never blocked. A cadence gate generates the first summary after one completed turn, then regenerates every few turns, using a chats.summary_generated_at freshness marker. Generation reads compaction-aware history, renders it to a plain transcript, and asks for a 1-3 sentence summary via structured output.
  • Staleness is guarded by history_version (mirroring last_turn_summary), so a background write racing a newer turn loses while worker lifecycle transitions cannot reject a fresh write.
  • Cost attribution adds a nullable chat_messages.cost_source discriminator. Summary spend is recorded as a hidden, soft-deleted accounting row tagged 'summary'; the existing manual-title accounting row is tagged 'title'. Ordinary turn spend stays NULL, so existing cost/usage queries are unchanged. The per-feature breakdown is left to the /cost endpoint (out of scope, feat: add chat summary popover and per-chat cost endpoint #26649).
  • Model selection defaults to the chat's configured model, with a new deployment-wide summary_generation model override (mirroring title_generation); a configured-but-unusable override skips generation.

Notes

  • Migration 000530 adds chats.summary, chats.summary_generated_at, and chat_messages.cost_source, and recreates chats_expanded to expose the new columns.
  • Root chats only; shared viewers pick up the summary on their next refetch (live watch events are owner-only).

Refs #26649

Add a persisted chats.summary populated by a background generator after
successful root-chat turns, delivered live via a new chat_summary_change
watch event, with per-feature cost attribution and a deployment-wide
summary-generation model override.
@github-actions

Copy link
Copy Markdown

Docs preview

📖 View docs preview for docs/admin/security/audit-logs.md

The UnknownContextReturns400 subtest hardcodes the valid override
context list in its expected error. Adding the summary_generation
context changed the message, so update both assertions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant