feat: consume pinned chat context, and surface drift, diff, refresh (UI + CLI)#26466
feat: consume pinned chat context, and surface drift, diff, refresh (UI + CLI)#26466kylecarbs wants to merge 21 commits into
Conversation
Add the first production reader of chat_context_resources, the per-chat pinned copy populated in #26438. When the chat-context-pin experiment is enabled and a chat has a pinned copy, prepareGeneration builds the system-prompt instruction block and workspace skills from that copy instead of scanning the per-turn chat history. The pinned and history paths are mutually exclusive, so with the experiment off behavior is unchanged. - ExperimentChatContextPin gates the behavior. chatd.Server now carries the deployment's experiments, wired through chatd.Config from coderd. - contextResourcesToPrompt maps the protojson resource bodies (instruction files and skills) back into the instruction block and skill metadata, skipping non-OK statuses and non-prompt body kinds. - pinnedWorkspaceContext reads the pin, falls back when the experiment is off or the chat has no pinned rows, and propagates read errors. The bound agent only decorates the instruction header with OS and directory, so the pin still resolves when the workspace is unreachable.
|
/coder-agents-review |
|
Chat: Spend limit reached | View chat Review historydeep-review v0.7.1 | Round 5 | Last posted: Round 5, 10 findings (5 P3, 2 P4, 3 Nit), APPROVE. Review Finding inventoryFindings
Round logRound 1Panel. 0 from Netero. 4 P3, 1 P4, 2 Nit from panel. Reviewed against 1c78bd8..22c682f. Round 2Churn guard: PROCEED. All 7 R1 findings addressed. Netero: no findings. Panel (Bisky, Mafuuu, Pariston, Meruem, Chopper, Gon, Kite): 1 P3, 1 Nit new. All R1 fixes verified. Reviewed against 1c78bd8..22df16f. Round 3Churn guard: PROCEED. Both R2 findings addressed (8fa6c3c). Netero: no findings. Panel (Mafuuu, Pariston, Gon, Razor): no new findings above Nit. All 9 fixes verified. APPROVE. Reviewed against 1c78bd8..8fa6c3c. Round 4Rebase: experiment gate removed, presence-based selection. Churn guard: PROCEED (0 open). Netero: no findings. Panel (Bisky, Mafuuu, Pariston, Meruem, Kite, Gon, Razor): 1 P4. APPROVE. Reviewed against 787392e..a0d6535. Round 5Churn guard: PROCEED. CRF-10 addressed (4735aa4). Netero: no findings. Panel (Bisky, Mafuuu, Pariston): no findings. All 10 fixes verified. APPROVE. Reviewed against 787392e..4735aa4. About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
There was a problem hiding this comment.
Clean read-side consumer of the pinned context copy, well-gated and well-tested. The experiment flag, mutual exclusion between pinned and history paths, and the three-tier test coverage (pure function, mock, real Postgres) are all solid. Netero's mechanical scan found nothing; the panel found no design or architecture issues.
Severity summary: 4 P3, 1 P4, 2 Nit. No blockers.
The most convergent finding is the MetaFile divergence (CRF-1): the comment claims equivalence with the per-turn path, but the pinned path drops custom skill meta file names because SkillMetaBody has no meta_file field. The root cause is the proto definition (PR #26438), not this PR's logic, and both experiments are gated. Five reviewers flagged it independently.
Four reviewers also flagged the silent unmarshal skip (CRF-2), noting that if all bodies corrupt, the user loses context with ok=true and no log entry.
persistedSkills at generation_preparer.go:218 is computed unconditionally even when the pinned path discards it. The cost is negligible (in-memory scan, no DB call); noted for awareness when the pin becomes the default path.
"The
ExperimentDisabledsubtest uses a strict mock with noListChatContextResourcesByChatIDexpectation, so if the gate fails to short-circuit, gomock panics. That's the right way to test a gate." Bisky
🤖 This review was automatically generated with Coder Agents.
- Log a warn-level diagnostic when a status-OK pinned resource body fails to decode, so a proto or encoding regression cannot silently drop context (contextResourcesToPrompt now reports a malformed count). - Correct the skill MetaFile comment: the proto carries no meta file name, so a non-default CODER_AGENT_EXP_SKILL_META_FILE is not preserved on this path. - Drop the temporal "first production reader" sentence from the doc comment. - Add agentWorkingDir to consolidate the ExpandedDirectory/Directory fallback that recurred three times in chatd. - Cover the malformed skill body skip and remove a dead test assignment.
…rage Pull the pinned-vs-history selection out of prepareGeneration into resolveTurnWorkspaceContext so the dispatch is unit-testable: the agent binding refresh stays in prepareGeneration, and the helper takes the resolved agent plus prompt rows. This also drops the unconditional skillsFromParts scan, which the pinned path discarded. Add table tests covering pinned precedence, history fallback when the experiment is off, the empty case, and read-error propagation.
|
/coder-agents-review |
There was a problem hiding this comment.
All 7 R1 findings verified fixed across the panel. The resolveTurnWorkspaceContext extraction (CRF-5) is clean: the dispatch is testable, the pinned/history mutual exclusion is preserved, and the PinnedPathWins test proves precedence. The malformed count + Warn log (CRF-2) gives operators the signal they needed. The agentWorkingDir consolidation (CRF-7) removes all three inline copies.
Severity summary: 1 P3 (new), 1 Nit (new). No blockers.
The new P3 is in the history-fallback comment added with the extraction. It claims the history path matches the pinned path for skill gating, but the pinned path returns skills independently of instruction presence (the SkillsBuildMeta test proves this).
"I tried to build cases against this change and could not. The problem is correctly understood, the solution is proportional, and the fix is at the right level." Pariston
🤖 This review was automatically generated with Coder Agents.
- The history fallback comment claimed parity with the pinned path for skill gating, but the pinned path resolves skills independently of instruction presence; describe each path's actual behavior. - Replace the Config.Experiments comment that restated the field with a note on nilability and read-only-after-construction.
|
/coder-agents-review |
There was a problem hiding this comment.
All 9 findings from rounds 1 and 2 verified fixed across the panel. The code is well-structured: clean experiment gate, mutually exclusive pinned/history paths, thorough three-tier test coverage (pure function, mock-gated, real Postgres), and proportional scope.
One observation: several doc comments in context_prompt.go run 10-15 lines where 3-5 carry the contract (non-obvious ok semantics, MetaFile divergence, malformed-body intent). The rest restates what the function signature and code already show. Not blocking, but trimming to the traps and invariants would improve signal density.
"I tried to build a case against this change and could not. The problem is correctly understood, the solution is proportional, the experiment gate is clean, and the fix is at the right level. The premises hold." Pariston
🤖 This review was automatically generated with Coder Agents.
…t-consume-pin # Conflicts: # coderd/apidoc/docs.go # coderd/apidoc/swagger.json # codersdk/deployment.go # docs/reference/api/schemas.md
…iment The pinned workspace-context path was gated behind ExperimentChatContextPin. Replace the gate with presence-based selection: when a chat has pinned context rows (the agent reported context through the push system), build the prompt from the pin; otherwise fall back to the per-turn history path. Older agents that never report context get legacy compatibility automatically, with no experiment to enable. Removes ExperimentChatContextPin and the Server/Config experiments field that only fed the gate, and regenerates the API docs and types.
|
/coder-agents-review |
There was a problem hiding this comment.
Re-reviewed after the rebase that removed the experiment gate in favor of presence-based selection. The structural change is clean: when pinned rows exist the pin is used, otherwise the history fallback engages automatically. This gives older agents seamless backward compatibility with no experiment flag to manage.
The panel (7 reviewers) found no design, safety, or correctness issues with the new selection mechanism. One P4 observation on untested defensive guards (CRF-10).
Comment verbosity was raised again by Gon. The doc comments in context_prompt.go carry real invariants (ok semantics, MetaFile divergence, malformed counting) padded with restatements of the code. Trimming to the traps and contracts would improve density. Not blocking.
"The
ok boolreturn onpinnedWorkspaceContextmakes the pin-vs-fallback decision explicit and testable.resolveTurnWorkspaceContextthen reads like a decision table." Kite
🤖 This review was automatically generated with Coder Agents.
…guards Adds the two defensive-branch tests from review CRF-10: an OK skill resource with an empty name, and a whitespace-only instruction body, both filter out without being counted as malformed. Also tightens the pinnedWorkspaceContext and contextResourcesToPrompt doc comments per the verbosity note, trimming restatement while keeping the ok-semantics, malformed-count, and MetaFile-divergence invariants.
|
/coder-agents-review |
There was a problem hiding this comment.
All 10 findings across 4 rounds verified fixed. The two new test subtests (SkipsEmptyNameSkill, SkipsEmptyInstructionContent) correctly exercise the defensive guards. Doc comments trimmed. All 19 unit subtests pass.
This PR is clean. 185 production lines, 544 test lines (74.6% test density), presence-based pin selection with automatic legacy fallback, three-tier test coverage, and every finding addressed.
🤖 This review was automatically generated with Coder Agents.
Drops the agentWorkingDir helper and restores the inline ExpandedDirectory-or-Directory fallback at its three call sites.
Extend the workspace-context feature so a chat shows the context it is actually pinned to, flags when that context has drifted from the agent's latest snapshot, and lets the user inspect the edits and refresh. Backend - codersdk.ChatContext gains `resources` (pinned instruction files + skills, metadata only) and `changes` (per-source diff vs the agent's latest snapshot: added/removed/modified, with sanitized old/new bodies for instruction files capped at 64KiB/side; name+description for skills). - chatd.Server.ContextDetail computes both on read; the single-chat GET handler attaches them like files/diff_status, while list and watch payloads stay lightweight. Nothing is persisted. Body decoders are factored out of contextResourcesToPrompt so reported resources match what the prompt injects. Frontend - The context ring shows a yellow drift triangle (distinct error treatment), drives its file/skill list from the pinned resources (falling back to last_injected_context), and adds a "Refresh context" action plus a "View changes" dialog that renders instruction-file edits via the existing DiffViewer and lists skill changes. - experimental.refreshChatContext + a refreshChatContext mutation; mergeWatchedChatSummary now carries context so context_dirty watch events update the cache, and the open chat is refetched to pull the change set. Tests: diff/resource computation and ContextDetail gating (Go unit), the GET resources/changes end to end (integration), the context_dirty cache merge (unit), and indicator stories (clean/dirty/error play).
Docs preview📖 View docs preview for |
|
/coder-agents-review Re-requested by Coder Agents on behalf of @kylecarbs after pushing the full-stack scope (pinned context resources + drift diff on the chat GET, and the context-ring drift indicator / changes dialog / refresh). |
|
@kylecarbs ⛔ This review has reached its per-chat spend limit ($102.97 / $100.00). Further review rounds are paused. To raise the limit and continue, comment: This is a per-chat budget, separate from any account-level usage limit.
|
Add two user-facing subcommands under `coder exp chat context`: - `show <chat>` reports the workspace context a chat is pinned to (instruction files + skills), whether it has drifted from the agent's latest snapshot, and the per-source change set when dirty. Supports `-o text` (default, with resource/change tables) and `-o json` (the raw chat.context object). - `refresh <chat>` re-pins the chat to the agent's latest snapshot and clears the drift marker. Both use the experimental user client (GetChat / RefreshChatContext). renderChatContextText is factored to take an io.Writer and covered by a unit test (clean, dirty-with-changes, and no-context cases).
The chat context popover lists pinned instruction files and skills, falling back to the agent's last injected context when the pin has not loaded. That fallback can contain an empty context-file marker (no path), which rendered as a nameless row under a "Context files" heading. Filter out file entries with no path and skill entries with no name in both the pinned and fallback branches, so an empty marker no longer produces a blank row (and the "Context files" section is hidden when it would otherwise be empty). Add a regression play story covering the drifted, no-pinned-resources fallback case.
…t socket Aligns 'coder exp chat context' with the Workspace Context Sources RFC. Previously 'add' only did a one-shot inject into a chat; sources could not be listed, inspected, or removed, and there was no in-workspace path to the agent's context Manager. Agent socket (agent/agentsocket): - Extend the DRPC proto with ContextSources/GetContextSource/ AddContextSource/RemoveContextSource/GetContextSnapshot/ResyncContext. - Add a ContextManager interface + WithContextManager option; implement the RPCs against *agentcontext.Manager and wire it in initSocketServer. - Add typed client methods + display structs. CLI (coder exp chat context): - list / show <path> / add <path> / remove <path>: agent-local source CRUD over the socket (in-workspace; no coderd, no tailnet). - add <path> --chat <chat>: keeps the legacy one-shot inject. - refresh [<chat>]: <chat> refreshes that chat (anywhere); no-arg, in workspace, re-resolves the agent (resync barrier) then refreshes every drifted chat. - show <chat> (chat drift) is replaced by show <path> (source); chat drift remains visible in the UI and via refresh <chat>. Tests: agentsocket context RPC round-trip with a fake manager; parseChatID.
…sources coder exp chat context add/show/remove now resolve a relative source path (e.g. ./) to an absolute path before sending it to the agent, which requires canonical absolute paths. A leading ~ is preserved for the agent to expand against its own home. chatd now includes mcp_config and mcp_server rows in a chat's pinned context list and in the drift diff, instead of filtering to instruction files and skills only. MCP changes carry only source, kind, and status (no body diff). The context popover gains an MCP section (config by file basename, server by name) and the changes dialog lists MCP changes alongside skills.
…apshot Wire the MCP manager's live tool cache into the agentcontext resolver via a new ManagerOptions.MCP provider, so each connected MCP server becomes a KindMCPServer resource (with its tools) in the snapshot alongside instruction files and skills. Previously Resolver.MCP was never set, so only the .mcp.json file appeared (as an mcp_config resource) and servers/tools never reached coderd or the UI. agent/x/agentmcp.Manager: export CachedTools (a non-blocking copy of the tool cache) and add SetOnToolsChanged, fired after a reload writes the cache so the agentcontext manager re-resolves when the live tool set changes. Only servers exposing at least one tool are surfaced; a connected server with no tools, or a failed connection, is not shown. The wire/proto and coderd storage path for KindMCPServer already existed; this change only produces the resources.
Decode the pinned mcp_server body and report its tools (with the agent's "<server>__" name prefix stripped) on the single-chat GET, via a new codersdk.ChatContextMCPTool list on ChatContextResource. The context popover now lists each MCP server's tools beneath it (wrench icon, description tooltip), mirroring how skills are shown. Tool decoding is defensive: a missing or malformed body yields no tools rather than breaking the pinned-context list.
pinnedContextResources now reports non-OK resources (invalid, unreadable, oversize, excluded) with their Status and Error instead of dropping them silently, and stamps Status on healthy resources. codersdk.ChatContextResource gains Status and Error (ChatContextResourceStatus mirrors the agent resolver's per-resource status, already stored end-to-end). The context popover lists problem resources in a new Issues section (warning icon, kind and status, and the agent's error message). For example, a skill rejected because its front-matter name does not match its directory now shows that reason instead of silently vanishing from the list and read_skill.
What
The workspace-context feature now reads from the per-chat pinned copy
(
chat_context_resources, populated in #26438) end to end, and surfaces it inthe API, the UI, and the CLI:
workspace skills from the pin instead of scanning per-turn history (the
first production reader of the pin).
drifted from the agent's latest snapshot, a per-source change set.
adds View changes (a diff) and Refresh context.
coder exp chat context showandrefresh.How
Prompt generation (read side)
contextResourcesToPromptmaps the protojson resource bodies back intoprompt inputs:
instruction_filebecomes a context-file part with path =resource
source;skillbecomesSkillMeta{Name, Description, Dir}(
MetaFiledefaults toSKILL.md). Non-OK statuses and non-prompt bodykinds are skipped; malformed bodies are skipped rather than fatal.
pinnedWorkspaceContextreads the pin and reportsok:ok=false(fallback to the per-turn history path) when there are no pinned rows; read
errors propagate; rows present make the pin authoritative. Selection is
presence-based with no experiment, so older agents keep legacy behavior.
instruction header, so the pin still resolves when the workspace is
unreachable.
Pinned context + drift on the chat GET
codersdk.ChatContextgainsresources(pinned instruction files + skills,metadata only) and
changes(per-source diff vs the agent's latestsnapshot:
added/removed/modified, with sanitized old/new bodies forinstruction files capped at 64KiB/side, and name+description for skills).
chatd.Server.ContextDetailcomputes both on read (status from eachsource's
content_hash); changes are only computed while the chat is dirty.The single-chat GET handler attaches the detail like
files/diff_status,while list and watch payloads stay lightweight. Nothing is persisted, and
the body decoders are shared with the prompt path so reported resources
match what is injected.
UI (context ring)
treatment on snapshot error). The popover drives its file/skill list from
the pinned
resources(falling back tolast_injected_context), and addsRefresh context plus View changes, a dialog that renders
instruction-file edits via the existing
DiffViewerand lists skillchanges.
experimental.refreshChatContextand arefreshChatContextmutation;mergeWatchedChatSummarynow carriescontextsocontext_dirtywatchevents update the cache, and the open chat is refetched to pull the change
set.
CLI
coder exp chat context show <chat>reports the pinned context (instructionfiles + skills), the drift status, and the change set when dirty;
-o text(default, with tables) or-o json(the rawchat.context).coder exp chat context refresh <chat>re-pins to the latest snapshot andclears the drift marker. Both use the experimental user client.
Scope
The legacy per-turn pull (
buildWorkspaceContext/fetchWorkspaceContext)and the
last_injected_contextcache are intentionally left in place; removingthem is a follow-up.
Testing
contextResourcesToPromptmapping/skip/guard cases;pinnedWorkspaceContextandresolveTurnWorkspaceContextdispatch viadbmock; diff/resource computation andContextDetaildirty-gating;truncateUTF8; CLIrenderChatContextText(clean / dirty / no-context).push drifts it, and the chat GET is asserted to report the pinned
resourcesplus thechanges(modified instruction file with old/newbodies, added skill); refresh re-pins and clears the change set.
context_dirtycache-merge unit test;ContextUsageIndicatorplay stories (clean has no marker; dirty shows the change section, refresh,
and the changes dialog; snapshot error treatment).
Implementation plan and decisions
Design (computed-on-read, no new endpoint): deliver the pinned context and
the change set ON the existing
chat.contextobject, populated only on thesingle-chat
GET /api/experimental/chats/{chat}(likefiles/diff_status)and kept out of list/watch payloads. The diff is computed on read by comparing
chat_context_resources(pinned) againstworkspace_agent_context_resources(the agent's latest snapshot) per
source/content_hash; nothing ispersisted (the unused
context_dirty_resourcescolumn stays unused).Resolved decisions:
persistence).
(added = new only, removed = old only, modified = both), capped 64KiB/side;
skills carry name/description.
DiffViewer.resources, withlast_injected_contextas theloading fallback.
resourcesequalswhat the model actually sees.
show(read) andrefresh(re-pin) undercoder exp chat context, alongside the existing agent-sideadd/clear.Prompt-side mapping/semantics (earlier commits):
bodyis protojson ofthe agentproto body;
instruction_file->InstructionFileBody{content}(
source= file path);skill->SkillMetaBody{meta,name,description}(
source= skill directory,MetaFiledefaults toSKILL.md);mcp_*andreserved kinds are ignored. Zero pinned rows fall back to history; list errors
fail the turn; rows present make the pin authoritative even when all rows
filter out.
Out of scope (follow-up): removing the per-turn pull and
last_injected_context.This PR was created by Coder Agents on behalf of @kylecarbs. Builds on #26438.