feat(agent): serve context sources and live MCP servers over the agent socket#26526
feat(agent): serve context sources and live MCP servers over the agent socket#26526kylecarbs wants to merge 1 commit into
Conversation
|
/coder-agents-review |
|
Chat: Review in progress | View chat Review historydeep-review v0.8.0 | Round 3 | Status: Churn guard Last posted: Round 2, 18 findings (2 P2, 8 P3, 6 Nit, 2 Note), COMMENT. Review Finding inventoryFindings
Law analysisEffective LOC: 1476 (813 production, 663 test, 1429 generated). Head SHA: 0c574da. Verdict: Don't split. Enforcement: Advisory. Contested and acknowledgedCRF-13 (Note, contextmcp.go:39) - Comment verbosity pattern
Round logRound 1Panel (20 reviewers). 2 P2, 5 P3, 3 Nit, 2 Note new. 3 dropped. Reviewed against 023a1c9..0c574da. Round 2Churn guard: PROCEED. 12 addressed, 1 acknowledged. Panel (6 reviewers). 3 P3, 2 Nit new. 0 dropped. Reviewed against 023a1c9..a9283c6. About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
There was a problem hiding this comment.
Well-structured agent-side feature. The provider seam, drift-hash exclusion, and feedback-loop dedup are clean. The test suite (44.9% density) covers every new code path including real subprocess failures and hash sensitivity. Particularly nice: the slices.Equal(paths, last) guard that prevents reload storms from MCP tool broadcasts, and the default-include switch in driftResources that makes new resource kinds hash-safe by default.
Severity count: 2 P2, 5 P3, 3 Nit, 2 Note.
The two P2s are (1) false claims in doc comments about aggregate-hash participation, and (2) a stale-data window between installServers and recordServerHealth that Resync can observe. Both are fixable in-place.
"Two tools identical in name and description but differing only in schema would hash identically, producing a false negative in drift detection." (Hisoka, on the silent schema drop)
agent/x/agentmcp/configwatcher_internal_test.go:35
Nit [CRF-1] The awaitTools doc comment still says cachedTools after the rename to CachedTools.
(Netero)
🤖
🤖 This review was automatically generated with Coder Agents.
025644d to
a9283c6
Compare
|
Reworked the MCP wiring and addressed the prior review (all 12 threads replied to and resolved). What changed since the first push
Force-pushed as a single clean commit ( This update was made by Coder Agents on behalf of @kylecarbs. |
|
/coder-agents-review |
There was a problem hiding this comment.
Clean rework. The agentmcp revert eliminates both R1 P2s (the false-claim comments and the cross-lock race), and the MCP-to-Resource conversion landing in agentcontext dissolves the duplication findings. All 13 R1 findings are resolved.
The reworked code is well-structured: the provider reads the manager's existing Tools cache via an atomic.Pointer, the sync loop deduplicates by both path-set and tool-set equality, and the marshal-error sentinel from CRF-7 is properly carried over.
R2 severity count: 3 P3, 2 Nit.
"When a tool changes which parameters are required without changing property definitions, hashMCPServer produces the same hash." (Meruem, on the Required gap)
🤖 This review was automatically generated with Coder Agents.
| if t.ServerName == "" { | ||
| continue | ||
| } | ||
| byServer[t.ServerName] = append(byServer[t.ServerName], MCPTool{ |
There was a problem hiding this comment.
P3 [CRF-17] buildMCPServerResources converts MCPToolInfo to MCPTool, mapping Name, Description, and Schema (properties) but dropping Required []string. hashMCPServer covers only MCPTool fields, so when a tool changes which parameters are required without any other schema change, the server's ContentHash and the aggregate hash stay the same.
The sync loop does detect the change: mcpToolsChanged uses reflect.DeepEqual on the full MCPToolInfo (which includes Required), so a re-resolve fires and the snapshot version increments. But the hashes don't flip, so hash-based consumers (e.g. coderd's push path in follow-up PRs) would miss the update.
Fix: add Required []string to MCPTool, populate it here, and include it in hashMCPServer's hash computation.
(Meruem P3, Razor P3)
🤖
| lastPaths []string | ||
| lastTools []workspacesdk.MCPToolInfo | ||
| ) | ||
| refresh := func() { |
There was a problem hiding this comment.
P3 [CRF-18] lastPaths = paths is committed before Reload at line 1623. If Reload fails (e.g. manager closing during a shutdown race), the next wake sees equal paths and short-circuits, permanently skipping the explicit reload for those paths. Self-healing requires a file edit on disk that changes the discovered path set.
In practice this bites only during shutdown races, and the agent is tearing down anyway. But the fix is trivial: set lastPaths after Reload succeeds, or clear it on error.
if !slices.Equal(paths, lastPaths) {
if err := a.mcpManager.Reload(ctx, paths); err != nil {
a.logger.Warn(ctx, "failed to reload workspace MCP servers", slog.Error(err))
} else {
lastPaths = paths
}
}(Hisoka)
🤖
| // contextSnapshotToProto converts an agentcontext.Snapshot to its on-wire | ||
| // form. Payload bytes are intentionally omitted; they reach coderd via the | ||
| // drpc PushContextState path. | ||
| func contextSnapshotToProto(s agentcontext.Snapshot) *proto.ContextSnapshot { |
There was a problem hiding this comment.
P3 [CRF-19] contextSnapshotToProto and snapshotResponse at agentcontext/api.go:180 are structurally identical converters from agentcontext.Snapshot. The Name field has already drifted: contextSnapshotToProto maps it (line 327), but snapshotResponse's SnapshotResource type lacks it. Adding a field to Resource now requires updating two converters in two packages.
A shared resourceToWire helper or at minimum syncing the field sets would prevent further drift.
(Robin)
🤖
| return nil | ||
| } | ||
|
|
||
| servers := make([]string, 0, len(byServer)) |
There was a problem hiding this comment.
Nit [CRF-20] slices.Sorted(maps.Keys(byServer)) replaces these four lines. CRF-10 fixed this pattern in mcpConfigPaths (same PR), but this new instance uses the old manual form.
(Robin)
🤖
| // entry is a structural error the MCP manager would reject. | ||
| // The top-level Unmarshal above already rejects malformed JSON, | ||
| // so a well-formed value starting with '{' is a complete object. | ||
| for name, raw := range shape.MCPServers { |
There was a problem hiding this comment.
Nit [CRF-21] validateMCPConfig iterates shape.MCPServers in nondeterministic map order. When a .mcp.json has multiple invalid entries, the error message varies between resolves. Status (StatusInvalid) and the aggregate hash are unaffected (hash does not include Error), so this is cosmetic only. Sorting the keys before iterating would make the error stable.
(Komugi)
🤖
…t socket Split from #26466 (agent-only changes); no behavior outside agent/. - agentsocket: add context source CRUD (ContextSources, GetContextSource, AddContextSource, RemoveContextSource) plus GetContextSnapshot and ResyncContext RPCs, with matching client methods and proto. The server takes the context Manager via WithContextManager and returns a clean error when it is absent. - agentcontext: take a non-blocking MCP tool source via ManagerOptions.MCPTools and validate .mcp.json structure (StatusInvalid on malformed). The resolver folds the tools into one KindMCPServer resource per server, grouping by server, sorting deterministically, and hashing the tool set so any change flips the snapshot's aggregate hash. The conversion from MCP tools to context resources lives in this package (no provider interface, since it is no longer cross-package). - agent: back the MCP tool source with a non-blocking cache that syncMCPServersFromContext refreshes from the MCP manager on each context change, deduping by the discovered .mcp.json path set and by the tool set so the re-resolve it triggers does not loop. No changes to agent/x/agentmcp.
a9283c6 to
5d4b950
Compare
|
Follow-up simplification (per review feedback): removed the This update was made by Coder Agents on behalf of @kylecarbs. |
|
/coder-agents-review |
Splits the agent-only changes out of #26466 so they can be reviewed and merged independently. There are no changes outside
agent/.This adds the agent-side capabilities the chat-context work in #26466 builds on:
ContextSources,GetContextSource,AddContextSource,RemoveContextSource,GetContextSnapshot, andResyncContextRPCs (proto, generated code, client, and service), wired throughWithContextManager. The socket returns a clean error when no context manager is attached.ManagerOptions.MCPTools, afunc() []workspacesdk.MCPToolInfo) and folds the tools into oneKindMCPServerresource per server: grouping by server, sorting servers and tools deterministically, and hashing each server's tool set so any change flips the snapshot's aggregate hash. The conversion from MCP tools to context resources lives inagentcontextitself. The agent backs the source with a small tool cache thatsyncMCPServersFromContextrefreshes from the MCP manager on each context change, deduping by both the discovered.mcp.jsonpath set and the tool set so the re-resolve it triggers cannot loop..mcp.jsonfiles get lightweight structural validation (surfaced asStatusInvalid), and MCP resources contribute to the snapshot aggregate hash like any other resource. Deciding when a connecting server should (or should not) re-pin a chat is left to the coderd layer in a follow-up PR.Notably, this needs no changes to
agent/x/agentmcp: the resolver reads the manager's existingToolscache rather than adding new accessors or callbacks.Part of #26466. The coderd, codersdk, CLI, and frontend pieces follow in separate PRs.
Design notes
Resourceconversion and hashing now live inagentcontext(mcp.go), there is no cross-package boundary to abstract: the resolver holds a plainMCPToolSource(func() []workspacesdk.MCPToolInfo) and callsbuildMCPServerResourcesdirectly. An earlier revision wrapped this in anMCPProviderinterface; that indirection was removed once the conversion moved in-package.agentmcp.Manager.Toolsreloads synchronously when a.mcp.jsonsnapshot changes, sosyncMCPServersFromContextcan refresh tools off the existing context-change signal instead of a callback or timer. It publishes new tools into anatomic.Pointercache andTriggers a re-resolve; deduping by tool-set equality stops the trigger from looping.writeLengthPrefixed) andresourceIDscheme so MCP resources sort and hash consistently with filesystem resources.agentcontextvalidation pieces are taken from feat: consume pinned chat context, and surface drift, diff, refresh (UI + CLI) #26466 (heada96951b); the MCP wiring was reworked for this split to drop theagentmcpchanges and the drift-hash exclusion.make pre-commit(gen, fmt, lint, build) passes locally with nomake gendiff.This PR was created by Coder Agents on behalf of @kylecarbs. Split from #26466.