Problem
When an MCP server configured in .mcp.json (or via the admin MCP config) fails to connect, the failure is logged at Warn level and the server is skipped. The error is not surfaced to the user in any way — the server's tools simply don't appear. This makes it very difficult to debug why expected tools are missing.
Additionally, failed servers are not retried unless the config file changes on disk (triggering a SnapshotChanged reload). A transient failure (e.g. server not yet ready, network blip) requires a manual config file touch or workspace restart to recover.
Where this happens
Agent-side (agent/x/agentmcp/manager.go)
connectAll() catches the error from connectServer(), logs it, and returns nil from the errgroup:
c, err := m.connectServer(ctx, cfg)
if err != nil {
m.logger.Warn(ctx, "skipping MCP server",
slog.F("server", cfg.Name),
slog.F("transport", cfg.Transport),
slog.Error(err),
)
return nil // Don't fail the group.
}
installServers() will retain a previous client on reconnect failure, but on the first connect there is no previous client — the server is simply absent.
Chatd-side (coderd/x/chatd/mcpclient/mcpclient.go)
ConnectAll() similarly logs and swallows:
if connectErr != nil {
logger.Warn(ctx,
"skipping MCP server due to connection failure",
slog.F("server_slug", cfg.Slug),
slog.F("server_url", RedactURL(cfg.Url)),
slog.F("error", redactErrorURL(connectErr)),
)
return nil
}
Current API gap
ListMCPToolsResponse only contains Tools []MCPToolInfo — there is no field for failed servers. Callers cannot distinguish "no MCP servers configured" from "configured but all failed to connect."
Desired behavior
- Surface failures: Connection errors should be visible to the user somewhere (UI design TBD — could be a chat system message, workspace health indicator, agent status panel, etc.).
- Retry failed servers: Servers that fail to connect should be periodically retried rather than requiring a config file change to trigger a reload.
🤖 Generated with Coder Agents
Problem
When an MCP server configured in
.mcp.json(or via the admin MCP config) fails to connect, the failure is logged atWarnlevel and the server is skipped. The error is not surfaced to the user in any way — the server's tools simply don't appear. This makes it very difficult to debug why expected tools are missing.Additionally, failed servers are not retried unless the config file changes on disk (triggering a
SnapshotChangedreload). A transient failure (e.g. server not yet ready, network blip) requires a manual config file touch or workspace restart to recover.Where this happens
Agent-side (
agent/x/agentmcp/manager.go)connectAll()catches the error fromconnectServer(), logs it, and returnsnilfrom the errgroup:installServers()will retain a previous client on reconnect failure, but on the first connect there is no previous client — the server is simply absent.Chatd-side (
coderd/x/chatd/mcpclient/mcpclient.go)ConnectAll()similarly logs and swallows:Current API gap
ListMCPToolsResponseonly containsTools []MCPToolInfo— there is no field for failed servers. Callers cannot distinguish "no MCP servers configured" from "configured but all failed to connect."Desired behavior