fix: pin workspace agent API client to intended agent#26600
Open
ethanndickson wants to merge 3 commits into
Open
fix: pin workspace agent API client to intended agent#26600ethanndickson wants to merge 3 commits into
ethanndickson wants to merge 3 commits into
Conversation
The control-plane HTTP client used to talk to workspace agents (agentConn.apiClient) followed redirects with Go's default policy, and its DialContext dialed whatever host the post-redirect URL contained, checking only that the port was the agent HTTP API port (4). A malicious workspace agent could return a 3xx redirect to another agent's tailnet IP, and coderd's shared tailnet would replay the request (including replayable POST bodies on 307/308) against the victim agent's unauthenticated HTTP API, enabling cross-tenant file read/write and RCE. Refuse redirects via CheckRedirect and pin every dial to the intended agent address, rejecting any request host that differs from it. Apply the same redirect refusal to the task app client in aitasks.go. Closes CODAGT-668
This comment has been minimized.
This comment has been minimized.
Member
Author
|
@codex review |
|
Codex Review: Didn't find any major issues. What shall we delve into next? Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The control-plane HTTP client used to talk to workspace agents followed HTTP redirects and trusted the redirected host, letting a malicious workspace agent bounce a coderd request onto a different agent on the shared tailnet. Because the agent HTTP API on port 4 is unauthenticated (it relies on tailnet reachability plus control-plane authorization), this allowed cross-tenant file read/write and remote code execution. This PR refuses redirects and pins every dial to the intended agent.
Closes CODAGT-668.
Problem
agentConn.apiClientincodersdk/workspacesdk/agentconn.goconstructed anhttp.Clientwith noCheckRedirect, so Go's default policy followed up to 10 redirects. Its customTransport.DialContextparsed the host from the (post-redirect) request URL and dialed that IP over the shared tailnet, validating only that the port wasAgentHTTPAPIServerPort(4). It never pinned the connection to the intendedAgentID/agentAddress().A workspace owner (any regular org member, not just admins) controls their own agent and can make its port-4 handler return a
3xxLocationpointing at a victim agent's tailnet IP. When a control-plane action (for example a chat tool or the HTTP MCP server) sends an agent API request to the attacker's agent, coderd acts as a confused deputy and replays the request against the victim:301/302/303rewrite POST to GET, but307/308preserve method and body when the body is replayable. The real callers pass replayable bodies, so a redirectedPOST /api/v0/write-filewrites attacker-controlled content into the victim workspace and a redirectedPOST /api/v0/processes/startexecutes it, giving RCE on the victim agent.The dangerous callers run server-side on coderd's single deployment-wide
ServerTailnet, which is authorized to tunnel to any agent, so the blast radius is cross-tenant / cross-organization (limited in practice to victim agents coderd currently has a live tunnel to).Fix
In
agentConn.apiClient:CheckRedirect: http.ErrUseLastResponseso the client never follows a redirect. A3xxis surfaced to the caller as the response (which the existingReadBodyAsErrorpath turns into an error) instead of being replayed against another host.AgentID(agentAddr := netip.AddrPortFrom(c.agentAddress(), AgentHTTPAPIServerPort)), reject any dial whose host or port does not match it, and always dial that pinned address rather than the URL-derived host.In
coderd/aitasks.go, the task app proxy client (taskAppHTTPClient) also now setsCheckRedirect: http.ErrUseLastResponse. This client dials throughagentConn.DialContext, which already pins the host to the originating workspace's agent (it takes only the port from the dial address), so it was never cross-agent. The change is hardening for parity so a malicious app cannot bounce the request to a different port on the same agent.Hardening and defense in depth
The two layers are independent.
CheckRedirectremoves the redirect-following behavior entirely, and the dial pinning guarantees that even a request constructed with a foreign host can only ever reach the intended agent. Removing either one in the future cannot, on its own, reintroduce the cross-agent vector.Tests
codersdk/workspacesdk/agentconn_redirect_test.gobuilds a three-peer tailnet (client, attacker, victim). The attacker agent redirects to the victim's port-4 URL, and the test asserts thatGET302,POST307, andPOST308all return an error and that the victim is never contacted.coderd/aitasks_internal_test.goaddsTestTaskAppHTTPClient_RejectsRedirect, which verifies the task app client surfaces a307instead of following it to a stand-in victim.Why this closes the whole vulnerability class
apiClientis the only HTTP chokepoint to the agent port-4 API, so fixing it covers every server-side caller:agentConnfunnels throughapiClient, either viaapiRequest, a directapiClient(ctx).Do(...)(ExecuteDesktopAction), or as the websocketHTTPClient(WatchContainers,WatchGit,ConnectDesktopVNC). The websocket handshake matters here:coder/websocketfollows3xxduring the handshake by default and only requires101on the final hop, but it honors the underlying client'sCheckRedirect, so reusingapiClientcloses the websocket paths too./api/experimental/mcp/httpregisters tools (coder_workspace_bash,_write_file,_read_file,_edit_files, etc.) that reach the agent throughworkspacesdk.AgentConnmethods, so they go throughapiClientand are covered. The same is true for agent-hosted MCP, which coderd reaches only viaagentConn.CallMCPTool/ListMCPTools. coderd never opens an MCP client connection directly to an agent over the tailnet.DialContext) speak non-HTTP protocols and have no redirect surface. The workspace apps reverse proxy targets user app ports, not port 4, forwards3xxto the browser rather than following them, and pins its transport to the request's agent.provisionerddoes not talk to the agent HTTP API at all.No other server-side client follows redirects to an agent-controllable tailnet host, so no further redirect changes are required for this class.