feat: add in-memory transport for chatd -> aibridge routing#25576
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
df9c129 to
b3dd62e
Compare
5d05a00 to
7cef6fd
Compare
b3dd62e to
7aaf647
Compare
61fe7c4 to
c51efb0
Compare
7aaf647 to
7038f1b
Compare
c51efb0 to
ddec110
Compare
7038f1b to
063af1f
Compare
c2fa1c4 to
f1eaac8
Compare
|
/coder-agents-review |
There was a problem hiding this comment.
Clean in-memory transport implementation. The pipe-based streaming design is sound, test coverage is thorough (streaming, cancellation, concurrency, no-write handler all exercised), and the interface/implementation split across aibridge/aibridged keeps the dependency graph clean.
This is a first-pass (Netero) review only: 1 P2, 3 P3 mechanical findings. The full review panel has not yet reviewed this PR and will review after these are addressed.
The P2 is a goroutine leak on the happy path that accumulates one parked goroutine per successful request until the caller's context expires. The P3s are a malformed Status string and two dead-code observations (noting that this is a stacked PR, the consumers likely live in dependent PRs #24897/#25577, but the code is dead in this diff).
"Every successful RoundTrip leaves one goroutine alive for the remaining context lifetime. For LLM workloads where the request context has a multi-minute timeout but the request completes in seconds, goroutines accumulate." Netero
🤖 This review was automatically generated with Coder Agents.
|
/coder-agents-review |
There was a problem hiding this comment.
Well-designed in-memory transport with strong streaming semantics, clean AGPL/daemon package split, and thorough test coverage (377 test lines covering streaming, cancellation, concurrency, and edge cases). DEREM-1 (goroutine leak) and DEREM-2 (Status format) are correctly fixed. DEREM-3 and DEREM-4 (stacked-PR dead code) accepted unanimously by the panel.
One P2 remains: the handler goroutine has no recover(), so a panicking handler crashes the entire process instead of producing a 500. Eleven of eighteen reviewers flagged this independently. Through the HTTP route, net/http.Server catches handler panics; through this transport, the same panic is process-fatal.
Three P3s: require calls in handler goroutines violate Go's testing.T contract, no test guards the DEREM-2 Status format fix, and a header-map race window exists between WriteHeader and the Clone in RoundTrip.
"Two doors to the same handler, different mortality." Hisoka
Gon noted several doc comments carry architecture narration that may go stale, though Leorio specifically praised the doc quality in this PR. The panel was split on comment density; the four dropped Note-level findings reflect this tension.
🤖 This review was automatically generated with Coder Agents.
johnstcn
left a comment
There was a problem hiding this comment.
LGTM. PR description could use some words though, and there's a bit of a gotcha with ensureHeaders().
| go func() { | ||
| select { | ||
| case <-served.Context().Done(): | ||
| _ = pw.CloseWithError(served.Context().Err()) |
There was a problem hiding this comment.
review: this is explicitly called out as being safe to call concurrently.
There was a problem hiding this comment.
I suppose the defense-in-depth here is fine to leave as-is, right?
There was a problem hiding this comment.
Yep! My comment was for future readers wondering "is this concurrently safe"?
| // ensureHeaders closes gotHeaders if it has not already been closed, with the | ||
| // current status. Used to unblock RoundTrip on handler return-without-write. | ||
| func (w *pipeResponseWriter) ensureHeaders() { | ||
| w.once.Do(func() { | ||
| close(w.gotHeaders) | ||
| }) | ||
| } |
There was a problem hiding this comment.
Bit of a gotcha: after calling ensureHeaders() a subsequent call to WriteHeader is a silent no-op.
There was a problem hiding this comment.
Right now this only happens on defer. I think it's fine?
There was a problem hiding this comment.
Yep, non-blocking but just calling out.
Introduce an AGPL-side TransportFactory hook on coderd.API that is populated with an in-process http.RoundTripper when the aibridge daemon starts, letting coder-agent chatd traffic reach aibridged without going through the licensed HTTP route. The RoundTripper streams responses via io.Pipe so SSE/NDJSON/chunked bodies propagate token-by-token and context cancellation surfaces as a body-read error, matching real-network semantics. Refs AIGOV-357. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ortFactory TransportFor now takes an aibridge.Source (currently just SourceAgents) attached to the request context for downstream logging. The old isCoderAgent boolean and (nil, nil) fall-through are gone; the carve-out gating belongs to the caller's licensing decision, not the transport API. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Fix goroutine leak: the cancellation-watcher goroutine now selects on
both served.Context().Done() and a handlerDone channel, so it exits
promptly when the handler completes on the happy path (DEREM-1).
- Fix Response.Status format: use fmt.Sprintf("%d %s", ...) to produce
the full status line (e.g. "200 OK") per the net/http contract (DEREM-2).
- Add recover() in handler goroutine to prevent panicking handlers from crashing the process; produces a 500 response instead (DEREM-6). - Replace require calls in handler goroutines with assert + return to respect testing.T goroutine contract (DEREM-7). - Add regression test for resp.Status format (DEREM-8). - Snapshot headers in WriteHeader via frozenHeader to eliminate the race window between handler and RoundTrip after gotHeaders closes (DEREM-9). - Fix comment referencing nonexistent outer http server (DEREM-10). - Use wg.Go and range-over-int loops (Go 1.25+) (DEREM-11). - Fix article: a -> an before aibridge.TransportFactory (DEREM-12). - Replace captureWriter with httptest.NewRecorder (DEREM-5). - Add TestInMemoryRoundTripper_HandlerPanic test.
646dfa1 to
ee9ba43
Compare

TL;DR
Introduces an in-process
TransportFactoryfor aibridge so that chatd (coder-agent LLM traffic) can route requests through the aibridged handler without crossing the HTTP route or requiring a license entitlement check.What changed?
coderd/aibridgepackage with aTransportFactoryinterface and aSourcetype for tagging the call site on request contexts.SourceAgentsis defined as the constant for coder-agent traffic.NewTransportFactoryincoderd/aibridged/transport.go, which returns anhttp.RoundTripperthat dispatches requests to the aibridged handler in-process. The response body is streamed through anio.Pipeso SSE/NDJSON/chunked responses propagate token-by-token. Handler panics are recovered and surfaced as 500 responses, and context cancellation closes the pipe with the appropriate error.RegisterInMemoryAIBridgedHTTPHandlernow also constructs aTransportFactoryfrom the registered handler and stores it onAPI.AIBridgeTransportFactory(anatomic.Pointer), making it available to chatd without going through the license-gated HTTP route.API.AIBridgeTransportFactoryas a publicatomic.Pointer[aibridge.TransportFactory]field oncoderd.API.How to test?
coderd/aibridged/transport_test.gocovers: transport creation, nil-handler errors, source attachment to context, header/status passthrough, streaming (SSE-style chunked writes visible before handler completion), context cancellation closing the body with an error, concurrent requests, handler panics producing 500s, and handlers that return without writing.coderd/aibridge_test.goverifies thatAIBridgeTransportFactorystarts as nil on AGPL coderd, can be stored and loaded atomically, and that the stored factory correctly dispatches requests through the stub handler.Why make this change?
Chatd needs to send LLM requests through aibridge in-process rather than via the external HTTP route, which is license-gated. The
TransportFactoryabstraction provides a clean seam: the entitlement check remains on the HTTP route for external callers, while in-process coder-agent traffic bypasses it through the factory. TheSourcetype allows downstream handlers and logs to attribute traffic without gating behavior on the caller identity.