Skip to content

feat: add in-memory transport for chatd -> aibridge routing#25576

Merged
dannykopping merged 4 commits into
mainfrom
dk/aibridge-inmemory-pipe
May 22, 2026
Merged

feat: add in-memory transport for chatd -> aibridge routing#25576
dannykopping merged 4 commits into
mainfrom
dk/aibridge-inmemory-pipe

Conversation

@dannykopping
Copy link
Copy Markdown
Contributor

@dannykopping dannykopping commented May 21, 2026

TL;DR

Introduces an in-process TransportFactory for aibridge so that chatd (coder-agent LLM traffic) can route requests through the aibridged handler without crossing the HTTP route or requiring a license entitlement check.

What changed?

  • Added a new coderd/aibridge package with a TransportFactory interface and a Source type for tagging the call site on request contexts. SourceAgents is defined as the constant for coder-agent traffic.
  • Implemented NewTransportFactory in coderd/aibridged/transport.go, which returns an http.RoundTripper that dispatches requests to the aibridged handler in-process. The response body is streamed through an io.Pipe so SSE/NDJSON/chunked responses propagate token-by-token. Handler panics are recovered and surfaced as 500 responses, and context cancellation closes the pipe with the appropriate error.
  • RegisterInMemoryAIBridgedHTTPHandler now also constructs a TransportFactory from the registered handler and stores it on API.AIBridgeTransportFactory (an atomic.Pointer), making it available to chatd without going through the license-gated HTTP route.
  • Added API.AIBridgeTransportFactory as a public atomic.Pointer[aibridge.TransportFactory] field on coderd.API.

How to test?

  • coderd/aibridged/transport_test.go covers: transport creation, nil-handler errors, source attachment to context, header/status passthrough, streaming (SSE-style chunked writes visible before handler completion), context cancellation closing the body with an error, concurrent requests, handler panics producing 500s, and handlers that return without writing.
  • coderd/aibridge_test.go verifies that AIBridgeTransportFactory starts as nil on AGPL coderd, can be stored and loaded atomically, and that the stored factory correctly dispatches requests through the stub handler.

Why make this change?

Chatd needs to send LLM requests through aibridge in-process rather than via the external HTTP route, which is license-gated. The TransportFactory abstraction provides a clean seam: the entitlement check remains on the HTTP route for external callers, while in-process coder-agent traffic bypasses it through the factory. The Source type allows downstream handlers and logs to attribute traffic without gating behavior on the caller identity.

Copy link
Copy Markdown
Contributor Author

dannykopping commented May 21, 2026

@dannykopping dannykopping changed the base branch from dk/aibridged-to-agpl to graphite-base/25576 May 21, 2026 16:32
@dannykopping dannykopping force-pushed the dk/aibridge-inmemory-pipe branch from df9c129 to b3dd62e Compare May 21, 2026 16:35
@dannykopping dannykopping force-pushed the graphite-base/25576 branch from 5d05a00 to 7cef6fd Compare May 21, 2026 16:35
@dannykopping dannykopping changed the base branch from graphite-base/25576 to dk/aibridged-to-agpl May 21, 2026 16:35
@dannykopping dannykopping changed the base branch from dk/aibridged-to-agpl to graphite-base/25576 May 21, 2026 18:16
@dannykopping dannykopping force-pushed the dk/aibridge-inmemory-pipe branch 2 times, most recently from b3dd62e to 7aaf647 Compare May 21, 2026 18:19
@dannykopping dannykopping changed the base branch from graphite-base/25576 to dk/aibridged-to-agpl May 21, 2026 18:19
@dannykopping dannykopping force-pushed the dk/aibridged-to-agpl branch from 61fe7c4 to c51efb0 Compare May 21, 2026 18:38
@dannykopping dannykopping force-pushed the dk/aibridge-inmemory-pipe branch from 7aaf647 to 7038f1b Compare May 21, 2026 18:38
@dannykopping dannykopping changed the base branch from dk/aibridged-to-agpl to graphite-base/25576 May 22, 2026 06:58
@dannykopping dannykopping force-pushed the graphite-base/25576 branch from c51efb0 to ddec110 Compare May 22, 2026 07:11
@dannykopping dannykopping force-pushed the dk/aibridge-inmemory-pipe branch from 7038f1b to 063af1f Compare May 22, 2026 07:12
@graphite-app graphite-app Bot changed the base branch from graphite-base/25576 to main May 22, 2026 07:12
@dannykopping dannykopping force-pushed the dk/aibridge-inmemory-pipe branch 2 times, most recently from c2fa1c4 to f1eaac8 Compare May 22, 2026 08:58
Copy link
Copy Markdown
Contributor Author

/coder-agents-review

Copy link
Copy Markdown
Contributor

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean in-memory transport implementation. The pipe-based streaming design is sound, test coverage is thorough (streaming, cancellation, concurrency, no-write handler all exercised), and the interface/implementation split across aibridge/aibridged keeps the dependency graph clean.

This is a first-pass (Netero) review only: 1 P2, 3 P3 mechanical findings. The full review panel has not yet reviewed this PR and will review after these are addressed.

The P2 is a goroutine leak on the happy path that accumulates one parked goroutine per successful request until the caller's context expires. The P3s are a malformed Status string and two dead-code observations (noting that this is a stacked PR, the consumers likely live in dependent PRs #24897/#25577, but the code is dead in this diff).

"Every successful RoundTrip leaves one goroutine alive for the remaining context lifetime. For LLM workloads where the request context has a multi-minute timeout but the request completes in seconds, goroutines accumulate." Netero

🤖 This review was automatically generated with Coder Agents.

Comment thread coderd/aibridged/transport.go
Comment thread coderd/aibridged/transport.go Outdated
Comment thread coderd/aibridge/factory.go
Comment thread coderd/coderd.go
Copy link
Copy Markdown
Contributor Author

/coder-agents-review

Copy link
Copy Markdown
Contributor

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-designed in-memory transport with strong streaming semantics, clean AGPL/daemon package split, and thorough test coverage (377 test lines covering streaming, cancellation, concurrency, and edge cases). DEREM-1 (goroutine leak) and DEREM-2 (Status format) are correctly fixed. DEREM-3 and DEREM-4 (stacked-PR dead code) accepted unanimously by the panel.

One P2 remains: the handler goroutine has no recover(), so a panicking handler crashes the entire process instead of producing a 500. Eleven of eighteen reviewers flagged this independently. Through the HTTP route, net/http.Server catches handler panics; through this transport, the same panic is process-fatal.

Three P3s: require calls in handler goroutines violate Go's testing.T contract, no test guards the DEREM-2 Status format fix, and a header-map race window exists between WriteHeader and the Clone in RoundTrip.

"Two doors to the same handler, different mortality." Hisoka

Gon noted several doc comments carry architecture narration that may go stale, though Leorio specifically praised the doc quality in this PR. The panel was split on comment density; the four dropped Note-level findings reflect this tension.

🤖 This review was automatically generated with Coder Agents.

Comment thread coderd/aibridged/transport.go
Comment thread coderd/aibridged/transport_test.go Outdated
Comment thread coderd/aibridged/transport_test.go
Comment thread coderd/aibridged/transport.go Outdated
Comment thread coderd/aibridged/transport.go Outdated
Comment thread coderd/aibridged/transport_test.go Outdated
Comment thread coderd/aibridged/transport.go Outdated
Comment thread coderd/aibridge_test.go Outdated
@dannykopping dannykopping marked this pull request as ready for review May 22, 2026 09:57
Copy link
Copy Markdown
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. PR description could use some words though, and there's a bit of a gotcha with ensureHeaders().

go func() {
select {
case <-served.Context().Done():
_ = pw.CloseWithError(served.Context().Err())
Copy link
Copy Markdown
Member

@johnstcn johnstcn May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review: this is explicitly called out as being safe to call concurrently.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the defense-in-depth here is fine to leave as-is, right?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! My comment was for future readers wondering "is this concurrently safe"?

Comment on lines +161 to +167
// ensureHeaders closes gotHeaders if it has not already been closed, with the
// current status. Used to unblock RoundTrip on handler return-without-write.
func (w *pipeResponseWriter) ensureHeaders() {
w.once.Do(func() {
close(w.gotHeaders)
})
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a gotcha: after calling ensureHeaders() a subsequent call to WriteHeader is a silent no-op.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this only happens on defer. I think it's fine?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, non-blocking but just calling out.

dannykopping and others added 4 commits May 22, 2026 12:20
Introduce an AGPL-side TransportFactory hook on coderd.API that is
populated with an in-process http.RoundTripper when the aibridge daemon
starts, letting coder-agent chatd traffic reach aibridged without going
through the licensed HTTP route. The RoundTripper streams responses via
io.Pipe so SSE/NDJSON/chunked bodies propagate token-by-token and
context cancellation surfaces as a body-read error, matching real-network
semantics.

Refs AIGOV-357.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ortFactory

TransportFor now takes an aibridge.Source (currently just SourceAgents)
attached to the request context for downstream logging. The old
isCoderAgent boolean and (nil, nil) fall-through are gone; the carve-out
gating belongs to the caller's licensing decision, not the transport API.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Fix goroutine leak: the cancellation-watcher goroutine now selects on
  both served.Context().Done() and a handlerDone channel, so it exits
  promptly when the handler completes on the happy path (DEREM-1).
- Fix Response.Status format: use fmt.Sprintf("%d %s", ...) to produce
  the full status line (e.g. "200 OK") per the net/http contract (DEREM-2).
- Add recover() in handler goroutine to prevent panicking handlers from
  crashing the process; produces a 500 response instead (DEREM-6).
- Replace require calls in handler goroutines with assert + return to
  respect testing.T goroutine contract (DEREM-7).
- Add regression test for resp.Status format (DEREM-8).
- Snapshot headers in WriteHeader via frozenHeader to eliminate the race
  window between handler and RoundTrip after gotHeaders closes (DEREM-9).
- Fix comment referencing nonexistent outer http server (DEREM-10).
- Use wg.Go and range-over-int loops (Go 1.25+) (DEREM-11).
- Fix article: a -> an before aibridge.TransportFactory (DEREM-12).
- Replace captureWriter with httptest.NewRecorder (DEREM-5).
- Add TestInMemoryRoundTripper_HandlerPanic test.
@dannykopping dannykopping force-pushed the dk/aibridge-inmemory-pipe branch from 646dfa1 to ee9ba43 Compare May 22, 2026 10:20
@dannykopping dannykopping merged commit 5d40bac into main May 22, 2026
27 checks passed
@dannykopping dannykopping deleted the dk/aibridge-inmemory-pipe branch May 22, 2026 10:33
@github-actions github-actions Bot locked and limited conversation to collaborators May 22, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants