Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: triggerdotdev/trigger.dev
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 5693b62
Choose a base ref
...
head repository: triggerdotdev/trigger.dev
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: c69e939
Choose a head ref
  • 7 commits
  • 73 files changed
  • 4 contributors

Commits on Apr 27, 2026

  1. chore: fix CONTRIBUTING.md setup steps and scope db:seed to webapp (#…

    …3450)
    
    ## Summary
    
    Two fixes that together get a fresh-machine setup working from
    `CONTRIBUTING.md` end-to-end with no manual workarounds:
    
    ### `CONTRIBUTING.md`
    
    - Fix wrong path in the migration walkthrough: `cd packages/database` →
    `cd internal-packages/database`. The current path doesn't exist; this
    breaks step 2 for every contributor adding a migration.
    - Renumber duplicate `4.` steps in **Adding migrations** and the skipped
    `5.` in the hello-world **Running** section.
    - Combine three sequential `pnpm run build --filter ...` calls into one
    (Turbo parallelizes filters): `pnpm run build --filter webapp --filter
    trigger.dev --filter @trigger.dev/sdk`.
    - Add a `pnpm run db:seed` step after migrate. The seed creates the
    local user, `References` org, and reference projects (including
    `hello-world` with the stable `proj_rrkpdguyagvsoktglnod`). Removes the
    manual instruction to edit the `externalRef` column in Postgres.
    - Mention ClickHouse and the ClickHouse migrator alongside
    Postgres/Redis in the Docker step (they're already part of `pnpm run
    docker`, just invisible in the docs).
    - Remove the V1-era **Add sample jobs** section.
    `references/job-catalog` no longer exists; the hello-world flow above
    replaces it.
    
    ### `turbo.json`
    
    Scope `db:seed` to `webapp#db:seed → webapp#build`. The previous
    root-level entry queued `build` for every workspace package — including
    `references-*`, `docs`, `kubernetes-provider`, `coordinator`, etc. Only
    `webapp` actually has a `db:seed` script, so the rest of those builds
    were dead weight. Worse: a single broken reference (today,
    `references-realtime-hooks-test` failing under Turbopack with
    `node:fs/promises`) kills the whole seed pipeline.
    
    After the change, `turbo run db:seed --dry-run` plan drops from 27 tasks
    to 20 — only `webapp` and its real transitive workspace deps. Reference
    projects no longer block seeding.
    
    ## Test plan
    
    - [x] Fresh-machine setup followed end-to-end on a wiped Postgres +
    ClickHouse: migrate → seed → build → webapp → CLI login → `trigger dev`
    → triggered `hello-world`, run completed with `{"message":"Hello,
    world!"}`.
    - [x] `turbo run db:seed --dry-run=json` confirms 20 tasks, all webapp
    deps, no reference packages.
    - [ ] CI green on the renamed turbo task name.
    ericallam authored Apr 27, 2026
    Configuration menu
    Copy the full SHA
    4dced14 View commit details
    Browse the repository at this point in the history
  2. fix(helm): expand CLICKHOUSE_PASSWORD in webapp CLICKHOUSE_URL via ku…

    …belet (#3449)
    
    ## Summary
    
    When the official Helm chart is deployed with an external ClickHouse and
    `clickhouse.external.existingSecret` set — the documented path for not
    committing secrets to `values.yaml` — the webapp pod crash-loops on
    startup:
    
    ```
    goose run: parse "http://default:${CLICKHOUSE_PASSWORD}@<host>:8123?secure=false": net/url: invalid userinfo
    ```
    
    Context in vouch request #3443. Re-opening in draft status per bot
    policy (previous attempt was #3445, closed by automation because it
    wasn't draft; no changes to the patch).
    
    ## Root cause
    
    Two pieces interact:
    
    1. `hosting/k8s/helm/templates/_helpers.tpl` renders `CLICKHOUSE_URL`
    (and `RUN_REPLICATION_CLICKHOUSE_URL`) with a shell-style literal
    `${CLICKHOUSE_PASSWORD}` expecting bash expansion at container start.
    2. `docker/scripts/entrypoint.sh` does `export
    GOOSE_DBSTRING="$CLICKHOUSE_URL"` — single-pass POSIX sh substitution,
    so the inner `${...}` survives as literal text and goose rejects it.
    
    Reproduces against the latest published chart
    (`oci://ghcr.io/triggerdotdev/charts/trigger:4.0.5`) and `main`.
    
    ## Fix
    
    Switch the two helpers (external + `existingSecret` branch) from
    shell-style `${CLICKHOUSE_PASSWORD}` to Kubernetes'
    `$(CLICKHOUSE_PASSWORD)`. Kubelet substitutes `$(VAR)` at pod-creation
    time from earlier env entries, and the chart already declares
    `CLICKHOUSE_PASSWORD` from the Secret immediately before
    `CLICKHOUSE_URL`, so the URL reaches the entrypoint with the real
    password already inlined. No entrypoint change, no image change. The
    plain-password branch (no `existingSecret`) is unchanged.
    
    Operator caveat added as template comments: `CLICKHOUSE_PASSWORD` must
    be URL-userinfo-safe since kubelet substitutes verbatim without
    percent-encoding. Hex-encoded passwords (e.g. `openssl rand -hex 32`)
    are safe by construction.
    
    ## Verification
    
    - `helm template` against `external.existingSecret` now renders `value:
    "http://default:$(CLICKHOUSE_PASSWORD)@<host>:8123?secure=false"` (was
    `${CLICKHOUSE_PASSWORD}`).
    - `helm template` against the plain-password branch is byte-identical to
    before.
    - Deployed end-to-end on a staging EKS cluster (Meistrari platform):
    webapp container reaches `goose: successfully migrated database to
    version: 6`, Node.js ClickHouse client connects at runtime.
    
    ## Alternatives considered
    
    - **Change `entrypoint.sh`** to `eval` / `envsubst` the URL — larger
    surface, touches every deployment mode (Docker Compose + k8s) and every
    container image.
    - **Mirror the Postgres pattern** (chart reads the full URL via
    `valueFrom.secretKeyRef`, as in `trigger-v4.postgres.useSecretUrl`) —
    cleaner long-term but requires a new `values.yaml` field and a migration
    path for existing users. Happy to follow up with that as a separate PR
    if the minimal fix here isn't the preferred direction.
    
    ## Changeset
    
    None added — the Helm chart isn't versioned through `@changesets/cli`
    (docs/chart-only PRs historically merge without a changeset, e.g.
    #2671). Happy to add one if the policy changed.
    
    Closes #3443.
    ThullyoCunha authored Apr 27, 2026
    Configuration menu
    Copy the full SHA
    e8f1a7a View commit details
    Browse the repository at this point in the history

Commits on Apr 28, 2026

  1. ci: skip privileged PR jobs on fork PRs (#3458)

    Fork PRs can't access org secrets or push to GHCR, so these two
    `pull_request` jobs hard-fail with no path to passing:
    
    - `claude-md-audit` - needs `CLAUDE_CODE_OAUTH_TOKEN`
    - `helm-pr-prerelease` `prerelease` job - needs `packages: write` to
    push the chart
    
    Hit this on #3449. Approving the run didn't help; the jobs ran and
    failed at the privileged step. The chart-validation `lint-and-test` job
    is fork-safe and stays untouched - that remains the merge gate for Helm
    changes.
    
    Gate both jobs on same-repo head:
    
    ```yaml
    if: github.event.pull_request.head.repo.full_name == github.repository
    ```
    
    Other PR workflows already handle forks fine: `pr_checks`
    (typecheck/units/e2e/sdk-compat) falls back to anonymous DockerHub pulls
    when secrets are missing.
    nicktrn authored Apr 28, 2026
    Configuration menu
    Copy the full SHA
    9e99c81 View commit details
    Browse the repository at this point in the history
  2. chore(security): close dependabot alerts q2 (#3456)

    Closes ~80 dependabot alerts (3 critical, ~25 high, ~31 medium) by
    bumping direct deps where possible and narrowly overriding the rest.
    Cloud uses `resend` email transport and Node 20 - all bumps are safe for
    both cloud and self-hosters.
    
    ## Direct upgrades
    
    | Package | Where | From | To | Why |
    |---|---|---|---|---|
    | `vite` | root devDeps | ^5.4.21 | *(removed)* | dead pin; vitest pulls
    vite transitively |
    | `dompurify` | apps/webapp | ^3.2.6 | ^3.4.1 | XSS CVEs |
    | `effect` | apps/webapp | ^3.11.7 | ^3.21.2 | AsyncLocalStorage CVE in
    Effect fibers |
    | `nodemailer` | internal-packages/emails | ^7.0.11 | ^8.0.6 | SMTP CRLF
    injection (only affects self-hosters w/ smtp/aws-ses transport) |
    | `uuid` | apps/webapp | ^9.0.0 | ^14.0.0 | buffer bounds check;
    ESM-only but bundled by Remix |
    | `uuid` + `@types/uuid` | packages/trigger-sdk | ^9.0.0 | *(removed)* |
    dead deps, no usage |
    | `@types/uuid` | apps/webapp | ^9.0.0 | *(removed)* | uuid 14 ships its
    own types |
    | `tar` | packages/cli-v3 | ^7.5.4 | ^7.5.13 | path traversal CVEs |
    | `testcontainers` + `@testcontainers/postgresql` +
    `@testcontainers/redis` | internal-packages/testcontainers | ^10.28.0 |
    ^11.14.0 | dev/test cleanup; one-line API fix for
    `RedisContainer(image)` |
    | `rimraf` | webapp + 6 packages | ^3.0.2 / ^5.0.7 | ^6.0.1 | dev/build
    tool consolidation |
    
    ## Scoped overrides
    
    All bound by both `>=` and `<` to avoid major-version yanks.
    
    | Override | Closes |
    |---|---|
    | `tar@>=7 <7.5.11` → `^7.5.11` | supervisor's `@kubernetes/client-node
    1.0.0` chain |
    | `axios@>=1.0.0 <1.15.0` → `^1.15.0` | replaces older 1.9.0 pin |
    | `systeminformation@>=5.0.0 <5.31.0` → `^5.31.0` | bumps existing
    5.27.14 pin |
    | `lodash@>=4.0.0 <4.18.0` → `^4.18.0` | bumps existing 4.17.23 pin |
    | `lodash-es@>=4.0.0 <4.18.0` → `^4.18.0` | new (mirrors lodash) |
    | `dompurify@>=3 <3.4.0` → `^3.4.1` | catches transitive dompurify via
    mermaid |
    | `vite@>=5.0.0 <6.4.2` → `^6.4.2` | path traversal; vite 5 has no patch
    |
    | `rollup@>=4 <4.59.0` → `^4.59.0` | path traversal in vite/vitest chain
    |
    | `flatted@>=3 <3.4.2` → `^3.4.2` | prototype pollution in eslint
    flat-cache |
    | `picomatch@>=2 <2.3.2` → `^2.3.2` | ReDoS in 2.x branch (transitive) |
    | `picomatch@>=4 <4.0.4` → `^4.0.4` | ReDoS in 4.x branch
    (vitest/tinyglobby) |
    | `minimatch@>=3 <3.1.3` → `^3.1.3` | ReDoS in eslint 8 chain |
    | `protobufjs@>=7 <7.5.5` → `^7.5.5` | **critical** RCE via
    @opentelemetry/otlp-transformer |
    | `fast-xml-parser@>=4 <4.5.5` → `^4.5.5` | DOCTYPE bypass + others (4.x
    branch via aws-sdk in supervisor) |
    | `fast-xml-parser@>=5 <5.7.0` → `^5.7.0` | **critical** + others (5.x
    branch via aws-sdk in webapp) |
    | `path-to-regexp@>=0.1 <0.1.13` → `^0.1.13` | ReDoS in express 4 /
    @remix-run/express |
    | `ajv@>=8 <8.18.0` → `^8.18.0` | DoS |
    | `socket.io-parser@>=4 <4.2.6` → `^4.2.6` | DoS in @trigger.dev/core's
    socket.io |
    | `postcss@>=8 <8.5.10` → `^8.5.10` | XSS via stringify |
    | `yaml@>=2 <2.8.3` → `^2.8.3` | DoS |
    | `semver@>=5 <5.7.2` → `^5.7.2` | ReDoS in 5.x |
    | `defu@>=6 <6.1.5` → `^6.1.5` | prototype pollution via __proto__ in
    @prisma/config c12 chain |
    
    ## Dismissed (~47)
    
    | Reason | Cluster | Count |
    |---|---|---|
    | `not_used` | langsmith + next 15.x in references/* | 10 |
    | `not_used` | minimatch 8.x via prisma-generator-ts-enums
    (references/prisma-6) | 3 |
    | `not_used` | basic-ftp via puppeteer in references/hello-world +
    references/seed | 2 |
    | `not_used` | hono / @hono/node-server / express-rate-limit /
    path-to-regexp 8.x / @modelcontextprotocol/sdk - all via mcp-sdk chain
    (dormant in webapp; dev-only localhost in cli-v3) | 22 |
    | `not_used` | fastify / @fastify/static / file-type via evalite devDep
    | 5 |
    | `tolerable_risk` | rollup 3 + minimatch 5/8/9/10 dev/build tooling |
    13 |
    
    ## Notes
    
    - **mcp-sdk chain**: `@vercel/sdk` in webapp imports `Vercel` API client
    only; `mcp-server/*` subpath isn't loaded at runtime. cli-v3's MCP
    server runs only via `trigger mcp` on developer machines. Bumping
    `@modelcontextprotocol/sdk` to latest (1.29.0) wouldn't close these
    alerts anyway - it ships hono ^4.11.4 which is still vulnerable - so
    dismissal is the cleaner call.
    - **References ignore list**: confirmed with current dependabot ignore
    config; added `references/seed/package.json` (only gap).
    - **undici** alerts (CVE-2026-1527, 4 alerts) will auto-close: lockfile
    already at 6.25.0 > patched 6.24.0; just needs Dependabot rescan.
    - **Effect 3.20 fix** is a runtime-only scheduler fix, no public API
    changes - verified with research agent against our four `effect/*`
    imports.
    - **uuid 14** is ESM-only; we only call `validate`/`version` (no crypto
    needed) so Node 20 requirement isn't load-bearing for us.
    ## Public packages (`packages/*`)
    
    Minimal surface, deliberately. None of these change published runtime
    behaviour - all changesets-worthy public package changes are deferred to
    a regular release pass.
    
    | Package | Change | Runtime impact |
    |---|---|---|
    | `packages/trigger-sdk` | Removed dead `uuid` dep (no source imports) |
    None - dep was unused |
    | `packages/cli-v3` | `tar` ^7.5.4 → ^7.5.13 | Patch bump within
    already-allowed 7.x range; nothing CLI consumers see |
    | `packages/core` / `packages/build` / `packages/python` /
    `packages/rsc` / `packages/react-hooks` / `packages/schema-to-json` |
    `rimraf` ^3.0.2 → ^6.0.1 in devDeps | Build-time only, no runtime change
    |
    
    No changeset added because nothing in these packages affects what
    published consumers run.
    
    ## Validation
    
    - Webapp typecheck (forced, no cache) passes after every commit
    - Smoke-tested testcontainers v11 changes via real `postgresTest` +
    `redisTest` (sync.test.ts, releaseConcurrency.test.ts) - both pass
    - Webapp built + verified `require("uuid")` no longer in CJS server
    output (now bundled inline)
    - Test env webapp deployed at `dependabot-q2.rc0` (cloud#740) - no
    issues observed
    - Test suite run with package prerelease passed
    nicktrn authored Apr 28, 2026
    Configuration menu
    Copy the full SHA
    91fd8a8 View commit details
    Browse the repository at this point in the history
  3. feat: add isReplay to run context (#3454)

    ## Summary
    
    Adds `isReplay` boolean to the run context (`ctx.run.isReplay`),
    following the same pattern as the existing `isTest`. The value is
    derived from the existing `replayedFromTaskRunFriendlyId` database
    field, so no schema migration is needed.
    
    ## ✅ Checklist
    
    - [x] I have followed every step in the [contributing
    guide](https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md)
    - [x] The PR title follows the convention.
    - [x] I ran and tested the code works
    
    ---
    
    ## Testing
    
    - Verified `@trigger.dev/core` builds successfully
    - Verified `webapp` typechecks successfully
    - All new fields use `default(false)` for backwards compatibility
    
    ---
    
    ## Changelog
    
    - Added `isReplay` to `TaskRun` and `V3TaskRun` schemas in `common.ts`
    - Added `RUN_IS_REPLAY` semantic attribute and wired it in `taskContext`
    - Propagated `isReplay` through the dequeue system, run attempt system,
    and all execution context construction paths (V1 + V2)
    - Added `isReplay` to `DequeuedMessage` and
    `TaskRunExecutionLazyAttemptPayload` schemas
    - Added patch changeset for `@trigger.dev/core`
    - Updated docs: added `isReplay` to context reference, added "Detecting
    replays" section to replaying page
    
    ---
    
    💯
    
    Link to Devin session:
    https://app.devin.ai/sessions/1d6f1b3cc39a4623b72d05bf00f2d70c
    
    ---------
    
    Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
    Co-authored-by: nick <55853254+nicktrn@users.noreply.github.com>
    devin-ai-integration[bot] and nicktrn authored Apr 28, 2026
    Configuration menu
    Copy the full SHA
    4b28080 View commit details
    Browse the repository at this point in the history
  4. fix(run-engine): debounce hot-key lock contention and 5xx feedback lo…

    …op (#3453)
    
    ## Changes
    
    Three changes in
    `internal-packages/run-engine/src/engine/systems/debounceSystem.ts`, in
    order of impact:
    
    1. **Fast-path skip before the lock.** In `handleExistingRun`, do an
    unlocked read of `delayUntil` (and `createdAt` for the max-duration
    check) from the run row before entering `runLock.lock("handleDebounce",
    ...)`. If `newDelayUntil <= currentDelayUntil` and the run is still
    within its max-duration window, return the existing run immediately
    without taking the lock. Safe because debounce is monotonic-forward only
    — a stale read either matches reality or undershoots, both of which
    decay correctly (re-checked properly inside the lock by whichever caller
    is actually pushing forward). Trailing-mode triggers carrying
    `updateData` still take the lock so the data update is applied.
    
    2. **Quantize `newDelayUntil`.** Round the computed `newDelayUntil` to
    1-second buckets (configurable via `quantizeNewDelayUntilMs`, set to 0
    to disable). Without quantization, every call has a slightly larger
    `newDelayUntil` than the last and they all pass the fast-path check.
    With it, concurrent callers on the same key share a target time and ~95%
    short-circuit. User-visible effect: a debounced run might fire up to 1s
    earlier than the strict spec — non-issue for typical debounce use cases
    (chat summarization, batched notifications, etc.).
    
    3. **Graceful lock-contention fallback.** Wrap the `runLock.lock(...)`
    call so `LockAcquisitionTimeoutError` and Redlock `ExecutionError` /
    `ResourceLockedError` return the existing run id with success instead of
    propagating a 5xx. Debounce is best-effort: if we can't take the lock,
    the herd is already updating it for us; fall in line. This kills the 5xx
    → SDK-retry feedback loop. With (1)+(2) this rarely fires; without them
    it's the difference between 5xx and 200.
    
    Defaults preserve current behaviour aside from quantization (1s) and
    fast-path (on). Both are configurable via `RunEngineOptions.debounce`.
    
    ## ✅ Checklist
    
    - [x] I have followed every step in the [contributing
    guide](https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md)
    - [x] The PR title follows the convention.
    - [x] I ran and tested the code works
    
    
    ---
    
    ## Changelog
    
    Reduce 5xx feedback loops on hot debounce keys by quantizing
    `delayUntil`, adding an unlocked fast-path skip before the redlock, and
    gracefully handling redlock contention in `handleDebounce` so the SDK no
    longer retries into a herd.
    
    ---------
    
    Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
    ericallam and devin-ai-integration[bot] authored Apr 28, 2026
    Configuration menu
    Copy the full SHA
    e134da7 View commit details
    Browse the repository at this point in the history
  5. feat: Sessions - bidirectional durable agent streams (#3417)

    > ⚠️ **Not released yet.** This PR is the server-side foundation only.
    The SDK changes that customers will actually use (`chat.agent`
    migration, `chat.createStartSessionAction`, `useTriggerChatTransport`
    updates) live on a separate branch and ship together in an upcoming
    `@trigger.dev/sdk` prerelease. Until that prerelease is published, this
    surface is reachable only via direct HTTP.
    
    ## What this gives Trigger.dev users
    
    A new first-class primitive, **Session**, for durable, task-bound,
    bidirectional I/O that outlives any single run. Sessions are the run
    manager for `chat.agent` going forward, and they unblock anything else
    that needs "one identifier, many runs over time" with a stable channel
    pair the client can write to and subscribe to.
    
    ### Use cases unblocked
    
    - **Chat agents that persist across many runs.** One session per chat
    (keyed on your own `chatId` via `externalId`), turns 1..N attach to the
    same Session, the UI subscribes once and keeps receiving output as new
    runs take over.
    - **Approval loops and long-running tasks with user feedback.** The task
    waits on `.in`, the client writes to `.in`, the server enforces
    no-writes-after-close.
    - **Workflow progress streams that live past the run.** Subscribe to
    `.out` after the task finishes to replay history.
    - **Resume-next-day flows.** A session is a durable row, not a transient
    stream. Send a message a day later and the server triggers a fresh run
    on the same session.
    
    ### How it works (Session-as-run-manager)
    
    A Session row is task-bound (`taskIdentifier` + `triggerConfig` are
    required) and owns its current run via `currentRunId` +
    `currentRunVersion` for optimistic claim. Three trigger paths:
    
    1. **Session create** — `POST /api/v1/sessions` creates the row and
    triggers the first run synchronously.
    2. **Append-time probe** — `POST
    /realtime/v1/sessions/:session/in/append` checks if the current run is
    alive; if it has terminated (idle exit, crash, etc.), the server
    triggers a new run before processing the append.
    3. **End-and-continue handoff** — `POST
    /api/v1/sessions/:session/end-and-continue`, called by the running
    agent, triggers a fresh run and atomically swaps `currentRunId`. Used by
    `chat.requestUpgrade()` for version handoffs.
    
    Every triggered run is recorded in the `SessionRun` audit table with a
    reason (`initial`, `continuation`, `upgrade`, `manual`).
    
    ## Public API surface
    
    ### Control plane
    
    - `POST /api/v1/sessions` — create. Idempotent on `(env, externalId)`.
    Triggers the first run, returns the session and a session-scoped public
    access token. Returns 409 if the upserted row is already closed.
    - `GET /api/v1/sessions/:session` — retrieve by friendlyId
    (`session_abc...`) or by your own externalId (server disambiguates by
    prefix).
    - `GET /api/v1/sessions` — list with filters (`type`, `tag`,
    `taskIdentifier`, `externalId`, derived `status` ACTIVE/CLOSED/EXPIRED,
    created-at range) and cursor pagination. Backed by ClickHouse.
    - `PATCH /api/v1/sessions/:session` — update tags / metadata /
    externalId.
    - `POST /api/v1/sessions/:session/close` — terminate. Idempotent,
    hard-blocks new server-brokered writes.
    - `POST /api/v1/sessions/:session/end-and-continue` — agent-only handoff
    to a fresh run.
    
    ### Realtime
    
    - `PUT /realtime/v1/sessions/:session/:io` — initialize a channel.
    Returns S2 credentials in headers so high-throughput clients can write
    direct to S2.
    - `GET /realtime/v1/sessions/:session/:io` — SSE subscribe. Supports
    Last-Event-ID resume and an opt-in `X-Peek-Settled: 1` header that
    fast-closes the stream when the upstream is already settled
    (`trigger:turn-complete`), eliminating long-poll wait on
    reconnect-on-reload paths.
    - `POST /realtime/v1/sessions/:session/:io/append` — server-side
    appends.
    - `POST /api/v1/runs/:runFriendlyId/session-streams/wait` — runs wait on
    a session stream as a waitpoint, with a race-check to avoid suspending
    if data already landed.
    
    ### Auth scopes
    
    `sessions` is a new resource type. `read:sessions:{id}`,
    `write:sessions:{id}`, `admin:sessions:{id}` flow through the existing
    JWT validator. Session-scoped public access tokens minted by the server
    replace browser-held trigger-task tokens for chat-style flows — the
    browser never sees a run identifier or a run-scoped token in steady
    state.
    
    ## What's coming after this PR
    
    - **SDK + chat.agent migration**: separate branch, separate PR, ships in
    the next `@trigger.dev/sdk` prerelease alongside this server deploy.
    Customers using the prerelease `chat.agent` will follow the [upgrade
    guide](https://github.com/triggerdotdev/trigger.dev/blob/docs/tri-7532-ai-sdk-chat-transport-and-chat-task-system/docs/ai-chat/upgrade-guide.mdx).
    - **Dashboard surfaces**: dedicated agent list, agent playground, agent
    view on the run dashboard. Tracking separately.
    
    ## Implementation notes
    
    - **Postgres `Session` table**: scalar scoping columns (`projectId`,
    `runtimeEnvironmentId`, `environmentType`, `organizationId`) without
    FKs, matching the January TaskRun FK-removal decision. Point-lookup
    indexes only — list queries go to ClickHouse. Terminal markers
    (`closedAt`, `expiresAt`) are write-once.
    - **ClickHouse `sessions_v1`**: ReplacingMergeTree, partitioned by
    month, ordered by `(org_id, project_id, environment_id, created_at,
    session_id)`. Tags indexed via `tokenbf_v1` skip index.
    - **`SessionsReplicationService`**: mirrors `RunsReplicationService`
    exactly — leader-locked logical replication consumer,
    `ConcurrentFlushScheduler`, retry with exponential backoff + jitter,
    identical metric shape. Dedicated slot + publication so the two consume
    independently.
    - **S2 keys**: `sessions/{addressingKey}/{out|in}`. The existing
    `runs/{runId}/{streamId}` key format for run-scoped streams is
    untouched.
    - **Optimistic claim**: `ensureRunForSession` triggers a run upfront
    (cheap to cancel if it loses the race), then attempts an `updateMany`
    keyed on `currentRunVersion`. Loser cancels its triggered run and reuses
    the winner's. No DB lock held across the trigger.
    
    ### What did NOT change
    
    Run-scoped `streams.pipe` / `streams.input` and the existing
    `/realtime/v1/streams/{runId}/...` routes are unchanged. Sessions are
    net-new — not a reshaping of the current streams API.
    
    ## Deploy notes
    
    - Set `SESSION_REPLICATION_CLICKHOUSE_URL` and
    `SESSION_REPLICATION_ENABLED=1` to enable the replication consumer.
    - The `Session` table needs `REPLICA IDENTITY FULL` set on the prod
    source DB before the publication is created (same one-time DDL we did
    for `TaskRun`). Required for delete events to carry full column values.
    - Cross-form authorization on the `GET /api/v1/sessions/:session` loader
    (a JWT minted for either form authorizes both URL forms). Action routes
    are URL-form-specific, matching how the SDK mints PATs.
    
    ## Verification
    
    - Webapp typecheck clean (10/10).
    - `apps/webapp/test/sessionsReplicationService.test.ts` — round-trip
    tests for insert/update/delete through Postgres logical replication into
    ClickHouse via testcontainers.
    - Live end-to-end against local dev: create + retrieve (both forms) +
    update + close, `.out.initialize` + `.out.append` x2 + `.in.send` +
    `.out.subscribe` over SSE, list with all filter combinations +
    pagination, `end-and-continue` swap, `X-Peek-Settled` fast-close
    (verified in browser via reconnect-on-reload and via curl). Replicated
    row lands in ClickHouse within ~1s.
    - Multi-round Devin + CodeRabbit review feedback addressed
    (read-after-write paths use `prisma` writer, info-leak on auth-routes
    masked as 403, peek-settled discriminator parsing fix, etc.).
    
    ## Test plan
    
    - [ ] `pnpm run typecheck --filter webapp`
    - [ ] `pnpm run test --filter webapp
    ./test/sessionsReplicationService.test.ts --run`
    - [ ] Start the webapp with `SESSION_REPLICATION_CLICKHOUSE_URL` and
    `SESSION_REPLICATION_ENABLED=1`. Confirm the slot and publication
    auto-create on boot.
    - [ ] `POST /api/v1/sessions` and verify the row replicates to
    `trigger_dev.sessions_v1` within a couple of seconds.
    - [ ] `POST /api/v1/sessions/:id/close`, then confirm `POST
    /realtime/v1/sessions/:id/out/append` returns 400.
    - [ ] Reuse a closed session's `externalId` on `POST /api/v1/sessions`
    and confirm 409.
    - [ ] `GET /realtime/v1/sessions/:id/out` with `X-Peek-Settled: 1` after
    a turn completes and confirm `X-Session-Settled: true` response header +
    immediate close.
    ericallam authored Apr 28, 2026
    Configuration menu
    Copy the full SHA
    c69e939 View commit details
    Browse the repository at this point in the history
Loading