From fd1a739a507b92e4db1d7832b239d2f4f41d9395 Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Tue, 26 May 2026 11:31:12 +0100 Subject: [PATCH 01/12] docs: security fix pipeline design spec Design for an autonomous, containerised pipeline that validates and proposes production-grade fixes for ~250 deepsec security findings tracked in Linear, with local-only review and no GitHub exposure until disclosure timing is decided. Implementation lives in a separate repository. Co-Authored-By: Claude Opus 4.7 (1M context) --- ...2026-05-26-security-fix-pipeline-design.md | 561 ++++++++++++++++++ 1 file changed, 561 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md diff --git a/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md b/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md new file mode 100644 index 0000000000..320e3c0720 --- /dev/null +++ b/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md @@ -0,0 +1,561 @@ +# Security Fix Pipeline — Design Spec + +**Date:** 2026-05-26 +**Status:** Approved, ready for implementation planning +**Author:** Daniel Sutton (with Claude) + +## Problem + +We have ~250 security findings filed as Linear tickets. Each needs to be +validated, then — if real — fixed with production-grade care: minimal blast +radius, no breaking changes, cautious rollout, migration considerations. We +want a dedicated machine to chew through this batch autonomously and produce +reviewable artifacts, without constant approval cycles. + +Strict constraints: + +- **Nothing leaks to GitHub.** Findings are exploitable; patches and commit + messages must be treated as sensitive until disclosure timing is decided. +- **Production-grade fixes only.** Multi-step rollouts, backwards-compatible + changes, migration plans where relevant. +- **Human review at the end.** Patches never auto-apply; the upstreaming + process is deliberate and separate. + +## Scope + +In scope: validation, fix design, patch generation, regression testing, +verification, artifact packaging, queue management, review surface. + +Out of scope: actually upstreaming fixes (separate human process post-review); +disclosure / CVE / advisory workflow. + +## Architecture Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Dedicated machine │ +│ │ +│ Linear ◄────── worker.ts ──────► docker compose -p sec-X │ +│ (queue) (serial, N=1) │ │ +│ ▲ ▼ │ +│ │ ┌──────────────┐ │ +│ │ │ per-issue │ │ +│ │ │ compose stack│ │ +│ │ │ postgres │ │ +│ │ │ redis │ │ +│ │ │ clickhouse │ │ +│ │ │ minio │ │ +│ │ │ electric │ │ +│ │ │ webapp ◄────┼──┐ │ +│ │ │ agent ◄────┼──┤ shared +│ │ └──────────────┘ │ /repo +│ │ ▼ volume +│ │ artifacts │ +│ │ vol │ +│ │ │ │ +│ │ ┌──────────┐ ┌──────┐ │ +│ └────────────┤dashboard │◄───────────────────┤MinIO │ │ +│ │localhost │ └──────┘ │ +│ │ :4000 │ │ +│ └──────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + +Three components: **worker daemon**, **per-issue compose stack**, **review +dashboard**. Linear is the queue. MinIO is the artifact store. No GitHub, no +external services beyond Anthropic API and Linear API. + +## Queue & State (Linear) + +Linear is the queue. The worker reads from the **deepsec-findings** view: +`https://linear.app/triggerdotdev/view/deepsec-findings-c443c3c869c0` +All 250 issues already exist in this view. State lives in labels applied on +top of the existing ticket state: + +| Label | Meaning | +|---|---| +| `agent:queued` | Ready for the worker to pick up | +| `agent:in-progress` | Claimed; container running | +| `agent:done` | Artifact bundle uploaded, ready for review | +| `agent:false-positive` | Agent determined the finding is not real | +| `agent:runaway` | Soft-limit hit (turns, wall timeout, heartbeat). Resumable. | +| `agent:resume` | Operator queued for resumption from a runaway snapshot | +| `agent:failed` | Hard failure (crash, OOM, non-resumable error); needs human eyes | +| `agent:reviewed-accept` | Human approved the fix bundle (set by dashboard) | +| `agent:reviewed-reject` | Human rejected the fix bundle (set by dashboard) | + +Audit trail is free — every label change appears in Linear's activity feed. + +## Worker Daemon + +Single TS process running under systemd on the dedicated machine. **Serial, +processes one issue at a time.** + +### Main loop + +```ts +while (running) { + // Source: deepsec-findings Linear view; pick the highest-priority issue + // labelled agent:queued (or not yet labelled by us — first-pass adoption). + const issue = await linear.nextFromDeepsecView({ label: "agent:queued" }); + if (!issue) { await sleep(30_000); continue; } + + await claim(issue); // write state(claimed) → setLabel(in-progress) + await run(issue); // compose up → wait → compose down + await persistArtifacts(); // collect → upload to MinIO → verify + await notify(issue); // post Linear comment (idempotent) + await finalize(issue); // setLabel(done|failed|false-positive) +} +``` + +### Local state file is source of truth + +`/var/lib/sec-fix-worker/state/.json`: + +```json +{ + "phase": "claimed" | "running" | "uploaded" | "finalized", + "startedAt": "...", + "project": "sec-LIN-1234", + "containerExitCode": null, + "outcome": "done" | "failed" | "false-positive" | null +} +``` + +Each transition: write state file → fsync → do side effect → write state +file. State file leads side effects so a crash always leaves a recoverable +record. + +### Reconcile on every startup + +Before entering the loop: + +1. `docker compose ls --filter name=sec-*` → for each orphaned project: tail + logs, `compose down -v`, mark corresponding state file `phase: "crashed"`, + set Linear label `agent:failed`, post Linear comment with the tail. +2. Scan `./state/*.json` for non-finalized entries → finish whatever step was + interrupted (re-upload, re-comment, re-label as needed; all idempotent). +3. Scan Linear for issues stuck on `agent:in-progress` with no local state + file (worker was wiped) → reset to `agent:queued`. + +### Idempotency + +- Uploads check MinIO for existing artifact before re-uploading +- Linear comments include marker ``; + duplicate-comment detection skips re-posting + +## Per-Issue Compose Stack + +One `stack.yml` template instantiated per issue with `--project-name +sec-`. Compose's project name provides the isolation boundary: +isolated network, isolated named volumes, complete teardown via `compose down +-v`. + +### Services + +```yaml +services: + postgres: # ephemeral pgdata per project + redis: # ephemeral per project + clickhouse: # ephemeral per project + minio: # ephemeral; per-issue uploads go to the host MinIO, not this one + electric: # ephemeral per project + + webapp: + volumes: + - repo:/repo # shared with agent + - pnpm-store:/pnpm-store # shared read-mostly across all projects + command: pnpm --filter webapp dev + depends_on: [postgres, redis, clickhouse, electric] + healthcheck: curl http://localhost:3030/healthcheck + + agent: + volumes: + - repo:/repo # SAME volume as webapp + - artifacts:/artifacts # output drop + command: node /run-agent.mjs + depends_on: + webapp: { condition: service_healthy } + +volumes: + pgdata: # per project + repo: # per project, populated from a baked tar snapshot + artifacts: # per project + pnpm-store: + external: true + name: pnpm-store-shared # ONE shared volume across all runs +``` + +### Repo volume population + +**Pinned base SHA: `37eeaa36908fb1aad48fc43d04e5b4e8f474f957`** — `origin/main` +of `trigger.dev-mirror` as of 2026-05-25, the commit immediately preceding +the most recent deepsec revalidate run (2026-05-25 18:14). Findings were +produced against this revision of the codebase, so reproduction and fixes +target it. + +A `repo.tar` snapshot at this SHA is baked into the base image; an init +container extracts it into the `repo` volume at stack startup (~5–10s). If a +Linear issue specifies a different base SHA in its body, the worker swaps in +a `git-clone`-from-local-bare-mirror init container instead. + +### Network isolation + +Compose stack's default network has egress restricted via iptables init +container (or Docker network policy plugin) to: + +- `api.anthropic.com` +- `api.linear.app` +- Local MinIO host + +Inbound: none. Belt-and-braces with the agent's `disallowedTools` list. + +## Agent Container + +### Image + +Base image (`trigger-mirror-agent:pinned`) contains: + +- Repo tar snapshot at a pinned SHA +- pnpm store warm (`pnpm fetch`) — actually mounted as shared external volume +- `@anthropic-ai/claude-agent-sdk` installed globally +- `/run-agent.mjs` (the bridge script) +- `/prompts/security-fix.md` (system prompt encoding the methodology) +- MCP server configs (Linear read+write only; no GitHub MCP) + +API keys passed via Docker secret files at `/run/secrets/anthropic` and +`/run/secrets/linear`, never as env vars. + +### `run-agent.mjs` — the worker↔agent bridge + +```js +import { query } from "@anthropic-ai/claude-agent-sdk"; +import { LinearClient } from "@linear/sdk"; +import fs from "node:fs/promises"; + +const issueId = process.env.LINEAR_ISSUE_ID; +const linear = new LinearClient({ apiKey: await readSecret("linear") }); +const issue = await linear.issue(issueId); + +const systemPrompt = await fs.readFile("/prompts/security-fix.md", "utf8"); +const userPrompt = renderIssueContext(issue, await issue.comments()); + +await fs.mkdir("/artifacts", { recursive: true }); +startHeartbeat("/artifacts/.heartbeat"); // updated every 60s + +const result = query({ + prompt: userPrompt, + options: { + systemPrompt, + cwd: "/repo", + permissionMode: "bypassPermissions", + allowedTools: ["Read", "Edit", "Write", "Bash", "Grep", "Glob"], + disallowedTools: ["WebFetch", "WebSearch"], + mcpServers: { linear: { /* read+write */ } }, + maxTurns: 200, // hard ceiling; runaway loops fail rather than burn budget + }, +}); + +const transcript = await fs.open("/artifacts/transcript.jsonl", "w"); +let finalText = ""; +for await (const msg of result) { + await transcript.write(JSON.stringify(msg) + "\n"); + if (msg.type === "assistant") finalText = extractText(msg); +} +await transcript.close(); + +await fs.writeFile("/artifacts/final-summary.md", finalText); +await fs.writeFile("/artifacts/status.json", + JSON.stringify({ issueId, endedAt: new Date().toISOString() })); + +process.exit(0); +``` + +Process exits when the SDK's async iterator finishes. Container exits. +`docker compose wait agent` on the host returns. **Unix process lifecycle is +the synchronization primitive — no IPC, no polling, no marker files.** + +### Agent contract (encoded in system prompt) + +1. **Validate**: reproduce the finding or declare false positive. Write + `/artifacts/validation.md`. +2. **If false positive**: final assistant message is + `FALSE_POSITIVE: `. Stop. +3. **If real**: produce the bundle in `/artifacts/`: + - `design.md` — blast radius, public-API/DB impact, backwards-compat + strategy, alternatives considered, minimal-impact justification + - `rollout.md` — sequencing across PRs if non-atomic, feature flags, + migration order, monitoring/rollback signals + - `patches/01-*.patch`, `02-*.patch`, ... — ordered, applied with `git am` + - `tests/` — new/updated regression tests, referenced inside the patches + - `verification.log` — captured output of `pnpm typecheck` (apps/internal + packages) or `pnpm build` (public packages) per CLAUDE.md, plus + `pnpm test` for the affected package + - `changeset.md` — draft `.changeset/` entry if any public package touched +4. **Final assistant message**: `SUBMITTED: `. Agent stops + by simply not calling another tool; SDK loop exits. + +### Bias toward minimal impact + +System prompt explicitly instructs: prefer additive changes over modifying +existing surfaces; prefer flagged rollouts over direct ships; prefer multiple +small ordered patches over one big atomic change when the fix touches public +contracts or schema. The agent has agency to decide, but defaults are +cautious. + +## Resumable Runs + +The agent can hit a soft limit — maxTurns (200), wall timeout (90 min), or +heartbeat watchdog (10 min stall) — with valuable in-progress state. These +outcomes are **resumable**, not failures. + +### Classification of run outcomes + +| Outcome | Cause | Resumable? | +|---|---|---| +| `done` | Final message `SUBMITTED` | — | +| `false-positive` | Final message `FALSE_POSITIVE` | — | +| `runaway` | maxTurns hit, wall timeout, heartbeat stall | Yes | +| `failed` | Container crash, OOM, exit code from non-timeout cause | No (state suspect) | + +### What survives teardown + +Two named volumes are snapshotted before `compose down -v`: + +- **`repo`** volume → `./snapshots//run-N/repo/` (source files only; + `node_modules` and pnpm store excluded — rehydrated from the base image on + resume; snapshot stays under ~100 MB per run) +- **`agent-session`** volume (mount of `~/.claude/projects/` inside the agent + container, where the SDK persists session JSONL) → + `./session-snapshots//run-N/` + +Plus the partial `/artifacts/` contents (whatever the agent had written so +far) are collected exactly as for completed runs. + +Snapshots upload to MinIO under +`s3://security-artifacts//runaway-/`. Local copies retained 14 +days (longer than artifact cache because resumes can happen later); MinIO is +the long-term store. + +### Resume action + +Triggered from the dashboard (Resume button on a runaway issue) or +`worker-cli resume `: + +1. Dashboard sets `agent:resume` label and posts a Linear comment + (`` marker for idempotency) +2. Worker picks the issue up like a normal queued issue but branches: + ```ts + if (issue.labels.includes("agent:resume")) { + await restoreRepoVolume(issueId, project); // hydrate from snapshot + await restoreSessionVolume(issueId, project); // hydrate SDK session + await stack.up({ resumeRunNumber: priorRunCount + 1 }); + } else { + await stack.up({ fresh: true }); + } + ``` +3. Inside the container, `run-agent.mjs` checks for an existing session ID in + the mounted session volume. If present, calls + `query({ ..., resume: sessionId })` to continue the prior conversation + rather than starting fresh. The user prompt is prefixed with: "You are + resuming after hitting ``. Review `/artifacts/` for what you've + already produced and continue from there." + +### Resume budget + +- **Max 3 resumes per issue** (configurable). After the 3rd consecutive + runaway outcome, the issue auto-promotes to `agent:failed` with a comment + explaining the cap was hit. Prevents infinite resume loops on truly + unfixable issues. +- Each resume gets a **fresh 90-min wall budget** and a **fresh 200-turn + budget**. The whole point of resume is to extend the available compute, + so per-run budgets reset; only the attempt count is capped. +- Attempt count tracked in the local state file under + `runaways: [{ runNumber, reason, endedAt }, ...]` and reflected in the + dashboard. + +### State file additions + +```json +{ + "phase": "...", + "currentRun": 2, + "runaways": [ + { "runNumber": 1, "reason": "maxTurns", "endedAt": "2026-05-26T11:30:00Z" } + ], + "sessionId": "claude-session-abc123" +} +``` + +### Dashboard surface + +The Review tab shows runaway issues with: + +- Reason for runaway (turns / wall / heartbeat) +- Attempt count (e.g. "runaway 2 of 3") +- Partial artifacts produced so far (whatever the agent had written) +- **Resume** button (disabled at the cap) +- **Mark failed** button (operator can manually give up) +- **Mark false-positive** button (if the partial work is enough to make the + call without resuming) + +The Queue tab shows runaway issues queued for resume distinctly from +fresh-queued issues. + +## Artifact Storage + +- Worker collects `/artifacts/` from the per-project volume to + `./out//` on the host +- Uploads to local MinIO at `s3://security-artifacts//` +- Verifies ETags +- Local `./out//` cached for 7 days; MinIO is source of truth + +## Review Dashboard + +Local Remix app on `localhost:4000`, run on the dedicated machine. + +### Queue tab (live operational view) + +- Counts: queued / in-progress / done / failed / false-positive +- Currently-running container (single, since serial): live log tail +- Recent failures with summaries + +### Review tab (per-issue review) + +For each `agent:done` issue: + +- Validation evidence (rendered Markdown) +- Design doc with blast-radius, alternatives, minimal-impact justification +- Rollout plan +- Ordered patches rendered with a proper diff component (Monaco / react-diff-view) +- Tests rendered alongside their patch +- Verification log +- Changeset draft +- Actions: **Approve** / **Reject** / **Needs changes** — writes + `agent:reviewed-accept` or `agent:reviewed-reject` back to Linear, plus a + review-notes comment + +State syncs to Linear on every action; dashboard is stateless beyond +short-lived UI state. + +## Durability (Unsupervised Operation) + +### Process supervision + +- systemd unit with `Restart=always`, `RestartSec=10`, `WatchdogSec=120`; + worker pings `sd_notify(WATCHDOG=1)` every 30s +- `flock /var/lock/sec-fix-worker.lock` at startup; double-instance prevented +- Kill switch: worker checks `/var/lib/sec-fix-worker/STOP` at top of loop; + drains current issue and exits cleanly if present + +### Timeouts + +- `compose up`: 5 min +- `compose wait agent`: 90 min +- `compose down`: 2 min +- Linear API call: 30s with exponential backoff, max 5 attempts +- MinIO upload: 5 min per file, retry 3x +- **Agent heartbeat watchdog**: agent writes `/artifacts/.heartbeat` every + 60s; if unchanged for 10 min, worker `compose kill agent` → mark + `runaway` (resumable; see "Resumable Runs"). Catches infinite tool-call + loops that don't trip `maxTurns`. + +The 90-min wall timeout and `maxTurns: 200` also produce `runaway` outcomes +rather than hard `failed`. Only container crashes, OOM kills, and non-timeout +non-zero exit codes produce `failed`. + +### Circuit breakers + +- **Consecutive failures**: 5 in a row → write `/var/lib/sec-fix-worker/PAUSED`, + alert, stop picking up new work. Stays alive for status reporting. +- **Failure rate**: >40% over last 20 issues → same. +- **Disk**: before each issue, check free space on `/var/lib/docker` and + `./out/`; if under 20 GB, pause and alert. + +### Resource hygiene + +- Per-issue logs `./logs/.log` capped at 100 MB via streaming truncation +- `docker image prune -f` after every 10 issues +- `docker volume prune -f` at reconcile time +- `./out//` deleted after 7 days + +### Secret hygiene + +- API keys via Docker secrets (`/run/secrets/*`), not env vars +- Transcript scrubber runs over `transcript.jsonl` pre-upload: regex-strips + Bearer tokens, known key prefixes, common secret patterns +- System prompt forbids writing secrets to artifact files + +### Observability (dashboard-only, no external alerting) + +The review dashboard is the operator's single surface. No Slack, no email, no +webhooks — the operator checks the dashboard on their own cadence. + +The dashboard's queue tab shows: + +- Live queue counts (queued / in-progress / done / failed / false-positive) +- Currently-running issue with live log tail +- Last N completed and last N failed, with summaries +- Circuit-breaker state (running / paused-by-consecutive-failures / + paused-by-failure-rate / paused-by-disk) +- Free disk on `/var/lib/docker` and `./out/` +- Projected completion time based on rolling average + +Worker also writes a heartbeat to `/var/lib/sec-fix-worker/heartbeat.json` +every 60s; dashboard surfaces "last heartbeat" as a freshness indicator. If +the worker dies silently, the dashboard makes it obvious within a minute. + +### Stop conditions + +Worker exits cleanly (systemd does not restart past this point — one-shot +disable) when: + +- Queue is empty AND no `agent:in-progress` issues remain +- `STOP` file touched +- Circuit breaker tripped (alerts; stays alive but does not pick up new work) + +## Upstreaming (Out of Pipeline) + +Explicitly out of scope for this pipeline. After review, a separate human +process: + +1. Decides disclosure timing for each accepted fix +2. Sanitizes commit messages if needed +3. Applies patches to a real branch with `git am` +4. Creates real PRs against `main` (now safe — fix is known good, no + reference to the vulnerability in commit messages until disclosure) +5. Coordinates with security advisories / CVE assignment as appropriate + +The pipeline's job ends at `agent:reviewed-accept`. + +## Wall-Clock Budget + +~30 min avg per issue × 250 issues = ~125 hours = ~5.2 days continuous serial +processing. Kick off Friday, review the following week. If too slow later, +lifting to N=2 concurrency is a one-line change to the worker semaphore. + +## What We Build + +1. **Base Docker image** (`trigger-mirror-base`) — repo tar, pnpm fetch +2. **Agent Docker image** (`trigger-mirror-agent`) — base + agent SDK + + `run-agent.mjs` + prompts + MCP configs +3. **`stack.yml`** — the per-issue compose template +4. **`worker.ts`** — the daemon (~150 lines incl. reconcile, durability, + circuit breakers, alerting) +5. **`run-agent.mjs`** — the in-container bridge script (~80 lines) +6. **`/prompts/security-fix.md`** — the system prompt encoding the + validation/fix/rollout methodology and minimal-impact bias +7. **Review dashboard** — local Remix app, queue + review tabs, diff renderer, + Linear writeback +8. **systemd unit** + **iptables egress policy** + **MinIO bucket setup** + + **status Linear issue setup** + +## Non-Goals + +- Auto-applying patches to `main` +- Public PR creation +- CVE / advisory automation +- Multi-machine orchestration (single dedicated machine) +- Parallel issue processing (serial, N=1, by design) +- Re-running an issue automatically after failure (retries are human-driven + via re-labeling to `agent:queued`) From eaae99ec5a2c623fc217c791f01acf22cdd88b7a Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Tue, 26 May 2026 11:36:40 +0100 Subject: [PATCH 02/12] docs: phase 1 implementation plan for sec-fix-pipeline skeleton MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit End-to-end plumbing plan — new repo at ~/Development/sec-fix-pipeline/, worker daemon, stub agent container, MinIO artifact storage, Linear label state machine. Stub agent only; Phases 2-6 (real agent, full stack, durability, resumable runs, dashboard) tracked separately. Co-Authored-By: Claude Opus 4.7 (1M context) --- ...05-26-sec-fix-pipeline-phase-1-skeleton.md | 1832 +++++++++++++++++ 1 file changed, 1832 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md diff --git a/docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md b/docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md new file mode 100644 index 0000000000..7090ce5869 --- /dev/null +++ b/docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md @@ -0,0 +1,1832 @@ +# Sec Fix Pipeline — Phase 1: Skeleton Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** End-to-end orchestration skeleton — worker daemon picks one Linear issue from the deepsec view, runs a stub agent in a docker-compose stack, collects an artifact, uploads it to MinIO, posts a Linear comment, flips the label to `agent:done`, tears down. Validates pipeline plumbing before adding any real agent logic. + +**Architecture:** New standalone repo at `~/Development/sec-fix-pipeline/`. Node + TypeScript worker daemon (no systemd yet — just a CLI you run manually), local Docker for the per-issue stack, local MinIO container for artifacts, Linear SDK for queue. Single-instance, serial, runs one issue then exits (loop comes in a later phase). + +**Tech Stack:** TypeScript, pnpm, Vitest, `@linear/sdk`, `@aws-sdk/client-s3`, `execa` (for shelling to docker compose), `zod` (state file schema), `testcontainers` (MinIO + Linear-mock in tests). + +**Scope:** Phase 1 of 6. Defers the full per-issue stack (Phase 3), real Claude Agent SDK (Phase 2), durability hardening (Phase 4), resumable runs (Phase 5), and the review dashboard (Phase 6). Ships when a real Linear test issue can be processed end-to-end by a manually-run worker against a stub agent. + +**Spec reference:** `docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md` in the `trigger.dev-mirror-2` repo. + +**Pinned base SHA (for later phases, not Phase 1):** `37eeaa36908fb1aad48fc43d04e5b4e8f474f957` + +--- + +## File Structure + +``` +~/Development/sec-fix-pipeline/ +├── package.json +├── pnpm-workspace.yaml # for future packages +├── tsconfig.json +├── tsconfig.base.json +├── vitest.config.ts +├── .gitignore +├── .nvmrc +├── .env.example +├── README.md +├── docker/ +│ ├── agent-stub/ +│ │ ├── Dockerfile +│ │ └── run-agent.mjs # writes /artifacts/hello.txt, exits 0 +│ └── stack.yml # MinIO host service + per-issue agent service +├── src/ +│ ├── config.ts # env loading, zod-validated +│ ├── logger.ts # pino, json output +│ ├── state.ts # state file read/write/transition +│ ├── linear/ +│ │ ├── client.ts # wraps @linear/sdk + the deepsec view +│ │ ├── client.test.ts +│ │ └── labels.ts # label constants + state machine helpers +│ ├── storage/ +│ │ ├── minio.ts # S3 client for local MinIO +│ │ └── minio.test.ts +│ ├── stack/ +│ │ ├── compose.ts # `docker compose` wrapper with timeouts +│ │ └── compose.test.ts +│ ├── worker/ +│ │ ├── process-issue.ts # the per-issue flow +│ │ ├── process-issue.test.ts +│ │ └── main.ts # CLI entry — pick one issue, process, exit +│ └── types.ts +├── test/ +│ └── integration/ +│ └── end-to-end.test.ts # full flow against a mock Linear + real MinIO + stub agent +└── scripts/ + ├── setup-minio.sh # idempotent bucket creation + └── seed-test-issue.ts # creates a Linear test issue labelled agent:queued +``` + +**Responsibilities:** + +- `config.ts` — single place for env vars; fails fast if anything missing. +- `state.ts` — atomic writes to `./state/.json`. Source of truth for phase transitions. +- `linear/` — every Linear interaction. Mock-friendly. View ID hardcoded as a constant. +- `storage/` — MinIO/S3 only. No Linear coupling. +- `stack/` — docker compose only. No state, no Linear. +- `worker/process-issue.ts` — orchestrates: claim → run → collect → upload → notify → finalize. The integration point. +- `worker/main.ts` — CLI front-end. Loops come in Phase 4. + +--- + +## Task 1: Bootstrap the new repo + +**Files:** +- Create: `~/Development/sec-fix-pipeline/package.json` +- Create: `~/Development/sec-fix-pipeline/tsconfig.json` +- Create: `~/Development/sec-fix-pipeline/.gitignore` +- Create: `~/Development/sec-fix-pipeline/.nvmrc` +- Create: `~/Development/sec-fix-pipeline/.env.example` +- Create: `~/Development/sec-fix-pipeline/README.md` + +- [ ] **Step 1: Create the repo and initialize git** + +```bash +mkdir -p ~/Development/sec-fix-pipeline +cd ~/Development/sec-fix-pipeline +git init +git checkout -b main +``` + +- [ ] **Step 2: Write `.nvmrc`** + +``` +22.13.0 +``` + +- [ ] **Step 3: Write `package.json`** + +```json +{ + "name": "sec-fix-pipeline", + "private": true, + "version": "0.1.0", + "type": "module", + "packageManager": "pnpm@10.33.2", + "engines": { "node": ">=22.13.0" }, + "scripts": { + "build": "tsc -p tsconfig.json", + "typecheck": "tsc -p tsconfig.json --noEmit", + "test": "vitest run", + "test:watch": "vitest", + "worker": "tsx src/worker/main.ts", + "setup:minio": "bash scripts/setup-minio.sh", + "seed:issue": "tsx scripts/seed-test-issue.ts" + }, + "dependencies": { + "@aws-sdk/client-s3": "3.654.0", + "@aws-sdk/lib-storage": "3.654.0", + "@linear/sdk": "32.0.0", + "execa": "9.5.1", + "pino": "9.5.0", + "tsx": "4.19.2", + "zod": "3.25.76" + }, + "devDependencies": { + "@types/node": "22.10.2", + "testcontainers": "10.16.0", + "typescript": "5.7.2", + "vitest": "2.1.8" + } +} +``` + +- [ ] **Step 4: Write `tsconfig.json`** + +```json +{ + "compilerOptions": { + "target": "ES2022", + "module": "NodeNext", + "moduleResolution": "NodeNext", + "lib": ["ES2022"], + "strict": true, + "noUncheckedIndexedAccess": true, + "esModuleInterop": true, + "skipLibCheck": true, + "resolveJsonModule": true, + "outDir": "dist", + "rootDir": "src", + "declaration": false, + "sourceMap": true, + "forceConsistentCasingInFileNames": true + }, + "include": ["src/**/*.ts"], + "exclude": ["node_modules", "dist", "**/*.test.ts"] +} +``` + +- [ ] **Step 5: Write `.gitignore`** + +``` +node_modules +dist +state/ +out/ +snapshots/ +logs/ +.env +.env.local +*.log +.DS_Store +``` + +- [ ] **Step 6: Write `.env.example`** + +``` +LINEAR_API_KEY= +LINEAR_DEEPSEC_VIEW_ID=c443c3c869c0 +MINIO_ENDPOINT=http://localhost:9000 +MINIO_ACCESS_KEY=minioadmin +MINIO_SECRET_KEY=minioadmin +MINIO_BUCKET=security-artifacts +WORKER_STATE_DIR=./state +WORKER_OUT_DIR=./out +WORKER_LOGS_DIR=./logs +``` + +- [ ] **Step 7: Write `README.md`** + +```markdown +# sec-fix-pipeline + +Autonomous pipeline for validating and proposing fixes for security findings tracked in the deepsec-findings Linear view. Reads issues, runs an isolated container per issue with Claude Code, produces patch bundles for human review. + +See design spec: `trigger.dev-mirror-2/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md`. + +## Phase 1 status + +End-to-end skeleton. Stub agent only — writes `hello.txt`, no real fixing yet. + +## Setup + +1. `pnpm i` +2. `cp .env.example .env` and fill in `LINEAR_API_KEY` +3. `docker compose -f docker/stack.yml --profile services up -d minio` +4. `pnpm setup:minio` +5. `pnpm seed:issue` to create a Linear test issue +6. `pnpm worker` to process it + +## Layout + +See `docs/architecture.md` (to be written in a later phase). +``` + +- [ ] **Step 8: Install dependencies** + +```bash +cd ~/Development/sec-fix-pipeline +pnpm i +``` + +Expected: clean install, no errors. + +- [ ] **Step 9: Verify TypeScript compiles (empty src)** + +```bash +mkdir -p src +echo "export {};" > src/index.ts +pnpm typecheck +``` + +Expected: no output (success). + +- [ ] **Step 10: Commit** + +```bash +git add -A +git commit -m "chore: bootstrap repo with TS + pnpm + vitest" +``` + +--- + +## Task 2: Config module + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/config.ts` +- Create: `~/Development/sec-fix-pipeline/src/config.test.ts` + +- [ ] **Step 1: Write the failing test** + +`src/config.test.ts`: + +```ts +import { describe, it, expect } from "vitest"; +import { loadConfig } from "./config.js"; + +describe("loadConfig", () => { + it("returns a parsed config when all required vars are set", () => { + const cfg = loadConfig({ + LINEAR_API_KEY: "lin_api_test", + LINEAR_DEEPSEC_VIEW_ID: "c443c3c869c0", + MINIO_ENDPOINT: "http://localhost:9000", + MINIO_ACCESS_KEY: "x", + MINIO_SECRET_KEY: "y", + MINIO_BUCKET: "security-artifacts", + WORKER_STATE_DIR: "./state", + WORKER_OUT_DIR: "./out", + WORKER_LOGS_DIR: "./logs", + }); + expect(cfg.linear.apiKey).toBe("lin_api_test"); + expect(cfg.minio.bucket).toBe("security-artifacts"); + }); + + it("throws when LINEAR_API_KEY is missing", () => { + expect(() => loadConfig({})).toThrow(/LINEAR_API_KEY/); + }); +}); +``` + +- [ ] **Step 2: Run test, verify it fails** + +```bash +pnpm test src/config.test.ts +``` + +Expected: FAIL — module `./config.js` not found. + +- [ ] **Step 3: Write `config.ts`** + +```ts +import { z } from "zod"; + +const Schema = z.object({ + LINEAR_API_KEY: z.string().min(1), + LINEAR_DEEPSEC_VIEW_ID: z.string().min(1), + MINIO_ENDPOINT: z.string().url(), + MINIO_ACCESS_KEY: z.string().min(1), + MINIO_SECRET_KEY: z.string().min(1), + MINIO_BUCKET: z.string().min(1), + WORKER_STATE_DIR: z.string().min(1), + WORKER_OUT_DIR: z.string().min(1), + WORKER_LOGS_DIR: z.string().min(1), +}); + +export type Config = { + linear: { apiKey: string; viewId: string }; + minio: { + endpoint: string; + accessKey: string; + secretKey: string; + bucket: string; + }; + dirs: { state: string; out: string; logs: string }; +}; + +export function loadConfig(env: NodeJS.ProcessEnv | Record = process.env): Config { + const parsed = Schema.parse(env); + return { + linear: { apiKey: parsed.LINEAR_API_KEY, viewId: parsed.LINEAR_DEEPSEC_VIEW_ID }, + minio: { + endpoint: parsed.MINIO_ENDPOINT, + accessKey: parsed.MINIO_ACCESS_KEY, + secretKey: parsed.MINIO_SECRET_KEY, + bucket: parsed.MINIO_BUCKET, + }, + dirs: { + state: parsed.WORKER_STATE_DIR, + out: parsed.WORKER_OUT_DIR, + logs: parsed.WORKER_LOGS_DIR, + }, + }; +} +``` + +- [ ] **Step 4: Run test, verify it passes** + +```bash +pnpm test src/config.test.ts +``` + +Expected: 2 tests pass. + +- [ ] **Step 5: Commit** + +```bash +git add src/config.ts src/config.test.ts +git commit -m "feat: config module with zod-validated env loading" +``` + +--- + +## Task 3: Logger module + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/logger.ts` + +- [ ] **Step 1: Write `logger.ts`** + +```ts +import pino from "pino"; + +export const logger = pino({ + level: process.env.LOG_LEVEL ?? "info", + base: { service: "sec-fix-worker" }, + timestamp: pino.stdTimeFunctions.isoTime, +}); + +export type Logger = typeof logger; +``` + +- [ ] **Step 2: Smoke test** + +```bash +node --experimental-strip-types -e "import('./src/logger.ts').then(m => m.logger.info({ hello: 'world' }, 'test'))" +``` + +Expected: a single JSON line on stdout containing `"hello":"world"`. + +- [ ] **Step 3: Commit** + +```bash +git add src/logger.ts +git commit -m "feat: pino logger module" +``` + +--- + +## Task 4: State file module + +State files live at `${WORKER_STATE_DIR}/.json`. Phase 1 only uses two phases (`claimed`, `finalized`) — the full state machine comes in Phase 4. We still write the file atomically (write to tmp, fsync, rename) because that pattern is load-bearing for Phase 4. + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/state.ts` +- Create: `~/Development/sec-fix-pipeline/src/state.test.ts` + +- [ ] **Step 1: Write the failing test** + +`src/state.test.ts`: + +```ts +import { describe, it, expect, beforeEach } from "vitest"; +import { mkdtemp, rm, readFile } from "node:fs/promises"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { readState, writeState, IssueState } from "./state.js"; + +describe("state file", () => { + let dir: string; + beforeEach(async () => { + dir = await mkdtemp(join(tmpdir(), "sec-state-")); + return async () => rm(dir, { recursive: true, force: true }); + }); + + it("returns null for a missing issue", async () => { + expect(await readState(dir, "LIN-999")).toBeNull(); + }); + + it("round-trips a state object", async () => { + const state: IssueState = { + issueIdentifier: "LIN-1", + phase: "claimed", + project: "sec-lin-1", + startedAt: "2026-05-26T10:00:00Z", + outcome: null, + }; + await writeState(dir, state); + const read = await readState(dir, "LIN-1"); + expect(read).toEqual(state); + }); + + it("writes atomically via a tmp+rename", async () => { + const state: IssueState = { + issueIdentifier: "LIN-2", + phase: "claimed", + project: "sec-lin-2", + startedAt: "2026-05-26T10:00:00Z", + outcome: null, + }; + await writeState(dir, state); + const raw = await readFile(join(dir, "LIN-2.json"), "utf8"); + expect(JSON.parse(raw)).toEqual(state); + }); +}); +``` + +- [ ] **Step 2: Run test, verify it fails** + +```bash +pnpm test src/state.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `state.ts`** + +```ts +import { mkdir, readFile, writeFile, rename } from "node:fs/promises"; +import { join } from "node:path"; +import { z } from "zod"; + +const PhaseSchema = z.enum(["claimed", "running", "uploaded", "finalized"]); +const OutcomeSchema = z.enum(["done", "failed", "false-positive", "runaway"]).nullable(); + +export const IssueStateSchema = z.object({ + issueIdentifier: z.string().min(1), + phase: PhaseSchema, + project: z.string().min(1), + startedAt: z.string().min(1), + outcome: OutcomeSchema, +}); +export type IssueState = z.infer; + +function pathFor(dir: string, issueIdentifier: string): string { + return join(dir, `${issueIdentifier}.json`); +} + +export async function readState(dir: string, issueIdentifier: string): Promise { + try { + const raw = await readFile(pathFor(dir, issueIdentifier), "utf8"); + return IssueStateSchema.parse(JSON.parse(raw)); + } catch (err: any) { + if (err?.code === "ENOENT") return null; + throw err; + } +} + +export async function writeState(dir: string, state: IssueState): Promise { + await mkdir(dir, { recursive: true }); + const final = pathFor(dir, state.issueIdentifier); + const tmp = `${final}.tmp-${process.pid}-${Date.now()}`; + await writeFile(tmp, JSON.stringify(state, null, 2), { flag: "w" }); + await rename(tmp, final); +} +``` + +- [ ] **Step 4: Run test, verify it passes** + +```bash +pnpm test src/state.test.ts +``` + +Expected: 3 tests pass. + +- [ ] **Step 5: Commit** + +```bash +git add src/state.ts src/state.test.ts +git commit -m "feat: state file module with atomic writes" +``` + +--- + +## Task 5: Linear labels module + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/linear/labels.ts` + +- [ ] **Step 1: Write `labels.ts`** + +```ts +export const LABELS = { + queued: "agent:queued", + inProgress: "agent:in-progress", + done: "agent:done", + falsePositive: "agent:false-positive", + runaway: "agent:runaway", + resume: "agent:resume", + failed: "agent:failed", + reviewedAccept: "agent:reviewed-accept", + reviewedReject: "agent:reviewed-reject", +} as const; + +export type AgentLabel = typeof LABELS[keyof typeof LABELS]; + +export const TERMINAL_LABELS: readonly AgentLabel[] = [ + LABELS.done, + LABELS.falsePositive, + LABELS.failed, + LABELS.runaway, +] as const; + +export const AGENT_OWNED_LABELS: readonly AgentLabel[] = Object.values(LABELS) as readonly AgentLabel[]; +``` + +- [ ] **Step 2: Commit** + +```bash +git add src/linear/labels.ts +git commit -m "feat: agent label constants" +``` + +--- + +## Task 6: Linear client wrapper + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/linear/client.ts` +- Create: `~/Development/sec-fix-pipeline/src/linear/client.test.ts` + +For Phase 1 we only need: `findNextQueuedIssue()`, `swapLabel(issueId, fromLabel, toLabel)`, `addComment(issueId, body)`. The `nextFromDeepsecView` filter is implemented as: fetch the view's issues filtered by label `agent:queued`. We'll stub the SDK in tests with a hand-rolled fake; we are not running against real Linear in unit tests. + +- [ ] **Step 1: Write the failing test** + +`src/linear/client.test.ts`: + +```ts +import { describe, it, expect, vi } from "vitest"; +import { createLinearClient, LinearGateway } from "./client.js"; +import { LABELS } from "./labels.js"; + +function fakeGateway(overrides: Partial = {}): LinearGateway { + return { + listIssuesInViewWithLabel: vi.fn().mockResolvedValue([]), + listLabelIdsForIssue: vi.fn().mockResolvedValue([]), + findLabelIdByName: vi.fn().mockResolvedValue("label-id-fake"), + updateIssueLabels: vi.fn().mockResolvedValue(undefined), + createComment: vi.fn().mockResolvedValue(undefined), + ...overrides, + }; +} + +describe("LinearClient", () => { + it("findNextQueuedIssue returns the first issue in the view labelled queued", async () => { + const gateway = fakeGateway({ + listIssuesInViewWithLabel: vi.fn().mockResolvedValue([ + { id: "i1", identifier: "LIN-1", title: "fix sql injection in users", body: "details", labelIds: ["label-id-fake"] }, + { id: "i2", identifier: "LIN-2", title: "fix xss", body: "details", labelIds: ["label-id-fake"] }, + ]), + }); + const client = createLinearClient({ viewId: "view-x" }, gateway); + const next = await client.findNextQueuedIssue(); + expect(next?.identifier).toBe("LIN-1"); + expect(gateway.listIssuesInViewWithLabel).toHaveBeenCalledWith("view-x", LABELS.queued); + }); + + it("swapLabel removes the from label and adds the to label", async () => { + const update = vi.fn().mockResolvedValue(undefined); + const gateway = fakeGateway({ + listLabelIdsForIssue: vi.fn().mockResolvedValue(["from-id", "other-id"]), + findLabelIdByName: vi.fn().mockImplementation((name: string) => + Promise.resolve(name === LABELS.queued ? "from-id" : "to-id"), + ), + updateIssueLabels: update, + }); + const client = createLinearClient({ viewId: "view-x" }, gateway); + await client.swapLabel("issue-1", LABELS.queued, LABELS.inProgress); + expect(update).toHaveBeenCalledWith("issue-1", ["other-id", "to-id"]); + }); + + it("addComment delegates to gateway", async () => { + const create = vi.fn().mockResolvedValue(undefined); + const gateway = fakeGateway({ createComment: create }); + const client = createLinearClient({ viewId: "view-x" }, gateway); + await client.addComment("issue-1", "hello"); + expect(create).toHaveBeenCalledWith("issue-1", "hello"); + }); +}); +``` + +- [ ] **Step 2: Run test, verify it fails** + +```bash +pnpm test src/linear/client.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `client.ts`** + +```ts +import { LinearClient as SDKClient } from "@linear/sdk"; +import { LABELS, type AgentLabel } from "./labels.js"; + +export type LinearIssue = { + id: string; + identifier: string; + title: string; + body: string; + labelIds: string[]; +}; + +export type LinearGateway = { + listIssuesInViewWithLabel(viewId: string, label: AgentLabel): Promise; + listLabelIdsForIssue(issueId: string): Promise; + findLabelIdByName(name: AgentLabel): Promise; + updateIssueLabels(issueId: string, labelIds: string[]): Promise; + createComment(issueId: string, body: string): Promise; +}; + +export type LinearClient = { + findNextQueuedIssue(): Promise; + swapLabel(issueId: string, from: AgentLabel, to: AgentLabel): Promise; + addComment(issueId: string, body: string): Promise; +}; + +export function createLinearClient(opts: { viewId: string }, gateway: LinearGateway): LinearClient { + return { + async findNextQueuedIssue() { + const issues = await gateway.listIssuesInViewWithLabel(opts.viewId, LABELS.queued); + return issues[0] ?? null; + }, + async swapLabel(issueId, from, to) { + const [current, fromId, toId] = await Promise.all([ + gateway.listLabelIdsForIssue(issueId), + gateway.findLabelIdByName(from), + gateway.findLabelIdByName(to), + ]); + const next = Array.from(new Set([...current.filter((id) => id !== fromId), toId])); + await gateway.updateIssueLabels(issueId, next); + }, + async addComment(issueId, body) { + await gateway.createComment(issueId, body); + }, + }; +} + +export function createRealGateway(apiKey: string): LinearGateway { + const sdk = new SDKClient({ apiKey }); + return { + async listIssuesInViewWithLabel(viewId, label) { + const view = await sdk.customView(viewId); + const issuesConnection = await view.issues({ + filter: { labels: { name: { eq: label } } }, + orderBy: undefined, + } as any); + return issuesConnection.nodes.map((n) => ({ + id: n.id, + identifier: n.identifier, + title: n.title, + body: n.description ?? "", + labelIds: (n as any)._labelIds ?? [], + })); + }, + async listLabelIdsForIssue(issueId) { + const issue = await sdk.issue(issueId); + const labels = await issue.labels(); + return labels.nodes.map((l) => l.id); + }, + async findLabelIdByName(name) { + const labels = await sdk.issueLabels({ filter: { name: { eq: name } } }); + const found = labels.nodes[0]; + if (!found) throw new Error(`Linear label not found: ${name}`); + return found.id; + }, + async updateIssueLabels(issueId, labelIds) { + await sdk.updateIssue(issueId, { labelIds }); + }, + async createComment(issueId, body) { + await sdk.createComment({ issueId, body }); + }, + }; +} +``` + +- [ ] **Step 4: Run test, verify it passes** + +```bash +pnpm test src/linear/client.test.ts +``` + +Expected: 3 tests pass. + +- [ ] **Step 5: Commit** + +```bash +git add src/linear/client.ts src/linear/client.test.ts +git commit -m "feat: linear client with gateway abstraction" +``` + +--- + +## Task 7: MinIO storage module + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/storage/minio.ts` +- Create: `~/Development/sec-fix-pipeline/src/storage/minio.test.ts` + +We use the AWS SDK against MinIO's S3-compatible endpoint. The test uses testcontainers to spin a real MinIO — no mocking S3 itself. + +- [ ] **Step 1: Write the failing test** + +`src/storage/minio.test.ts`: + +```ts +import { describe, it, expect, beforeAll, afterAll } from "vitest"; +import { GenericContainer, StartedTestContainer } from "testcontainers"; +import { mkdtemp, writeFile, mkdir, rm } from "node:fs/promises"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { createStorage } from "./minio.js"; + +let minio: StartedTestContainer; +let endpoint: string; + +beforeAll(async () => { + minio = await new GenericContainer("minio/minio:RELEASE.2024-10-29T16-01-48Z") + .withCommand(["server", "/data"]) + .withEnvironment({ MINIO_ROOT_USER: "test", MINIO_ROOT_PASSWORD: "testtest" }) + .withExposedPorts(9000) + .start(); + endpoint = `http://${minio.getHost()}:${minio.getMappedPort(9000)}`; +}, 60_000); + +afterAll(async () => { + await minio.stop(); +}); + +describe("storage", () => { + it("uploads a directory and lists its keys", async () => { + const dir = await mkdtemp(join(tmpdir(), "upload-")); + try { + await mkdir(join(dir, "sub"), { recursive: true }); + await writeFile(join(dir, "a.txt"), "alpha"); + await writeFile(join(dir, "sub/b.txt"), "beta"); + + const storage = createStorage({ + endpoint, + accessKey: "test", + secretKey: "testtest", + bucket: "test-bucket", + }); + await storage.ensureBucket(); + await storage.uploadDirectory(dir, "LIN-1/"); + + const keys = await storage.list("LIN-1/"); + expect(keys.sort()).toEqual(["LIN-1/a.txt", "LIN-1/sub/b.txt"]); + } finally { + await rm(dir, { recursive: true, force: true }); + } + }, 60_000); +}); +``` + +- [ ] **Step 2: Run test, verify it fails** + +```bash +pnpm test src/storage/minio.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `minio.ts`** + +```ts +import { S3Client, CreateBucketCommand, HeadBucketCommand, ListObjectsV2Command, PutObjectCommand } from "@aws-sdk/client-s3"; +import { readFile, readdir, stat } from "node:fs/promises"; +import { join, relative, posix } from "node:path"; + +export type Storage = { + ensureBucket(): Promise; + uploadDirectory(localDir: string, keyPrefix: string): Promise; + list(keyPrefix: string): Promise; +}; + +export function createStorage(opts: { + endpoint: string; + accessKey: string; + secretKey: string; + bucket: string; +}): Storage { + const s3 = new S3Client({ + endpoint: opts.endpoint, + region: "us-east-1", + credentials: { accessKeyId: opts.accessKey, secretAccessKey: opts.secretKey }, + forcePathStyle: true, + }); + + return { + async ensureBucket() { + try { + await s3.send(new HeadBucketCommand({ Bucket: opts.bucket })); + } catch { + await s3.send(new CreateBucketCommand({ Bucket: opts.bucket })); + } + }, + + async uploadDirectory(localDir, keyPrefix) { + const files = await walk(localDir); + for (const abs of files) { + const rel = relative(localDir, abs).split(/[\\/]/).join("/"); + const key = posix.join(keyPrefix.replace(/\/+$/, ""), rel); + const body = await readFile(abs); + await s3.send(new PutObjectCommand({ Bucket: opts.bucket, Key: key, Body: body })); + } + }, + + async list(keyPrefix) { + const out: string[] = []; + let token: string | undefined; + do { + const resp = await s3.send( + new ListObjectsV2Command({ Bucket: opts.bucket, Prefix: keyPrefix, ContinuationToken: token }), + ); + for (const obj of resp.Contents ?? []) if (obj.Key) out.push(obj.Key); + token = resp.IsTruncated ? resp.NextContinuationToken : undefined; + } while (token); + return out; + }, + }; +} + +async function walk(dir: string): Promise { + const entries = await readdir(dir, { withFileTypes: true }); + const out: string[] = []; + for (const e of entries) { + const full = join(dir, e.name); + if (e.isDirectory()) out.push(...(await walk(full))); + else if (e.isFile()) out.push(full); + } + return out; +} +``` + +- [ ] **Step 4: Run test, verify it passes** + +```bash +pnpm test src/storage/minio.test.ts +``` + +Expected: 1 test passes (testcontainer pull may take a minute first time). + +- [ ] **Step 5: Commit** + +```bash +git add src/storage/minio.ts src/storage/minio.test.ts +git commit -m "feat: minio storage module with directory upload" +``` + +--- + +## Task 8: Compose wrapper + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/stack/compose.ts` +- Create: `~/Development/sec-fix-pipeline/src/stack/compose.test.ts` + +The compose module wraps three operations: `up --wait`, `wait `, `down -v`. Tests use a tiny fixture `compose-fixture.yml` with a single alpine service that sleeps then exits — no real stack needed. + +- [ ] **Step 1: Write a compose fixture for the test** + +Create `src/stack/__fixtures__/compose-fixture.yml`: + +```yaml +services: + agent: + image: alpine:3.20 + command: ["sh", "-c", "echo started; sleep 2; echo done; exit 0"] +``` + +- [ ] **Step 2: Write the failing test** + +`src/stack/compose.test.ts`: + +```ts +import { describe, it, expect } from "vitest"; +import { join } from "node:path"; +import { createCompose } from "./compose.js"; + +const fixture = join(__dirname, "__fixtures__/compose-fixture.yml"); + +describe("compose", () => { + it("up → wait → down cycles cleanly and reports exit code", async () => { + const compose = createCompose({ + file: fixture, + project: `sec-fix-test-${Date.now()}`, + }); + try { + await compose.up({ timeoutMs: 30_000 }); + const exitCode = await compose.waitForService("agent", { timeoutMs: 30_000 }); + expect(exitCode).toBe(0); + } finally { + await compose.down({ timeoutMs: 30_000 }); + } + }, 90_000); +}); +``` + +- [ ] **Step 3: Run test, verify it fails** + +```bash +pnpm test src/stack/compose.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 4: Write `compose.ts`** + +```ts +import { execa } from "execa"; + +export type Compose = { + up(opts: { timeoutMs: number }): Promise; + waitForService(service: string, opts: { timeoutMs: number }): Promise; + down(opts: { timeoutMs: number }): Promise; +}; + +export function createCompose(opts: { file: string; project: string }): Compose { + const base = ["compose", "-p", opts.project, "-f", opts.file]; + + return { + async up({ timeoutMs }) { + await execa("docker", [...base, "up", "-d", "--wait"], { timeout: timeoutMs }); + }, + + async waitForService(service, { timeoutMs }) { + const result = await execa("docker", [...base, "wait", service], { timeout: timeoutMs }); + const code = parseInt(result.stdout.trim(), 10); + if (Number.isNaN(code)) { + throw new Error(`docker compose wait returned non-numeric: ${result.stdout}`); + } + return code; + }, + + async down({ timeoutMs }) { + await execa("docker", [...base, "down", "-v"], { timeout: timeoutMs }); + }, + }; +} +``` + +- [ ] **Step 5: Run test, verify it passes** + +```bash +pnpm test src/stack/compose.test.ts +``` + +Expected: 1 test passes. Requires Docker running locally. + +- [ ] **Step 6: Commit** + +```bash +git add src/stack/compose.ts src/stack/compose.test.ts src/stack/__fixtures__/compose-fixture.yml +git commit -m "feat: docker compose wrapper (up/wait/down)" +``` + +--- + +## Task 9: Stub agent container + +**Files:** +- Create: `~/Development/sec-fix-pipeline/docker/agent-stub/Dockerfile` +- Create: `~/Development/sec-fix-pipeline/docker/agent-stub/run-agent.mjs` + +The Phase 1 agent does nothing real — writes a hello file with the issue identifier into `/artifacts/`, then exits 0. The container interface is the contract Phase 2 will fill out. + +- [ ] **Step 1: Write `run-agent.mjs`** + +```js +import { writeFile, mkdir } from "node:fs/promises"; + +const issueId = process.env.LINEAR_ISSUE_ID; +if (!issueId) { + console.error("LINEAR_ISSUE_ID is required"); + process.exit(2); +} + +await mkdir("/artifacts", { recursive: true }); +await writeFile( + "/artifacts/hello.txt", + `Hello from sec-fix-pipeline phase 1 stub agent.\nIssue: ${issueId}\nTimestamp: ${new Date().toISOString()}\n`, +); +await writeFile( + "/artifacts/final-summary.md", + `# Stub agent run\n\nIssue: ${issueId}\n\nThis is a Phase 1 skeleton. No real validation or fix was performed.\n`, +); +await writeFile( + "/artifacts/status.json", + JSON.stringify({ issueId, endedAt: new Date().toISOString(), stub: true }, null, 2), +); + +console.log(`stub agent done for ${issueId}`); +process.exit(0); +``` + +- [ ] **Step 2: Write `Dockerfile`** + +```dockerfile +FROM node:22.13.0-alpine +WORKDIR /app +COPY run-agent.mjs ./run-agent.mjs +ENTRYPOINT ["node", "/app/run-agent.mjs"] +``` + +- [ ] **Step 3: Build the image** + +```bash +cd ~/Development/sec-fix-pipeline +docker build -t sec-fix/agent-stub:latest docker/agent-stub +``` + +Expected: build succeeds. + +- [ ] **Step 4: Smoke-run the image** + +```bash +docker run --rm -e LINEAR_ISSUE_ID=LIN-TEST -v $PWD/tmp-artifacts:/artifacts sec-fix/agent-stub:latest +ls tmp-artifacts/ +rm -rf tmp-artifacts/ +``` + +Expected: prints "stub agent done for LIN-TEST"; directory contains `hello.txt`, `final-summary.md`, `status.json`. + +- [ ] **Step 5: Commit** + +```bash +git add docker/agent-stub/ +git commit -m "feat: stub agent container that writes hello artifact" +``` + +--- + +## Task 10: Per-issue stack template + +For Phase 1 the stack contains only the agent service. Later phases add postgres, redis, clickhouse, webapp, electric, and the shared `repo` volume. The MinIO host service is run separately (long-lived, shared across all issues). + +**Files:** +- Create: `~/Development/sec-fix-pipeline/docker/stack.yml` + +- [ ] **Step 1: Write `stack.yml`** + +```yaml +# Per-issue compose template. Instantiated with -p sec-. +# Phase 1: agent service only. Phase 3 will add the full trigger.dev stack. + +name: sec-fix-issue + +services: + agent: + image: sec-fix/agent-stub:latest + environment: + LINEAR_ISSUE_ID: "${LINEAR_ISSUE_ID:?LINEAR_ISSUE_ID must be set}" + volumes: + - artifacts:/artifacts + +volumes: + artifacts: +``` + +- [ ] **Step 2: Write the host services compose file** + +Create `docker/host-services.yml`: + +```yaml +# Long-lived host services. Brought up once with: +# docker compose -f docker/host-services.yml up -d + +name: sec-fix-host + +services: + minio: + image: minio/minio:RELEASE.2024-10-29T16-01-48Z + command: ["server", "/data", "--console-address", ":9001"] + environment: + MINIO_ROOT_USER: "${MINIO_ACCESS_KEY:-minioadmin}" + MINIO_ROOT_PASSWORD: "${MINIO_SECRET_KEY:-minioadmin}" + ports: + - "9000:9000" + - "9001:9001" + volumes: + - minio-data:/data + restart: unless-stopped + +volumes: + minio-data: +``` + +- [ ] **Step 3: Start host services** + +```bash +docker compose -f docker/host-services.yml up -d +docker ps | grep minio +``` + +Expected: MinIO container running on port 9000. + +- [ ] **Step 4: Commit** + +```bash +git add docker/stack.yml docker/host-services.yml +git commit -m "feat: per-issue stack template and host MinIO service" +``` + +--- + +## Task 11: MinIO bucket setup script + +**Files:** +- Create: `~/Development/sec-fix-pipeline/scripts/setup-minio.sh` + +- [ ] **Step 1: Write `setup-minio.sh`** + +```bash +#!/usr/bin/env bash +set -euo pipefail + +# Loads env from .env if present +if [ -f .env ]; then + set -a; source .env; set +a +fi + +: "${MINIO_ENDPOINT:?must be set}" +: "${MINIO_ACCESS_KEY:?must be set}" +: "${MINIO_SECRET_KEY:?must be set}" +: "${MINIO_BUCKET:?must be set}" + +docker run --rm --network host \ + -e MC_HOST_local="http://${MINIO_ACCESS_KEY}:${MINIO_SECRET_KEY}@${MINIO_ENDPOINT#http://}" \ + minio/mc:RELEASE.2024-10-29T15-34-59Z \ + mb -p "local/${MINIO_BUCKET}" + +echo "Bucket ready: ${MINIO_BUCKET}" +``` + +- [ ] **Step 2: Make executable and run** + +```bash +chmod +x scripts/setup-minio.sh +pnpm setup:minio +``` + +Expected: prints "Bucket ready: security-artifacts" (or reports the bucket already exists). + +- [ ] **Step 3: Commit** + +```bash +git add scripts/setup-minio.sh +git commit -m "feat: minio bucket setup script" +``` + +--- + +## Task 12: process-issue orchestration + +This is the integration point. It claims the issue, runs the stack, collects `/artifacts/` from the compose volume, uploads to MinIO, posts a Linear comment, and finalizes the label. + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/worker/process-issue.ts` +- Create: `~/Development/sec-fix-pipeline/src/worker/process-issue.test.ts` + +- [ ] **Step 1: Write the failing test** + +`src/worker/process-issue.test.ts`: + +```ts +import { describe, it, expect, vi } from "vitest"; +import { mkdtemp, rm, mkdir, writeFile, readFile } from "node:fs/promises"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { processIssue, ProcessIssueDeps } from "./process-issue.js"; +import { LABELS } from "../linear/labels.js"; + +function deps(overrides: Partial = {}): ProcessIssueDeps { + return { + linear: { + findNextQueuedIssue: vi.fn(), + swapLabel: vi.fn().mockResolvedValue(undefined), + addComment: vi.fn().mockResolvedValue(undefined), + }, + compose: { + up: vi.fn().mockResolvedValue(undefined), + waitForService: vi.fn().mockResolvedValue(0), + down: vi.fn().mockResolvedValue(undefined), + }, + storage: { + ensureBucket: vi.fn().mockResolvedValue(undefined), + uploadDirectory: vi.fn().mockResolvedValue(undefined), + list: vi.fn().mockResolvedValue([]), + }, + collectArtifacts: vi.fn().mockResolvedValue(undefined), + stateDir: "", + outDir: "", + logger: { info: () => {}, error: () => {}, warn: () => {} } as any, + ...overrides, + }; +} + +describe("processIssue", () => { + it("happy path: claim → up → wait → collect → upload → comment → done", async () => { + const d = deps(); + const stateDir = await mkdtemp(join(tmpdir(), "state-")); + const outDir = await mkdtemp(join(tmpdir(), "out-")); + try { + await processIssue( + { id: "issue-1", identifier: "LIN-1", title: "test", body: "", labelIds: [] }, + { ...d, stateDir, outDir }, + ); + expect(d.linear.swapLabel).toHaveBeenNthCalledWith(1, "issue-1", LABELS.queued, LABELS.inProgress); + expect(d.compose.up).toHaveBeenCalledOnce(); + expect(d.compose.waitForService).toHaveBeenCalledWith("agent", expect.any(Object)); + expect(d.storage.uploadDirectory).toHaveBeenCalled(); + expect(d.linear.addComment).toHaveBeenCalled(); + expect(d.linear.swapLabel).toHaveBeenNthCalledWith(2, "issue-1", LABELS.inProgress, LABELS.done); + expect(d.compose.down).toHaveBeenCalledOnce(); + + const state = JSON.parse(await readFile(join(stateDir, "LIN-1.json"), "utf8")); + expect(state.phase).toBe("finalized"); + expect(state.outcome).toBe("done"); + } finally { + await rm(stateDir, { recursive: true, force: true }); + await rm(outDir, { recursive: true, force: true }); + } + }); + + it("agent non-zero exit causes failed outcome and label", async () => { + const d = deps({ + compose: { + up: vi.fn().mockResolvedValue(undefined), + waitForService: vi.fn().mockResolvedValue(1), + down: vi.fn().mockResolvedValue(undefined), + }, + }); + const stateDir = await mkdtemp(join(tmpdir(), "state-")); + const outDir = await mkdtemp(join(tmpdir(), "out-")); + try { + await processIssue( + { id: "issue-2", identifier: "LIN-2", title: "test", body: "", labelIds: [] }, + { ...d, stateDir, outDir }, + ); + expect(d.linear.swapLabel).toHaveBeenLastCalledWith("issue-2", LABELS.inProgress, LABELS.failed); + const state = JSON.parse(await readFile(join(stateDir, "LIN-2.json"), "utf8")); + expect(state.outcome).toBe("failed"); + } finally { + await rm(stateDir, { recursive: true, force: true }); + await rm(outDir, { recursive: true, force: true }); + } + }); + + it("always tears down the stack, even if upload fails", async () => { + const d = deps({ + storage: { + ensureBucket: vi.fn().mockResolvedValue(undefined), + uploadDirectory: vi.fn().mockRejectedValue(new Error("minio down")), + list: vi.fn().mockResolvedValue([]), + }, + }); + const stateDir = await mkdtemp(join(tmpdir(), "state-")); + const outDir = await mkdtemp(join(tmpdir(), "out-")); + try { + await expect( + processIssue( + { id: "issue-3", identifier: "LIN-3", title: "test", body: "", labelIds: [] }, + { ...d, stateDir, outDir }, + ), + ).rejects.toThrow(/minio down/); + expect(d.compose.down).toHaveBeenCalledOnce(); + } finally { + await rm(stateDir, { recursive: true, force: true }); + await rm(outDir, { recursive: true, force: true }); + } + }); +}); +``` + +- [ ] **Step 2: Run test, verify it fails** + +```bash +pnpm test src/worker/process-issue.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `process-issue.ts`** + +```ts +import { mkdir } from "node:fs/promises"; +import { join } from "node:path"; +import type { LinearClient, LinearIssue } from "../linear/client.js"; +import { LABELS } from "../linear/labels.js"; +import type { Compose } from "../stack/compose.js"; +import type { Storage } from "../storage/minio.js"; +import type { Logger } from "../logger.js"; +import { writeState } from "../state.js"; + +export type ProcessIssueDeps = { + linear: LinearClient; + compose: Compose; + storage: Storage; + collectArtifacts: (project: string, destDir: string) => Promise; + stateDir: string; + outDir: string; + logger: Logger; +}; + +const TIMEOUTS = { + up: 5 * 60_000, + wait: 90 * 60_000, + down: 2 * 60_000, +}; + +export async function processIssue(issue: LinearIssue, deps: ProcessIssueDeps): Promise { + const project = `sec-${issue.identifier.toLowerCase()}`; + const issueOutDir = join(deps.outDir, issue.identifier); + await mkdir(issueOutDir, { recursive: true }); + + // Claim + await writeState(deps.stateDir, { + issueIdentifier: issue.identifier, + phase: "claimed", + project, + startedAt: new Date().toISOString(), + outcome: null, + }); + await deps.linear.swapLabel(issue.id, LABELS.queued, LABELS.inProgress); + deps.logger.info({ issue: issue.identifier, project }, "claimed"); + + let outcome: "done" | "failed" = "done"; + + try { + await writeState(deps.stateDir, { + issueIdentifier: issue.identifier, + phase: "running", + project, + startedAt: new Date().toISOString(), + outcome: null, + }); + await deps.compose.up({ timeoutMs: TIMEOUTS.up }); + const exitCode = await deps.compose.waitForService("agent", { timeoutMs: TIMEOUTS.wait }); + deps.logger.info({ issue: issue.identifier, exitCode }, "agent exited"); + if (exitCode !== 0) outcome = "failed"; + + await deps.collectArtifacts(project, issueOutDir); + await deps.storage.ensureBucket(); + await deps.storage.uploadDirectory(issueOutDir, `${issue.identifier}/`); + await writeState(deps.stateDir, { + issueIdentifier: issue.identifier, + phase: "uploaded", + project, + startedAt: new Date().toISOString(), + outcome, + }); + + await deps.linear.addComment( + issue.id, + renderComment(issue.identifier, outcome), + ); + + await deps.linear.swapLabel( + issue.id, + LABELS.inProgress, + outcome === "done" ? LABELS.done : LABELS.failed, + ); + await writeState(deps.stateDir, { + issueIdentifier: issue.identifier, + phase: "finalized", + project, + startedAt: new Date().toISOString(), + outcome, + }); + deps.logger.info({ issue: issue.identifier, outcome }, "finalized"); + } finally { + try { + await deps.compose.down({ timeoutMs: TIMEOUTS.down }); + } catch (err) { + deps.logger.error({ issue: issue.identifier, err: String(err) }, "compose down failed"); + } + } +} + +function renderComment(issueIdentifier: string, outcome: "done" | "failed"): string { + return [ + ``, + `**sec-fix-worker — Phase 1 stub**`, + ``, + `Outcome: \`${outcome}\``, + ``, + `Artifacts uploaded to \`s3://security-artifacts/${issueIdentifier}/\`.`, + ].join("\n"); +} +``` + +- [ ] **Step 4: Run test, verify it passes** + +```bash +pnpm test src/worker/process-issue.test.ts +``` + +Expected: 3 tests pass. + +- [ ] **Step 5: Commit** + +```bash +git add src/worker/process-issue.ts src/worker/process-issue.test.ts +git commit -m "feat: process-issue orchestration with state transitions" +``` + +--- + +## Task 13: Artifact collection from compose volume + +`collectArtifacts(project, destDir)` copies the contents of the named `artifacts` volume out to a host directory. Uses `docker cp` from a one-shot helper container that mounts the volume. + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/stack/collect-artifacts.ts` +- Create: `~/Development/sec-fix-pipeline/src/stack/collect-artifacts.test.ts` + +- [ ] **Step 1: Write the failing test** + +`src/stack/collect-artifacts.test.ts`: + +```ts +import { describe, it, expect } from "vitest"; +import { mkdtemp, readFile, rm } from "node:fs/promises"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { execa } from "execa"; +import { collectArtifacts } from "./collect-artifacts.js"; + +describe("collectArtifacts", () => { + it("copies files out of a named volume into the destination dir", async () => { + const project = `sec-fix-collect-test-${Date.now()}`; + const volume = `${project}_artifacts`; + // Pre-populate the volume by running a one-shot container. + await execa("docker", [ + "run", "--rm", + "-v", `${volume}:/artifacts`, + "alpine:3.20", + "sh", "-c", "echo hi > /artifacts/a.txt && mkdir -p /artifacts/sub && echo two > /artifacts/sub/b.txt", + ]); + const dest = await mkdtemp(join(tmpdir(), "collected-")); + try { + await collectArtifacts(project, dest); + expect(await readFile(join(dest, "a.txt"), "utf8")).toBe("hi\n"); + expect(await readFile(join(dest, "sub/b.txt"), "utf8")).toBe("two\n"); + } finally { + await rm(dest, { recursive: true, force: true }); + await execa("docker", ["volume", "rm", "-f", volume]).catch(() => {}); + } + }, 60_000); +}); +``` + +- [ ] **Step 2: Run test, verify it fails** + +```bash +pnpm test src/stack/collect-artifacts.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `collect-artifacts.ts`** + +```ts +import { execa } from "execa"; + +export async function collectArtifacts(project: string, destDir: string): Promise { + const volume = `${project}_artifacts`; + await execa("docker", [ + "run", "--rm", + "-v", `${volume}:/src:ro`, + "-v", `${destDir}:/dst`, + "alpine:3.20", + "sh", "-c", "cp -a /src/. /dst/", + ]); +} +``` + +- [ ] **Step 4: Run test, verify it passes** + +```bash +pnpm test src/stack/collect-artifacts.test.ts +``` + +Expected: 1 test passes. + +- [ ] **Step 5: Commit** + +```bash +git add src/stack/collect-artifacts.ts src/stack/collect-artifacts.test.ts +git commit -m "feat: collect artifacts from compose named volume" +``` + +--- + +## Task 14: Worker CLI entry point + +**Files:** +- Create: `~/Development/sec-fix-pipeline/src/worker/main.ts` + +`main.ts` wires everything together for a single-issue dry-run: load config, build Linear/compose/storage clients, fetch one issue, call `processIssue`, exit. The continuous loop comes in Phase 4. + +- [ ] **Step 1: Write `main.ts`** + +```ts +import "dotenv/config"; +import { loadConfig } from "../config.js"; +import { logger } from "../logger.js"; +import { createLinearClient, createRealGateway } from "../linear/client.js"; +import { createStorage } from "../storage/minio.js"; +import { createCompose } from "../stack/compose.js"; +import { collectArtifacts } from "../stack/collect-artifacts.js"; +import { processIssue } from "./process-issue.js"; +import { join } from "node:path"; + +async function main(): Promise { + const cfg = loadConfig(); + const linear = createLinearClient( + { viewId: cfg.linear.viewId }, + createRealGateway(cfg.linear.apiKey), + ); + const storage = createStorage(cfg.minio); + + const issue = await linear.findNextQueuedIssue(); + if (!issue) { + logger.info("no queued issue; exiting"); + return 0; + } + logger.info({ issue: issue.identifier, title: issue.title }, "picked issue"); + + const project = `sec-${issue.identifier.toLowerCase()}`; + const compose = createCompose({ + file: join(process.cwd(), "docker/stack.yml"), + project, + }); + + process.env.LINEAR_ISSUE_ID = issue.identifier; // consumed by compose stack.yml + + await processIssue(issue, { + linear, + compose, + storage, + collectArtifacts: (p, d) => collectArtifacts(p, d), + stateDir: cfg.dirs.state, + outDir: cfg.dirs.out, + logger, + }); + + return 0; +} + +main() + .then((code) => process.exit(code)) + .catch((err) => { + logger.error({ err: String(err), stack: err?.stack }, "worker crashed"); + process.exit(1); + }); +``` + +- [ ] **Step 2: Add `dotenv` to dependencies** + +Edit `package.json` to add to `dependencies`: + +```json + "dotenv": "16.4.5", +``` + +Then: + +```bash +pnpm i +``` + +- [ ] **Step 3: Typecheck** + +```bash +pnpm typecheck +``` + +Expected: no errors. + +- [ ] **Step 4: Commit** + +```bash +git add src/worker/main.ts package.json pnpm-lock.yaml +git commit -m "feat: worker CLI entry point" +``` + +--- + +## Task 15: Seed-test-issue script + +A helper script to create a Linear issue in the deepsec view labelled `agent:queued`, for end-to-end testing without touching real findings. + +**Files:** +- Create: `~/Development/sec-fix-pipeline/scripts/seed-test-issue.ts` + +- [ ] **Step 1: Write `seed-test-issue.ts`** + +```ts +import "dotenv/config"; +import { LinearClient } from "@linear/sdk"; +import { loadConfig } from "../src/config.js"; +import { LABELS } from "../src/linear/labels.js"; + +async function main() { + const cfg = loadConfig(); + const linear = new LinearClient({ apiKey: cfg.linear.apiKey }); + + // The deepsec view is on a specific team; we read the view to find the team ID. + const view = await linear.customView(cfg.linear.viewId); + const team = await view.team; + if (!team) throw new Error("could not resolve view's team"); + + const labels = await linear.issueLabels({ filter: { name: { eq: LABELS.queued } } }); + const queuedLabel = labels.nodes[0]; + if (!queuedLabel) throw new Error(`label not found: ${LABELS.queued}`); + + const created = await linear.createIssue({ + teamId: team.id, + title: `[sec-fix-test] phase 1 e2e probe — ${new Date().toISOString()}`, + description: "Synthetic issue created by sec-fix-pipeline phase 1 e2e test. Safe to close.", + labelIds: [queuedLabel.id], + }); + const issue = await created.issue; + console.log(`Created issue: ${issue?.identifier} (${issue?.id})`); + console.log(`URL: ${issue?.url}`); +} + +main().catch((e) => { + console.error(e); + process.exit(1); +}); +``` + +- [ ] **Step 2: Run it** + +```bash +pnpm seed:issue +``` + +Expected: prints a new issue identifier (e.g. `TRI-1234`) and URL. + +- [ ] **Step 3: Verify in Linear** + +Open the printed URL in a browser. Confirm the issue exists, has the `agent:queued` label, and appears in the deepsec-findings view. + +> If the label doesn't exist yet on the team, create the full label set (`agent:queued`, `agent:in-progress`, `agent:done`, `agent:false-positive`, `agent:runaway`, `agent:resume`, `agent:failed`) in Linear UI under the team's label settings before continuing. Re-run the seed script. + +- [ ] **Step 4: Commit** + +```bash +git add scripts/seed-test-issue.ts +git commit -m "feat: seed-test-issue script" +``` + +--- + +## Task 16: End-to-end integration test (manual) + +This is the Phase 1 ship gate. Not automated — operator runs it once and checks each step. + +- [ ] **Step 1: Ensure prerequisites are running** + +```bash +docker ps | grep minio # MinIO container up +docker images | grep sec-fix/agent-stub # Stub agent image built +``` + +If either is missing, redo Task 9 / Task 10. + +- [ ] **Step 2: Confirm `.env` is populated** + +```bash +test -f .env && grep LINEAR_API_KEY .env | grep -v '^LINEAR_API_KEY=$' +``` + +Expected: prints the line (key is set). + +- [ ] **Step 3: Setup bucket and seed test issue** + +```bash +pnpm setup:minio +pnpm seed:issue +``` + +Note the printed issue identifier (e.g. `TRI-1234`). + +- [ ] **Step 4: Run the worker** + +```bash +pnpm worker 2>&1 | tee /tmp/sec-fix-worker.log +``` + +Expected log lines (in order): +- `"picked issue"` with the seeded issue's identifier +- `"claimed"` +- `"agent exited"` with `exitCode: 0` +- `"finalized"` with `outcome: done` + +Process exits 0. + +- [ ] **Step 5: Verify Linear state** + +Open the seeded issue in Linear. Confirm: +- Label `agent:queued` is gone +- Label `agent:done` is present +- A comment was added containing `s3://security-artifacts//` + +- [ ] **Step 6: Verify MinIO contents** + +```bash +docker run --rm --network host \ + -e MC_HOST_local="http://${MINIO_ACCESS_KEY}:${MINIO_SECRET_KEY}@localhost:9000" \ + minio/mc:RELEASE.2024-10-29T15-34-59Z \ + ls -r local/security-artifacts/ +``` + +Expected: lists `/hello.txt`, `/final-summary.md`, `/status.json`. + +- [ ] **Step 7: Verify local state file** + +```bash +cat state/.json +``` + +Expected: JSON with `"phase": "finalized"`, `"outcome": "done"`. + +- [ ] **Step 8: Verify no compose stack lingers** + +```bash +docker compose ls +``` + +Expected: no `sec-*` project listed. + +- [ ] **Step 9: Close the seeded test issue manually** + +In Linear, close the seeded test issue with a comment "phase 1 e2e probe complete". + +- [ ] **Step 10: Document the run** + +Append a line to `README.md`: + +```markdown +## Phase 1 e2e probe history + +- : — issue — PASS +``` + +- [ ] **Step 11: Commit** + +```bash +git add README.md +git commit -m "docs: record phase 1 e2e probe pass" +``` + +--- + +## Phase 1 done + +Pipeline plumbing validated. Next: Phase 2 — Real Claude Agent SDK integration. The interface between the worker and the agent container is now fixed (env var in, `/artifacts/` out, exit code as signal), so Phase 2 only changes `docker/agent-stub/` → `docker/agent/` with a real `run-agent.mjs` and a system prompt. + +--- + +## Self-Review (writing-plans skill) + +**Spec coverage (Phase 1 only):** +- Linear queue read from deepsec view → Task 6, Task 14 ✓ +- Label state machine (queued → in-progress → done/failed) → Tasks 5, 12 ✓ +- Per-issue docker compose with project name `sec-` → Tasks 8, 10, 12 ✓ +- Agent container with `/artifacts/` contract → Task 9 ✓ +- Container exit code as completion signal → Task 8 (`waitForService`), Task 12 ✓ +- MinIO artifact upload → Tasks 7, 11, 12 ✓ +- Linear comment with marker → Task 12 (`renderComment`) ✓ +- Local state file with atomic writes → Task 4 ✓ +- Teardown via `compose down -v` even on failure → Task 12 (`finally`) ✓ + +**Spec sections deferred to later phases (explicit, not gaps):** +- Real Agent SDK + system prompt + multi-artifact bundle → Phase 2 +- Full per-issue stack (postgres/redis/clickhouse/webapp) + repo volume + pnpm-store + egress policy → Phase 3 +- systemd + reconcile + circuit breakers + heartbeat → Phase 4 +- Resumable runs (snapshots, session resume, 3-resume cap) → Phase 5 +- Review dashboard → Phase 6 + +**Placeholder scan:** None. All steps have concrete code or commands. + +**Type consistency check:** `LinearIssue`, `LinearClient`, `LinearGateway`, `Compose`, `Storage`, `IssueState`, `ProcessIssueDeps` all named consistently across Tasks 4, 6, 7, 8, 12. + +**One known caveat:** the `findNextQueuedIssue` gateway implementation in Task 6 uses a `customView(viewId).issues(...)` call against the Linear SDK; the exact filter API may need adjustment after Phase 1 hits real Linear (Linear's SDK has changed view-issue filtering shape between releases). The integration test in Task 16 will surface this; fix is a small edit in the gateway, not a design change. From 16ad47e981a6c919890484c5dd58d4cbfc985e87 Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Mon, 1 Jun 2026 20:20:43 +0100 Subject: [PATCH 03/12] feat(redis-worker): per-env batched pop in MollifierDrainer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a drainBatchSize option (default 1, preserves existing behaviour) that lets the drainer pop up to N entries from each chosen env per tick and dispatch them all through the shared concurrency-bounded limiter. Org/env fairness is preserved — the per-tick env selection is unchanged, only the in-env pop count grows. Wires TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE through the webapp (default 50). For a single-env burst of K entries with K > 1, drain time drops from K × tick_time to ceil(K / drainBatchSize) × tick_time with handler parallelism capped at concurrency. Heavy single-tenant tails go from minutes to tens of seconds without changing PG load characteristics. Co-Authored-By: Claude Opus 4.7 --- .changeset/mollifier-drain-batch-size.md | 5 + .server-changes/mollifier-drain-batch-size.md | 6 + apps/webapp/app/env.server.ts | 10 + .../v3/mollifier/mollifierDrainer.server.ts | 2 + .../src/mollifier/drainer.test.ts | 397 ++++++++++++++++++ .../redis-worker/src/mollifier/drainer.ts | 99 +++-- 6 files changed, 489 insertions(+), 30 deletions(-) create mode 100644 .changeset/mollifier-drain-batch-size.md create mode 100644 .server-changes/mollifier-drain-batch-size.md diff --git a/.changeset/mollifier-drain-batch-size.md b/.changeset/mollifier-drain-batch-size.md new file mode 100644 index 0000000000..9aea66a96c --- /dev/null +++ b/.changeset/mollifier-drain-batch-size.md @@ -0,0 +1,5 @@ +--- +"@trigger.dev/redis-worker": patch +--- + +`MollifierDrainer` now accepts a `drainBatchSize` option that controls how many entries it pops from a single env per tick. Default remains 1 (one pop per env per tick — previous behaviour). Setting it higher lets a single-env burst drain at handler-parallelism speed instead of one entry per ~50ms tick: the drainer pops up to `drainBatchSize` from the picked env and dispatches all popped entries through the shared `concurrency`-bounded limiter. Org/env fairness is unchanged — the per-tick env selection is unaffected. diff --git a/.server-changes/mollifier-drain-batch-size.md b/.server-changes/mollifier-drain-batch-size.md new file mode 100644 index 0000000000..3216dde528 --- /dev/null +++ b/.server-changes/mollifier-drain-batch-size.md @@ -0,0 +1,6 @@ +--- +area: webapp +type: improvement +--- + +Wire `TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE` (default 50) into the drainer so single-env bursts drain at the full `DRAIN_CONCURRENCY` budget instead of one pop per ~50ms tick. For a 20k-trigger burst on one env this cuts drain time from minutes to ~tens of seconds; smaller bursts (e.g. 50 on one env) drop from ~2.5s to ~50–100ms tail. diff --git a/apps/webapp/app/env.server.ts b/apps/webapp/app/env.server.ts index dafd67124b..a3041f4c7e 100644 --- a/apps/webapp/app/env.server.ts +++ b/apps/webapp/app/env.server.ts @@ -1101,6 +1101,16 @@ const EnvironmentSchema = z TRIGGER_MOLLIFIER_DRAIN_MAX_ATTEMPTS: z.coerce.number().int().positive().default(3), TRIGGER_MOLLIFIER_DRAIN_SHUTDOWN_TIMEOUT_MS: z.coerce.number().int().positive().default(30_000), TRIGGER_MOLLIFIER_DRAIN_MAX_ORGS_PER_TICK: z.coerce.number().int().positive().default(500), + // Per-env per-tick pop cap. The drainer rotates one env per org per + // tick; this bounds how many entries it pops from that env before + // dispatching them through the shared `DRAIN_CONCURRENCY`-bounded + // limiter. Default matches `DRAIN_CONCURRENCY` so a single-env burst + // uses the full handler-parallelism budget — for 20k buffered on one + // env this is the difference between ~17m (one-pop-per-tick × ~50ms) + // and ~20s (400 ticks × concurrent engine.trigger). Org/env fairness + // is preserved because the per-tick env selection is unchanged; only + // the in-env pop count grows. + TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE: z.coerce.number().int().positive().default(50), // Periodic sweep that scans buffer queue LISTs for entries whose // dwell exceeds the stale threshold. Independent of the drainer — // its job is exactly to make a stuck/offline drainer visible to diff --git a/apps/webapp/app/v3/mollifier/mollifierDrainer.server.ts b/apps/webapp/app/v3/mollifier/mollifierDrainer.server.ts index 26ac60f180..1b64da3345 100644 --- a/apps/webapp/app/v3/mollifier/mollifierDrainer.server.ts +++ b/apps/webapp/app/v3/mollifier/mollifierDrainer.server.ts @@ -72,6 +72,7 @@ function initializeMollifierDrainer(): MollifierDrainer { logger.debug("Initializing mollifier drainer", { concurrency: env.TRIGGER_MOLLIFIER_DRAIN_CONCURRENCY, maxAttempts: env.TRIGGER_MOLLIFIER_DRAIN_MAX_ATTEMPTS, + drainBatchSize: env.TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE, }); const drainer = new MollifierDrainer({ @@ -81,6 +82,7 @@ function initializeMollifierDrainer(): MollifierDrainer { concurrency: env.TRIGGER_MOLLIFIER_DRAIN_CONCURRENCY, maxAttempts: env.TRIGGER_MOLLIFIER_DRAIN_MAX_ATTEMPTS, maxOrgsPerTick: env.TRIGGER_MOLLIFIER_DRAIN_MAX_ORGS_PER_TICK, + drainBatchSize: env.TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE, isRetryable: isRetryablePgError, }); diff --git a/packages/redis-worker/src/mollifier/drainer.test.ts b/packages/redis-worker/src/mollifier/drainer.test.ts index c6832e94c7..a33835033e 100644 --- a/packages/redis-worker/src/mollifier/drainer.test.ts +++ b/packages/redis-worker/src/mollifier/drainer.test.ts @@ -130,6 +130,403 @@ describe("MollifierDrainer.runOnce", () => { }); }); +describe("MollifierDrainer.drainBatchSize", () => { + // Default behaviour (drainBatchSize=1) is exercised by every other + // test in this file — one pop per env per tick. These tests pin the + // single-env batched-pop fast path: with drainBatchSize=N, a single + // env with K buffered entries drains in ceil(K / N) ticks instead of + // K ticks, capped by the shared `concurrency` for in-flight handlers. + + it("pops up to drainBatchSize entries from a single env in one tick", async () => { + const queue: string[] = Array.from({ length: 10 }, (_, i) => `run_${i}`); + const handled: string[] = []; + const buffer = makeStubBuffer({ + ...eachEnvAsOwnOrg(["env_a"]), + pop: async (envId: string) => { + if (envId !== "env_a") return null; + const runId = queue.shift(); + if (!runId) return null; + return { + runId, + envId: "env_a", + orgId: "org_1", + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async (input) => { + handled.push(input.runId); + }, + concurrency: 5, + maxAttempts: 3, + isRetryable: () => false, + drainBatchSize: 5, + logger: new Logger("test-drainer", "log"), + }); + + const r1 = await drainer.runOnce(); + expect(r1.drained).toBe(5); + expect(handled).toHaveLength(5); + + const r2 = await drainer.runOnce(); + expect(r2.drained).toBe(5); + expect(handled).toHaveLength(10); + + // Queue now empty — next tick is a no-op. + const r3 = await drainer.runOnce(); + expect(r3.drained).toBe(0); + expect(r3.failed).toBe(0); + }); + + it("respects global concurrency cap when batch dispatch exceeds it", async () => { + // drainBatchSize=10 with concurrency=3 means each tick pops 10 + // entries but only 3 handlers run in parallel; the other 7 sit in + // pLimit's queue. The cap is on in-flight handlers, not on per-tick + // pop count. + const queue: string[] = Array.from({ length: 10 }, (_, i) => `run_${i}`); + let inflight = 0; + let peak = 0; + const buffer = makeStubBuffer({ + ...eachEnvAsOwnOrg(["env_a"]), + pop: async (envId: string) => { + if (envId !== "env_a") return null; + const runId = queue.shift(); + if (!runId) return null; + return { + runId, + envId: "env_a", + orgId: "org_1", + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async () => { + inflight += 1; + if (inflight > peak) peak = inflight; + await new Promise((r) => setTimeout(r, 25)); + inflight -= 1; + }, + concurrency: 3, + maxAttempts: 3, + isRetryable: () => false, + drainBatchSize: 10, + logger: new Logger("test-drainer", "log"), + }); + + const result = await drainer.runOnce(); + expect(result.drained).toBe(10); + expect(peak).toBeGreaterThan(1); // genuinely parallel + expect(peak).toBeLessThanOrEqual(3); // capped + }); + + it("a mid-batch pop failure aborts that env's batch and counts as one failure", async () => { + // Pin: when the third pop on env_bad throws, the drainer stops + // popping from that env for this tick (no infinite retry inside one + // tick), the two entries already popped still get processed, and + // the env contributes exactly one to the failed count. + let envBadPops = 0; + let envGoodPops = 0; + const handled: string[] = []; + const buffer = makeStubBuffer({ + ...eachEnvAsOwnOrg(["env_bad", "env_good"]), + pop: async (envId: string) => { + if (envId === "env_bad") { + envBadPops += 1; + if (envBadPops > 2) { + throw new Error("simulated pop failure mid-batch"); + } + return { + runId: `bad_${envBadPops}`, + envId: "env_bad", + orgId: "org_bad", + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + } + // env_good — one entry then empty. Track via pop-count rather + // than handler-side state so the pop loop's synchronous "is the + // queue empty?" check doesn't race against the parallel handler + // dispatch that runs after the whole batch is collected. + envGoodPops += 1; + if (envGoodPops > 1) return null; + return { + runId: "good_1", + envId: "env_good", + orgId: "org_good", + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async (input) => { + handled.push(input.runId); + }, + concurrency: 5, + maxAttempts: 3, + isRetryable: () => false, + drainBatchSize: 5, + logger: new Logger("test-drainer", "log"), + }); + + const result = await drainer.runOnce(); + // env_bad: 2 successful pops processed (drained) + 1 pop failure (failed). + // env_good: 1 successful pop processed (drained). + expect(result.drained).toBe(3); + expect(result.failed).toBe(1); + expect(new Set(handled)).toEqual(new Set(["bad_1", "bad_2", "good_1"])); + // We stopped popping env_bad on the failure — no fourth attempt. + expect(envBadPops).toBe(3); + }); + + it("fans batched pops out across multiple envs in a single tick", async () => { + // Pin: with N envs each holding K entries and drainBatchSize=K, one + // tick pops N×K entries and dispatches them all through the shared + // pLimit. Closes the gap that all the other batch tests cover a + // single env in isolation. + const envCount = 10; + const perEnv = 10; + const queues = new Map(); + for (let i = 0; i < envCount; i++) { + queues.set( + `env_${i}`, + Array.from({ length: perEnv }, (_, j) => `env_${i}_run_${j}`), + ); + } + const handled: string[] = []; + const buffer = makeStubBuffer({ + ...eachEnvAsOwnOrg([...queues.keys()]), + pop: async (envId: string) => { + const q = queues.get(envId); + if (!q || q.length === 0) return null; + const runId = q.shift()!; + return { + runId, + envId, + orgId: envId, + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async (input) => { + handled.push(input.runId); + }, + concurrency: 20, + maxAttempts: 3, + isRetryable: () => false, + drainBatchSize: perEnv, + logger: new Logger("test-drainer", "log"), + }); + + const r = await drainer.runOnce(); + expect(r.drained).toBe(envCount * perEnv); + expect(handled).toHaveLength(envCount * perEnv); + // Every env contributed exactly perEnv entries. + const perEnvCounts = handled.reduce>((acc, runId) => { + const env = runId.replace(/_run_\d+$/, ""); + acc[env] = (acc[env] ?? 0) + 1; + return acc; + }, {}); + for (let i = 0; i < envCount; i++) { + expect(perEnvCounts[`env_${i}`]).toBe(perEnv); + } + }); + + it("preserves org-level fairness with drainBatchSize > 1", async () => { + // Regression guard for the hierarchical rotation property at batch + // > 1: a heavy org with many envs still gets ~1 org-slot per tick, + // not N. The original test at line ~1066 only exercises batchSize=1; + // this re-runs the same shape with batchSize=5 to ensure batching + // doesn't somehow give the noisy tenant more slots. + const orgAEnvs = Array.from({ length: 6 }, (_, i) => `env_orgA_${i}`); + const orgBEnv = "env_orgB_only"; + const envOrg = new Map(); + for (const e of orgAEnvs) envOrg.set(e, "org_A"); + envOrg.set(orgBEnv, "org_B"); + const queues = new Map>(); + for (const e of orgAEnvs) { + queues.set( + e, + Array.from({ length: 100 }, (_, i) => ({ + runId: `${e}_run_${i}`, + orgId: "org_A", + })), + ); + } + queues.set( + orgBEnv, + Array.from({ length: 100 }, (_, i) => ({ + runId: `${orgBEnv}_run_${i}`, + orgId: "org_B", + })), + ); + + const drainedByOrg: Record = { org_A: 0, org_B: 0 }; + const buffer = makeStubBuffer({ + listOrgs: async () => { + const orgs = new Set(); + for (const [envId, items] of queues.entries()) { + if (items.length > 0) orgs.add(envOrg.get(envId)!); + } + return [...orgs]; + }, + listEnvsForOrg: async (orgId: string) => { + const envs: string[] = []; + for (const [envId, items] of queues.entries()) { + if (items.length > 0 && envOrg.get(envId) === orgId) envs.push(envId); + } + return envs; + }, + pop: async (envId: string) => { + const q = queues.get(envId); + if (!q || q.length === 0) return null; + const entry = q.shift()!; + return { + runId: entry.runId, + envId, + orgId: entry.orgId, + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async (input) => { + drainedByOrg[input.orgId] = (drainedByOrg[input.orgId] ?? 0) + 1; + }, + concurrency: 10, + maxAttempts: 3, + isRetryable: () => false, + maxOrgsPerTick: 100, + drainBatchSize: 5, + logger: new Logger("test-drainer", "log"), + }); + + for (let i = 0; i < 20; i++) { + await drainer.runOnce(); + } + + expect(drainedByOrg["org_A"]).toBeGreaterThan(0); + expect(drainedByOrg["org_B"]).toBeGreaterThan(0); + const ratio = drainedByOrg["org_A"]! / drainedByOrg["org_B"]!; + // Same fairness window as the batchSize=1 sibling test — batching + // multiplies per-tick throughput uniformly, not asymmetrically. + expect(ratio).toBeGreaterThan(0.7); + expect(ratio).toBeLessThan(1.5); + }); + + it("counts mixed handler success and failure within a batched tick correctly", async () => { + // 5 envs, one entry each, drainBatchSize=5. Three handlers succeed, + // two throw non-retryable → drained=3, failed=2. Pins that the batched + // dispatch's drained/failed accounting per entry is preserved when + // multiple outcomes interleave in one tick. + const envs = ["env_ok_1", "env_ok_2", "env_ok_3", "env_fail_1", "env_fail_2"]; + const popsByEnv = new Map(); + const buffer = makeStubBuffer({ + ...eachEnvAsOwnOrg(envs), + pop: async (envId: string) => { + if (!envs.includes(envId)) return null; + // One entry per env then empty. Track via a per-env pop counter + // so the batch loop terminates after the first hit even though + // drainBatchSize=5. + const popped = (popsByEnv.get(envId) ?? 0) + 1; + popsByEnv.set(envId, popped); + if (popped > 1) return null; + return { + runId: `run_${envId}`, + envId, + orgId: envId, + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async (input) => { + if (input.envId.startsWith("env_fail")) { + throw new Error("simulated handler failure"); + } + }, + concurrency: 10, + maxAttempts: 3, + isRetryable: () => false, // non-retryable → terminal on first attempt + drainBatchSize: 5, + logger: new Logger("test-drainer", "log"), + }); + + const r = await drainer.runOnce(); + expect(r.drained).toBe(3); + expect(r.failed).toBe(2); + }); + + it("stops popping early when the env's queue empties before reaching drainBatchSize", async () => { + const queue = ["only_1", "only_2"]; + const handled: string[] = []; + let popCalls = 0; + const buffer = makeStubBuffer({ + ...eachEnvAsOwnOrg(["env_a"]), + pop: async (envId: string) => { + if (envId !== "env_a") return null; + popCalls += 1; + const runId = queue.shift(); + if (!runId) return null; + return { + runId, + envId: "env_a", + orgId: "org_1", + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async (input) => { + handled.push(input.runId); + }, + concurrency: 5, + maxAttempts: 3, + isRetryable: () => false, + drainBatchSize: 10, + logger: new Logger("test-drainer", "log"), + }); + + const r = await drainer.runOnce(); + expect(r.drained).toBe(2); + expect(handled).toEqual(["only_1", "only_2"]); + // 2 successful pops + 1 sentinel pop that returned null and ended + // the batch loop — 3 calls, not 10. Bounding stops the Lua spam. + expect(popCalls).toBe(3); + }); +}); + describe("MollifierDrainer error handling", () => { redisTest("retryable error requeues and increments attempts", { timeout: 20_000 }, async ({ redisContainer }) => { const buffer = new MollifierBuffer({ diff --git a/packages/redis-worker/src/mollifier/drainer.ts b/packages/redis-worker/src/mollifier/drainer.ts index 20b5ee3ae1..a2a3737f47 100644 --- a/packages/redis-worker/src/mollifier/drainer.ts +++ b/packages/redis-worker/src/mollifier/drainer.ts @@ -52,6 +52,21 @@ export type MollifierDrainerOptions = { // as an org with 1 env — tenant-level drainage throughput is determined // by org count, not env count. maxOrgsPerTick?: number; + // Per-env per-tick pop cap. Default 1 preserves the original + // one-pop-per-env-per-tick behaviour. Setting it higher lets a single + // env drain at handler-parallelism speed: each tick the drainer pops + // up to `drainBatchSize` entries from the env's queue, then dispatches + // them all through the shared `concurrency`-bounded pLimit. For a + // single-env burst this turns N sequential ticks into one tick of N + // parallel handler calls, capped by `concurrency`. Org/env fairness + // still holds — each org still contributes exactly one env per tick. + // + // Memory: per-tick in-flight entries ≤ `maxOrgsPerTick × drainBatchSize`. + // Operators sizing this should ensure their PG pool / engine handler + // can sustain `concurrency` parallel writes; popping more than the + // handler can process per tick just queues entries in JS waiting on + // pLimit. + drainBatchSize?: number; logger?: Logger; }; @@ -68,6 +83,7 @@ export class MollifierDrainer { private readonly isRetryable: (err: unknown) => boolean; private readonly pollIntervalMs: number; private readonly maxOrgsPerTick: number; + private readonly drainBatchSize: number; private readonly logger: Logger; private readonly limit: ReturnType; // Rotation state. `orgCursor` advances through the active-orgs list. @@ -87,6 +103,7 @@ export class MollifierDrainer { this.isRetryable = options.isRetryable; this.pollIntervalMs = options.pollIntervalMs ?? 100; this.maxOrgsPerTick = options.maxOrgsPerTick ?? 500; + this.drainBatchSize = Math.max(1, options.drainBatchSize ?? 1); this.logger = options.logger ?? new Logger("MollifierDrainer", "debug"); this.limit = pLimit(options.concurrency); } @@ -116,15 +133,63 @@ export class MollifierDrainer { targets.push(envId); } - const inflight: Promise<"drained" | "failed" | "empty">[] = []; - for (const envId of targets) { - inflight.push(this.limit(() => this.processOneFromEnv(envId))); + // Pop a batch from each target env in parallel. Within an env we pop + // sequentially (each Lua `pop` is atomic; back-to-back pops on the + // same env can't be concurrent without a `popBatch` Lua, and Redis + // RTT × drainBatchSize is cheap compared to the engine.trigger work + // that follows). A pop failure mid-batch aborts only that env's + // batch and counts as one failure — same semantics as the previous + // one-pop-per-env path, generalised. + const envBatches = await Promise.all( + targets.map(async (envId) => { + const entries: BufferEntry[] = []; + let popFailed = false; + for (let i = 0; i < this.drainBatchSize; i++) { + let entry: BufferEntry | null; + try { + entry = await this.buffer.pop(envId); + } catch (err) { + this.logger.error("MollifierDrainer.pop failed", { envId, err }); + popFailed = true; + break; + } + if (!entry) break; + entries.push(entry); + } + return { entries, popFailed }; + }), + ); + + const popFailures = envBatches.reduce((n, b) => n + (b.popFailed ? 1 : 0), 0); + const allEntries = envBatches.flatMap((b) => b.entries); + if (allEntries.length === 0) { + return { drained: 0, failed: popFailures }; } + // Dispatch every popped entry through the shared pLimit so the + // global in-flight cap is `concurrency` regardless of how many envs + // contributed entries this tick. Per-entry errors are caught inside + // the closure so a single bad entry can't poison the tick — same + // safety net the old `processOneFromEnv` provided. + const inflight = allEntries.map((entry) => + this.limit(async () => { + try { + return await this.processEntry(entry); + } catch (err) { + this.logger.error("MollifierDrainer.processEntry failed", { + envId: entry.envId, + runId: entry.runId, + err, + }); + return "failed" as const; + } + }), + ); + const results = await Promise.all(inflight); return { drained: results.filter((r) => r === "drained").length, - failed: results.filter((r) => r === "failed").length, + failed: results.filter((r) => r === "failed").length + popFailures, }; } @@ -249,32 +314,6 @@ export class MollifierDrainer { return sorted[idx]!; } - // A failure for one env (e.g. a Redis hiccup mid-batch in `pop`, or in - // `requeue`/`fail` during error recovery inside `processEntry`) must not - // poison the rest of the batch — `Promise.all` would otherwise reject and - // bubble all the way to `loop()`. Catch both stages here so the failed env - // is just counted as "failed" for this tick and we move on. - private async processOneFromEnv(envId: string): Promise<"drained" | "failed" | "empty"> { - let entry: BufferEntry | null; - try { - entry = await this.buffer.pop(envId); - } catch (err) { - this.logger.error("MollifierDrainer.pop failed", { envId, err }); - return "failed"; - } - if (!entry) return "empty"; - try { - return await this.processEntry(entry); - } catch (err) { - this.logger.error("MollifierDrainer.processEntry failed", { - envId, - runId: entry.runId, - err, - }); - return "failed"; - } - } - private async processEntry(entry: BufferEntry): Promise<"drained" | "failed"> { try { const payload = deserialiseSnapshot(entry.payload); From 9eda05904f323ea0e1f3ca125a52b4a1f29eedca Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Mon, 1 Jun 2026 20:25:55 +0100 Subject: [PATCH 04/12] chore: remove stale docs/superpowers scratch Co-Authored-By: Claude Opus 4.7 --- ...05-26-sec-fix-pipeline-phase-1-skeleton.md | 1832 ----------------- ...2026-05-26-security-fix-pipeline-design.md | 561 ----- 2 files changed, 2393 deletions(-) delete mode 100644 docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md delete mode 100644 docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md diff --git a/docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md b/docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md deleted file mode 100644 index 7090ce5869..0000000000 --- a/docs/superpowers/plans/2026-05-26-sec-fix-pipeline-phase-1-skeleton.md +++ /dev/null @@ -1,1832 +0,0 @@ -# Sec Fix Pipeline — Phase 1: Skeleton Implementation Plan - -> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. - -**Goal:** End-to-end orchestration skeleton — worker daemon picks one Linear issue from the deepsec view, runs a stub agent in a docker-compose stack, collects an artifact, uploads it to MinIO, posts a Linear comment, flips the label to `agent:done`, tears down. Validates pipeline plumbing before adding any real agent logic. - -**Architecture:** New standalone repo at `~/Development/sec-fix-pipeline/`. Node + TypeScript worker daemon (no systemd yet — just a CLI you run manually), local Docker for the per-issue stack, local MinIO container for artifacts, Linear SDK for queue. Single-instance, serial, runs one issue then exits (loop comes in a later phase). - -**Tech Stack:** TypeScript, pnpm, Vitest, `@linear/sdk`, `@aws-sdk/client-s3`, `execa` (for shelling to docker compose), `zod` (state file schema), `testcontainers` (MinIO + Linear-mock in tests). - -**Scope:** Phase 1 of 6. Defers the full per-issue stack (Phase 3), real Claude Agent SDK (Phase 2), durability hardening (Phase 4), resumable runs (Phase 5), and the review dashboard (Phase 6). Ships when a real Linear test issue can be processed end-to-end by a manually-run worker against a stub agent. - -**Spec reference:** `docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md` in the `trigger.dev-mirror-2` repo. - -**Pinned base SHA (for later phases, not Phase 1):** `37eeaa36908fb1aad48fc43d04e5b4e8f474f957` - ---- - -## File Structure - -``` -~/Development/sec-fix-pipeline/ -├── package.json -├── pnpm-workspace.yaml # for future packages -├── tsconfig.json -├── tsconfig.base.json -├── vitest.config.ts -├── .gitignore -├── .nvmrc -├── .env.example -├── README.md -├── docker/ -│ ├── agent-stub/ -│ │ ├── Dockerfile -│ │ └── run-agent.mjs # writes /artifacts/hello.txt, exits 0 -│ └── stack.yml # MinIO host service + per-issue agent service -├── src/ -│ ├── config.ts # env loading, zod-validated -│ ├── logger.ts # pino, json output -│ ├── state.ts # state file read/write/transition -│ ├── linear/ -│ │ ├── client.ts # wraps @linear/sdk + the deepsec view -│ │ ├── client.test.ts -│ │ └── labels.ts # label constants + state machine helpers -│ ├── storage/ -│ │ ├── minio.ts # S3 client for local MinIO -│ │ └── minio.test.ts -│ ├── stack/ -│ │ ├── compose.ts # `docker compose` wrapper with timeouts -│ │ └── compose.test.ts -│ ├── worker/ -│ │ ├── process-issue.ts # the per-issue flow -│ │ ├── process-issue.test.ts -│ │ └── main.ts # CLI entry — pick one issue, process, exit -│ └── types.ts -├── test/ -│ └── integration/ -│ └── end-to-end.test.ts # full flow against a mock Linear + real MinIO + stub agent -└── scripts/ - ├── setup-minio.sh # idempotent bucket creation - └── seed-test-issue.ts # creates a Linear test issue labelled agent:queued -``` - -**Responsibilities:** - -- `config.ts` — single place for env vars; fails fast if anything missing. -- `state.ts` — atomic writes to `./state/.json`. Source of truth for phase transitions. -- `linear/` — every Linear interaction. Mock-friendly. View ID hardcoded as a constant. -- `storage/` — MinIO/S3 only. No Linear coupling. -- `stack/` — docker compose only. No state, no Linear. -- `worker/process-issue.ts` — orchestrates: claim → run → collect → upload → notify → finalize. The integration point. -- `worker/main.ts` — CLI front-end. Loops come in Phase 4. - ---- - -## Task 1: Bootstrap the new repo - -**Files:** -- Create: `~/Development/sec-fix-pipeline/package.json` -- Create: `~/Development/sec-fix-pipeline/tsconfig.json` -- Create: `~/Development/sec-fix-pipeline/.gitignore` -- Create: `~/Development/sec-fix-pipeline/.nvmrc` -- Create: `~/Development/sec-fix-pipeline/.env.example` -- Create: `~/Development/sec-fix-pipeline/README.md` - -- [ ] **Step 1: Create the repo and initialize git** - -```bash -mkdir -p ~/Development/sec-fix-pipeline -cd ~/Development/sec-fix-pipeline -git init -git checkout -b main -``` - -- [ ] **Step 2: Write `.nvmrc`** - -``` -22.13.0 -``` - -- [ ] **Step 3: Write `package.json`** - -```json -{ - "name": "sec-fix-pipeline", - "private": true, - "version": "0.1.0", - "type": "module", - "packageManager": "pnpm@10.33.2", - "engines": { "node": ">=22.13.0" }, - "scripts": { - "build": "tsc -p tsconfig.json", - "typecheck": "tsc -p tsconfig.json --noEmit", - "test": "vitest run", - "test:watch": "vitest", - "worker": "tsx src/worker/main.ts", - "setup:minio": "bash scripts/setup-minio.sh", - "seed:issue": "tsx scripts/seed-test-issue.ts" - }, - "dependencies": { - "@aws-sdk/client-s3": "3.654.0", - "@aws-sdk/lib-storage": "3.654.0", - "@linear/sdk": "32.0.0", - "execa": "9.5.1", - "pino": "9.5.0", - "tsx": "4.19.2", - "zod": "3.25.76" - }, - "devDependencies": { - "@types/node": "22.10.2", - "testcontainers": "10.16.0", - "typescript": "5.7.2", - "vitest": "2.1.8" - } -} -``` - -- [ ] **Step 4: Write `tsconfig.json`** - -```json -{ - "compilerOptions": { - "target": "ES2022", - "module": "NodeNext", - "moduleResolution": "NodeNext", - "lib": ["ES2022"], - "strict": true, - "noUncheckedIndexedAccess": true, - "esModuleInterop": true, - "skipLibCheck": true, - "resolveJsonModule": true, - "outDir": "dist", - "rootDir": "src", - "declaration": false, - "sourceMap": true, - "forceConsistentCasingInFileNames": true - }, - "include": ["src/**/*.ts"], - "exclude": ["node_modules", "dist", "**/*.test.ts"] -} -``` - -- [ ] **Step 5: Write `.gitignore`** - -``` -node_modules -dist -state/ -out/ -snapshots/ -logs/ -.env -.env.local -*.log -.DS_Store -``` - -- [ ] **Step 6: Write `.env.example`** - -``` -LINEAR_API_KEY= -LINEAR_DEEPSEC_VIEW_ID=c443c3c869c0 -MINIO_ENDPOINT=http://localhost:9000 -MINIO_ACCESS_KEY=minioadmin -MINIO_SECRET_KEY=minioadmin -MINIO_BUCKET=security-artifacts -WORKER_STATE_DIR=./state -WORKER_OUT_DIR=./out -WORKER_LOGS_DIR=./logs -``` - -- [ ] **Step 7: Write `README.md`** - -```markdown -# sec-fix-pipeline - -Autonomous pipeline for validating and proposing fixes for security findings tracked in the deepsec-findings Linear view. Reads issues, runs an isolated container per issue with Claude Code, produces patch bundles for human review. - -See design spec: `trigger.dev-mirror-2/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md`. - -## Phase 1 status - -End-to-end skeleton. Stub agent only — writes `hello.txt`, no real fixing yet. - -## Setup - -1. `pnpm i` -2. `cp .env.example .env` and fill in `LINEAR_API_KEY` -3. `docker compose -f docker/stack.yml --profile services up -d minio` -4. `pnpm setup:minio` -5. `pnpm seed:issue` to create a Linear test issue -6. `pnpm worker` to process it - -## Layout - -See `docs/architecture.md` (to be written in a later phase). -``` - -- [ ] **Step 8: Install dependencies** - -```bash -cd ~/Development/sec-fix-pipeline -pnpm i -``` - -Expected: clean install, no errors. - -- [ ] **Step 9: Verify TypeScript compiles (empty src)** - -```bash -mkdir -p src -echo "export {};" > src/index.ts -pnpm typecheck -``` - -Expected: no output (success). - -- [ ] **Step 10: Commit** - -```bash -git add -A -git commit -m "chore: bootstrap repo with TS + pnpm + vitest" -``` - ---- - -## Task 2: Config module - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/config.ts` -- Create: `~/Development/sec-fix-pipeline/src/config.test.ts` - -- [ ] **Step 1: Write the failing test** - -`src/config.test.ts`: - -```ts -import { describe, it, expect } from "vitest"; -import { loadConfig } from "./config.js"; - -describe("loadConfig", () => { - it("returns a parsed config when all required vars are set", () => { - const cfg = loadConfig({ - LINEAR_API_KEY: "lin_api_test", - LINEAR_DEEPSEC_VIEW_ID: "c443c3c869c0", - MINIO_ENDPOINT: "http://localhost:9000", - MINIO_ACCESS_KEY: "x", - MINIO_SECRET_KEY: "y", - MINIO_BUCKET: "security-artifacts", - WORKER_STATE_DIR: "./state", - WORKER_OUT_DIR: "./out", - WORKER_LOGS_DIR: "./logs", - }); - expect(cfg.linear.apiKey).toBe("lin_api_test"); - expect(cfg.minio.bucket).toBe("security-artifacts"); - }); - - it("throws when LINEAR_API_KEY is missing", () => { - expect(() => loadConfig({})).toThrow(/LINEAR_API_KEY/); - }); -}); -``` - -- [ ] **Step 2: Run test, verify it fails** - -```bash -pnpm test src/config.test.ts -``` - -Expected: FAIL — module `./config.js` not found. - -- [ ] **Step 3: Write `config.ts`** - -```ts -import { z } from "zod"; - -const Schema = z.object({ - LINEAR_API_KEY: z.string().min(1), - LINEAR_DEEPSEC_VIEW_ID: z.string().min(1), - MINIO_ENDPOINT: z.string().url(), - MINIO_ACCESS_KEY: z.string().min(1), - MINIO_SECRET_KEY: z.string().min(1), - MINIO_BUCKET: z.string().min(1), - WORKER_STATE_DIR: z.string().min(1), - WORKER_OUT_DIR: z.string().min(1), - WORKER_LOGS_DIR: z.string().min(1), -}); - -export type Config = { - linear: { apiKey: string; viewId: string }; - minio: { - endpoint: string; - accessKey: string; - secretKey: string; - bucket: string; - }; - dirs: { state: string; out: string; logs: string }; -}; - -export function loadConfig(env: NodeJS.ProcessEnv | Record = process.env): Config { - const parsed = Schema.parse(env); - return { - linear: { apiKey: parsed.LINEAR_API_KEY, viewId: parsed.LINEAR_DEEPSEC_VIEW_ID }, - minio: { - endpoint: parsed.MINIO_ENDPOINT, - accessKey: parsed.MINIO_ACCESS_KEY, - secretKey: parsed.MINIO_SECRET_KEY, - bucket: parsed.MINIO_BUCKET, - }, - dirs: { - state: parsed.WORKER_STATE_DIR, - out: parsed.WORKER_OUT_DIR, - logs: parsed.WORKER_LOGS_DIR, - }, - }; -} -``` - -- [ ] **Step 4: Run test, verify it passes** - -```bash -pnpm test src/config.test.ts -``` - -Expected: 2 tests pass. - -- [ ] **Step 5: Commit** - -```bash -git add src/config.ts src/config.test.ts -git commit -m "feat: config module with zod-validated env loading" -``` - ---- - -## Task 3: Logger module - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/logger.ts` - -- [ ] **Step 1: Write `logger.ts`** - -```ts -import pino from "pino"; - -export const logger = pino({ - level: process.env.LOG_LEVEL ?? "info", - base: { service: "sec-fix-worker" }, - timestamp: pino.stdTimeFunctions.isoTime, -}); - -export type Logger = typeof logger; -``` - -- [ ] **Step 2: Smoke test** - -```bash -node --experimental-strip-types -e "import('./src/logger.ts').then(m => m.logger.info({ hello: 'world' }, 'test'))" -``` - -Expected: a single JSON line on stdout containing `"hello":"world"`. - -- [ ] **Step 3: Commit** - -```bash -git add src/logger.ts -git commit -m "feat: pino logger module" -``` - ---- - -## Task 4: State file module - -State files live at `${WORKER_STATE_DIR}/.json`. Phase 1 only uses two phases (`claimed`, `finalized`) — the full state machine comes in Phase 4. We still write the file atomically (write to tmp, fsync, rename) because that pattern is load-bearing for Phase 4. - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/state.ts` -- Create: `~/Development/sec-fix-pipeline/src/state.test.ts` - -- [ ] **Step 1: Write the failing test** - -`src/state.test.ts`: - -```ts -import { describe, it, expect, beforeEach } from "vitest"; -import { mkdtemp, rm, readFile } from "node:fs/promises"; -import { tmpdir } from "node:os"; -import { join } from "node:path"; -import { readState, writeState, IssueState } from "./state.js"; - -describe("state file", () => { - let dir: string; - beforeEach(async () => { - dir = await mkdtemp(join(tmpdir(), "sec-state-")); - return async () => rm(dir, { recursive: true, force: true }); - }); - - it("returns null for a missing issue", async () => { - expect(await readState(dir, "LIN-999")).toBeNull(); - }); - - it("round-trips a state object", async () => { - const state: IssueState = { - issueIdentifier: "LIN-1", - phase: "claimed", - project: "sec-lin-1", - startedAt: "2026-05-26T10:00:00Z", - outcome: null, - }; - await writeState(dir, state); - const read = await readState(dir, "LIN-1"); - expect(read).toEqual(state); - }); - - it("writes atomically via a tmp+rename", async () => { - const state: IssueState = { - issueIdentifier: "LIN-2", - phase: "claimed", - project: "sec-lin-2", - startedAt: "2026-05-26T10:00:00Z", - outcome: null, - }; - await writeState(dir, state); - const raw = await readFile(join(dir, "LIN-2.json"), "utf8"); - expect(JSON.parse(raw)).toEqual(state); - }); -}); -``` - -- [ ] **Step 2: Run test, verify it fails** - -```bash -pnpm test src/state.test.ts -``` - -Expected: FAIL — module not found. - -- [ ] **Step 3: Write `state.ts`** - -```ts -import { mkdir, readFile, writeFile, rename } from "node:fs/promises"; -import { join } from "node:path"; -import { z } from "zod"; - -const PhaseSchema = z.enum(["claimed", "running", "uploaded", "finalized"]); -const OutcomeSchema = z.enum(["done", "failed", "false-positive", "runaway"]).nullable(); - -export const IssueStateSchema = z.object({ - issueIdentifier: z.string().min(1), - phase: PhaseSchema, - project: z.string().min(1), - startedAt: z.string().min(1), - outcome: OutcomeSchema, -}); -export type IssueState = z.infer; - -function pathFor(dir: string, issueIdentifier: string): string { - return join(dir, `${issueIdentifier}.json`); -} - -export async function readState(dir: string, issueIdentifier: string): Promise { - try { - const raw = await readFile(pathFor(dir, issueIdentifier), "utf8"); - return IssueStateSchema.parse(JSON.parse(raw)); - } catch (err: any) { - if (err?.code === "ENOENT") return null; - throw err; - } -} - -export async function writeState(dir: string, state: IssueState): Promise { - await mkdir(dir, { recursive: true }); - const final = pathFor(dir, state.issueIdentifier); - const tmp = `${final}.tmp-${process.pid}-${Date.now()}`; - await writeFile(tmp, JSON.stringify(state, null, 2), { flag: "w" }); - await rename(tmp, final); -} -``` - -- [ ] **Step 4: Run test, verify it passes** - -```bash -pnpm test src/state.test.ts -``` - -Expected: 3 tests pass. - -- [ ] **Step 5: Commit** - -```bash -git add src/state.ts src/state.test.ts -git commit -m "feat: state file module with atomic writes" -``` - ---- - -## Task 5: Linear labels module - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/linear/labels.ts` - -- [ ] **Step 1: Write `labels.ts`** - -```ts -export const LABELS = { - queued: "agent:queued", - inProgress: "agent:in-progress", - done: "agent:done", - falsePositive: "agent:false-positive", - runaway: "agent:runaway", - resume: "agent:resume", - failed: "agent:failed", - reviewedAccept: "agent:reviewed-accept", - reviewedReject: "agent:reviewed-reject", -} as const; - -export type AgentLabel = typeof LABELS[keyof typeof LABELS]; - -export const TERMINAL_LABELS: readonly AgentLabel[] = [ - LABELS.done, - LABELS.falsePositive, - LABELS.failed, - LABELS.runaway, -] as const; - -export const AGENT_OWNED_LABELS: readonly AgentLabel[] = Object.values(LABELS) as readonly AgentLabel[]; -``` - -- [ ] **Step 2: Commit** - -```bash -git add src/linear/labels.ts -git commit -m "feat: agent label constants" -``` - ---- - -## Task 6: Linear client wrapper - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/linear/client.ts` -- Create: `~/Development/sec-fix-pipeline/src/linear/client.test.ts` - -For Phase 1 we only need: `findNextQueuedIssue()`, `swapLabel(issueId, fromLabel, toLabel)`, `addComment(issueId, body)`. The `nextFromDeepsecView` filter is implemented as: fetch the view's issues filtered by label `agent:queued`. We'll stub the SDK in tests with a hand-rolled fake; we are not running against real Linear in unit tests. - -- [ ] **Step 1: Write the failing test** - -`src/linear/client.test.ts`: - -```ts -import { describe, it, expect, vi } from "vitest"; -import { createLinearClient, LinearGateway } from "./client.js"; -import { LABELS } from "./labels.js"; - -function fakeGateway(overrides: Partial = {}): LinearGateway { - return { - listIssuesInViewWithLabel: vi.fn().mockResolvedValue([]), - listLabelIdsForIssue: vi.fn().mockResolvedValue([]), - findLabelIdByName: vi.fn().mockResolvedValue("label-id-fake"), - updateIssueLabels: vi.fn().mockResolvedValue(undefined), - createComment: vi.fn().mockResolvedValue(undefined), - ...overrides, - }; -} - -describe("LinearClient", () => { - it("findNextQueuedIssue returns the first issue in the view labelled queued", async () => { - const gateway = fakeGateway({ - listIssuesInViewWithLabel: vi.fn().mockResolvedValue([ - { id: "i1", identifier: "LIN-1", title: "fix sql injection in users", body: "details", labelIds: ["label-id-fake"] }, - { id: "i2", identifier: "LIN-2", title: "fix xss", body: "details", labelIds: ["label-id-fake"] }, - ]), - }); - const client = createLinearClient({ viewId: "view-x" }, gateway); - const next = await client.findNextQueuedIssue(); - expect(next?.identifier).toBe("LIN-1"); - expect(gateway.listIssuesInViewWithLabel).toHaveBeenCalledWith("view-x", LABELS.queued); - }); - - it("swapLabel removes the from label and adds the to label", async () => { - const update = vi.fn().mockResolvedValue(undefined); - const gateway = fakeGateway({ - listLabelIdsForIssue: vi.fn().mockResolvedValue(["from-id", "other-id"]), - findLabelIdByName: vi.fn().mockImplementation((name: string) => - Promise.resolve(name === LABELS.queued ? "from-id" : "to-id"), - ), - updateIssueLabels: update, - }); - const client = createLinearClient({ viewId: "view-x" }, gateway); - await client.swapLabel("issue-1", LABELS.queued, LABELS.inProgress); - expect(update).toHaveBeenCalledWith("issue-1", ["other-id", "to-id"]); - }); - - it("addComment delegates to gateway", async () => { - const create = vi.fn().mockResolvedValue(undefined); - const gateway = fakeGateway({ createComment: create }); - const client = createLinearClient({ viewId: "view-x" }, gateway); - await client.addComment("issue-1", "hello"); - expect(create).toHaveBeenCalledWith("issue-1", "hello"); - }); -}); -``` - -- [ ] **Step 2: Run test, verify it fails** - -```bash -pnpm test src/linear/client.test.ts -``` - -Expected: FAIL — module not found. - -- [ ] **Step 3: Write `client.ts`** - -```ts -import { LinearClient as SDKClient } from "@linear/sdk"; -import { LABELS, type AgentLabel } from "./labels.js"; - -export type LinearIssue = { - id: string; - identifier: string; - title: string; - body: string; - labelIds: string[]; -}; - -export type LinearGateway = { - listIssuesInViewWithLabel(viewId: string, label: AgentLabel): Promise; - listLabelIdsForIssue(issueId: string): Promise; - findLabelIdByName(name: AgentLabel): Promise; - updateIssueLabels(issueId: string, labelIds: string[]): Promise; - createComment(issueId: string, body: string): Promise; -}; - -export type LinearClient = { - findNextQueuedIssue(): Promise; - swapLabel(issueId: string, from: AgentLabel, to: AgentLabel): Promise; - addComment(issueId: string, body: string): Promise; -}; - -export function createLinearClient(opts: { viewId: string }, gateway: LinearGateway): LinearClient { - return { - async findNextQueuedIssue() { - const issues = await gateway.listIssuesInViewWithLabel(opts.viewId, LABELS.queued); - return issues[0] ?? null; - }, - async swapLabel(issueId, from, to) { - const [current, fromId, toId] = await Promise.all([ - gateway.listLabelIdsForIssue(issueId), - gateway.findLabelIdByName(from), - gateway.findLabelIdByName(to), - ]); - const next = Array.from(new Set([...current.filter((id) => id !== fromId), toId])); - await gateway.updateIssueLabels(issueId, next); - }, - async addComment(issueId, body) { - await gateway.createComment(issueId, body); - }, - }; -} - -export function createRealGateway(apiKey: string): LinearGateway { - const sdk = new SDKClient({ apiKey }); - return { - async listIssuesInViewWithLabel(viewId, label) { - const view = await sdk.customView(viewId); - const issuesConnection = await view.issues({ - filter: { labels: { name: { eq: label } } }, - orderBy: undefined, - } as any); - return issuesConnection.nodes.map((n) => ({ - id: n.id, - identifier: n.identifier, - title: n.title, - body: n.description ?? "", - labelIds: (n as any)._labelIds ?? [], - })); - }, - async listLabelIdsForIssue(issueId) { - const issue = await sdk.issue(issueId); - const labels = await issue.labels(); - return labels.nodes.map((l) => l.id); - }, - async findLabelIdByName(name) { - const labels = await sdk.issueLabels({ filter: { name: { eq: name } } }); - const found = labels.nodes[0]; - if (!found) throw new Error(`Linear label not found: ${name}`); - return found.id; - }, - async updateIssueLabels(issueId, labelIds) { - await sdk.updateIssue(issueId, { labelIds }); - }, - async createComment(issueId, body) { - await sdk.createComment({ issueId, body }); - }, - }; -} -``` - -- [ ] **Step 4: Run test, verify it passes** - -```bash -pnpm test src/linear/client.test.ts -``` - -Expected: 3 tests pass. - -- [ ] **Step 5: Commit** - -```bash -git add src/linear/client.ts src/linear/client.test.ts -git commit -m "feat: linear client with gateway abstraction" -``` - ---- - -## Task 7: MinIO storage module - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/storage/minio.ts` -- Create: `~/Development/sec-fix-pipeline/src/storage/minio.test.ts` - -We use the AWS SDK against MinIO's S3-compatible endpoint. The test uses testcontainers to spin a real MinIO — no mocking S3 itself. - -- [ ] **Step 1: Write the failing test** - -`src/storage/minio.test.ts`: - -```ts -import { describe, it, expect, beforeAll, afterAll } from "vitest"; -import { GenericContainer, StartedTestContainer } from "testcontainers"; -import { mkdtemp, writeFile, mkdir, rm } from "node:fs/promises"; -import { tmpdir } from "node:os"; -import { join } from "node:path"; -import { createStorage } from "./minio.js"; - -let minio: StartedTestContainer; -let endpoint: string; - -beforeAll(async () => { - minio = await new GenericContainer("minio/minio:RELEASE.2024-10-29T16-01-48Z") - .withCommand(["server", "/data"]) - .withEnvironment({ MINIO_ROOT_USER: "test", MINIO_ROOT_PASSWORD: "testtest" }) - .withExposedPorts(9000) - .start(); - endpoint = `http://${minio.getHost()}:${minio.getMappedPort(9000)}`; -}, 60_000); - -afterAll(async () => { - await minio.stop(); -}); - -describe("storage", () => { - it("uploads a directory and lists its keys", async () => { - const dir = await mkdtemp(join(tmpdir(), "upload-")); - try { - await mkdir(join(dir, "sub"), { recursive: true }); - await writeFile(join(dir, "a.txt"), "alpha"); - await writeFile(join(dir, "sub/b.txt"), "beta"); - - const storage = createStorage({ - endpoint, - accessKey: "test", - secretKey: "testtest", - bucket: "test-bucket", - }); - await storage.ensureBucket(); - await storage.uploadDirectory(dir, "LIN-1/"); - - const keys = await storage.list("LIN-1/"); - expect(keys.sort()).toEqual(["LIN-1/a.txt", "LIN-1/sub/b.txt"]); - } finally { - await rm(dir, { recursive: true, force: true }); - } - }, 60_000); -}); -``` - -- [ ] **Step 2: Run test, verify it fails** - -```bash -pnpm test src/storage/minio.test.ts -``` - -Expected: FAIL — module not found. - -- [ ] **Step 3: Write `minio.ts`** - -```ts -import { S3Client, CreateBucketCommand, HeadBucketCommand, ListObjectsV2Command, PutObjectCommand } from "@aws-sdk/client-s3"; -import { readFile, readdir, stat } from "node:fs/promises"; -import { join, relative, posix } from "node:path"; - -export type Storage = { - ensureBucket(): Promise; - uploadDirectory(localDir: string, keyPrefix: string): Promise; - list(keyPrefix: string): Promise; -}; - -export function createStorage(opts: { - endpoint: string; - accessKey: string; - secretKey: string; - bucket: string; -}): Storage { - const s3 = new S3Client({ - endpoint: opts.endpoint, - region: "us-east-1", - credentials: { accessKeyId: opts.accessKey, secretAccessKey: opts.secretKey }, - forcePathStyle: true, - }); - - return { - async ensureBucket() { - try { - await s3.send(new HeadBucketCommand({ Bucket: opts.bucket })); - } catch { - await s3.send(new CreateBucketCommand({ Bucket: opts.bucket })); - } - }, - - async uploadDirectory(localDir, keyPrefix) { - const files = await walk(localDir); - for (const abs of files) { - const rel = relative(localDir, abs).split(/[\\/]/).join("/"); - const key = posix.join(keyPrefix.replace(/\/+$/, ""), rel); - const body = await readFile(abs); - await s3.send(new PutObjectCommand({ Bucket: opts.bucket, Key: key, Body: body })); - } - }, - - async list(keyPrefix) { - const out: string[] = []; - let token: string | undefined; - do { - const resp = await s3.send( - new ListObjectsV2Command({ Bucket: opts.bucket, Prefix: keyPrefix, ContinuationToken: token }), - ); - for (const obj of resp.Contents ?? []) if (obj.Key) out.push(obj.Key); - token = resp.IsTruncated ? resp.NextContinuationToken : undefined; - } while (token); - return out; - }, - }; -} - -async function walk(dir: string): Promise { - const entries = await readdir(dir, { withFileTypes: true }); - const out: string[] = []; - for (const e of entries) { - const full = join(dir, e.name); - if (e.isDirectory()) out.push(...(await walk(full))); - else if (e.isFile()) out.push(full); - } - return out; -} -``` - -- [ ] **Step 4: Run test, verify it passes** - -```bash -pnpm test src/storage/minio.test.ts -``` - -Expected: 1 test passes (testcontainer pull may take a minute first time). - -- [ ] **Step 5: Commit** - -```bash -git add src/storage/minio.ts src/storage/minio.test.ts -git commit -m "feat: minio storage module with directory upload" -``` - ---- - -## Task 8: Compose wrapper - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/stack/compose.ts` -- Create: `~/Development/sec-fix-pipeline/src/stack/compose.test.ts` - -The compose module wraps three operations: `up --wait`, `wait `, `down -v`. Tests use a tiny fixture `compose-fixture.yml` with a single alpine service that sleeps then exits — no real stack needed. - -- [ ] **Step 1: Write a compose fixture for the test** - -Create `src/stack/__fixtures__/compose-fixture.yml`: - -```yaml -services: - agent: - image: alpine:3.20 - command: ["sh", "-c", "echo started; sleep 2; echo done; exit 0"] -``` - -- [ ] **Step 2: Write the failing test** - -`src/stack/compose.test.ts`: - -```ts -import { describe, it, expect } from "vitest"; -import { join } from "node:path"; -import { createCompose } from "./compose.js"; - -const fixture = join(__dirname, "__fixtures__/compose-fixture.yml"); - -describe("compose", () => { - it("up → wait → down cycles cleanly and reports exit code", async () => { - const compose = createCompose({ - file: fixture, - project: `sec-fix-test-${Date.now()}`, - }); - try { - await compose.up({ timeoutMs: 30_000 }); - const exitCode = await compose.waitForService("agent", { timeoutMs: 30_000 }); - expect(exitCode).toBe(0); - } finally { - await compose.down({ timeoutMs: 30_000 }); - } - }, 90_000); -}); -``` - -- [ ] **Step 3: Run test, verify it fails** - -```bash -pnpm test src/stack/compose.test.ts -``` - -Expected: FAIL — module not found. - -- [ ] **Step 4: Write `compose.ts`** - -```ts -import { execa } from "execa"; - -export type Compose = { - up(opts: { timeoutMs: number }): Promise; - waitForService(service: string, opts: { timeoutMs: number }): Promise; - down(opts: { timeoutMs: number }): Promise; -}; - -export function createCompose(opts: { file: string; project: string }): Compose { - const base = ["compose", "-p", opts.project, "-f", opts.file]; - - return { - async up({ timeoutMs }) { - await execa("docker", [...base, "up", "-d", "--wait"], { timeout: timeoutMs }); - }, - - async waitForService(service, { timeoutMs }) { - const result = await execa("docker", [...base, "wait", service], { timeout: timeoutMs }); - const code = parseInt(result.stdout.trim(), 10); - if (Number.isNaN(code)) { - throw new Error(`docker compose wait returned non-numeric: ${result.stdout}`); - } - return code; - }, - - async down({ timeoutMs }) { - await execa("docker", [...base, "down", "-v"], { timeout: timeoutMs }); - }, - }; -} -``` - -- [ ] **Step 5: Run test, verify it passes** - -```bash -pnpm test src/stack/compose.test.ts -``` - -Expected: 1 test passes. Requires Docker running locally. - -- [ ] **Step 6: Commit** - -```bash -git add src/stack/compose.ts src/stack/compose.test.ts src/stack/__fixtures__/compose-fixture.yml -git commit -m "feat: docker compose wrapper (up/wait/down)" -``` - ---- - -## Task 9: Stub agent container - -**Files:** -- Create: `~/Development/sec-fix-pipeline/docker/agent-stub/Dockerfile` -- Create: `~/Development/sec-fix-pipeline/docker/agent-stub/run-agent.mjs` - -The Phase 1 agent does nothing real — writes a hello file with the issue identifier into `/artifacts/`, then exits 0. The container interface is the contract Phase 2 will fill out. - -- [ ] **Step 1: Write `run-agent.mjs`** - -```js -import { writeFile, mkdir } from "node:fs/promises"; - -const issueId = process.env.LINEAR_ISSUE_ID; -if (!issueId) { - console.error("LINEAR_ISSUE_ID is required"); - process.exit(2); -} - -await mkdir("/artifacts", { recursive: true }); -await writeFile( - "/artifacts/hello.txt", - `Hello from sec-fix-pipeline phase 1 stub agent.\nIssue: ${issueId}\nTimestamp: ${new Date().toISOString()}\n`, -); -await writeFile( - "/artifacts/final-summary.md", - `# Stub agent run\n\nIssue: ${issueId}\n\nThis is a Phase 1 skeleton. No real validation or fix was performed.\n`, -); -await writeFile( - "/artifacts/status.json", - JSON.stringify({ issueId, endedAt: new Date().toISOString(), stub: true }, null, 2), -); - -console.log(`stub agent done for ${issueId}`); -process.exit(0); -``` - -- [ ] **Step 2: Write `Dockerfile`** - -```dockerfile -FROM node:22.13.0-alpine -WORKDIR /app -COPY run-agent.mjs ./run-agent.mjs -ENTRYPOINT ["node", "/app/run-agent.mjs"] -``` - -- [ ] **Step 3: Build the image** - -```bash -cd ~/Development/sec-fix-pipeline -docker build -t sec-fix/agent-stub:latest docker/agent-stub -``` - -Expected: build succeeds. - -- [ ] **Step 4: Smoke-run the image** - -```bash -docker run --rm -e LINEAR_ISSUE_ID=LIN-TEST -v $PWD/tmp-artifacts:/artifacts sec-fix/agent-stub:latest -ls tmp-artifacts/ -rm -rf tmp-artifacts/ -``` - -Expected: prints "stub agent done for LIN-TEST"; directory contains `hello.txt`, `final-summary.md`, `status.json`. - -- [ ] **Step 5: Commit** - -```bash -git add docker/agent-stub/ -git commit -m "feat: stub agent container that writes hello artifact" -``` - ---- - -## Task 10: Per-issue stack template - -For Phase 1 the stack contains only the agent service. Later phases add postgres, redis, clickhouse, webapp, electric, and the shared `repo` volume. The MinIO host service is run separately (long-lived, shared across all issues). - -**Files:** -- Create: `~/Development/sec-fix-pipeline/docker/stack.yml` - -- [ ] **Step 1: Write `stack.yml`** - -```yaml -# Per-issue compose template. Instantiated with -p sec-. -# Phase 1: agent service only. Phase 3 will add the full trigger.dev stack. - -name: sec-fix-issue - -services: - agent: - image: sec-fix/agent-stub:latest - environment: - LINEAR_ISSUE_ID: "${LINEAR_ISSUE_ID:?LINEAR_ISSUE_ID must be set}" - volumes: - - artifacts:/artifacts - -volumes: - artifacts: -``` - -- [ ] **Step 2: Write the host services compose file** - -Create `docker/host-services.yml`: - -```yaml -# Long-lived host services. Brought up once with: -# docker compose -f docker/host-services.yml up -d - -name: sec-fix-host - -services: - minio: - image: minio/minio:RELEASE.2024-10-29T16-01-48Z - command: ["server", "/data", "--console-address", ":9001"] - environment: - MINIO_ROOT_USER: "${MINIO_ACCESS_KEY:-minioadmin}" - MINIO_ROOT_PASSWORD: "${MINIO_SECRET_KEY:-minioadmin}" - ports: - - "9000:9000" - - "9001:9001" - volumes: - - minio-data:/data - restart: unless-stopped - -volumes: - minio-data: -``` - -- [ ] **Step 3: Start host services** - -```bash -docker compose -f docker/host-services.yml up -d -docker ps | grep minio -``` - -Expected: MinIO container running on port 9000. - -- [ ] **Step 4: Commit** - -```bash -git add docker/stack.yml docker/host-services.yml -git commit -m "feat: per-issue stack template and host MinIO service" -``` - ---- - -## Task 11: MinIO bucket setup script - -**Files:** -- Create: `~/Development/sec-fix-pipeline/scripts/setup-minio.sh` - -- [ ] **Step 1: Write `setup-minio.sh`** - -```bash -#!/usr/bin/env bash -set -euo pipefail - -# Loads env from .env if present -if [ -f .env ]; then - set -a; source .env; set +a -fi - -: "${MINIO_ENDPOINT:?must be set}" -: "${MINIO_ACCESS_KEY:?must be set}" -: "${MINIO_SECRET_KEY:?must be set}" -: "${MINIO_BUCKET:?must be set}" - -docker run --rm --network host \ - -e MC_HOST_local="http://${MINIO_ACCESS_KEY}:${MINIO_SECRET_KEY}@${MINIO_ENDPOINT#http://}" \ - minio/mc:RELEASE.2024-10-29T15-34-59Z \ - mb -p "local/${MINIO_BUCKET}" - -echo "Bucket ready: ${MINIO_BUCKET}" -``` - -- [ ] **Step 2: Make executable and run** - -```bash -chmod +x scripts/setup-minio.sh -pnpm setup:minio -``` - -Expected: prints "Bucket ready: security-artifacts" (or reports the bucket already exists). - -- [ ] **Step 3: Commit** - -```bash -git add scripts/setup-minio.sh -git commit -m "feat: minio bucket setup script" -``` - ---- - -## Task 12: process-issue orchestration - -This is the integration point. It claims the issue, runs the stack, collects `/artifacts/` from the compose volume, uploads to MinIO, posts a Linear comment, and finalizes the label. - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/worker/process-issue.ts` -- Create: `~/Development/sec-fix-pipeline/src/worker/process-issue.test.ts` - -- [ ] **Step 1: Write the failing test** - -`src/worker/process-issue.test.ts`: - -```ts -import { describe, it, expect, vi } from "vitest"; -import { mkdtemp, rm, mkdir, writeFile, readFile } from "node:fs/promises"; -import { tmpdir } from "node:os"; -import { join } from "node:path"; -import { processIssue, ProcessIssueDeps } from "./process-issue.js"; -import { LABELS } from "../linear/labels.js"; - -function deps(overrides: Partial = {}): ProcessIssueDeps { - return { - linear: { - findNextQueuedIssue: vi.fn(), - swapLabel: vi.fn().mockResolvedValue(undefined), - addComment: vi.fn().mockResolvedValue(undefined), - }, - compose: { - up: vi.fn().mockResolvedValue(undefined), - waitForService: vi.fn().mockResolvedValue(0), - down: vi.fn().mockResolvedValue(undefined), - }, - storage: { - ensureBucket: vi.fn().mockResolvedValue(undefined), - uploadDirectory: vi.fn().mockResolvedValue(undefined), - list: vi.fn().mockResolvedValue([]), - }, - collectArtifacts: vi.fn().mockResolvedValue(undefined), - stateDir: "", - outDir: "", - logger: { info: () => {}, error: () => {}, warn: () => {} } as any, - ...overrides, - }; -} - -describe("processIssue", () => { - it("happy path: claim → up → wait → collect → upload → comment → done", async () => { - const d = deps(); - const stateDir = await mkdtemp(join(tmpdir(), "state-")); - const outDir = await mkdtemp(join(tmpdir(), "out-")); - try { - await processIssue( - { id: "issue-1", identifier: "LIN-1", title: "test", body: "", labelIds: [] }, - { ...d, stateDir, outDir }, - ); - expect(d.linear.swapLabel).toHaveBeenNthCalledWith(1, "issue-1", LABELS.queued, LABELS.inProgress); - expect(d.compose.up).toHaveBeenCalledOnce(); - expect(d.compose.waitForService).toHaveBeenCalledWith("agent", expect.any(Object)); - expect(d.storage.uploadDirectory).toHaveBeenCalled(); - expect(d.linear.addComment).toHaveBeenCalled(); - expect(d.linear.swapLabel).toHaveBeenNthCalledWith(2, "issue-1", LABELS.inProgress, LABELS.done); - expect(d.compose.down).toHaveBeenCalledOnce(); - - const state = JSON.parse(await readFile(join(stateDir, "LIN-1.json"), "utf8")); - expect(state.phase).toBe("finalized"); - expect(state.outcome).toBe("done"); - } finally { - await rm(stateDir, { recursive: true, force: true }); - await rm(outDir, { recursive: true, force: true }); - } - }); - - it("agent non-zero exit causes failed outcome and label", async () => { - const d = deps({ - compose: { - up: vi.fn().mockResolvedValue(undefined), - waitForService: vi.fn().mockResolvedValue(1), - down: vi.fn().mockResolvedValue(undefined), - }, - }); - const stateDir = await mkdtemp(join(tmpdir(), "state-")); - const outDir = await mkdtemp(join(tmpdir(), "out-")); - try { - await processIssue( - { id: "issue-2", identifier: "LIN-2", title: "test", body: "", labelIds: [] }, - { ...d, stateDir, outDir }, - ); - expect(d.linear.swapLabel).toHaveBeenLastCalledWith("issue-2", LABELS.inProgress, LABELS.failed); - const state = JSON.parse(await readFile(join(stateDir, "LIN-2.json"), "utf8")); - expect(state.outcome).toBe("failed"); - } finally { - await rm(stateDir, { recursive: true, force: true }); - await rm(outDir, { recursive: true, force: true }); - } - }); - - it("always tears down the stack, even if upload fails", async () => { - const d = deps({ - storage: { - ensureBucket: vi.fn().mockResolvedValue(undefined), - uploadDirectory: vi.fn().mockRejectedValue(new Error("minio down")), - list: vi.fn().mockResolvedValue([]), - }, - }); - const stateDir = await mkdtemp(join(tmpdir(), "state-")); - const outDir = await mkdtemp(join(tmpdir(), "out-")); - try { - await expect( - processIssue( - { id: "issue-3", identifier: "LIN-3", title: "test", body: "", labelIds: [] }, - { ...d, stateDir, outDir }, - ), - ).rejects.toThrow(/minio down/); - expect(d.compose.down).toHaveBeenCalledOnce(); - } finally { - await rm(stateDir, { recursive: true, force: true }); - await rm(outDir, { recursive: true, force: true }); - } - }); -}); -``` - -- [ ] **Step 2: Run test, verify it fails** - -```bash -pnpm test src/worker/process-issue.test.ts -``` - -Expected: FAIL — module not found. - -- [ ] **Step 3: Write `process-issue.ts`** - -```ts -import { mkdir } from "node:fs/promises"; -import { join } from "node:path"; -import type { LinearClient, LinearIssue } from "../linear/client.js"; -import { LABELS } from "../linear/labels.js"; -import type { Compose } from "../stack/compose.js"; -import type { Storage } from "../storage/minio.js"; -import type { Logger } from "../logger.js"; -import { writeState } from "../state.js"; - -export type ProcessIssueDeps = { - linear: LinearClient; - compose: Compose; - storage: Storage; - collectArtifacts: (project: string, destDir: string) => Promise; - stateDir: string; - outDir: string; - logger: Logger; -}; - -const TIMEOUTS = { - up: 5 * 60_000, - wait: 90 * 60_000, - down: 2 * 60_000, -}; - -export async function processIssue(issue: LinearIssue, deps: ProcessIssueDeps): Promise { - const project = `sec-${issue.identifier.toLowerCase()}`; - const issueOutDir = join(deps.outDir, issue.identifier); - await mkdir(issueOutDir, { recursive: true }); - - // Claim - await writeState(deps.stateDir, { - issueIdentifier: issue.identifier, - phase: "claimed", - project, - startedAt: new Date().toISOString(), - outcome: null, - }); - await deps.linear.swapLabel(issue.id, LABELS.queued, LABELS.inProgress); - deps.logger.info({ issue: issue.identifier, project }, "claimed"); - - let outcome: "done" | "failed" = "done"; - - try { - await writeState(deps.stateDir, { - issueIdentifier: issue.identifier, - phase: "running", - project, - startedAt: new Date().toISOString(), - outcome: null, - }); - await deps.compose.up({ timeoutMs: TIMEOUTS.up }); - const exitCode = await deps.compose.waitForService("agent", { timeoutMs: TIMEOUTS.wait }); - deps.logger.info({ issue: issue.identifier, exitCode }, "agent exited"); - if (exitCode !== 0) outcome = "failed"; - - await deps.collectArtifacts(project, issueOutDir); - await deps.storage.ensureBucket(); - await deps.storage.uploadDirectory(issueOutDir, `${issue.identifier}/`); - await writeState(deps.stateDir, { - issueIdentifier: issue.identifier, - phase: "uploaded", - project, - startedAt: new Date().toISOString(), - outcome, - }); - - await deps.linear.addComment( - issue.id, - renderComment(issue.identifier, outcome), - ); - - await deps.linear.swapLabel( - issue.id, - LABELS.inProgress, - outcome === "done" ? LABELS.done : LABELS.failed, - ); - await writeState(deps.stateDir, { - issueIdentifier: issue.identifier, - phase: "finalized", - project, - startedAt: new Date().toISOString(), - outcome, - }); - deps.logger.info({ issue: issue.identifier, outcome }, "finalized"); - } finally { - try { - await deps.compose.down({ timeoutMs: TIMEOUTS.down }); - } catch (err) { - deps.logger.error({ issue: issue.identifier, err: String(err) }, "compose down failed"); - } - } -} - -function renderComment(issueIdentifier: string, outcome: "done" | "failed"): string { - return [ - ``, - `**sec-fix-worker — Phase 1 stub**`, - ``, - `Outcome: \`${outcome}\``, - ``, - `Artifacts uploaded to \`s3://security-artifacts/${issueIdentifier}/\`.`, - ].join("\n"); -} -``` - -- [ ] **Step 4: Run test, verify it passes** - -```bash -pnpm test src/worker/process-issue.test.ts -``` - -Expected: 3 tests pass. - -- [ ] **Step 5: Commit** - -```bash -git add src/worker/process-issue.ts src/worker/process-issue.test.ts -git commit -m "feat: process-issue orchestration with state transitions" -``` - ---- - -## Task 13: Artifact collection from compose volume - -`collectArtifacts(project, destDir)` copies the contents of the named `artifacts` volume out to a host directory. Uses `docker cp` from a one-shot helper container that mounts the volume. - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/stack/collect-artifacts.ts` -- Create: `~/Development/sec-fix-pipeline/src/stack/collect-artifacts.test.ts` - -- [ ] **Step 1: Write the failing test** - -`src/stack/collect-artifacts.test.ts`: - -```ts -import { describe, it, expect } from "vitest"; -import { mkdtemp, readFile, rm } from "node:fs/promises"; -import { tmpdir } from "node:os"; -import { join } from "node:path"; -import { execa } from "execa"; -import { collectArtifacts } from "./collect-artifacts.js"; - -describe("collectArtifacts", () => { - it("copies files out of a named volume into the destination dir", async () => { - const project = `sec-fix-collect-test-${Date.now()}`; - const volume = `${project}_artifacts`; - // Pre-populate the volume by running a one-shot container. - await execa("docker", [ - "run", "--rm", - "-v", `${volume}:/artifacts`, - "alpine:3.20", - "sh", "-c", "echo hi > /artifacts/a.txt && mkdir -p /artifacts/sub && echo two > /artifacts/sub/b.txt", - ]); - const dest = await mkdtemp(join(tmpdir(), "collected-")); - try { - await collectArtifacts(project, dest); - expect(await readFile(join(dest, "a.txt"), "utf8")).toBe("hi\n"); - expect(await readFile(join(dest, "sub/b.txt"), "utf8")).toBe("two\n"); - } finally { - await rm(dest, { recursive: true, force: true }); - await execa("docker", ["volume", "rm", "-f", volume]).catch(() => {}); - } - }, 60_000); -}); -``` - -- [ ] **Step 2: Run test, verify it fails** - -```bash -pnpm test src/stack/collect-artifacts.test.ts -``` - -Expected: FAIL — module not found. - -- [ ] **Step 3: Write `collect-artifacts.ts`** - -```ts -import { execa } from "execa"; - -export async function collectArtifacts(project: string, destDir: string): Promise { - const volume = `${project}_artifacts`; - await execa("docker", [ - "run", "--rm", - "-v", `${volume}:/src:ro`, - "-v", `${destDir}:/dst`, - "alpine:3.20", - "sh", "-c", "cp -a /src/. /dst/", - ]); -} -``` - -- [ ] **Step 4: Run test, verify it passes** - -```bash -pnpm test src/stack/collect-artifacts.test.ts -``` - -Expected: 1 test passes. - -- [ ] **Step 5: Commit** - -```bash -git add src/stack/collect-artifacts.ts src/stack/collect-artifacts.test.ts -git commit -m "feat: collect artifacts from compose named volume" -``` - ---- - -## Task 14: Worker CLI entry point - -**Files:** -- Create: `~/Development/sec-fix-pipeline/src/worker/main.ts` - -`main.ts` wires everything together for a single-issue dry-run: load config, build Linear/compose/storage clients, fetch one issue, call `processIssue`, exit. The continuous loop comes in Phase 4. - -- [ ] **Step 1: Write `main.ts`** - -```ts -import "dotenv/config"; -import { loadConfig } from "../config.js"; -import { logger } from "../logger.js"; -import { createLinearClient, createRealGateway } from "../linear/client.js"; -import { createStorage } from "../storage/minio.js"; -import { createCompose } from "../stack/compose.js"; -import { collectArtifacts } from "../stack/collect-artifacts.js"; -import { processIssue } from "./process-issue.js"; -import { join } from "node:path"; - -async function main(): Promise { - const cfg = loadConfig(); - const linear = createLinearClient( - { viewId: cfg.linear.viewId }, - createRealGateway(cfg.linear.apiKey), - ); - const storage = createStorage(cfg.minio); - - const issue = await linear.findNextQueuedIssue(); - if (!issue) { - logger.info("no queued issue; exiting"); - return 0; - } - logger.info({ issue: issue.identifier, title: issue.title }, "picked issue"); - - const project = `sec-${issue.identifier.toLowerCase()}`; - const compose = createCompose({ - file: join(process.cwd(), "docker/stack.yml"), - project, - }); - - process.env.LINEAR_ISSUE_ID = issue.identifier; // consumed by compose stack.yml - - await processIssue(issue, { - linear, - compose, - storage, - collectArtifacts: (p, d) => collectArtifacts(p, d), - stateDir: cfg.dirs.state, - outDir: cfg.dirs.out, - logger, - }); - - return 0; -} - -main() - .then((code) => process.exit(code)) - .catch((err) => { - logger.error({ err: String(err), stack: err?.stack }, "worker crashed"); - process.exit(1); - }); -``` - -- [ ] **Step 2: Add `dotenv` to dependencies** - -Edit `package.json` to add to `dependencies`: - -```json - "dotenv": "16.4.5", -``` - -Then: - -```bash -pnpm i -``` - -- [ ] **Step 3: Typecheck** - -```bash -pnpm typecheck -``` - -Expected: no errors. - -- [ ] **Step 4: Commit** - -```bash -git add src/worker/main.ts package.json pnpm-lock.yaml -git commit -m "feat: worker CLI entry point" -``` - ---- - -## Task 15: Seed-test-issue script - -A helper script to create a Linear issue in the deepsec view labelled `agent:queued`, for end-to-end testing without touching real findings. - -**Files:** -- Create: `~/Development/sec-fix-pipeline/scripts/seed-test-issue.ts` - -- [ ] **Step 1: Write `seed-test-issue.ts`** - -```ts -import "dotenv/config"; -import { LinearClient } from "@linear/sdk"; -import { loadConfig } from "../src/config.js"; -import { LABELS } from "../src/linear/labels.js"; - -async function main() { - const cfg = loadConfig(); - const linear = new LinearClient({ apiKey: cfg.linear.apiKey }); - - // The deepsec view is on a specific team; we read the view to find the team ID. - const view = await linear.customView(cfg.linear.viewId); - const team = await view.team; - if (!team) throw new Error("could not resolve view's team"); - - const labels = await linear.issueLabels({ filter: { name: { eq: LABELS.queued } } }); - const queuedLabel = labels.nodes[0]; - if (!queuedLabel) throw new Error(`label not found: ${LABELS.queued}`); - - const created = await linear.createIssue({ - teamId: team.id, - title: `[sec-fix-test] phase 1 e2e probe — ${new Date().toISOString()}`, - description: "Synthetic issue created by sec-fix-pipeline phase 1 e2e test. Safe to close.", - labelIds: [queuedLabel.id], - }); - const issue = await created.issue; - console.log(`Created issue: ${issue?.identifier} (${issue?.id})`); - console.log(`URL: ${issue?.url}`); -} - -main().catch((e) => { - console.error(e); - process.exit(1); -}); -``` - -- [ ] **Step 2: Run it** - -```bash -pnpm seed:issue -``` - -Expected: prints a new issue identifier (e.g. `TRI-1234`) and URL. - -- [ ] **Step 3: Verify in Linear** - -Open the printed URL in a browser. Confirm the issue exists, has the `agent:queued` label, and appears in the deepsec-findings view. - -> If the label doesn't exist yet on the team, create the full label set (`agent:queued`, `agent:in-progress`, `agent:done`, `agent:false-positive`, `agent:runaway`, `agent:resume`, `agent:failed`) in Linear UI under the team's label settings before continuing. Re-run the seed script. - -- [ ] **Step 4: Commit** - -```bash -git add scripts/seed-test-issue.ts -git commit -m "feat: seed-test-issue script" -``` - ---- - -## Task 16: End-to-end integration test (manual) - -This is the Phase 1 ship gate. Not automated — operator runs it once and checks each step. - -- [ ] **Step 1: Ensure prerequisites are running** - -```bash -docker ps | grep minio # MinIO container up -docker images | grep sec-fix/agent-stub # Stub agent image built -``` - -If either is missing, redo Task 9 / Task 10. - -- [ ] **Step 2: Confirm `.env` is populated** - -```bash -test -f .env && grep LINEAR_API_KEY .env | grep -v '^LINEAR_API_KEY=$' -``` - -Expected: prints the line (key is set). - -- [ ] **Step 3: Setup bucket and seed test issue** - -```bash -pnpm setup:minio -pnpm seed:issue -``` - -Note the printed issue identifier (e.g. `TRI-1234`). - -- [ ] **Step 4: Run the worker** - -```bash -pnpm worker 2>&1 | tee /tmp/sec-fix-worker.log -``` - -Expected log lines (in order): -- `"picked issue"` with the seeded issue's identifier -- `"claimed"` -- `"agent exited"` with `exitCode: 0` -- `"finalized"` with `outcome: done` - -Process exits 0. - -- [ ] **Step 5: Verify Linear state** - -Open the seeded issue in Linear. Confirm: -- Label `agent:queued` is gone -- Label `agent:done` is present -- A comment was added containing `s3://security-artifacts//` - -- [ ] **Step 6: Verify MinIO contents** - -```bash -docker run --rm --network host \ - -e MC_HOST_local="http://${MINIO_ACCESS_KEY}:${MINIO_SECRET_KEY}@localhost:9000" \ - minio/mc:RELEASE.2024-10-29T15-34-59Z \ - ls -r local/security-artifacts/ -``` - -Expected: lists `/hello.txt`, `/final-summary.md`, `/status.json`. - -- [ ] **Step 7: Verify local state file** - -```bash -cat state/.json -``` - -Expected: JSON with `"phase": "finalized"`, `"outcome": "done"`. - -- [ ] **Step 8: Verify no compose stack lingers** - -```bash -docker compose ls -``` - -Expected: no `sec-*` project listed. - -- [ ] **Step 9: Close the seeded test issue manually** - -In Linear, close the seeded test issue with a comment "phase 1 e2e probe complete". - -- [ ] **Step 10: Document the run** - -Append a line to `README.md`: - -```markdown -## Phase 1 e2e probe history - -- : — issue — PASS -``` - -- [ ] **Step 11: Commit** - -```bash -git add README.md -git commit -m "docs: record phase 1 e2e probe pass" -``` - ---- - -## Phase 1 done - -Pipeline plumbing validated. Next: Phase 2 — Real Claude Agent SDK integration. The interface between the worker and the agent container is now fixed (env var in, `/artifacts/` out, exit code as signal), so Phase 2 only changes `docker/agent-stub/` → `docker/agent/` with a real `run-agent.mjs` and a system prompt. - ---- - -## Self-Review (writing-plans skill) - -**Spec coverage (Phase 1 only):** -- Linear queue read from deepsec view → Task 6, Task 14 ✓ -- Label state machine (queued → in-progress → done/failed) → Tasks 5, 12 ✓ -- Per-issue docker compose with project name `sec-` → Tasks 8, 10, 12 ✓ -- Agent container with `/artifacts/` contract → Task 9 ✓ -- Container exit code as completion signal → Task 8 (`waitForService`), Task 12 ✓ -- MinIO artifact upload → Tasks 7, 11, 12 ✓ -- Linear comment with marker → Task 12 (`renderComment`) ✓ -- Local state file with atomic writes → Task 4 ✓ -- Teardown via `compose down -v` even on failure → Task 12 (`finally`) ✓ - -**Spec sections deferred to later phases (explicit, not gaps):** -- Real Agent SDK + system prompt + multi-artifact bundle → Phase 2 -- Full per-issue stack (postgres/redis/clickhouse/webapp) + repo volume + pnpm-store + egress policy → Phase 3 -- systemd + reconcile + circuit breakers + heartbeat → Phase 4 -- Resumable runs (snapshots, session resume, 3-resume cap) → Phase 5 -- Review dashboard → Phase 6 - -**Placeholder scan:** None. All steps have concrete code or commands. - -**Type consistency check:** `LinearIssue`, `LinearClient`, `LinearGateway`, `Compose`, `Storage`, `IssueState`, `ProcessIssueDeps` all named consistently across Tasks 4, 6, 7, 8, 12. - -**One known caveat:** the `findNextQueuedIssue` gateway implementation in Task 6 uses a `customView(viewId).issues(...)` call against the Linear SDK; the exact filter API may need adjustment after Phase 1 hits real Linear (Linear's SDK has changed view-issue filtering shape between releases). The integration test in Task 16 will surface this; fix is a small edit in the gateway, not a design change. diff --git a/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md b/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md deleted file mode 100644 index 320e3c0720..0000000000 --- a/docs/superpowers/specs/2026-05-26-security-fix-pipeline-design.md +++ /dev/null @@ -1,561 +0,0 @@ -# Security Fix Pipeline — Design Spec - -**Date:** 2026-05-26 -**Status:** Approved, ready for implementation planning -**Author:** Daniel Sutton (with Claude) - -## Problem - -We have ~250 security findings filed as Linear tickets. Each needs to be -validated, then — if real — fixed with production-grade care: minimal blast -radius, no breaking changes, cautious rollout, migration considerations. We -want a dedicated machine to chew through this batch autonomously and produce -reviewable artifacts, without constant approval cycles. - -Strict constraints: - -- **Nothing leaks to GitHub.** Findings are exploitable; patches and commit - messages must be treated as sensitive until disclosure timing is decided. -- **Production-grade fixes only.** Multi-step rollouts, backwards-compatible - changes, migration plans where relevant. -- **Human review at the end.** Patches never auto-apply; the upstreaming - process is deliberate and separate. - -## Scope - -In scope: validation, fix design, patch generation, regression testing, -verification, artifact packaging, queue management, review surface. - -Out of scope: actually upstreaming fixes (separate human process post-review); -disclosure / CVE / advisory workflow. - -## Architecture Overview - -``` -┌─────────────────────────────────────────────────────────────┐ -│ Dedicated machine │ -│ │ -│ Linear ◄────── worker.ts ──────► docker compose -p sec-X │ -│ (queue) (serial, N=1) │ │ -│ ▲ ▼ │ -│ │ ┌──────────────┐ │ -│ │ │ per-issue │ │ -│ │ │ compose stack│ │ -│ │ │ postgres │ │ -│ │ │ redis │ │ -│ │ │ clickhouse │ │ -│ │ │ minio │ │ -│ │ │ electric │ │ -│ │ │ webapp ◄────┼──┐ │ -│ │ │ agent ◄────┼──┤ shared -│ │ └──────────────┘ │ /repo -│ │ ▼ volume -│ │ artifacts │ -│ │ vol │ -│ │ │ │ -│ │ ┌──────────┐ ┌──────┐ │ -│ └────────────┤dashboard │◄───────────────────┤MinIO │ │ -│ │localhost │ └──────┘ │ -│ │ :4000 │ │ -│ └──────────┘ │ -└─────────────────────────────────────────────────────────────┘ -``` - -Three components: **worker daemon**, **per-issue compose stack**, **review -dashboard**. Linear is the queue. MinIO is the artifact store. No GitHub, no -external services beyond Anthropic API and Linear API. - -## Queue & State (Linear) - -Linear is the queue. The worker reads from the **deepsec-findings** view: -`https://linear.app/triggerdotdev/view/deepsec-findings-c443c3c869c0` -All 250 issues already exist in this view. State lives in labels applied on -top of the existing ticket state: - -| Label | Meaning | -|---|---| -| `agent:queued` | Ready for the worker to pick up | -| `agent:in-progress` | Claimed; container running | -| `agent:done` | Artifact bundle uploaded, ready for review | -| `agent:false-positive` | Agent determined the finding is not real | -| `agent:runaway` | Soft-limit hit (turns, wall timeout, heartbeat). Resumable. | -| `agent:resume` | Operator queued for resumption from a runaway snapshot | -| `agent:failed` | Hard failure (crash, OOM, non-resumable error); needs human eyes | -| `agent:reviewed-accept` | Human approved the fix bundle (set by dashboard) | -| `agent:reviewed-reject` | Human rejected the fix bundle (set by dashboard) | - -Audit trail is free — every label change appears in Linear's activity feed. - -## Worker Daemon - -Single TS process running under systemd on the dedicated machine. **Serial, -processes one issue at a time.** - -### Main loop - -```ts -while (running) { - // Source: deepsec-findings Linear view; pick the highest-priority issue - // labelled agent:queued (or not yet labelled by us — first-pass adoption). - const issue = await linear.nextFromDeepsecView({ label: "agent:queued" }); - if (!issue) { await sleep(30_000); continue; } - - await claim(issue); // write state(claimed) → setLabel(in-progress) - await run(issue); // compose up → wait → compose down - await persistArtifacts(); // collect → upload to MinIO → verify - await notify(issue); // post Linear comment (idempotent) - await finalize(issue); // setLabel(done|failed|false-positive) -} -``` - -### Local state file is source of truth - -`/var/lib/sec-fix-worker/state/.json`: - -```json -{ - "phase": "claimed" | "running" | "uploaded" | "finalized", - "startedAt": "...", - "project": "sec-LIN-1234", - "containerExitCode": null, - "outcome": "done" | "failed" | "false-positive" | null -} -``` - -Each transition: write state file → fsync → do side effect → write state -file. State file leads side effects so a crash always leaves a recoverable -record. - -### Reconcile on every startup - -Before entering the loop: - -1. `docker compose ls --filter name=sec-*` → for each orphaned project: tail - logs, `compose down -v`, mark corresponding state file `phase: "crashed"`, - set Linear label `agent:failed`, post Linear comment with the tail. -2. Scan `./state/*.json` for non-finalized entries → finish whatever step was - interrupted (re-upload, re-comment, re-label as needed; all idempotent). -3. Scan Linear for issues stuck on `agent:in-progress` with no local state - file (worker was wiped) → reset to `agent:queued`. - -### Idempotency - -- Uploads check MinIO for existing artifact before re-uploading -- Linear comments include marker ``; - duplicate-comment detection skips re-posting - -## Per-Issue Compose Stack - -One `stack.yml` template instantiated per issue with `--project-name -sec-`. Compose's project name provides the isolation boundary: -isolated network, isolated named volumes, complete teardown via `compose down --v`. - -### Services - -```yaml -services: - postgres: # ephemeral pgdata per project - redis: # ephemeral per project - clickhouse: # ephemeral per project - minio: # ephemeral; per-issue uploads go to the host MinIO, not this one - electric: # ephemeral per project - - webapp: - volumes: - - repo:/repo # shared with agent - - pnpm-store:/pnpm-store # shared read-mostly across all projects - command: pnpm --filter webapp dev - depends_on: [postgres, redis, clickhouse, electric] - healthcheck: curl http://localhost:3030/healthcheck - - agent: - volumes: - - repo:/repo # SAME volume as webapp - - artifacts:/artifacts # output drop - command: node /run-agent.mjs - depends_on: - webapp: { condition: service_healthy } - -volumes: - pgdata: # per project - repo: # per project, populated from a baked tar snapshot - artifacts: # per project - pnpm-store: - external: true - name: pnpm-store-shared # ONE shared volume across all runs -``` - -### Repo volume population - -**Pinned base SHA: `37eeaa36908fb1aad48fc43d04e5b4e8f474f957`** — `origin/main` -of `trigger.dev-mirror` as of 2026-05-25, the commit immediately preceding -the most recent deepsec revalidate run (2026-05-25 18:14). Findings were -produced against this revision of the codebase, so reproduction and fixes -target it. - -A `repo.tar` snapshot at this SHA is baked into the base image; an init -container extracts it into the `repo` volume at stack startup (~5–10s). If a -Linear issue specifies a different base SHA in its body, the worker swaps in -a `git-clone`-from-local-bare-mirror init container instead. - -### Network isolation - -Compose stack's default network has egress restricted via iptables init -container (or Docker network policy plugin) to: - -- `api.anthropic.com` -- `api.linear.app` -- Local MinIO host - -Inbound: none. Belt-and-braces with the agent's `disallowedTools` list. - -## Agent Container - -### Image - -Base image (`trigger-mirror-agent:pinned`) contains: - -- Repo tar snapshot at a pinned SHA -- pnpm store warm (`pnpm fetch`) — actually mounted as shared external volume -- `@anthropic-ai/claude-agent-sdk` installed globally -- `/run-agent.mjs` (the bridge script) -- `/prompts/security-fix.md` (system prompt encoding the methodology) -- MCP server configs (Linear read+write only; no GitHub MCP) - -API keys passed via Docker secret files at `/run/secrets/anthropic` and -`/run/secrets/linear`, never as env vars. - -### `run-agent.mjs` — the worker↔agent bridge - -```js -import { query } from "@anthropic-ai/claude-agent-sdk"; -import { LinearClient } from "@linear/sdk"; -import fs from "node:fs/promises"; - -const issueId = process.env.LINEAR_ISSUE_ID; -const linear = new LinearClient({ apiKey: await readSecret("linear") }); -const issue = await linear.issue(issueId); - -const systemPrompt = await fs.readFile("/prompts/security-fix.md", "utf8"); -const userPrompt = renderIssueContext(issue, await issue.comments()); - -await fs.mkdir("/artifacts", { recursive: true }); -startHeartbeat("/artifacts/.heartbeat"); // updated every 60s - -const result = query({ - prompt: userPrompt, - options: { - systemPrompt, - cwd: "/repo", - permissionMode: "bypassPermissions", - allowedTools: ["Read", "Edit", "Write", "Bash", "Grep", "Glob"], - disallowedTools: ["WebFetch", "WebSearch"], - mcpServers: { linear: { /* read+write */ } }, - maxTurns: 200, // hard ceiling; runaway loops fail rather than burn budget - }, -}); - -const transcript = await fs.open("/artifacts/transcript.jsonl", "w"); -let finalText = ""; -for await (const msg of result) { - await transcript.write(JSON.stringify(msg) + "\n"); - if (msg.type === "assistant") finalText = extractText(msg); -} -await transcript.close(); - -await fs.writeFile("/artifacts/final-summary.md", finalText); -await fs.writeFile("/artifacts/status.json", - JSON.stringify({ issueId, endedAt: new Date().toISOString() })); - -process.exit(0); -``` - -Process exits when the SDK's async iterator finishes. Container exits. -`docker compose wait agent` on the host returns. **Unix process lifecycle is -the synchronization primitive — no IPC, no polling, no marker files.** - -### Agent contract (encoded in system prompt) - -1. **Validate**: reproduce the finding or declare false positive. Write - `/artifacts/validation.md`. -2. **If false positive**: final assistant message is - `FALSE_POSITIVE: `. Stop. -3. **If real**: produce the bundle in `/artifacts/`: - - `design.md` — blast radius, public-API/DB impact, backwards-compat - strategy, alternatives considered, minimal-impact justification - - `rollout.md` — sequencing across PRs if non-atomic, feature flags, - migration order, monitoring/rollback signals - - `patches/01-*.patch`, `02-*.patch`, ... — ordered, applied with `git am` - - `tests/` — new/updated regression tests, referenced inside the patches - - `verification.log` — captured output of `pnpm typecheck` (apps/internal - packages) or `pnpm build` (public packages) per CLAUDE.md, plus - `pnpm test` for the affected package - - `changeset.md` — draft `.changeset/` entry if any public package touched -4. **Final assistant message**: `SUBMITTED: `. Agent stops - by simply not calling another tool; SDK loop exits. - -### Bias toward minimal impact - -System prompt explicitly instructs: prefer additive changes over modifying -existing surfaces; prefer flagged rollouts over direct ships; prefer multiple -small ordered patches over one big atomic change when the fix touches public -contracts or schema. The agent has agency to decide, but defaults are -cautious. - -## Resumable Runs - -The agent can hit a soft limit — maxTurns (200), wall timeout (90 min), or -heartbeat watchdog (10 min stall) — with valuable in-progress state. These -outcomes are **resumable**, not failures. - -### Classification of run outcomes - -| Outcome | Cause | Resumable? | -|---|---|---| -| `done` | Final message `SUBMITTED` | — | -| `false-positive` | Final message `FALSE_POSITIVE` | — | -| `runaway` | maxTurns hit, wall timeout, heartbeat stall | Yes | -| `failed` | Container crash, OOM, exit code from non-timeout cause | No (state suspect) | - -### What survives teardown - -Two named volumes are snapshotted before `compose down -v`: - -- **`repo`** volume → `./snapshots//run-N/repo/` (source files only; - `node_modules` and pnpm store excluded — rehydrated from the base image on - resume; snapshot stays under ~100 MB per run) -- **`agent-session`** volume (mount of `~/.claude/projects/` inside the agent - container, where the SDK persists session JSONL) → - `./session-snapshots//run-N/` - -Plus the partial `/artifacts/` contents (whatever the agent had written so -far) are collected exactly as for completed runs. - -Snapshots upload to MinIO under -`s3://security-artifacts//runaway-/`. Local copies retained 14 -days (longer than artifact cache because resumes can happen later); MinIO is -the long-term store. - -### Resume action - -Triggered from the dashboard (Resume button on a runaway issue) or -`worker-cli resume `: - -1. Dashboard sets `agent:resume` label and posts a Linear comment - (`` marker for idempotency) -2. Worker picks the issue up like a normal queued issue but branches: - ```ts - if (issue.labels.includes("agent:resume")) { - await restoreRepoVolume(issueId, project); // hydrate from snapshot - await restoreSessionVolume(issueId, project); // hydrate SDK session - await stack.up({ resumeRunNumber: priorRunCount + 1 }); - } else { - await stack.up({ fresh: true }); - } - ``` -3. Inside the container, `run-agent.mjs` checks for an existing session ID in - the mounted session volume. If present, calls - `query({ ..., resume: sessionId })` to continue the prior conversation - rather than starting fresh. The user prompt is prefixed with: "You are - resuming after hitting ``. Review `/artifacts/` for what you've - already produced and continue from there." - -### Resume budget - -- **Max 3 resumes per issue** (configurable). After the 3rd consecutive - runaway outcome, the issue auto-promotes to `agent:failed` with a comment - explaining the cap was hit. Prevents infinite resume loops on truly - unfixable issues. -- Each resume gets a **fresh 90-min wall budget** and a **fresh 200-turn - budget**. The whole point of resume is to extend the available compute, - so per-run budgets reset; only the attempt count is capped. -- Attempt count tracked in the local state file under - `runaways: [{ runNumber, reason, endedAt }, ...]` and reflected in the - dashboard. - -### State file additions - -```json -{ - "phase": "...", - "currentRun": 2, - "runaways": [ - { "runNumber": 1, "reason": "maxTurns", "endedAt": "2026-05-26T11:30:00Z" } - ], - "sessionId": "claude-session-abc123" -} -``` - -### Dashboard surface - -The Review tab shows runaway issues with: - -- Reason for runaway (turns / wall / heartbeat) -- Attempt count (e.g. "runaway 2 of 3") -- Partial artifacts produced so far (whatever the agent had written) -- **Resume** button (disabled at the cap) -- **Mark failed** button (operator can manually give up) -- **Mark false-positive** button (if the partial work is enough to make the - call without resuming) - -The Queue tab shows runaway issues queued for resume distinctly from -fresh-queued issues. - -## Artifact Storage - -- Worker collects `/artifacts/` from the per-project volume to - `./out//` on the host -- Uploads to local MinIO at `s3://security-artifacts//` -- Verifies ETags -- Local `./out//` cached for 7 days; MinIO is source of truth - -## Review Dashboard - -Local Remix app on `localhost:4000`, run on the dedicated machine. - -### Queue tab (live operational view) - -- Counts: queued / in-progress / done / failed / false-positive -- Currently-running container (single, since serial): live log tail -- Recent failures with summaries - -### Review tab (per-issue review) - -For each `agent:done` issue: - -- Validation evidence (rendered Markdown) -- Design doc with blast-radius, alternatives, minimal-impact justification -- Rollout plan -- Ordered patches rendered with a proper diff component (Monaco / react-diff-view) -- Tests rendered alongside their patch -- Verification log -- Changeset draft -- Actions: **Approve** / **Reject** / **Needs changes** — writes - `agent:reviewed-accept` or `agent:reviewed-reject` back to Linear, plus a - review-notes comment - -State syncs to Linear on every action; dashboard is stateless beyond -short-lived UI state. - -## Durability (Unsupervised Operation) - -### Process supervision - -- systemd unit with `Restart=always`, `RestartSec=10`, `WatchdogSec=120`; - worker pings `sd_notify(WATCHDOG=1)` every 30s -- `flock /var/lock/sec-fix-worker.lock` at startup; double-instance prevented -- Kill switch: worker checks `/var/lib/sec-fix-worker/STOP` at top of loop; - drains current issue and exits cleanly if present - -### Timeouts - -- `compose up`: 5 min -- `compose wait agent`: 90 min -- `compose down`: 2 min -- Linear API call: 30s with exponential backoff, max 5 attempts -- MinIO upload: 5 min per file, retry 3x -- **Agent heartbeat watchdog**: agent writes `/artifacts/.heartbeat` every - 60s; if unchanged for 10 min, worker `compose kill agent` → mark - `runaway` (resumable; see "Resumable Runs"). Catches infinite tool-call - loops that don't trip `maxTurns`. - -The 90-min wall timeout and `maxTurns: 200` also produce `runaway` outcomes -rather than hard `failed`. Only container crashes, OOM kills, and non-timeout -non-zero exit codes produce `failed`. - -### Circuit breakers - -- **Consecutive failures**: 5 in a row → write `/var/lib/sec-fix-worker/PAUSED`, - alert, stop picking up new work. Stays alive for status reporting. -- **Failure rate**: >40% over last 20 issues → same. -- **Disk**: before each issue, check free space on `/var/lib/docker` and - `./out/`; if under 20 GB, pause and alert. - -### Resource hygiene - -- Per-issue logs `./logs/.log` capped at 100 MB via streaming truncation -- `docker image prune -f` after every 10 issues -- `docker volume prune -f` at reconcile time -- `./out//` deleted after 7 days - -### Secret hygiene - -- API keys via Docker secrets (`/run/secrets/*`), not env vars -- Transcript scrubber runs over `transcript.jsonl` pre-upload: regex-strips - Bearer tokens, known key prefixes, common secret patterns -- System prompt forbids writing secrets to artifact files - -### Observability (dashboard-only, no external alerting) - -The review dashboard is the operator's single surface. No Slack, no email, no -webhooks — the operator checks the dashboard on their own cadence. - -The dashboard's queue tab shows: - -- Live queue counts (queued / in-progress / done / failed / false-positive) -- Currently-running issue with live log tail -- Last N completed and last N failed, with summaries -- Circuit-breaker state (running / paused-by-consecutive-failures / - paused-by-failure-rate / paused-by-disk) -- Free disk on `/var/lib/docker` and `./out/` -- Projected completion time based on rolling average - -Worker also writes a heartbeat to `/var/lib/sec-fix-worker/heartbeat.json` -every 60s; dashboard surfaces "last heartbeat" as a freshness indicator. If -the worker dies silently, the dashboard makes it obvious within a minute. - -### Stop conditions - -Worker exits cleanly (systemd does not restart past this point — one-shot -disable) when: - -- Queue is empty AND no `agent:in-progress` issues remain -- `STOP` file touched -- Circuit breaker tripped (alerts; stays alive but does not pick up new work) - -## Upstreaming (Out of Pipeline) - -Explicitly out of scope for this pipeline. After review, a separate human -process: - -1. Decides disclosure timing for each accepted fix -2. Sanitizes commit messages if needed -3. Applies patches to a real branch with `git am` -4. Creates real PRs against `main` (now safe — fix is known good, no - reference to the vulnerability in commit messages until disclosure) -5. Coordinates with security advisories / CVE assignment as appropriate - -The pipeline's job ends at `agent:reviewed-accept`. - -## Wall-Clock Budget - -~30 min avg per issue × 250 issues = ~125 hours = ~5.2 days continuous serial -processing. Kick off Friday, review the following week. If too slow later, -lifting to N=2 concurrency is a one-line change to the worker semaphore. - -## What We Build - -1. **Base Docker image** (`trigger-mirror-base`) — repo tar, pnpm fetch -2. **Agent Docker image** (`trigger-mirror-agent`) — base + agent SDK + - `run-agent.mjs` + prompts + MCP configs -3. **`stack.yml`** — the per-issue compose template -4. **`worker.ts`** — the daemon (~150 lines incl. reconcile, durability, - circuit breakers, alerting) -5. **`run-agent.mjs`** — the in-container bridge script (~80 lines) -6. **`/prompts/security-fix.md`** — the system prompt encoding the - validation/fix/rollout methodology and minimal-impact bias -7. **Review dashboard** — local Remix app, queue + review tabs, diff renderer, - Linear writeback -8. **systemd unit** + **iptables egress policy** + **MinIO bucket setup** + - **status Linear issue setup** - -## Non-Goals - -- Auto-applying patches to `main` -- Public PR creation -- CVE / advisory automation -- Multi-machine orchestration (single dedicated machine) -- Parallel issue processing (serial, N=1, by design) -- Re-running an issue automatically after failure (retries are human-driven - via re-labeling to `agent:queued`) From 9378033a57b50c2081c1787f7f3d6e15fb7f50be Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Mon, 1 Jun 2026 20:27:46 +0100 Subject: [PATCH 05/12] docs: shorten changelog entries to one line Co-Authored-By: Claude Opus 4.7 --- .changeset/mollifier-drain-batch-size.md | 2 +- .server-changes/mollifier-drain-batch-size.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.changeset/mollifier-drain-batch-size.md b/.changeset/mollifier-drain-batch-size.md index 9aea66a96c..c9ca0dcd86 100644 --- a/.changeset/mollifier-drain-batch-size.md +++ b/.changeset/mollifier-drain-batch-size.md @@ -2,4 +2,4 @@ "@trigger.dev/redis-worker": patch --- -`MollifierDrainer` now accepts a `drainBatchSize` option that controls how many entries it pops from a single env per tick. Default remains 1 (one pop per env per tick — previous behaviour). Setting it higher lets a single-env burst drain at handler-parallelism speed instead of one entry per ~50ms tick: the drainer pops up to `drainBatchSize` from the picked env and dispatches all popped entries through the shared `concurrency`-bounded limiter. Org/env fairness is unchanged — the per-tick env selection is unaffected. +`MollifierDrainer` accepts a `drainBatchSize` option (default 1) that lets a single env drain at full `concurrency`-parallelism per tick. diff --git a/.server-changes/mollifier-drain-batch-size.md b/.server-changes/mollifier-drain-batch-size.md index 3216dde528..2b0e1697b8 100644 --- a/.server-changes/mollifier-drain-batch-size.md +++ b/.server-changes/mollifier-drain-batch-size.md @@ -3,4 +3,4 @@ area: webapp type: improvement --- -Wire `TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE` (default 50) into the drainer so single-env bursts drain at the full `DRAIN_CONCURRENCY` budget instead of one pop per ~50ms tick. For a 20k-trigger burst on one env this cuts drain time from minutes to ~tens of seconds; smaller bursts (e.g. 50 on one env) drop from ~2.5s to ~50–100ms tail. +Wire `TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE` (default 50) so single-env bursts drain at the full `DRAIN_CONCURRENCY` budget per tick instead of one entry per tick. From 3bceb549fd7064a3ea5ba0cbf8f4200b8a658a53 Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Mon, 1 Jun 2026 20:37:58 +0100 Subject: [PATCH 06/12] fix(redis-worker): bound DRAINING blast radius with worker-pool drain MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses CodeRabbit review: the prefetched-pop tick design moved every popped entry into DRAINING before any of them got a pLimit slot, so a process crash mid-tick stranded up to maxOrgsPerTick × drainBatchSize entries for stale-sweep to recover (~25k at defaults). Replaces it with a worker-pool: spawn min(concurrency, totalBudget) workers; each worker round-robin-picks an env with budget remaining, pops one entry, processes it, releases its slot. At any moment, the count of popped-but-not-acked entries is bounded by `concurrency` — identical safety to the pre-batch one-pop-per-env path — while a single-env burst still uses the full concurrency budget (all workers can pull from the same env). Adds a regression test pinning the bound: never has more than `concurrency` entries popped-but-not-acked at any moment. Two existing batch tests now use concurrency=1 to isolate the break-on-empty/error semantic from the worker-pool's parallel-pick race (the race semantic itself is covered by the new safety test). Co-Authored-By: Claude Opus 4.7 --- .../src/mollifier/drainer.test.ts | 78 +++++++++- .../redis-worker/src/mollifier/drainer.ts | 133 +++++++++++------- 2 files changed, 159 insertions(+), 52 deletions(-) diff --git a/packages/redis-worker/src/mollifier/drainer.test.ts b/packages/redis-worker/src/mollifier/drainer.test.ts index a33835033e..1640f4fd0a 100644 --- a/packages/redis-worker/src/mollifier/drainer.test.ts +++ b/packages/redis-worker/src/mollifier/drainer.test.ts @@ -276,7 +276,13 @@ describe("MollifierDrainer.drainBatchSize", () => { handler: async (input) => { handled.push(input.runId); }, - concurrency: 5, + // Concurrency=1 so the worker pool runs sequentially and pop calls + // can't race past the `skip.add(envId)` that fires after a pop + // failure. The semantic this test pins (one env's pop blowup + // aborts its batch and counts as one failure) is the deterministic + // case; multi-worker race semantics are exercised by the safety + // property test below. + concurrency: 1, maxAttempts: 3, isRetryable: () => false, drainBatchSize: 5, @@ -484,6 +490,69 @@ describe("MollifierDrainer.drainBatchSize", () => { expect(r.failed).toBe(2); }); + it("never has more than `concurrency` entries popped-but-not-acked at any moment", async () => { + // Regression guard for the DRAINING blast radius. Each pop+process + // happens inside a single pLimit slot, so at any instant the number + // of entries that have been popped (and therefore marked DRAINING in + // a real buffer) but not yet acked is bounded by `concurrency`. This + // matters because the stale sweep only catches DRAINING entries + // visibly after a threshold — a process crash with thousands of + // mid-flight entries would mean a long detection/recovery window. + const envCount = 10; + const perEnv = 20; + const queues = new Map(); + for (let i = 0; i < envCount; i++) { + queues.set( + `env_${i}`, + Array.from({ length: perEnv }, (_, j) => `env_${i}_run_${j}`), + ); + } + let inflightPoppedNotAcked = 0; + let peak = 0; + const concurrency = 4; + const buffer = makeStubBuffer({ + ...eachEnvAsOwnOrg([...queues.keys()]), + pop: async (envId: string) => { + const q = queues.get(envId); + if (!q || q.length === 0) return null; + const runId = q.shift()!; + inflightPoppedNotAcked += 1; + if (inflightPoppedNotAcked > peak) peak = inflightPoppedNotAcked; + return { + runId, + envId, + orgId: envId, + payload: "{}", + attempts: 0, + createdAt: new Date(), + } as any; + }, + ack: async () => { + inflightPoppedNotAcked -= 1; + }, + }); + + const drainer = new MollifierDrainer({ + buffer, + handler: async () => { + // Force handler overlap if scheduling allowed it — without a + // tight per-slot bound the peak would visibly exceed `concurrency`. + await new Promise((r) => setTimeout(r, 15)); + }, + concurrency, + maxAttempts: 3, + isRetryable: () => false, + drainBatchSize: perEnv, + logger: new Logger("test-drainer", "log"), + }); + + const r = await drainer.runOnce(); + expect(r.drained).toBe(envCount * perEnv); + expect(peak).toBeGreaterThan(1); // concurrency is real, not serialised + expect(peak).toBeLessThanOrEqual(concurrency); // and bounded by it + expect(inflightPoppedNotAcked).toBe(0); // everything settled + }); + it("stops popping early when the env's queue empties before reaching drainBatchSize", async () => { const queue = ["only_1", "only_2"]; const handled: string[] = []; @@ -511,7 +580,12 @@ describe("MollifierDrainer.drainBatchSize", () => { handler: async (input) => { handled.push(input.runId); }, - concurrency: 5, + // Concurrency=1 isolates the "stop on empty" semantic from the + // worker pool's parallel-pick race: with multiple workers, several + // can pick env_a simultaneously and pop in parallel before any of + // them can `skip.add(envId)`, so the empty-pop count would be + // >1 nondeterministically. + concurrency: 1, maxAttempts: 3, isRetryable: () => false, drainBatchSize: 10, diff --git a/packages/redis-worker/src/mollifier/drainer.ts b/packages/redis-worker/src/mollifier/drainer.ts index a2a3737f47..9ad0f26f1b 100644 --- a/packages/redis-worker/src/mollifier/drainer.ts +++ b/packages/redis-worker/src/mollifier/drainer.ts @@ -1,5 +1,4 @@ import { Logger } from "@trigger.dev/core/logger"; -import pLimit from "p-limit"; import { MollifierBuffer } from "./buffer.js"; import { BufferEntry, deserialiseSnapshot } from "./schemas.js"; @@ -84,8 +83,8 @@ export class MollifierDrainer { private readonly pollIntervalMs: number; private readonly maxOrgsPerTick: number; private readonly drainBatchSize: number; + private readonly concurrency: number; private readonly logger: Logger; - private readonly limit: ReturnType; // Rotation state. `orgCursor` advances through the active-orgs list. // Each org has its own internal cursor in `perOrgEnvCursors` for // cycling through that org's envs. Both reset on `start()`. @@ -104,8 +103,8 @@ export class MollifierDrainer { this.pollIntervalMs = options.pollIntervalMs ?? 100; this.maxOrgsPerTick = options.maxOrgsPerTick ?? 500; this.drainBatchSize = Math.max(1, options.drainBatchSize ?? 1); + this.concurrency = Math.max(1, options.concurrency); this.logger = options.logger ?? new Logger("MollifierDrainer", "debug"); - this.limit = pLimit(options.concurrency); } async runOnce(): Promise { @@ -133,64 +132,98 @@ export class MollifierDrainer { targets.push(envId); } - // Pop a batch from each target env in parallel. Within an env we pop - // sequentially (each Lua `pop` is atomic; back-to-back pops on the - // same env can't be concurrent without a `popBatch` Lua, and Redis - // RTT × drainBatchSize is cheap compared to the engine.trigger work - // that follows). A pop failure mid-batch aborts only that env's - // batch and counts as one failure — same semantics as the previous - // one-pop-per-env path, generalised. - const envBatches = await Promise.all( - targets.map(async (envId) => { - const entries: BufferEntry[] = []; - let popFailed = false; - for (let i = 0; i < this.drainBatchSize; i++) { - let entry: BufferEntry | null; - try { - entry = await this.buffer.pop(envId); - } catch (err) { - this.logger.error("MollifierDrainer.pop failed", { envId, err }); - popFailed = true; - break; - } - if (!entry) break; - entries.push(entry); + if (targets.length === 0) return { drained: 0, failed: 0 }; + + // Worker-pool draining. We spawn up to `concurrency` workers; each + // worker repeatedly: + // 1. Picks the next env with budget remaining (round-robin), + // atomically claiming one slot of that env's per-tick budget. + // 2. Pops one entry and processes it. + // 3. Repeats until pickNextEnv returns null. + // + // This pattern gives us both invariants the prior two designs traded + // off: + // - Single-env bursts use the full `concurrency` budget. All + // workers can pull from one env, processing `concurrency` entries + // in parallel. + // - The number of entries in "popped-but-not-acked" (DRAINING) + // state at any moment is bounded by the worker count, i.e. + // `concurrency` — same blast radius as the pre-batch + // one-pop-per-env model. A process crash mid-tick strands at + // most `concurrency` entries for stale-sweep to recover, not + // `maxOrgsPerTick × drainBatchSize`. + // + // Fairness: pickNextEnv advances a cursor by 1 each successful pick, + // so workers round-robin across envs at the entry level. Combined + // with the per-env budget cap, an env contributes at most + // `drainBatchSize` entries per tick regardless of how many workers + // are free — a heavy env can't starve siblings within a tick. + const remaining = new Map(); + const skip = new Set(); // envs with empty queue or pop failure this tick + for (const envId of targets) remaining.set(envId, this.drainBatchSize); + + let cursor = 0; + const pickNextEnv = (): string | null => { + for (let i = 0; i < targets.length; i++) { + const idx = (cursor + i) % targets.length; + const envId = targets[idx]!; + if (skip.has(envId)) continue; + const r = remaining.get(envId) ?? 0; + if (r > 0) { + remaining.set(envId, r - 1); + cursor = (idx + 1) % targets.length; + return envId; } - return { entries, popFailed }; - }), - ); + } + return null; + }; - const popFailures = envBatches.reduce((n, b) => n + (b.popFailed ? 1 : 0), 0); - const allEntries = envBatches.flatMap((b) => b.entries); - if (allEntries.length === 0) { - return { drained: 0, failed: popFailures }; - } + let drained = 0; + let failed = 0; - // Dispatch every popped entry through the shared pLimit so the - // global in-flight cap is `concurrency` regardless of how many envs - // contributed entries this tick. Per-entry errors are caught inside - // the closure so a single bad entry can't poison the tick — same - // safety net the old `processOneFromEnv` provided. - const inflight = allEntries.map((entry) => - this.limit(async () => { + const worker = async (): Promise => { + while (true) { + const envId = pickNextEnv(); + if (envId === null) return; + let entry: BufferEntry | null; try { - return await this.processEntry(entry); + entry = await this.buffer.pop(envId); + } catch (err) { + // A pop failure on one env aborts that env's batch for this + // tick (don't keep hammering a broken Redis) and counts as + // exactly one failure — same as the pre-batch path on a pop + // blowup. Other envs continue. + this.logger.error("MollifierDrainer.pop failed", { envId, err }); + skip.add(envId); + failed += 1; + continue; + } + if (!entry) { + // Queue exhausted between scheduling and this pop. Mark the + // env skipped so siblings aren't held up by repeated empty pops. + skip.add(envId); + continue; + } + try { + const outcome = await this.processEntry(entry); + if (outcome === "drained") drained += 1; + else failed += 1; } catch (err) { this.logger.error("MollifierDrainer.processEntry failed", { - envId: entry.envId, + envId, runId: entry.runId, err, }); - return "failed" as const; + failed += 1; } - }), - ); - - const results = await Promise.all(inflight); - return { - drained: results.filter((r) => r === "drained").length, - failed: results.filter((r) => r === "failed").length + popFailures, + } }; + + const totalBudget = targets.length * this.drainBatchSize; + const workerCount = Math.min(this.concurrency, totalBudget); + await Promise.all(Array.from({ length: workerCount }, () => worker())); + + return { drained, failed }; } start(): void { From 18e64f66a0a1713e90951728724ed6521055b630 Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Mon, 1 Jun 2026 20:42:55 +0100 Subject: [PATCH 07/12] test(redis-worker): restore real concurrency in batch tests with race-tolerant assertions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The two batch tests that exercise pop-failure / queue-empty behaviour were temporarily set to concurrency=1 to dodge the worker pool's parallel-pick race. That collapsed the worker pool and stopped the tests from validating their semantics under genuine concurrency. Restored concurrency=5 (matching the rest of the suite) and switched the non-deterministic counts to bounded-range assertions: - mid-batch pop failure: actual drained entries are deterministic (the two bad pops + one good pop); failure count is in [1, concurrency] because workers that loop after a sibling's empty/null pop can re-pop the broken env before skip.add propagates. envBadPops is bounded by drainBatchSize + concurrency — the property is "bounded retry", not "exactly one". - stops popping early: popCalls in [3, concurrency + 2] and strictly less than drainBatchSize — the property is "we don't pop all the way to the batch ceiling once the queue empties". These bounds are tight enough to catch a regression to unbounded pops while tolerating the legitimate race between worker iterations. Co-Authored-By: Claude Opus 4.7 --- .../src/mollifier/drainer.test.ts | 56 +++++++++++-------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/packages/redis-worker/src/mollifier/drainer.test.ts b/packages/redis-worker/src/mollifier/drainer.test.ts index 1640f4fd0a..6fccba0ee8 100644 --- a/packages/redis-worker/src/mollifier/drainer.test.ts +++ b/packages/redis-worker/src/mollifier/drainer.test.ts @@ -271,32 +271,38 @@ describe("MollifierDrainer.drainBatchSize", () => { }, }); + const concurrency = 5; + const drainBatchSize = 5; const drainer = new MollifierDrainer({ buffer, handler: async (input) => { handled.push(input.runId); }, - // Concurrency=1 so the worker pool runs sequentially and pop calls - // can't race past the `skip.add(envId)` that fires after a pop - // failure. The semantic this test pins (one env's pop blowup - // aborts its batch and counts as one failure) is the deterministic - // case; multi-worker race semantics are exercised by the safety - // property test below. - concurrency: 1, + concurrency, maxAttempts: 3, isRetryable: () => false, - drainBatchSize: 5, + drainBatchSize, logger: new Logger("test-drainer", "log"), }); const result = await drainer.runOnce(); - // env_bad: 2 successful pops processed (drained) + 1 pop failure (failed). - // env_good: 1 successful pop processed (drained). + // The actual ENTRIES drained are deterministic regardless of races: + // env_bad's pop returns bad_1 then bad_2 (the only two snapshots it + // ever produces) and env_good's pop returns good_1 (its only entry). expect(result.drained).toBe(3); - expect(result.failed).toBe(1); expect(new Set(handled)).toEqual(new Set(["bad_1", "bad_2", "good_1"])); - // We stopped popping env_bad on the failure — no fourth attempt. - expect(envBadPops).toBe(3); + // At least one failure is recorded (env_bad's throwing pop). With + // concurrency > 1 the race between "worker loops after empty/null" + // and "skip.add(envBad) propagates" can re-pop the broken env, so + // the upper bound is concurrency. The property we're pinning is + // bounded retry, not "exactly one". + expect(result.failed).toBeGreaterThanOrEqual(1); + expect(result.failed).toBeLessThanOrEqual(concurrency); + // env_bad's pop call count is bounded too — at most concurrency + // retries after the first throw — definitely never reaches the + // drainBatchSize ceiling. + expect(envBadPops).toBeGreaterThanOrEqual(3); + expect(envBadPops).toBeLessThan(drainBatchSize + concurrency); }); it("fans batched pops out across multiple envs in a single tick", async () => { @@ -575,29 +581,31 @@ describe("MollifierDrainer.drainBatchSize", () => { }, }); + const concurrency = 5; + const drainBatchSize = 10; const drainer = new MollifierDrainer({ buffer, handler: async (input) => { handled.push(input.runId); }, - // Concurrency=1 isolates the "stop on empty" semantic from the - // worker pool's parallel-pick race: with multiple workers, several - // can pick env_a simultaneously and pop in parallel before any of - // them can `skip.add(envId)`, so the empty-pop count would be - // >1 nondeterministically. - concurrency: 1, + concurrency, maxAttempts: 3, isRetryable: () => false, - drainBatchSize: 10, + drainBatchSize, logger: new Logger("test-drainer", "log"), }); const r = await drainer.runOnce(); expect(r.drained).toBe(2); - expect(handled).toEqual(["only_1", "only_2"]); - // 2 successful pops + 1 sentinel pop that returned null and ended - // the batch loop — 3 calls, not 10. Bounding stops the Lua spam. - expect(popCalls).toBe(3); + expect(new Set(handled)).toEqual(new Set(["only_1", "only_2"])); + // The property we're pinning: pop calls are bounded by concurrency + // (plus the original two successes) once the queue empties — they + // never run all the way up to drainBatchSize. With concurrency > 1 + // multiple workers can race to pop env_a before `skip.add` lands, + // so the upper bound is the worker count, not a tight "3". + expect(popCalls).toBeGreaterThanOrEqual(3); // 2 success + ≥1 sentinel null + expect(popCalls).toBeLessThanOrEqual(concurrency + 2); + expect(popCalls).toBeLessThan(drainBatchSize); // bounded — the actual safety property }); }); From 1c6fc450f3b6f6230d0a08b1b0cd0beb7d7a937e Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Mon, 1 Jun 2026 20:44:21 +0100 Subject: [PATCH 08/12] docs: clarify drainBatchSize changeset wording Co-Authored-By: Claude Opus 4.7 --- .changeset/mollifier-drain-batch-size.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.changeset/mollifier-drain-batch-size.md b/.changeset/mollifier-drain-batch-size.md index c9ca0dcd86..ee9d308ad4 100644 --- a/.changeset/mollifier-drain-batch-size.md +++ b/.changeset/mollifier-drain-batch-size.md @@ -2,4 +2,4 @@ "@trigger.dev/redis-worker": patch --- -`MollifierDrainer` accepts a `drainBatchSize` option (default 1) that lets a single env drain at full `concurrency`-parallelism per tick. +`MollifierDrainer` accepts a `drainBatchSize` option (default 1) that controls how many entries are popped per env per tick — in-flight handlers remain capped by the global `concurrency`. From 4bdd70d78577e7a193ff73fa0880760382b0239a Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Mon, 1 Jun 2026 20:46:31 +0100 Subject: [PATCH 09/12] fix(redis-worker): make per-env pop-failure increment idempotent MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Multiple workers can race past pickNextEnv into the same env before skip propagates from the first failing pop. With the prior unguarded `failed += 1` each racing worker bumped the count, so a single broken env could contribute up to `concurrency` failures in one tick — drifting from the documented "one failure per env batch" contract. Guard the increment on `!skip.has(envId)` so the per-env failure count is exactly one regardless of race. Tightens the test assertion from "in [1, concurrency]" to "=== 1". Co-Authored-By: Claude Opus 4.7 --- packages/redis-worker/src/mollifier/drainer.test.ts | 12 +++++------- packages/redis-worker/src/mollifier/drainer.ts | 13 +++++++++++-- 2 files changed, 16 insertions(+), 9 deletions(-) diff --git a/packages/redis-worker/src/mollifier/drainer.test.ts b/packages/redis-worker/src/mollifier/drainer.test.ts index 6fccba0ee8..d4250641ee 100644 --- a/packages/redis-worker/src/mollifier/drainer.test.ts +++ b/packages/redis-worker/src/mollifier/drainer.test.ts @@ -291,13 +291,11 @@ describe("MollifierDrainer.drainBatchSize", () => { // ever produces) and env_good's pop returns good_1 (its only entry). expect(result.drained).toBe(3); expect(new Set(handled)).toEqual(new Set(["bad_1", "bad_2", "good_1"])); - // At least one failure is recorded (env_bad's throwing pop). With - // concurrency > 1 the race between "worker loops after empty/null" - // and "skip.add(envBad) propagates" can re-pop the broken env, so - // the upper bound is concurrency. The property we're pinning is - // bounded retry, not "exactly one". - expect(result.failed).toBeGreaterThanOrEqual(1); - expect(result.failed).toBeLessThanOrEqual(concurrency); + // Exactly one failure recorded for env_bad, even though multiple + // workers can race into a broken env before skip propagates — the + // catch guards the increment on `!skip.has(envId)`, so the documented + // "one failure per env batch" contract holds. + expect(result.failed).toBe(1); // env_bad's pop call count is bounded too — at most concurrency // retries after the first throw — definitely never reaches the // drainBatchSize ceiling. diff --git a/packages/redis-worker/src/mollifier/drainer.ts b/packages/redis-worker/src/mollifier/drainer.ts index 9ad0f26f1b..b5b90cdb2a 100644 --- a/packages/redis-worker/src/mollifier/drainer.ts +++ b/packages/redis-worker/src/mollifier/drainer.ts @@ -193,9 +193,18 @@ export class MollifierDrainer { // tick (don't keep hammering a broken Redis) and counts as // exactly one failure — same as the pre-batch path on a pop // blowup. Other envs continue. + // + // `pickNextEnv` decrements `remaining` before the pop settles, + // so multiple workers can race into the same env and all hit + // a throwing pop before the first catch lands. Guarding the + // failure increment on `!skip.has(envId)` keeps the per-env + // failure count at exactly one even under that race — + // matching the documented contract. this.logger.error("MollifierDrainer.pop failed", { envId, err }); - skip.add(envId); - failed += 1; + if (!skip.has(envId)) { + skip.add(envId); + failed += 1; + } continue; } if (!entry) { From 2d185abdc0649626c5d8d9b8f259b1ebfa19ea19 Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Tue, 2 Jun 2026 09:22:23 +0100 Subject: [PATCH 10/12] feat(redis-worker,webapp): observability index for in-flight DRAINING entries MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a Redis sorted set `mollifier:draining` mirroring entries currently in DRAINING state (popped by the drainer, not yet acked/failed/requeued), scored by pop wall-clock millis. Maintained atomically with the existing per-entry status transitions: - popAndMarkDraining → ZADD score=now-ms - ackMollifierEntry → ZREM - failMollifierEntry → ZREM - requeueMollifierEntry → ZREM Each pre-existing Lua picks up one extra Redis op; ack/fail also gain a runId arg so they can ZREM without a hash read. Buffer exposes: - getDrainingCount(): ZCARD — gauge value - listStaleDraining(olderThanMs, limit): ZRANGEBYSCORE — forensics after an ECS OOM ("which entries were stranded?") NOT load-bearing for correctness — per-entry hash still carries status, stale-sweep still scans queue LISTs. The set is a fast top-level index so a wiped/out-of-date set just over-reports the gauge; recovery paths are untouched. A test pins this graceful-degradation invariant. Wires `mollifier.draining.current` ObservableGauge polled every 15s on the drainer worker pods. unref'd setInterval so it can't block graceful shutdown; idempotent under dev hot-reload. Test seam exported for unit testing without spinning a real OTel meter. Tests: - 7 redisTest cases in buffer.test.ts (lifecycle on every Lua boundary, requeue-and-repop score replacement, listStaleDraining cutoff/limit, graceful-degradation when set is wiped) - 6 unit tests in webapp for the gauge poller (eager fire, cadence, null buffer no-op, transient-error survives, idempotent start, stop halts loop) Co-Authored-By: Claude Opus 4.7 --- .../mollifierDrainingGauge.server.ts | 63 +++++ .../v3/mollifier/mollifierTelemetry.server.ts | 33 +++ .../app/v3/mollifierDrainerWorker.server.ts | 7 + .../test/mollifierDrainingGauge.test.ts | 116 ++++++++ .../redis-worker/src/mollifier/buffer.test.ts | 248 ++++++++++++++++++ packages/redis-worker/src/mollifier/buffer.ts | 94 ++++++- 6 files changed, 557 insertions(+), 4 deletions(-) create mode 100644 apps/webapp/app/v3/mollifier/mollifierDrainingGauge.server.ts create mode 100644 apps/webapp/test/mollifierDrainingGauge.test.ts diff --git a/apps/webapp/app/v3/mollifier/mollifierDrainingGauge.server.ts b/apps/webapp/app/v3/mollifier/mollifierDrainingGauge.server.ts new file mode 100644 index 0000000000..eda8f45ebf --- /dev/null +++ b/apps/webapp/app/v3/mollifier/mollifierDrainingGauge.server.ts @@ -0,0 +1,63 @@ +import { logger } from "~/services/logger.server"; +import { getMollifierBuffer } from "./mollifierBuffer.server"; +import { reportDrainingCount } from "./mollifierTelemetry.server"; + +// How often we ZCARD the draining-tracker set. Each poll is a single +// O(1) Redis call, so cadence is bounded by "how fresh do we want the +// gauge?" rather than cost. 15s gives a tight-enough window to spot a +// brief OOM-induced spike without burning RTTs, and lines up well with +// typical Prometheus scrape intervals. +const POLL_INTERVAL_MS = 15_000; + +let intervalHandle: ReturnType | null = null; + +// Polls `mollifier:draining` cardinality on an interval and feeds the +// gauge in `mollifierTelemetry.server.ts`. Started from the drainer +// worker bootstrap (alongside `drainer.start()`) so it runs on the same +// pods that actually pop/ack entries — observability is colocated with +// the lifecycle. +// +// Idempotent: a second call is a no-op (Remix dev hot-reload re-runs +// the bootstrap; the existing interval keeps ticking). +export function startMollifierDrainingGauge(opts: { + intervalMs?: number; + getBuffer?: typeof getMollifierBuffer; +} = {}): void { + if (intervalHandle !== null) return; + + const intervalMs = opts.intervalMs ?? POLL_INTERVAL_MS; + const getBuffer = opts.getBuffer ?? getMollifierBuffer; + + // Fire one poll immediately so the gauge populates before the first + // scrape rather than reading 0 for a full interval after boot. + const tick = async () => { + const buffer = getBuffer(); + if (!buffer) return; + try { + const count = await buffer.getDrainingCount(); + reportDrainingCount(count); + } catch (err) { + // Transient Redis blip — don't tank the loop, just leave the + // gauge at its last-known value. A sustained Redis outage will + // surface via the drainer's own alerts long before this gauge + // staleness becomes a primary signal. + logger.warn("Mollifier draining gauge poll failed; keeping previous value", { err }); + } + }; + + void tick(); + // unref so the interval doesn't keep the process alive past + // graceful shutdown — the gauge is best-effort, not a flush boundary. + intervalHandle = setInterval(() => { + void tick(); + }, intervalMs); + intervalHandle.unref?.(); +} + +// Test seam. Production code never calls this; lifecycle is implicitly +// process-end. +export function stopMollifierDrainingGauge(): void { + if (intervalHandle === null) return; + clearInterval(intervalHandle); + intervalHandle = null; +} diff --git a/apps/webapp/app/v3/mollifier/mollifierTelemetry.server.ts b/apps/webapp/app/v3/mollifier/mollifierTelemetry.server.ts index f9c7ca72f1..deaa32bb74 100644 --- a/apps/webapp/app/v3/mollifier/mollifierTelemetry.server.ts +++ b/apps/webapp/app/v3/mollifier/mollifierTelemetry.server.ts @@ -90,6 +90,39 @@ meter.addBatchObservableCallback( [staleEntriesGauge], ); +// Observability gauge for entries currently in DRAINING state — popped +// by the drainer but not yet acked/failed/requeued. Backed by the +// `mollifier:draining` ZSET (see `MollifierBuffer.getDrainingCount`) +// and polled by the loop in `mollifierDrainingGaugeLoop.server.ts`. +// +// Useful for: +// - "Is anything mid-drain right now?" panels +// - Post-crash forensics ("how many entries got stranded by that ECS OOM?") +// - Alerting: a sustained non-zero with no drainer progress is a stall +// +// No `envId` attribute — same high-cardinality constraint as the other +// mollifier gauges. The per-entry hash carries env/org for drill-down. +export const drainingCountGauge = meter.createObservableGauge( + "mollifier.draining.current", + { + description: + "Mollifier buffer entries currently in DRAINING state (popped but not yet acked/failed/requeued)", + }, +); + +let latestDrainingCount = 0; + +export function reportDrainingCount(count: number): void { + latestDrainingCount = count; +} + +meter.addBatchObservableCallback( + (result) => { + result.observe(drainingCountGauge, latestDrainingCount); + }, + [drainingCountGauge], +); + // Electric SQL's shape-stream protocol adds a `handle=` query param on // every reconnect after the initial GET. Gating the realtime-buffered // log/counter on its absence keeps the signal at one tick per diff --git a/apps/webapp/app/v3/mollifierDrainerWorker.server.ts b/apps/webapp/app/v3/mollifierDrainerWorker.server.ts index e571344141..bd348f8112 100644 --- a/apps/webapp/app/v3/mollifierDrainerWorker.server.ts +++ b/apps/webapp/app/v3/mollifierDrainerWorker.server.ts @@ -5,6 +5,7 @@ import { getMollifierDrainer, MollifierConfigurationError, } from "./mollifier/mollifierDrainer.server"; +import { startMollifierDrainingGauge } from "./mollifier/mollifierDrainingGauge.server"; declare global { // eslint-disable-next-line no-var @@ -92,6 +93,12 @@ export function initMollifierDrainerWorker( signalsEmitter.on("SIGINT", stopDrainer); global.__mollifierShutdownRegistered__ = true; drainer.start(); + // Spin up the observability-only gauge poller for the + // `mollifier:draining` ZSET cardinality. Colocated with the + // drainer because that's the loop creating the DRAINING entries + // — same pod, same Redis client lifecycle. Idempotent + unref'd + // so it's safe under dev hot-reload and doesn't block shutdown. + startMollifierDrainingGauge(); } } catch (error) { // Deterministic misconfig (shutdown-timeout vs GRACEFUL_SHUTDOWN_TIMEOUT, diff --git a/apps/webapp/test/mollifierDrainingGauge.test.ts b/apps/webapp/test/mollifierDrainingGauge.test.ts new file mode 100644 index 0000000000..18251e310b --- /dev/null +++ b/apps/webapp/test/mollifierDrainingGauge.test.ts @@ -0,0 +1,116 @@ +import { describe, expect, it, vi, beforeEach, afterEach } from "vitest"; + +// Same defensive mocks as mollifierDrainerWorker.test.ts: importing +// the gauge module transitively loads telemetry → meter → OTel +// initialisation, plus the buffer singleton's runtime resolution. +vi.mock("~/db.server", () => ({ prisma: {}, $replica: {} })); +vi.mock("~/services/logger.server", () => ({ + logger: { warn: vi.fn(), error: vi.fn(), info: vi.fn(), debug: vi.fn() }, +})); + +const reportDrainingCount = vi.fn(); +vi.mock("~/v3/mollifier/mollifierTelemetry.server", () => ({ + reportDrainingCount: (count: number) => reportDrainingCount(count), +})); + +import { + startMollifierDrainingGauge, + stopMollifierDrainingGauge, +} from "~/v3/mollifier/mollifierDrainingGauge.server"; + +// The gauge poller reads `mollifier:draining` cardinality on a cadence +// and forwards it to `reportDrainingCount`. These tests pin the +// observable contract: the gauge value is the buffer's count, transient +// errors keep the last value, and the loop never blocks the main thread +// (unref'd interval — verified implicitly because Vitest exits cleanly). +describe("startMollifierDrainingGauge", () => { + beforeEach(() => { + reportDrainingCount.mockReset(); + stopMollifierDrainingGauge(); + }); + + afterEach(() => { + stopMollifierDrainingGauge(); + }); + + it("fires an immediate poll on start so the gauge populates before the first scrape", async () => { + const buffer = { getDrainingCount: vi.fn().mockResolvedValue(7) } as any; + startMollifierDrainingGauge({ + intervalMs: 100_000, // long — we're checking the immediate fire, not the interval + getBuffer: () => buffer, + }); + + // Wait one microtask tick so the eager poll resolves. + await new Promise((r) => setImmediate(r)); + expect(reportDrainingCount).toHaveBeenCalledWith(7); + expect(buffer.getDrainingCount).toHaveBeenCalledTimes(1); + }); + + it("polls on the configured cadence", async () => { + const buffer = { getDrainingCount: vi.fn().mockResolvedValue(3) } as any; + startMollifierDrainingGauge({ + intervalMs: 20, + getBuffer: () => buffer, + }); + + // Eager tick + at least one interval tick. + await new Promise((r) => setTimeout(r, 80)); + expect(buffer.getDrainingCount.mock.calls.length).toBeGreaterThanOrEqual(2); + expect(reportDrainingCount).toHaveBeenCalledWith(3); + }); + + it("no-ops when the buffer singleton returns null (mollifier disabled)", async () => { + startMollifierDrainingGauge({ + intervalMs: 20, + getBuffer: () => null, + }); + await new Promise((r) => setTimeout(r, 60)); + expect(reportDrainingCount).not.toHaveBeenCalled(); + }); + + it("swallows a transient ZCARD failure so the loop keeps running", async () => { + let calls = 0; + const buffer = { + getDrainingCount: vi.fn(async () => { + calls += 1; + if (calls === 1) throw new Error("transient redis blip"); + return 4; + }), + } as any; + startMollifierDrainingGauge({ + intervalMs: 20, + getBuffer: () => buffer, + }); + + await new Promise((r) => setTimeout(r, 80)); + // First call threw → no report. Second call succeeded → reported. + // The gauge keeps its previous value (stale-but-non-zero) between + // the failed poll and the next successful one — better than + // crashing the loop and going silent forever. + expect(reportDrainingCount).toHaveBeenCalledWith(4); + expect(buffer.getDrainingCount.mock.calls.length).toBeGreaterThanOrEqual(2); + }); + + it("is idempotent: a second start does not spawn a parallel loop", async () => { + const buffer = { getDrainingCount: vi.fn().mockResolvedValue(1) } as any; + startMollifierDrainingGauge({ intervalMs: 25, getBuffer: () => buffer }); + startMollifierDrainingGauge({ intervalMs: 25, getBuffer: () => buffer }); + + await new Promise((r) => setTimeout(r, 90)); + // One eager + a small number of interval ticks. Doubled-loop would + // produce ~2× the calls in the same window. Upper bound is generous + // for CI jitter; the property is "single loop", not exact count. + expect(buffer.getDrainingCount.mock.calls.length).toBeLessThan(8); + }); + + it("stop halts the polling loop", async () => { + const buffer = { getDrainingCount: vi.fn().mockResolvedValue(2) } as any; + startMollifierDrainingGauge({ intervalMs: 20, getBuffer: () => buffer }); + await new Promise((r) => setTimeout(r, 50)); + const callsAtStop = buffer.getDrainingCount.mock.calls.length; + stopMollifierDrainingGauge(); + + await new Promise((r) => setTimeout(r, 80)); + expect(buffer.getDrainingCount.mock.calls.length).toBe(callsAtStop); + }); +}); diff --git a/packages/redis-worker/src/mollifier/buffer.test.ts b/packages/redis-worker/src/mollifier/buffer.test.ts index b47e41589e..3a775bbb8f 100644 --- a/packages/redis-worker/src/mollifier/buffer.test.ts +++ b/packages/redis-worker/src/mollifier/buffer.test.ts @@ -3,6 +3,7 @@ import { BufferEntrySchema, serialiseSnapshot, deserialiseSnapshot } from "./sch import { redisTest } from "@internal/testcontainers"; import { Logger } from "@trigger.dev/core/logger"; import { + DRAINING_SET_KEY, MollifierBuffer, idempotencyLookupKeyFor, makeIdempotencyClaimKey, @@ -2724,3 +2725,250 @@ describe("MollifierBuffer pre-gate claim — ownership token safety", () => { }, ); }); + +// The DRAINING set is observability-only: a sorted set keyed by the +// pop wall-clock millis whose membership mirrors entries currently in +// DRAINING state (popped, not yet acked/failed/requeued). The gauge in +// `mollifierDrainerWorker.server.ts` polls `getDrainingCount` and emits +// `mollifier.draining.current` for ops dashboards / post-crash +// forensics. Tests pin the lifecycle transitions on every Lua boundary +// so a regression that breaks the gauge surfaces here, not at 03:00. +describe("MollifierBuffer.draining tracker (observability)", () => { + redisTest( + "pop ZADDs to the draining set with a positive recent score", + { timeout: 20_000 }, + async ({ redisContainer }) => { + const buffer = new MollifierBuffer({ + redisOptions: { + host: redisContainer.getHost(), + port: redisContainer.getPort(), + password: redisContainer.getPassword(), + }, + logger: new Logger("test", "log"), + }); + + try { + const before = Date.now(); + await buffer.accept({ runId: "drn_1", envId: "env_a", orgId: "org_1", payload: "{}" }); + await buffer.pop("env_a"); + + const count = await buffer.getDrainingCount(); + expect(count).toBe(1); + + // Score is the pop wall-clock in millis (Redis TIME, computed + // inside the Lua). Sanity-check it's within a tight window of + // the test's wall-clock so a future bug substituting createdAt + // or zero would surface. + const score = await buffer["redis"].zscore(DRAINING_SET_KEY, "drn_1"); + expect(score).not.toBeNull(); + const scoreMs = Number(score); + expect(scoreMs).toBeGreaterThanOrEqual(before - 1_000); + expect(scoreMs).toBeLessThanOrEqual(Date.now() + 1_000); + } finally { + await buffer.close(); + } + }, + ); + + redisTest( + "ack ZREMs from the draining set", + { timeout: 20_000 }, + async ({ redisContainer }) => { + const buffer = new MollifierBuffer({ + redisOptions: { + host: redisContainer.getHost(), + port: redisContainer.getPort(), + password: redisContainer.getPassword(), + }, + logger: new Logger("test", "log"), + }); + + try { + await buffer.accept({ runId: "drn_ack", envId: "env_a", orgId: "org_1", payload: "{}" }); + await buffer.pop("env_a"); + expect(await buffer.getDrainingCount()).toBe(1); + + await buffer.ack("drn_ack"); + expect(await buffer.getDrainingCount()).toBe(0); + expect(await buffer["redis"].zscore(DRAINING_SET_KEY, "drn_ack")).toBeNull(); + } finally { + await buffer.close(); + } + }, + ); + + redisTest( + "fail ZREMs from the draining set even though the entry hash is torn down", + { timeout: 20_000 }, + async ({ redisContainer }) => { + const buffer = new MollifierBuffer({ + redisOptions: { + host: redisContainer.getHost(), + port: redisContainer.getPort(), + password: redisContainer.getPassword(), + }, + logger: new Logger("test", "log"), + }); + + try { + await buffer.accept({ runId: "drn_fail", envId: "env_a", orgId: "org_1", payload: "{}" }); + await buffer.pop("env_a"); + expect(await buffer.getDrainingCount()).toBe(1); + + await buffer.fail("drn_fail", { code: "X", message: "y" }); + expect(await buffer.getDrainingCount()).toBe(0); + } finally { + await buffer.close(); + } + }, + ); + + redisTest( + "requeue ZREMs from the draining set so the entry is no longer counted as in-flight", + { timeout: 20_000 }, + async ({ redisContainer }) => { + const buffer = new MollifierBuffer({ + redisOptions: { + host: redisContainer.getHost(), + port: redisContainer.getPort(), + password: redisContainer.getPassword(), + }, + logger: new Logger("test", "log"), + }); + + try { + await buffer.accept({ runId: "drn_rq", envId: "env_a", orgId: "org_1", payload: "{}" }); + await buffer.pop("env_a"); + expect(await buffer.getDrainingCount()).toBe(1); + + await buffer.requeue("drn_rq"); + // Back in QUEUED — not currently "draining", so the tracker is + // empty even though the entry hash still exists. + expect(await buffer.getDrainingCount()).toBe(0); + const entry = await buffer.getEntry("drn_rq"); + expect(entry!.status).toBe("QUEUED"); + } finally { + await buffer.close(); + } + }, + ); + + redisTest( + "the same entry going through pop → requeue → pop is tracked at the latest pop's score", + { timeout: 20_000 }, + async ({ redisContainer }) => { + // The pop Lua does ZADD (no NX/XX/GT flags) so the second pop + // overwrites the score with the new wall-clock. listStaleDraining + // therefore measures "time since the most recent pop", which is + // what an operator wants — a requeued entry that just got picked + // up again isn't stale, only one that was popped and stayed there. + const buffer = new MollifierBuffer({ + redisOptions: { + host: redisContainer.getHost(), + port: redisContainer.getPort(), + password: redisContainer.getPassword(), + }, + logger: new Logger("test", "log"), + }); + + try { + await buffer.accept({ runId: "drn_re", envId: "env_a", orgId: "org_1", payload: "{}" }); + await buffer.pop("env_a"); + const firstScore = Number(await buffer["redis"].zscore(DRAINING_SET_KEY, "drn_re")); + + await buffer.requeue("drn_re"); + // ZREMed; not counted as draining between pops. + expect(await buffer.getDrainingCount()).toBe(0); + + // Tiny sleep so the second pop's TIME is observably later. + await new Promise((r) => setTimeout(r, 25)); + await buffer.pop("env_a"); + const secondScore = Number(await buffer["redis"].zscore(DRAINING_SET_KEY, "drn_re")); + expect(secondScore).toBeGreaterThanOrEqual(firstScore); + } finally { + await buffer.close(); + } + }, + ); + + redisTest( + "listStaleDraining returns runIds popped before the cutoff and respects the limit", + { timeout: 20_000 }, + async ({ redisContainer }) => { + const buffer = new MollifierBuffer({ + redisOptions: { + host: redisContainer.getHost(), + port: redisContainer.getPort(), + password: redisContainer.getPassword(), + }, + logger: new Logger("test", "log"), + }); + + try { + // Pop two entries, wait, then pop a third. With the cutoff set + // to the gap, only the first two should come back. + await buffer.accept({ runId: "old_1", envId: "env_a", orgId: "org_1", payload: "{}" }); + await buffer.accept({ runId: "old_2", envId: "env_a", orgId: "org_1", payload: "{}" }); + await buffer.pop("env_a"); + await buffer.pop("env_a"); + + await new Promise((r) => setTimeout(r, 75)); + + await buffer.accept({ runId: "new_1", envId: "env_b", orgId: "org_1", payload: "{}" }); + await buffer.pop("env_b"); + + // 50ms cutoff: old_1 and old_2 qualify, new_1 is too fresh. + const stale = await buffer.listStaleDraining(50, 10); + expect(stale.sort()).toEqual(["old_1", "old_2"]); + + // Limit caps the result set. + const capped = await buffer.listStaleDraining(50, 1); + expect(capped.length).toBe(1); + expect(["old_1", "old_2"]).toContain(capped[0]); + } finally { + await buffer.close(); + } + }, + ); + + redisTest( + "draining set is not load-bearing for correctness: stale-sweep & entry hash still drive recovery", + { timeout: 20_000 }, + async ({ redisContainer }) => { + // Documents the invariant we rely on for graceful degradation: if + // a deploy somehow regresses the ZREM-on-ack (or the set is + // manually wiped), correctness still holds because the per-entry + // hash carries `status` and the stale-sweep scans the queue LISTs. + // The gauge would just over-report — an ops issue, not a data-loss + // bug. Pinning this with a direct DEL keeps the principle visible + // in test output. + const buffer = new MollifierBuffer({ + redisOptions: { + host: redisContainer.getHost(), + port: redisContainer.getPort(), + password: redisContainer.getPassword(), + }, + logger: new Logger("test", "log"), + }); + + try { + await buffer.accept({ runId: "deg_1", envId: "env_a", orgId: "org_1", payload: "{}" }); + const popped = await buffer.pop("env_a"); + expect(popped!.status).toBe("DRAINING"); + + // Wipe the tracker out-of-band. Correctness must not regress. + await buffer["redis"].del(DRAINING_SET_KEY); + + // ack still succeeds, entry hash still flips to materialised, + // grace TTL still applied. The next gauge poll just sees a + // count of 0 instead of 1 — an observability blip, not a bug. + await buffer.ack("deg_1"); + const after = await buffer.getEntry("deg_1"); + expect(after).not.toBeNull(); + expect(after!.materialised).toBe(true); + } finally { + await buffer.close(); + } + }, + ); +}); diff --git a/packages/redis-worker/src/mollifier/buffer.ts b/packages/redis-worker/src/mollifier/buffer.ts index 71920bb4ff..d2fff14dfc 100644 --- a/packages/redis-worker/src/mollifier/buffer.ts +++ b/packages/redis-worker/src/mollifier/buffer.ts @@ -18,6 +18,17 @@ export type MollifierBufferOptions = { // have a safety net while PG replica lag settles. const ACK_GRACE_TTL_SECONDS = 30; +// Observability-only sorted set of entries currently in DRAINING state +// (popped by the drainer, not yet acked/failed/requeued). Score is the +// pop wall-clock in milliseconds — `ZRANGEBYSCORE 0 ` gives +// the entries stuck mid-drain for longer than X. NOT load-bearing for +// correctness: the per-entry hash already carries `status` and the +// stale-sweep would catch stranded entries via the queue LISTs. This +// set is a fast top-level index for ops (gauge cardinality, post-crash +// forensics after an ECS OOM) — see `mollifierDrainerWorker` for the +// gauge wiring. +export const DRAINING_SET_KEY = "mollifier:draining"; + // ioredis reconnect backoff for the mollifier buffer client. The base // grows linearly with the attempt count and is capped at 1s (the same // envelope as the previous fixed `Math.min(times * 50, 1000)` schedule). @@ -204,6 +215,7 @@ export class MollifierBuffer { const encoded = (await this.redis.popAndMarkDraining( queueKey, orgsKey, + DRAINING_SET_KEY, entryPrefix, envId, "mollifier:org-envs:", @@ -493,7 +505,9 @@ export class MollifierBuffer { async ack(runId: string): Promise { await this.redis.ackMollifierEntry( `mollifier:entries:${runId}`, + DRAINING_SET_KEY, String(ACK_GRACE_TTL_SECONDS), + runId, ); } @@ -501,6 +515,7 @@ export class MollifierBuffer { await this.redis.requeueMollifierEntry( `mollifier:entries:${runId}`, "mollifier:orgs", + DRAINING_SET_KEY, "mollifier:queue:", runId, "mollifier:org-envs:", @@ -516,11 +531,39 @@ export class MollifierBuffer { async fail(runId: string, error: { code: string; message: string }): Promise { const result = await this.redis.failMollifierEntry( `mollifier:entries:${runId}`, + DRAINING_SET_KEY, JSON.stringify(error), + runId, ); return result === 1; } + // Observability-only: number of entries currently in DRAINING state + // (popped, not yet acked/failed/requeued). The gauge in the webapp + // drainer worker polls this on a short interval and emits it as + // `mollifier.draining.current` for ops dashboards and post-crash + // forensics. Cheap (single ZCARD). + async getDrainingCount(): Promise { + return this.redis.zcard(DRAINING_SET_KEY); + } + + // Observability-only: list runIds that have been DRAINING longer than + // `olderThanMs` (i.e. popped before `now - olderThanMs`). Bounded by + // `limit` to keep the result set tractable when something has gone + // very wrong. ZRANGEBYSCORE is O(log N + K). Score is the pop wall-clock + // in milliseconds as written by the popAndMarkDraining Lua. + async listStaleDraining(olderThanMs: number, limit: number): Promise { + const maxScore = Date.now() - Math.max(0, olderThanMs); + return this.redis.zrangebyscore( + DRAINING_SET_KEY, + "-inf", + String(maxScore), + "LIMIT", + 0, + Math.max(0, limit), + ); + } + // Returns Redis-side TTL on the entry hash. Returns -1 for entries // with no TTL — the steady state under the current design, where // entries persist until drainer ack/fail. The ack grace TTL (30s @@ -630,10 +673,11 @@ export class MollifierBuffer { }); this.redis.defineCommand("requeueMollifierEntry", { - numberOfKeys: 2, + numberOfKeys: 3, lua: ` local entryKey = KEYS[1] local orgsKey = KEYS[2] + local drainingSetKey = KEYS[3] local queuePrefix = ARGV[1] local runId = ARGV[2] local orgEnvsPrefix = ARGV[3] @@ -661,19 +705,32 @@ export class MollifierBuffer { redis.call('SADD', orgsKey, orgId) redis.call('SADD', orgEnvsPrefix .. orgId, envId) end + -- Observability-only: leaving DRAINING state, so drop the + -- entry from the draining-tracker set. ZREM on absent member + -- is a no-op. + redis.call('ZREM', drainingSetKey, runId) return 1 `, }); this.redis.defineCommand("popAndMarkDraining", { - numberOfKeys: 2, + numberOfKeys: 3, lua: ` local queueKey = KEYS[1] local orgsKey = KEYS[2] + local drainingSetKey = KEYS[3] local entryPrefix = ARGV[1] local envId = ARGV[2] local orgEnvsPrefix = ARGV[3] + -- Wall-clock millis used as the ZADD score on the draining-tracker + -- set. Computed once per script invocation so all observers see + -- the same pop instant. redis.call('TIME') is deterministic per + -- script execution (Lua sees it as a single read), satisfying the + -- write-determinism contract on replicas/AOF replay. + local timeArr = redis.call('TIME') + local nowMs = tonumber(timeArr[1]) * 1000 + math.floor(tonumber(timeArr[2]) / 1000) + -- Helper: prune org-level membership when an env's queue empties. -- Called only from the success branch where we know orgId from the -- popped entry. The no-runId branch below can't reach this because @@ -706,6 +763,14 @@ export class MollifierBuffer { local entryKey = entryPrefix .. runId if redis.call('EXISTS', entryKey) == 1 then redis.call('HSET', entryKey, 'status', 'DRAINING') + -- Observability-only: track the runId in the draining set + -- with the pop wall-clock as score. Acked/failed/requeued + -- in the corresponding Lua scripts. The set is NOT + -- load-bearing for correctness — the per-entry hash carries + -- status — so a missed ZREM on a partial Lua execution is + -- recoverable via the stale-sweep + entry hash, not a + -- correctness bug. + redis.call('ZADD', drainingSetKey, nowMs, runId) local raw = redis.call('HGETALL', entryKey) local result = {} for i = 1, #raw, 2 do @@ -957,10 +1022,18 @@ export class MollifierBuffer { }); this.redis.defineCommand("ackMollifierEntry", { - numberOfKeys: 1, + numberOfKeys: 2, lua: ` local entryKey = KEYS[1] + local drainingSetKey = KEYS[2] local graceTtlSeconds = tonumber(ARGV[1]) + local runId = ARGV[2] + + -- Always ZREM from the draining-tracker — even if the entry hash + -- has been concurrently torn down, the runId might still be in + -- the set (e.g. fail() ran first and cleared the hash but a + -- delayed ack races in). Idempotent: ZREM on absent is a no-op. + redis.call('ZREM', drainingSetKey, runId) -- Guard: never create a partial entry. If the hash is gone between -- pop and ack (concurrent fail or eviction — QUEUED entries carry @@ -984,10 +1057,17 @@ export class MollifierBuffer { }); this.redis.defineCommand("failMollifierEntry", { - numberOfKeys: 1, + numberOfKeys: 2, lua: ` local entryKey = KEYS[1] + local drainingSetKey = KEYS[2] local errorPayload = ARGV[1] + local runId = ARGV[2] + + -- Always ZREM from the draining-tracker (idempotent on absent). + -- Mirrors ack: the runId may be in the set even if the entry hash + -- has been raced away. + redis.call('ZREM', drainingSetKey, runId) -- Guard: nothing to mark FAILED if the hash is gone (concurrent -- ack/manual cleanup). Returning 0 lets the caller distinguish @@ -1077,6 +1157,7 @@ declare module "@internal/redis" { popAndMarkDraining( queueKey: string, orgsKey: string, + drainingSetKey: string, entryPrefix: string, envId: string, orgEnvsPrefix: string, @@ -1085,6 +1166,7 @@ declare module "@internal/redis" { requeueMollifierEntry( entryKey: string, orgsKey: string, + drainingSetKey: string, queuePrefix: string, runId: string, orgEnvsPrefix: string, @@ -1129,12 +1211,16 @@ declare module "@internal/redis" { ): Result; ackMollifierEntry( entryKey: string, + drainingSetKey: string, graceTtlSeconds: string, + runId: string, callback?: Callback, ): Result; failMollifierEntry( entryKey: string, + drainingSetKey: string, errorPayload: string, + runId: string, callback?: Callback, ): Result; delMollifierKeyIfEquals( From 667218ad89b35147a9f4bcd1a727e7f8fcd60aa4 Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Tue, 2 Jun 2026 09:22:50 +0100 Subject: [PATCH 11/12] docs: extend changelog entries to cover the draining tracker / gauge Co-Authored-By: Claude Opus 4.7 --- .changeset/mollifier-drain-batch-size.md | 2 +- .server-changes/mollifier-drain-batch-size.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.changeset/mollifier-drain-batch-size.md b/.changeset/mollifier-drain-batch-size.md index ee9d308ad4..9e848b5011 100644 --- a/.changeset/mollifier-drain-batch-size.md +++ b/.changeset/mollifier-drain-batch-size.md @@ -2,4 +2,4 @@ "@trigger.dev/redis-worker": patch --- -`MollifierDrainer` accepts a `drainBatchSize` option (default 1) that controls how many entries are popped per env per tick — in-flight handlers remain capped by the global `concurrency`. +`MollifierDrainer` accepts a `drainBatchSize` option (default 1) that controls how many entries are popped per env per tick — in-flight handlers remain capped by the global `concurrency`. `MollifierBuffer` also gains `getDrainingCount()` / `listStaleDraining()`, backed by a new `mollifier:draining` ZSET maintained atomically with pop/ack/fail/requeue (observability-only). diff --git a/.server-changes/mollifier-drain-batch-size.md b/.server-changes/mollifier-drain-batch-size.md index 2b0e1697b8..ddb6845f63 100644 --- a/.server-changes/mollifier-drain-batch-size.md +++ b/.server-changes/mollifier-drain-batch-size.md @@ -3,4 +3,4 @@ area: webapp type: improvement --- -Wire `TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE` (default 50) so single-env bursts drain at the full `DRAIN_CONCURRENCY` budget per tick instead of one entry per tick. +Wire `TRIGGER_MOLLIFIER_DRAIN_BATCH_SIZE` (default 50) so single-env bursts drain at the full `DRAIN_CONCURRENCY` budget per tick instead of one entry per tick. Also expose `mollifier.draining.current` ObservableGauge (polled every 15s on drainer pods) for in-flight DRAINING entries. From d38b449bff56725c1985a2970360d35ef3fafd1b Mon Sep 17 00:00:00 2001 From: Daniel Sutton Date: Tue, 2 Jun 2026 12:56:05 +0100 Subject: [PATCH 12/12] chore: drop stale changesets from earlier mollifier PRs `mollifier-buffer-extensions.md` and `mollifier-drainer-terminal-failure-callback.md` describe changes that have already shipped on prior merged PRs; carrying them on this branch would double-publish in the next release. Co-Authored-By: Claude Opus 4.7 --- .changeset/mollifier-buffer-extensions.md | 5 ----- .changeset/mollifier-drainer-terminal-failure-callback.md | 5 ----- 2 files changed, 10 deletions(-) delete mode 100644 .changeset/mollifier-buffer-extensions.md delete mode 100644 .changeset/mollifier-drainer-terminal-failure-callback.md diff --git a/.changeset/mollifier-buffer-extensions.md b/.changeset/mollifier-buffer-extensions.md deleted file mode 100644 index c2a3b1a0e8..0000000000 --- a/.changeset/mollifier-buffer-extensions.md +++ /dev/null @@ -1,5 +0,0 @@ ---- -"@trigger.dev/redis-worker": minor ---- - -Mollifier buffer extensions: idempotency dedup, an atomic `mutateSnapshot` API, metadata CAS, claim primitives, and a `MollifierSnapshot` type. The buffer's Redis client now reconnects with jittered backoff so a fleet of clients doesn't stampede Redis in lockstep after a blip. diff --git a/.changeset/mollifier-drainer-terminal-failure-callback.md b/.changeset/mollifier-drainer-terminal-failure-callback.md deleted file mode 100644 index e0ac3400ff..0000000000 --- a/.changeset/mollifier-drainer-terminal-failure-callback.md +++ /dev/null @@ -1,5 +0,0 @@ ---- -"@trigger.dev/redis-worker": minor ---- - -Add `onTerminalFailure` callback to `MollifierDrainerOptions` so the customer's run lands a SYSTEM_FAILURE PG row even when the drainer exhausts `maxAttempts` on a retryable PG error. Previously, retryable-error exhaustion called `buffer.fail()` directly, which atomically marks FAILED + DELs the entry hash with no PG write — silent data loss when PG was unreachable across the full retry budget. The callback fires before `buffer.fail()` on any terminal path (`cause: "non-retryable"` or `"max-attempts-exhausted"`); throwing a retryable error from the callback causes the drainer to requeue rather than fail.