Skip to content

fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path#4163

Merged
d-cs merged 16 commits into
mainfrom
fix/waitpoint-pending-check-primary
Jul 5, 2026
Merged

fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path#4163
d-cs merged 16 commits into
mainfrom
fix/waitpoint-pending-check-primary

Conversation

@d-cs

@d-cs d-cs commented Jul 5, 2026

Copy link
Copy Markdown
Collaborator

Summary

On the run-ops split, NEW-residency runs could hang. Time-based waits (wait.for, wait.until, delay, waitpoint tokens), batchTriggerAndWait, and attempt starts stalled and never resumed. Each was a run-ops read or update that hit the wrong database: either the owning store's read replica when it needed read-your-writes, or the wrong store entirely because it routed by an id that does not encode residency.

Fixes

Waitpoint resume (the main hang). The managed resume path reads a run's completed waitpoints by snapshot id (findSnapshotCompletedWaitpointIds). Snapshot ids are cuids, which always classify to the legacy store, so a NEW run's join rows (which live on the new store) were never found. The resumed run saw zero completed waitpoints and hung. It now fans out across both stores and merges, like its sibling readers.

Batch completion. Batch item completion (updateManyBatchTaskRunItems) routed by the item id, which is also a cuid, so a NEW batch's items were updated on the wrong store, matched zero rows, and the batch was treated as already complete (its parent's batchTriggerAndWait then hung). It now routes by the batch id, which does encode residency, matching the sibling countBatchTaskRunItems.

Read-your-writes on the resume path. The block-time pending-waitpoint check (countPendingWaitpoints) and the attempt-start lock check (findRun in startRunAttempt) both read the owning store's replica with no read-your-writes guarantee, so a just-committed waitpoint completion or dequeue lock could be missed under replica lag and strand the run. Both now read the owning primary.

Each fix ships with a two-database store or engine test that reproduces the hang and passes with the fix.

…primary

blockRunWithWaitpoint confirms whether a run is still blocked with a
separate countPendingWaitpoints query. Under the run-ops split that read
passed no client, so it resolved to the owning store's read replica. When a
waitpoint completes on the primary just before the run blocks on it (the
wait, token, or batchTriggerAndWait-child race), a lagging replica still
reports it PENDING, the run is marked blocked, and no continue job is
enqueued, so the run never resumes.

Pass the writer so the pending re-read is read-your-writes on the owning
primary. Adds a two-database engine test that reproduces the strand and
passes with the fix.
@changeset-bot

changeset-bot Bot commented Jul 5, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 19c2759

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This change updates run-engine reads to use the transaction-bound Prisma client for waitpoint pending checks and initial run lookup. It also changes run-store routing so completed waitpoint lookups query both stores and batch-item updates route by batchTaskRunId before id. Additional changes make waitpoint joins and blocking edges work without foreign-key checks, update webapp presenters to use split-read dependencies, and add integration tests for replica lag, routing, and cross-DB waitpoint cases.

Related PRs: None specified.
Suggested labels: run-engine, run-store, webapp, tests, bug
Suggested reviewers: Not specified.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is missing required template sections like Closes #issue, checklist, testing steps, changelog, and screenshots. Add the missing template sections, including Closes #issue, checklist items, testing steps, changelog, and screenshots placeholder.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the main fix: wrong-store reads causing run-ops split hangs on the resume path.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/waitpoint-pending-check-primary

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

d-cs added 2 commits July 5, 2026 21:20
…owning store

Two run-ops split reads routed by an id that does not encode residency, so
NEW-residency data on the new store was queried on the legacy store and came
back empty:

- findSnapshotCompletedWaitpointIds routed by the snapshot id, which is a cuid
  (always classifies legacy). A resuming NEW run saw zero completed waitpoints
  and hung. It now fans out across both stores and merges.
- updateManyBatchTaskRunItems routed by the item id, also a cuid, so a NEW
  batch's items were updated on the wrong store and matched zero rows, leaving
  the batch stuck and its batchTriggerAndWait parent hanging. It now routes by
  the batch id.

Both covered by two-database store tests that reproduce the miss.
…tempt

The attempt-start lock check read the run with no client, so under the split it
hit the owning store's replica. Dequeue had just written lockedById on the
primary, so a lagging replica reported the run unlocked and rejected the start
with "Task run is not locked". It now threads the writer so the read is
read-your-writes on the owning primary, matching the sibling snapshot read.

Covered by a two-database engine test.
@d-cs d-cs changed the title fix(run-engine): route the block-time pending waitpoint check to the primary fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path Jul 5, 2026
…ed waitpoints

Covers getExecutionSnapshotsSince end to end for a NEW-residency run: the
resumed snapshot must carry the completed waitpoint whose join lives on the new
store, which the snapshot-id-routed lookup used to miss.
coderabbitai[bot]

This comment was marked as resolved.

@d-cs d-cs marked this pull request as ready for review July 5, 2026 20:38
@d-cs d-cs self-assigned this Jul 5, 2026

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

… methods in test proxies

oxfmt the wrapped method signature, and make the lagging-replica test proxies
bind any forwarded method to the real client. Prisma delegates are proxy-based
and not pre-bound, so an unbound forwarded method could trip a this/private-field
brand check when called.
d-cs added 2 commits July 5, 2026 21:45
… a NEW batch item

Broadens the batch-item misroute regression guard: completing a NEW-resident item must update #new and leave #legacy with zero matching rows.
… resolves for NEW runs

The run-result route built ApiRunResultPresenter with no run-ops read clients,
so under the split it read a NEW-residency run's row on the control-plane
database, found nothing, and returned 404. triggerAndWait and
runs.retrieve().result then never resolved for NEW runs, stalling the parent.
Wire the run-ops read clients the same way the batch-results route does.
devin-ai-integration[bot]

This comment was marked as resolved.

d-cs added 5 commits July 5, 2026 22:20
…store

clearBlockingWaitpoints deleted edges via the caller's control-plane tx, so a
NEW run's TaskRunWaitpoint edges (on the run-ops database) were orphaned and
re-blocked the run after a retry. Route the delete through the store, which fans
across both databases and applies the tx only to the legacy leg.
blockRunWithWaitpointEdges' legacy branch joined FROM "Waitpoint", so a LEGACY
run blocking on a NEW-resident token (whose Waitpoint row lives on the run-ops
database) matched no rows and wrote no edge, silently stranding the run. Source
the edge rows from the id array via unnest, matching the dedicated branch. Two
migrations drop the now-cross-DB TaskRunWaitpoint and _WaitpointRunConnections
foreign keys to Waitpoint; integrity is app-enforced, matching the split's
existing control-plane FK-removal pattern.
…nsert

createExecutionSnapshot and lockRunToWorker recorded completed waitpoints with a
Prisma connect on the implicit _completedWaitpoints M2M, which ORM-validates the
Waitpoint exists locally and rejects a cross-DB (NEW-resident) token. A LEGACY
parent that triggerAndWaits a NEW child then hangs when the resume snapshot
connects the NEW token. Insert the join rows FK-free after create, mirroring the
dedicated schema, and drop the _completedWaitpoints to Waitpoint FK by migration.
SpanPresenter built WaitpointPresenter with no run-ops read clients, so the trace
panel read a NEW-residency waitpoint on the control-plane database and showed
"Waitpoint not found". Wire the run-ops read clients like the other split-aware
routes.
…grations

Match the split's other FK-drop migrations: fail fast on the ACCESS EXCLUSIVE
lock instead of queueing behind a long transaction or VACUUM.
devin-ai-integration[bot]

This comment was marked as resolved.

d-cs added 2 commits July 5, 2026 22:41
…tpoint connect

The legacy raw-insert helper only calls $executeRaw but typed its client
parameter narrower than its dedicated sibling, so callers passing the store's
own client failed the build. Widen it to the same client type.
The span-panel waitpoint presenter selected connectedRuns as a Prisma relation.
That field does not exist on the dedicated run-ops Waitpoint model, so with the
split read enabled the lookup threw a validation error; it also could only ever
see connections whose join row lived on the waitpoint's own store, missing any
run connected across the two databases.

Read the run<->waitpoint join from each store instead (the explicit
WaitpointRunConnection table on the dedicated schema, the implicit
_WaitpointRunConnections M2M on the control plane), resolve each run's friendlyId
on its own store, and union the results.
devin-ai-integration[bot]

This comment was marked as resolved.

…edge-delete fan-out

The taskRunId-keyed deleteManyTaskRunWaitpoints fan-out dropped the caller's
transaction for both legs. The new (dedicated) leg can't join a control-plane
transaction, but the legacy leg can, so pass it through: a legacy run's blocking
edges are again deleted atomically with the caller's operation (e.g. an attempt
failure) instead of auto-committing, matching the waitpointId-keyed path.
devin-ai-integration[bot]

This comment was marked as resolved.

… createExecutionSnapshot

RoutingRunStore.createExecutionSnapshot accepted a caller transaction but never
forwarded it to the routed store. Forward it when the owning store is legacy so
a legacy-resident snapshot stays atomic with the caller's operation; a new
(cross-DB) write still cannot join a control-plane transaction, so it is dropped
there and relies on runInTransaction for atomicity.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal-packages/run-store/src/PostgresRunStore.ts (1)

936-949: 🧹 Nitpick | 🔵 Trivial

FK-free raw inserts depend on the sibling FK-drop migrations landing first.

#connectCompletedWaitpointsLegacy (and the legacy blockRunWithWaitpointEdges rewrite at Lines 1727-1750) rely on _completedWaitpoints_B_fkey / _WaitpointRunConnections_B_fkey / TaskRunWaitpoint_waitpointId_fkey being dropped. ON CONFLICT DO NOTHING does not absorb FK violations, so if this code reaches production before migrations 20260705210000/20260705220000/20260705230000 are applied, a cross-DB (NEW-resident) token insert will throw and re-strand the parent run. Same-DB tokens are unaffected (FK satisfied), so this only surfaces for the exact scenario this PR enables.

Please confirm the deploy pipeline applies these migrations before rolling out the run-store change.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a542549c-19cf-40fe-8a9d-1c8838bc2053

📥 Commits

Reviewing files that changed from the base of the PR and between 62247f4 and 0a2ed96.

📒 Files selected for processing (15)
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/database/prisma/migrations/20260705210000_drop_waitpoint_run_connections_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705220000_drop_task_run_waitpoint_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705230000_drop_completed_waitpoints_waitpoint_fk/migration.sql
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/runOpsStore.batchItemMisroute.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/runOpsStore.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal-packages/run-store/src/runOpsStore.batchItemMisroute.test.ts
  • internal-packages/run-store/src/runOpsStore.ts
📜 Review details
⏰ Context from checks skipped due to timeout. (11)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (11, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (10, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (6, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (9, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (3, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 10)
⚠️ CI failures not shown inline (6)

GitHub Actions: 📝 Agent Instructions Audit / audit: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

##[group]Run anthropics/claude-code-action@428971d2ecd6e3a7cb0ee0da2a3a8b33fdb3678d
 with:
   anthropic_***REDACTED***
   use_sticky_comment: true
   allowed_bots: devin-ai-integration[bot]
   claude_args: --max-turns 25
--model claude-opus-4-8
--allowedTools "Read,Glob,Grep,Bash(git diff:*)"
   prompt: You are reviewing a PR to check whether any agent instruction files need updating.
In this repo:
- Root shared agent guidance lives in `AGENTS.md`.
- Root `CLAUDE.md` is only a Claude Code adapter that imports `AGENTS.md`.
- Subdirectories may still have scoped `CLAUDE.md` files.
- `.claude/rules/` contains additional Claude Code guidance.
## Your task
1. Run `git diff origin/main...HEAD --name-only` to see which files changed in this PR.
2. For each changed directory, check the applicable instruction files: root `AGENTS.md`, any `CLAUDE.md` in that directory or a parent directory, and relevant `.claude/rules/` files.
3. Determine if any instruction file should be updated based on the changes. Consider:
   - New files/directories that aren't covered by existing documentation
   - Changed architecture or patterns that contradict current agent guidance
   - New dependencies, services, or infrastructure that agents should know about
   - Renamed or moved files that are referenced in an instruction file
   - Changes to build commands, test patterns, or development workflows
## Response format
If NO updates are needed, respond with exactly:
✅ Agent instruction files look current for this PR.
If updates ARE needed, respond with a short list:
📝 **Agent instruction updates suggested:**
- `AGENTS.md`: [what should be added/changed]
- `path/to/CLAUDE.md`: [what should be added/changed]
- `.claude/rules/file.md`: [what should be added/changed]
Keep suggestions specific and brief. Only flag things that would actually mislead agents in future sessions.
Do NOT suggest updates for trivial changes (bug fixes, small refactors within existing patterns).
Do NOT suggest creating new...

GitHub Actions: 📝 Agent Instructions Audit / 0_audit.txt: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

     build-batching-rc.1         -> build-batching-rc.1
  * [new tag]             build-batching-rc.2         -> build-batching-rc.2
  * [new tag]             build-billing-0.0.1         -> build-billing-0.0.1
  * [new tag]             build-billing-0.0.2         -> build-billing-0.0.2
  * [new tag]             build-billing-0.0.3         -> build-billing-0.0.3
  * [new tag]             build-buildinfo-rc.0        -> build-buildinfo-rc.0
  * [new tag]             build-buildinfo-rc.1        -> build-buildinfo-rc.1
  * [new tag]             build-checkpoint-failover-rc.1 -> build-checkpoint-failover-rc.1
  * [new tag]             build-checkpoint-race-condition-1 -> build-checkpoint-race-condition-1
  * [new tag]             build-checkpoint-race-condition-2 -> build-checkpoint-race-condition-2
  * [new tag]             build-checkpoint-race-condition-3 -> build-checkpoint-race-condition-3
  * [new tag]             build-chris-test-blacksmith -> build-chris-test-blacksmith
  * [new tag]             build-chris-test-blacksmith-2 -> build-chris-test-blacksmith-2
  * [new tag]             build-cli-build-upgrade-rc.1 -> build-cli-build-upgrade-rc.1
  * [new tag]             build-clickhouse-reads-rc0  -> build-clickhouse-reads-rc0
  * [new tag]             build-clickhouse-reads-rc1  -> build-clickhouse-reads-rc1
  * [new tag]             build-compute.rc0           -> build-compute.rc0
  * [new tag]             build-compute.rc1           -> build-compute.rc1
  * [new tag]             build-compute.rc2           -> build-compute.rc2
  * [new tag]             build-compute.rc3           -> build-compute.rc3
  * [new tag]             build-compute.rc4           -> build-compute.rc4
  * [new tag]             build-compute.rc5           -> build-compute.rc5
  * [new tag]             build-compute.rc6           -> build-compute.rc6
  * [new tag]             build-corepack-offline-rc.0 -> build-corepack-offline-rc.0
  * [new tag]             build-current-deployment-rc.0 ...

GitHub Actions: 🔎 REVIEW.md Drift Audit / audit: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

##[group]Run anthropics/claude-code-action@428971d2ecd6e3a7cb0ee0da2a3a8b33fdb3678d
 with:
   anthropic_***REDACTED***
   use_sticky_comment: true
   allowed_bots: devin-ai-integration[bot]
   claude_args: --max-turns 30
--allowedTools "Read,Glob,Grep,Bash(git diff:*)"
   prompt: You are auditing this PR for drift against `.claude/REVIEW.md`.
## Context
`.claude/REVIEW.md` is the repo's source of truth for what AI / agent code reviewers should treat as critical findings (rolling-deploy safety, hot-table indexes, recovery-path queries, testcontainers usage, Lua versioning, etc.). It is consumed by review agents to calibrate severity. If REVIEW.md goes stale, every future agent review degrades.
## Strategy — read this first
You have a hard turn budget. Spend it on signal, not coverage. The audit is allowed to miss things; it is NOT allowed to time out.
1. Read `.claude/REVIEW.md` once, in full.
2. Run `git diff origin/main...HEAD --name-only` to get the list of changed files. Do NOT read the diff content yet.
3. Scan the file-list for relevance to REVIEW.md scope. Relevance signals: changes to Prisma schema, Redis / queue / Lua code, hot tables, recovery / restart loops, new packages, deletions of paths REVIEW.md cites. Skim everything else.
4. Open at most **5 files** total — only the ones most likely to surface a real signal. If nothing in the file-list looks relevant to any REVIEW.md rule, do NOT read any files; go straight to the verdict.
5. Form a verdict and stop. Do not exhaust the turn budget exploring.
Large PRs (>50 files changed) are a strong signal to be MORE selective, not more thorough. Pick 3-5 files at most.
## What to look for
- **Stale references** — does any REVIEW.md rule cite a file, directory, function, table, Prisma model, or package name that has been removed or renamed in this PR (or is already gone from `main`)?
- **Contradictions** — does code in this PR clearly violate a current REVIEW.md rule? (Don't re-review the PR. Only flag if REVIE...

GitHub Actions: 🔎 REVIEW.md Drift Audit / 0_audit.txt: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

l-checkpoints.rc1 -> build-manual-checkpoints.rc1
  * [new tag]             build-metadata-upgrade-logging.rc1 -> build-metadata-upgrade-logging.rc1
  * [new tag]             build-metadata-upgrade-logging.rc2 -> build-metadata-upgrade-logging.rc2
  * [new tag]             build-metadata-upgrade-logging.rc3 -> build-metadata-upgrade-logging.rc3
  * [new tag]             build-new-build-system.rc.1 -> build-new-build-system.rc.1
  * [new tag]             build-otel-upgrade-rc.0     -> build-otel-upgrade-rc.0
  * [new tag]             build-otel-upgrade-rc.1     -> build-otel-upgrade-rc.1
  * [new tag]             build-pre-pull-deployments-rc.1 -> build-pre-pull-deployments-rc.1
  * [new tag]             build-prod-rescue-rc.1      -> build-prod-rescue-rc.1
  * [new tag]             build-rate-limiter-fix-rc.1 -> build-rate-limiter-fix-rc.1
  * [new tag]             build-re2.rc0               -> build-re2.rc0
  * [new tag]             build-realtime-v2-stream-fix -> build-realtime-v2-stream-fix
  * [new tag]             build-realtime-v2-stream-fix-2 -> build-realtime-v2-stream-fix-2
  * [new tag]             build-realtime-v2-stream-fix-3 -> build-realtime-v2-stream-fix-3
  * [new tag]             build-realtime-v2-stream-fix-4 -> build-realtime-v2-stream-fix-4
  * [new tag]             build-realtime-v2-stream-fix-5 -> build-realtime-v2-stream-fix-5
  * [new tag]             build-realtimestreams-dedupe -> build-realtimestreams-dedupe
  * [new tag]             build-registry-maintenance-rc.1 -> build-registry-maintenance-rc.1
  * [new tag]             build-registry-maintenance-rc.2 -> build-registry-maintenance-rc.2
  * [new tag]             build-remote-ecr-rc.0       -> build-remote-ecr-rc.0
  * [new tag]             build-reschedule-hotfix.rc1 -> build-reschedule-hotfix.rc1
  * [new tag]             build-resume-fixes.rc1      -> build-resume-fixes.rc1
  * [new tag]             build-resume-fixes.rc2      -> build-resume-fixes.rc2
  * [new tag]             ...

GitHub Actions: 🛡️ E2E Tests: Webapp Auth (full) / 0_🛡️ E2E Auth Tests (full).txt: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

mq2y1","http":{"requestId":"jLlxTn-hIvuAS-yCVXdU0","path":"/api/v2/runs/run_fna52uiprdbfmx24mq2y1/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"FinalizeTaskRunService: Resumed dependent parents","level":"log"}
 POST /api/v2/runs/run_fna52uiprdbfmx24mq2y1/cancel 200 - - 11.391 ms
 {"messageId":"f8ysautr5shh15h1b2cut","service":"marqs","reason":"FinalTaskRunService call","http":{"requestId":"wfTu2wEyHVb9Xh-t1m7__","path":"/api/v2/runs/run_f8ysautr5shh15h1b2cut/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"[marqs].acknowledgeMessage() message not found","level":"log"}
 {"runId":"f8ysautr5shh15h1b2cut","status":"CANCELED","http":{"requestId":"wfTu2wEyHVb9Xh-t1m7__","path":"/api/v2/runs/run_f8ysautr5shh15h1b2cut/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"FinalizeTaskRunService: No lockedById, so can't get the BackgroundWorkerTask. Not creating an attempt.","level":"info"}
 {"runId":"f8ysautr5shh15h1b2cut","dependency":null,"http":{"requestId":"wfTu2wEyHVb9Xh-t1m7__","path":"/api/v2/runs/run_f8ysautr5shh15h1b2cut/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"ResumeDependentParentsService: tried to find dependency","level":"log"}
 {"runId":"f8ysautr5shh15h1b2cut","http":{"requestId":"wfTu2wEyHVb9Xh-t1m7__","path":"/api/v2/runs/run_f8ysautr5shh15h1b2cut/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"ResumeDependentParentsService: dependency not found","level":"log"}
 {"result":{"success":true,"action":"no-dependencies"},"run":"f8ysautr5shh15h1b2cut","http":{"requestId":"wfTu2wEyHVb9Xh-t1m7__","path":"/api/v2/runs/run_f8ysautr5shh15h1b2cut/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"FinalizeTaskRunService: Re...

GitHub Actions: 🛡️ E2E Tests: Webapp Auth (full) / 🛡️ E2E Auth Tests (full): fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

2294.381,"attributes":{"http.url":"http://localhost:46159/api/v2/runs/run_rvq3dkvnkggnfko1vt6f3/cancel","http.host":"localhost:46159","net.host.name":"localhost","http.method":"POST","http.scheme":"http","http.target":"/api/v2/runs/run_rvq3dkvnkggnfko1vt6f3/cancel","http.user_agent":"node","http.request_content_length_uncompressed":2,"http.flavor":"1.1","net.transport":"ip_tcp","db.datasource":"writer","net.host.ip":"::ffff:127.0.0.1","net.host.port":46159,"net.peer.ip":"::ffff:127.0.0.1","net.peer.port":32952,"http.status_code":200,"http.status_text":"OK"},"status":{"code":0},"events":[],"links":[],"level":"trace"}
 POST /api/v2/runs/run_bmikiwzm55228rpgkp17a/cancel 403 - - 4.568 ms
 POST /api/v2/runs/run_e9ev2rkbuvys9x48nee2n/cancel 403 - - 4.587 ms
 {"messageId":"fna52uiprdbfmx24mq2y1","service":"marqs","reason":"FinalTaskRunService call","http":{"requestId":"jLlxTn-hIvuAS-yCVXdU0","path":"/api/v2/runs/run_fna52uiprdbfmx24mq2y1/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"[marqs].acknowledgeMessage() message not found","level":"log"}
 {"runId":"fna52uiprdbfmx24mq2y1","status":"CANCELED","http":{"requestId":"jLlxTn-hIvuAS-yCVXdU0","path":"/api/v2/runs/run_fna52uiprdbfmx24mq2y1/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"FinalizeTaskRunService: No lockedById, so can't get the BackgroundWorkerTask. Not creating an attempt.","level":"info"}
 {"runId":"fna52uiprdbfmx24mq2y1","dependency":null,"http":{"requestId":"jLlxTn-hIvuAS-yCVXdU0","path":"/api/v2/runs/run_fna52uiprdbfmx24mq2y1/cancel","host":"localhost","method":"POST","abortController":{}},"timestamp":"","name":"webapp","message":"ResumeDependentParentsService: tried to find dependency","level":"log"}
 {"runId":"fna52uiprdbfmx24mq2y1","http":{"requestId":"jLlxTn-hIvuAS-yCVXdU0","path":"/api/v2/runs/run_fna52uiprdbfmx24mq2y1/cancel","host":"localhost","method":"POST","abortController":{}...
🧰 Additional context used
📓 Path-based instructions (13)
internal-packages/database/**/prisma/migrations/*/*.sql

📄 CodeRabbit inference engine (internal-packages/database/CLAUDE.md)

internal-packages/database/**/prisma/migrations/*/*.sql: Clean up generated Prisma migrations by removing extraneous lines for junction tables (_BackgroundWorkerToBackgroundWorkerFile, _BackgroundWorkerToTaskQueue, _TaskRunToTaskRunTag, _WaitpointRunConnections, _completedWaitpoints) and indexes (SecretStore_key_idx, various TaskRun indexes) unless explicitly added
When adding indexes to existing tables, use CREATE INDEX CONCURRENTLY IF NOT EXISTS to avoid table locks in production, and place each concurrent index in its own separate migration file
Indexes on newly created tables can use CREATE INDEX without CONCURRENTLY and can be combined in the same migration file as the CREATE TABLE statement
When adding an index on a new column in an existing table, use two separate migrations: first for ALTER TABLE ... ADD COLUMN IF NOT EXISTS ..., then for CREATE INDEX CONCURRENTLY IF NOT EXISTS ... in its own file

Files:

  • internal-packages/database/prisma/migrations/20260705210000_drop_waitpoint_run_connections_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705230000_drop_completed_waitpoints_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705220000_drop_task_run_waitpoint_waitpoint_fk/migration.sql
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

**/*.{ts,tsx,js,jsx}: Prefer static imports over dynamic import(); only use dynamic imports when resolving circular dependencies, enabling real code splitting, or conditionally loading a module at runtime.
Always import from @trigger.dev/sdk; never import from @trigger.dev/sdk/v3 or use deprecated client.defineJob.
In code that imports @trigger.dev/core, use subpath imports only and never import from the package root.

Files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: Access environment variables through the env export of env.server.ts instead of directly accessing process.env
Use subpath exports from @trigger.dev/core package instead of importing from the root @trigger.dev/core path

Always use findFirst instead of findUnique for Prisma queries.

Files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
apps/webapp/app/routes/**/*.ts

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

Use Remix flat-file route naming with dot-separated segments in app/routes/ (for example, api.v1.tasks.$taskId.trigger.ts maps to /api/v1/tasks/:taskId/trigger).

Files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use vitest for all tests in the Trigger.dev repository

Files:

  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
internal-packages/run-engine/src/engine/tests/**/*.test.ts

📄 CodeRabbit inference engine (internal-packages/run-engine/CLAUDE.md)

Implement tests for RunEngine in src/engine/tests/ using testcontainers for Redis and PostgreSQL containerization

Files:

  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
**/*.test.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.test.{ts,tsx,js,jsx}: Place test files next to their source files (for example, MyService.ts -> MyService.test.ts).
Use Vitest exclusively for tests, and do not mock dependencies; use testcontainers instead.

Files:

  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
apps/webapp/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Do not import env.server.ts directly or indirectly into test files; instead pass environment-dependent values through options/parameters to make code testable

Files:

  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
apps/webapp/**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

In test files, never import env.server.ts; pass configuration as options instead.

Files:

  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
internal-packages/run-engine/src/engine/systems/**/*.ts

📄 CodeRabbit inference engine (internal-packages/run-engine/CLAUDE.md)

Integrate OpenTelemetry tracer and meter instrumentation in RunEngine systems for observability

Files:

  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
🧠 Learnings (21)
📚 Learning: 2026-02-03T18:48:31.790Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 2994
File: internal-packages/database/prisma/migrations/20260129162810_add_integration_deployment/migration.sql:14-18
Timestamp: 2026-02-03T18:48:31.790Z
Learning: For Prisma migrations targeting PostgreSQL: - When adding indexes to existing tables, create the index in a separate migration file and include CONCURRENTLY to avoid locking the table. - For indexes on newly created tables (in CREATE TABLE statements), you can create the index in the same migration file without CONCURRENTLY. This reduces rollout complexity for new objects while protecting uptime for existing structures.

Applied to files:

  • internal-packages/database/prisma/migrations/20260705210000_drop_waitpoint_run_connections_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705230000_drop_completed_waitpoints_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705220000_drop_task_run_waitpoint_waitpoint_fk/migration.sql
📚 Learning: 2026-03-22T13:49:20.068Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: internal-packages/database/prisma/migrations/20260318114244_add_prompt_friendly_id/migration.sql:5-5
Timestamp: 2026-03-22T13:49:20.068Z
Learning: For Prisma migration SQL files under `internal-packages/database/prisma/migrations/`, it is acceptable to create indexes with `CREATE INDEX` / `CREATE UNIQUE INDEX` (i.e., without `CONCURRENTLY`) when the parent table is introduced in the same PR and has no existing production rows yet. Only require `CREATE INDEX CONCURRENTLY` (or otherwise account for existing production data/locks) when the table already exists in production with data.

Applied to files:

  • internal-packages/database/prisma/migrations/20260705210000_drop_waitpoint_run_connections_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705230000_drop_completed_waitpoints_waitpoint_fk/migration.sql
  • internal-packages/database/prisma/migrations/20260705220000_drop_task_run_waitpoint_waitpoint_fk/migration.sql
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma error P1001 ("Can't reach database server") in TypeScript, don’t assume a single error shape. Prisma can surface P1001 via two different error classes/fields: `PrismaClientKnownRequestError` exposes it as `err.code === "P1001"` (common during mid-query connection drops), while `PrismaClientInitializationError` exposes it as `err.errorCode === "P1001"` (common on client startup failure). Therefore, predicates should use `err.code === "P1001" || err.errorCode === "P1001"`. Do not flag `err.code === "P1001"` as “unreachable/never matches,” as it is expected in production.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma errors for P1001 ("Can't reach database server"), do not assume it only appears under a single property name. Prisma may surface P1001 via either `PrismaClientKnownRequestError` (`err.code === "P1001"`, e.g., mid-query connection drops) or `PrismaClientInitializationError` (`err.errorCode === "P1001"`, e.g., client startup connection failure). To reliably detect the condition, check `err.code === "P1001" || err.errorCode === "P1001"`, and avoid review rules that would incorrectly flag `err.code === "P1001"` as unreachable/never-matching.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-06-13T19:53:13.759Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3937
File: packages/trigger-sdk/skills/realtime-and-frontend/SKILL.md:258-260
Timestamp: 2026-06-13T19:53:13.759Z
Learning: When reviewing code that uses `trigger.dev/react-hooks`’s `useRealtimeRun`, preserve the call signature where the first argument is the full realtime handle object (not `handle.id`). This is intentional to maintain type-safety and is consistent with the official docs; do not suggest changing the first argument from the handle object to `handle.id`.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-06-17T17:13:49.929Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3948
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.bulk-actions.$bulkActionParam/route.tsx:48-62
Timestamp: 2026-06-17T17:13:49.929Z
Learning: In triggerdotdev/trigger.dev, within `dashboardLoader`/`dashboardAction` (or similar context resolver code) whenever you resolve an organization ID from an organization slug for RBAC/enterprise authorization scope, always read from the primary Prisma client (`prisma`), not `$replica`. Using `$replica` can hit replica-lag and cause the RBAC lookup/authorization to run without the correct org scope (bypassing intended role enforcement). Implement the slug→org lookup with `prisma.organization.findFirst(...)` (or equivalent primary-client query) and add an inline comment documenting why the primary client is required (replica lag could lead to unscoped RBAC checks).

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-06-23T13:04:21.413Z
Learnt from: carderne
Repo: triggerdotdev/trigger.dev PR: 4023
File: apps/webapp/app/services/upsertBranch.server.ts:14-18
Timestamp: 2026-06-23T13:04:21.413Z
Learning: In TypeScript, it’s valid to `import { type X }` and then use `typeof X` in a type-only position, e.g. `type Alias = z.infer<typeof X>`. The `type` modifier suppresses the runtime import, but the type checker still has the full exported type so `z.infer<typeof X>` can resolve correctly. In code reviews, don’t flag this as a TypeScript compile error as long as `typeof X` is used in a type context (e.g., with `z.infer`, `type` aliases, generics), not as a runtime value.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-05-12T21:04:05.815Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3542
File: apps/webapp/app/components/sessions/v1/SessionStatus.tsx:1-3
Timestamp: 2026-05-12T21:04:05.815Z
Learning: In this Remix + TypeScript codebase, do not flag a server/client boundary violation when a file imports only types from a module matching `*.server`.

Specifically, it’s safe to import types using `import type { Foo } from "*.server"` or `import { type Foo } from "*.server"` because TypeScript erases type-only imports at compile time and they emit no JavaScript, so they won’t cross the Remix server/client bundle boundary.

Only raise the boundary concern for value imports (e.g., `import { Foo }` without `type`, or `import Foo`), since those produce JavaScript output.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-06-25T18:21:51.905Z
Learnt from: carderne
Repo: triggerdotdev/trigger.dev PR: 4039
File: apps/webapp/app/routes/invite-revoke.tsx:0-0
Timestamp: 2026-06-25T18:21:51.905Z
Learning: During the Zod v4 migration in the triggerdotdev/trigger.dev webapp, ensure any imports from `conform-to/zod` use the Zod-4 subpath: `conform-to/zod/v4` (e.g., `import { parseWithZod } from "conform-to/zod/v4"`). Do not import from the package root `conform-to/zod`, because it is the Zod 3 implementation and may load Zod-3-only symbols (e.g., `ZodBranded`, `ZodEffects`), which can throw at module load (notably with `zod4.4.3`). This should be enforced across `apps/webapp/**/*` where helpers like `parseWithZod` and `conformZodMessage` are used.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-07-03T17:10:21.498Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 4148
File: apps/webapp/app/models/orgMember.server.ts:149-168
Timestamp: 2026-07-03T17:10:21.498Z
Learning: In triggerdotdev/trigger.dev, `User.email` (Prisma schema: `internal-packages/database/prisma/schema.prisma`) currently does NOT use `citext` and does NOT have a `lower(email)` functional unique index. Therefore, do not introduce Prisma queries like `where: { email: { equals: <value>, mode: "insensitive" } }` (or any case-insensitive lookup) against `User.email`, because it can force sequential scans of the `users` table under load. During review, ensure email is normalized (e.g., lowercased/trimmed) before both writes and subsequent lookups, and if true case-insensitive behavior/uniqueness is required, implement it via a separate app-wide migration (e.g., switch to `citext` and/or add a functional unique index with backfill) rather than bolting it onto individual feature PRs.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-06-04T18:16:35.386Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3836
File: apps/supervisor/src/backpressure/backpressureMonitor.ts:3-5
Timestamp: 2026-06-04T18:16:35.386Z
Learning: When reviewing TypeScript in this repo, apply the rule “prefer type aliases over interfaces” only to data/object shapes and union/intersection type modeling. If an interface is being used as a behavioral contract for collaborators to implement (e.g., method-shape interfaces that define required behavior, such as `BackpressureLogger` / `BackpressureSignalSource` in `apps/supervisor/src/backpressure/backpressureMonitor.ts`), keep it as an `interface` and do not flag it as a type-alias-vs-interface violation.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-06-09T17:58:04.699Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3879
File: apps/webapp/app/models/vercelIntegration.server.ts:619-630
Timestamp: 2026-06-09T17:58:04.699Z
Learning: In this codebase, outbound raw `fetch` calls should typically rely on Node/undici’s default request timeout (about ~300s) rather than adding a per-call `AbortController` + `setTimeout` wrapper inside individual functions (e.g. in files like `apps/webapp/app/models/vercelIntegration.server.ts`). During code review, do not flag the absence of a per-call timeout on a single `fetch` as an issue; if per-call timeouts are needed, they should be implemented via a codebase-wide convention (e.g., a shared fetch wrapper or documented pattern) rather than ad-hoc per-function changes.

Applied to files:

  • apps/webapp/app/routes/api.v1.runs.$runParam.result.ts
  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/PostgresRunStore.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-05-18T14:40:02.173Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3658
File: packages/core/src/v3/realtimeStreams/manager.test.ts:1-147
Timestamp: 2026-05-18T14:40:02.173Z
Learning: In the triggerdotdev/trigger.dev repo, the policy “Never mock anything — use testcontainers instead” should only be enforced for integration tests that interact with real external services (e.g., Redis, Postgres) via actual infrastructure. For unit tests that exercise pure in-memory logic (e.g., cache semantics) it is OK to stub collaborators such as `ApiClient` using Vitest (`vi.fn()`) to assert call counts or control behavior. Do not flag `vi.fn()`-based `ApiClient` stubs in unit tests as violations of the testcontainers policy.

Applied to files:

  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-06-16T09:19:47.637Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3960
File: apps/webapp/test/prismaInfrastructureErrorCapture.test.ts:0-0
Timestamp: 2026-06-16T09:19:47.637Z
Learning: In this repo’s Vitest setup, `vitest.config.ts` uses `globals: true`, so identifiers like `vi`, `describe`, `it`, and `expect` are available as globals in Vitest test files. During code review, do not flag missing `vi`/`describe`/`it`/`expect` imports as a runtime error or correctness issue when they’re used in `*.test.ts/tsx` or `*.spec.ts/tsx` files. Explicit imports are still preferred for consistency, but they’re not required for runtime behavior.

Applied to files:

  • internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts
  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts
  • internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-05-07T12:25:18.271Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3531
File: apps/webapp/test/sentryTraceContext.server.test.ts:9-47
Timestamp: 2026-05-07T12:25:18.271Z
Learning: In the triggerdotdev/trigger.dev webapp test suite, it is acceptable to leave `createInMemoryTracing()` calls that register a global `NodeTracerProvider` without `afterEach`/`afterAll` teardown. Do not flag this as a test-ordering risk when the code follows the established pattern used across webapp tests (e.g., replication service/benchmark/backfiller tests). This is considered safe because `trace.getActiveSpan()` when called outside a `context.with(...)` block reads `AsyncLocalStorage.getStore()` (undefined when no `run()` scope exists), so it falls back to `ROOT_CONTEXT` with no attached span—regardless of which provider is registered.

Applied to files:

  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
📚 Learning: 2026-05-28T20:02:10.647Z
Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 3772
File: apps/webapp/test/findOrCreateBackgroundWorker.test.ts:1-1
Timestamp: 2026-05-28T20:02:10.647Z
Learning: In the triggerdotdev/trigger.dev monorepo, for the `apps/webapp` package use the established convention of storing Vitest tests (unit, integration, and e2e) under `apps/webapp/test/` rather than colocating them next to source files. Do not flag files located in `apps/webapp/test/` as violating any rule that says to colocate tests with source.

Applied to files:

  • apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts
📚 Learning: 2026-02-06T19:53:38.843Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 2994
File: apps/webapp/app/presenters/v3/DeploymentListPresenter.server.ts:233-237
Timestamp: 2026-02-06T19:53:38.843Z
Learning: When constructing Vercel dashboard URLs from deployment IDs, always strip the dpl_ prefix from the ID. Implement this by transforming the ID with .replace(/^dpl_/, "") before concatenating into the URL: https://vercel.com/${teamSlug}/${projectName}/${cleanedDeploymentId}. Consider centralizing this logic in a small helper (e.g., getVercelDeploymentId(id) or a URL builder) and add tests to verify both prefixed and non-prefixed inputs.

Applied to files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-05-05T09:38:02.512Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3523
File: apps/webapp/app/routes/api.v3.batches.ts:178-181
Timestamp: 2026-05-05T09:38:02.512Z
Learning: When reviewing code that catches `ServiceValidationError` in `*.server.ts` files, do not blindly forward `error.status` to HTTP responses, because SVEs may be thrown with non-default statuses (e.g., 400/500) and forwarding them can cause client-visible behavioral regressions (e.g., surfacing 500s to clients). Prefer a safe default response status of `error.status ?? 422`, but only after confirming via the reachable call graph that the caught `ServiceValidationError` instances are expected to carry those non-default statuses; otherwise, normalize to `422` to avoid unexpected client-visible 5xx behavior.

Applied to files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
📚 Learning: 2026-06-21T05:35:23.468Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 4005
File: apps/webapp/app/presenters/v3/ApiErrorListPresenter.server.ts:29-30
Timestamp: 2026-06-21T05:35:23.468Z
Learning: For triggerdotdev/trigger.dev list endpoints (and their presenters/handlers that implement list pagination), it is an established shared convention to allow both cursor query params `page[after]` and `page[before]` to be provided at the same time. When both are present, `page[before]` must take precedence (i.e., it should be used/wins). During code review, do NOT flag missing per-endpoint mutual-exclusion validation between `page[after]` and `page[before]` as a problem; if stricter enforcement is ever desired, it should be implemented as a codebase-wide shared convention (not individually per endpoint).

Applied to files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
🔇 Additional comments (15)
internal-packages/run-engine/src/engine/systems/waitpointSystem.ts (2)

62-68: LGTM!


509-511: LGTM!

apps/webapp/app/presenters/v3/SpanPresenter.server.ts (1)

726-730: LGTM!

apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts (1)

77-136: LGTM!

apps/webapp/app/routes/api.v1.runs.$runParam.result.ts (1)

31-35: LGTM!

apps/webapp/test/waitpointPresenter.dedicatedConnectedRuns.readthrough.test.ts (1)

1-253: LGTM!

internal-packages/run-engine/src/engine/tests/clearBlockingWaitpointsResidency.test.ts (2)

137-182: LGTM!


17-17: 🩺 Stability & Availability

describe is available via Vitest globals, so the missing import is fine.

			> Likely an incorrect or invalid review comment.
internal-packages/run-store/src/PostgresRunStore.ts (1)

993-1013: LGTM!

Also applies to: 1575-1592, 1722-1750

internal-packages/database/prisma/migrations/20260705210000_drop_waitpoint_run_connections_waitpoint_fk/migration.sql (1)

1-8: LGTM!

internal-packages/database/prisma/migrations/20260705220000_drop_task_run_waitpoint_waitpoint_fk/migration.sql (1)

1-9: LGTM!

internal-packages/database/prisma/migrations/20260705230000_drop_completed_waitpoints_waitpoint_fk/migration.sql (1)

1-9: LGTM!

internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts (1)

315-397: LGTM!

internal-packages/run-store/src/runOpsStore.crossDbCompletedWaitpoint.test.ts (1)

1-162: LGTM!

internal-packages/run-store/src/runOpsStore.crossDbTokenBlock.test.ts (1)

1-144: LGTM!

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts (1)

402-459: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Consider adding a symmetric #new (cross-DB) case for full branch coverage.

Both new tests exercise only the #legacy owning-store branch (store === this.#legacy ? tx : undefined). Per the commit summary, the #new branch intentionally drops the tx and relies on runInTransaction for atomicity — a companion test asserting that a NEW-residency run's createExecutionSnapshot call with a caller tx still persists (non-atomically) would close out coverage of the conditional introduced in runOpsStore.ts.

internal-packages/run-store/src/runOpsStore.ts (1)

1497-1497: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Stale doc comment: priority order no longer matches the code.

The comment still reads "Route by item id or batchTaskRunId" but the code below now checks batchTaskRunId first (Line 1505) precisely because leading with the cuid id misroutes NEW batch items. Worth correcting so the load-bearing routing-order rationale doesn't mislead future changes to this method.

✏️ Proposed fix
-  // Route by item `id` or `batchTaskRunId` when scalar; else fan out to both and sum.
+  // Route by `batchTaskRunId` (residency-encoding) or item `id` when scalar; else fan out to both and sum.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 01631dde-f86d-4773-919e-f01f6c143f76

📥 Commits

Reviewing files that changed from the base of the PR and between 0a2ed96 and 19c2759.

📒 Files selected for processing (2)
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
  • internal-packages/run-store/src/runOpsStore.ts
📜 Review details
⏰ Context from checks skipped due to timeout. (11)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (10, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (6, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (11, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (9, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (3, 12)
⚠️ CI failures not shown inline (6)

GitHub Actions: 📝 Agent Instructions Audit / audit: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

##[group]Run anthropics/claude-code-action@428971d2ecd6e3a7cb0ee0da2a3a8b33fdb3678d
 with:
   anthropic_***REDACTED***
   use_sticky_comment: true
   allowed_bots: devin-ai-integration[bot]
   claude_args: --max-turns 25
--model claude-opus-4-8
--allowedTools "Read,Glob,Grep,Bash(git diff:*)"
   prompt: You are reviewing a PR to check whether any agent instruction files need updating.
In this repo:
- Root shared agent guidance lives in `AGENTS.md`.
- Root `CLAUDE.md` is only a Claude Code adapter that imports `AGENTS.md`.
- Subdirectories may still have scoped `CLAUDE.md` files.
- `.claude/rules/` contains additional Claude Code guidance.
## Your task
1. Run `git diff origin/main...HEAD --name-only` to see which files changed in this PR.
2. For each changed directory, check the applicable instruction files: root `AGENTS.md`, any `CLAUDE.md` in that directory or a parent directory, and relevant `.claude/rules/` files.
3. Determine if any instruction file should be updated based on the changes. Consider:
   - New files/directories that aren't covered by existing documentation
   - Changed architecture or patterns that contradict current agent guidance
   - New dependencies, services, or infrastructure that agents should know about
   - Renamed or moved files that are referenced in an instruction file
   - Changes to build commands, test patterns, or development workflows
## Response format
If NO updates are needed, respond with exactly:
✅ Agent instruction files look current for this PR.
If updates ARE needed, respond with a short list:
📝 **Agent instruction updates suggested:**
- `AGENTS.md`: [what should be added/changed]
- `path/to/CLAUDE.md`: [what should be added/changed]
- `.claude/rules/file.md`: [what should be added/changed]
Keep suggestions specific and brief. Only flag things that would actually mislead agents in future sessions.
Do NOT suggest updates for trivial changes (bug fixes, small refactors within existing patterns).
Do NOT suggest creating new...

GitHub Actions: 📝 Agent Instructions Audit / 0_audit.txt: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

 build-batching-rc.2         -> build-batching-rc.2
  * [new tag]             build-billing-0.0.1         -> build-billing-0.0.1
  * [new tag]             build-billing-0.0.2         -> build-billing-0.0.2
  * [new tag]             build-billing-0.0.3         -> build-billing-0.0.3
  * [new tag]             build-buildinfo-rc.0        -> build-buildinfo-rc.0
  * [new tag]             build-buildinfo-rc.1        -> build-buildinfo-rc.1
  * [new tag]             build-checkpoint-failover-rc.1 -> build-checkpoint-failover-rc.1
  * [new tag]             build-checkpoint-race-condition-1 -> build-checkpoint-race-condition-1
  * [new tag]             build-checkpoint-race-condition-2 -> build-checkpoint-race-condition-2
  * [new tag]             build-checkpoint-race-condition-3 -> build-checkpoint-race-condition-3
  * [new tag]             build-chris-test-blacksmith -> build-chris-test-blacksmith
  * [new tag]             build-chris-test-blacksmith-2 -> build-chris-test-blacksmith-2
  * [new tag]             build-cli-build-upgrade-rc.1 -> build-cli-build-upgrade-rc.1
  * [new tag]             build-clickhouse-reads-rc0  -> build-clickhouse-reads-rc0
  * [new tag]             build-clickhouse-reads-rc1  -> build-clickhouse-reads-rc1
  * [new tag]             build-compute.rc0           -> build-compute.rc0
  * [new tag]             build-compute.rc1           -> build-compute.rc1
  * [new tag]             build-compute.rc2           -> build-compute.rc2
  * [new tag]             build-compute.rc3           -> build-compute.rc3
  * [new tag]             build-compute.rc4           -> build-compute.rc4
  * [new tag]             build-compute.rc5           -> build-compute.rc5
  * [new tag]             build-compute.rc6           -> build-compute.rc6
  * [new tag]             build-corepack-offline-rc.0 -> build-corepack-offline-rc.0
  * [new tag]             build-current-deployment-rc.0 -> build-current-deployment-rc.0
  * [new tag]             build-dependabot-q2.rc...

GitHub Actions: 🔎 REVIEW.md Drift Audit / audit: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

##[group]Run anthropics/claude-code-action@428971d2ecd6e3a7cb0ee0da2a3a8b33fdb3678d
 with:
   anthropic_***REDACTED***
   use_sticky_comment: true
   allowed_bots: devin-ai-integration[bot]
   claude_args: --max-turns 30
--allowedTools "Read,Glob,Grep,Bash(git diff:*)"
   prompt: You are auditing this PR for drift against `.claude/REVIEW.md`.
## Context
`.claude/REVIEW.md` is the repo's source of truth for what AI / agent code reviewers should treat as critical findings (rolling-deploy safety, hot-table indexes, recovery-path queries, testcontainers usage, Lua versioning, etc.). It is consumed by review agents to calibrate severity. If REVIEW.md goes stale, every future agent review degrades.
## Strategy — read this first
You have a hard turn budget. Spend it on signal, not coverage. The audit is allowed to miss things; it is NOT allowed to time out.
1. Read `.claude/REVIEW.md` once, in full.
2. Run `git diff origin/main...HEAD --name-only` to get the list of changed files. Do NOT read the diff content yet.
3. Scan the file-list for relevance to REVIEW.md scope. Relevance signals: changes to Prisma schema, Redis / queue / Lua code, hot tables, recovery / restart loops, new packages, deletions of paths REVIEW.md cites. Skim everything else.
4. Open at most **5 files** total — only the ones most likely to surface a real signal. If nothing in the file-list looks relevant to any REVIEW.md rule, do NOT read any files; go straight to the verdict.
5. Form a verdict and stop. Do not exhaust the turn budget exploring.
Large PRs (>50 files changed) are a strong signal to be MORE selective, not more thorough. Pick 3-5 files at most.
## What to look for
- **Stale references** — does any REVIEW.md rule cite a file, directory, function, table, Prisma model, or package name that has been removed or renamed in this PR (or is already gone from `main`)?
- **Contradictions** — does code in this PR clearly violate a current REVIEW.md rule? (Don't re-review the PR. Only flag if REVIE...

GitHub Actions: 🔎 REVIEW.md Drift Audit / 0_audit.txt: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

ngine.fix3 -> build-legacy-run-engine.fix3
  * [new tag]             build-manual-checkpoints.rc1 -> build-manual-checkpoints.rc1
  * [new tag]             build-metadata-upgrade-logging.rc1 -> build-metadata-upgrade-logging.rc1
  * [new tag]             build-metadata-upgrade-logging.rc2 -> build-metadata-upgrade-logging.rc2
  * [new tag]             build-metadata-upgrade-logging.rc3 -> build-metadata-upgrade-logging.rc3
  * [new tag]             build-new-build-system.rc.1 -> build-new-build-system.rc.1
  * [new tag]             build-otel-upgrade-rc.0     -> build-otel-upgrade-rc.0
  * [new tag]             build-otel-upgrade-rc.1     -> build-otel-upgrade-rc.1
  * [new tag]             build-pre-pull-deployments-rc.1 -> build-pre-pull-deployments-rc.1
  * [new tag]             build-prod-rescue-rc.1      -> build-prod-rescue-rc.1
  * [new tag]             build-rate-limiter-fix-rc.1 -> build-rate-limiter-fix-rc.1
  * [new tag]             build-re2.rc0               -> build-re2.rc0
  * [new tag]             build-realtime-v2-stream-fix -> build-realtime-v2-stream-fix
  * [new tag]             build-realtime-v2-stream-fix-2 -> build-realtime-v2-stream-fix-2
  * [new tag]             build-realtime-v2-stream-fix-3 -> build-realtime-v2-stream-fix-3
  * [new tag]             build-realtime-v2-stream-fix-4 -> build-realtime-v2-stream-fix-4
  * [new tag]             build-realtime-v2-stream-fix-5 -> build-realtime-v2-stream-fix-5
  * [new tag]             build-realtimestreams-dedupe -> build-realtimestreams-dedupe
  * [new tag]             build-registry-maintenance-rc.1 -> build-registry-maintenance-rc.1
  * [new tag]             build-registry-maintenance-rc.2 -> build-registry-maintenance-rc.2
  * [new tag]             build-remote-ecr-rc.0       -> build-remote-ecr-rc.0
  * [new tag]             build-reschedule-hotfix.rc1 -> build-reschedule-hotfix.rc1
  * [new tag]             build-resume-fixes.rc1      -> build-resume-fixes.rc1
  * [new tag]             ...

GitHub Actions: 🛡️ E2E Tests: Webapp Auth (full) / 🛡️ E2E Auth Tests (full): fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

0 - - 9.632 ms
 GET /api/v1/runs 403 - - 7.238 ms
 GET /api/v1/runs 403 - - 6.684 ms
 [][ERROR][`@clickhouse/client`][Connection] Query: HTTP request error.
 Arguments: {
   query: 'SELECT run_id, toUnixTimestamp64Milli(created_at) AS created_at_ms FROM trigger_dev.task_runs_v2 FINAL WHERE organization_id = {organizationId: String} AND project_id = {projectId: String} AND environment_id = {environmentId: String} AND task_identifier IN {tasks: Array(String)} AND created_at >= fromUnixTimestamp64Milli({period: Int64}) ORDER BY created_at DESC, run_id DESC LIMIT 26 \n' +
     'FORMAT JSONEachRow',
   search_params: 'query_id=aa1a7286-4d1b-4ce3-9a7a-8140ab25e92c&param_organizationId=cmr8cr73x007bqnc4d2t3vix4&param_projectId=cmr8cr73y007dqnc406ze1j22&param_environmentId=cmr8cr73z007fqnc4glt7mflx&param_tasks=%5B%27task_a%27%2C%27task_b%27%5D&param_period=1782685086923&output_format_json_quote_64bit_integers=0&output_format_json_quote_64bit_floats=0&cancel_http_readonly_queries_on_client_close=1',
   with_abort_signal: false,
   session_id: undefined,
   query_id: 'aa1a7286-4d1b-4ce3-9a7a-8140ab25e92c',
   decompress_response: false,
   clickhouse_settings: {
     output_format_json_quote_64bit_integers: 0,
     output_format_json_quote_64bit_floats: 0,
     cancel_http_readonly_queries_on_client_close: 1
   }
 }
 Caused by: Error: connect ECONNREFUSED 127.0.0.1:19123
     at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1638:16)
     at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
   errno: -111,
   code: 'ECONNREFUSED',
   syscall: 'connect',
   address: '127.0.0.1',
   port: 19123
 }
 {"name":"ClickHouse","error":{"message":"connect ECONNREFUSED 127.0.0.1:19123","stack":"Error: connect ECONNREFUSED 127.0.0.1:19123\n    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1638:16)\n    at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:130:17)","name":"Error"},"query":"SELECT run_id, toUnixTimestamp64Milli(created_at) A...

GitHub Actions: 🛡️ E2E Tests: Webapp Auth (full) / 0_🛡️ E2E Auth Tests (full).txt: fix(run-store,run-engine): fix run-ops split hangs from wrong-store reads on the resume path

Conclusion: failure

View job details

ask_identifier IN {tasks: Array(String)} AND created_at >= fromUnixTimestamp64Milli({period: Int64}) ORDER BY created_at DESC, run_id DESC LIMIT 26","params":{"organizationId":"cmr8cr73x007bqnc4d2t3vix4","projectId":"cmr8cr73y007dqnc406ze1j22","environmentId":"cmr8cr73z007fqnc4glt7mflx","tasks":["task_a","task_b"],"period":1782685086923},"queryId":"aa1a7286-4d1b-4ce3-9a7a-8140ab25e92c","timestamp":"","message":"Error querying clickhouse","level":"error"}
 {"error":{"name":"QueryError","message":"Unable to query clickhouse: connect ECONNREFUSED 127.0.0.1:19123","stack":"QueryError: Unable to query clickhouse: connect ECONNREFUSED 127.0.0.1:19123\n    at /home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:75633:17\n    at process.processTicksAndRejections (node:internal/process/task_queues:103:5)\n    at async /home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:260:14\n    at async /home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:75584:18\n    at async ClickHouseRunsRepository.listRunRows (/home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:117771:36)\n    at async ClickHouseRunsRepository.listRunIds (/home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:117784:20)\n    at async ClickHouseRunsRepository.listRuns (/home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:117836:38)\n    at async /home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:35449:14\n    at async NextRunListPresenter.call (/home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:157003:39)\n    at async /home/runner/_work/trigger.dev/trigger.dev/apps/webapp/build/index.js:259121:21"},"url":"http://localhost:42911/api/v1/runs?filter%5BtaskIdentifier%5D=task_a%2Ctask_b","http":{"requestId":"jfyIOveU3VtQHmGVw-MvC","path":"/api/v1/runs?filter%5BtaskIdentifier%5D=task_a%2Ctask_b","host":"localhost","method":"GET","abortController":{}},"timestamp":"","name":"webapp","...
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

**/*.{ts,tsx,js,jsx}: Prefer static imports over dynamic import(); only use dynamic imports when resolving circular dependencies, enabling real code splitting, or conditionally loading a module at runtime.
Always import from @trigger.dev/sdk; never import from @trigger.dev/sdk/v3 or use deprecated client.defineJob.
In code that imports @trigger.dev/core, use subpath imports only and never import from the package root.

Files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use vitest for all tests in the Trigger.dev repository

Files:

  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
**/*.test.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.test.{ts,tsx,js,jsx}: Place test files next to their source files (for example, MyService.ts -> MyService.test.ts).
Use Vitest exclusively for tests, and do not mock dependencies; use testcontainers instead.

Files:

  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
🧠 Learnings (11)
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma error P1001 ("Can't reach database server") in TypeScript, don’t assume a single error shape. Prisma can surface P1001 via two different error classes/fields: `PrismaClientKnownRequestError` exposes it as `err.code === "P1001"` (common during mid-query connection drops), while `PrismaClientInitializationError` exposes it as `err.errorCode === "P1001"` (common on client startup failure). Therefore, predicates should use `err.code === "P1001" || err.errorCode === "P1001"`. Do not flag `err.code === "P1001"` as “unreachable/never matches,” as it is expected in production.

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma errors for P1001 ("Can't reach database server"), do not assume it only appears under a single property name. Prisma may surface P1001 via either `PrismaClientKnownRequestError` (`err.code === "P1001"`, e.g., mid-query connection drops) or `PrismaClientInitializationError` (`err.errorCode === "P1001"`, e.g., client startup connection failure). To reliably detect the condition, check `err.code === "P1001" || err.errorCode === "P1001"`, and avoid review rules that would incorrectly flag `err.code === "P1001"` as unreachable/never-matching.

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-06-13T19:53:13.759Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3937
File: packages/trigger-sdk/skills/realtime-and-frontend/SKILL.md:258-260
Timestamp: 2026-06-13T19:53:13.759Z
Learning: When reviewing code that uses `trigger.dev/react-hooks`’s `useRealtimeRun`, preserve the call signature where the first argument is the full realtime handle object (not `handle.id`). This is intentional to maintain type-safety and is consistent with the official docs; do not suggest changing the first argument from the handle object to `handle.id`.

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-06-17T17:13:49.929Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3948
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.bulk-actions.$bulkActionParam/route.tsx:48-62
Timestamp: 2026-06-17T17:13:49.929Z
Learning: In triggerdotdev/trigger.dev, within `dashboardLoader`/`dashboardAction` (or similar context resolver code) whenever you resolve an organization ID from an organization slug for RBAC/enterprise authorization scope, always read from the primary Prisma client (`prisma`), not `$replica`. Using `$replica` can hit replica-lag and cause the RBAC lookup/authorization to run without the correct org scope (bypassing intended role enforcement). Implement the slug→org lookup with `prisma.organization.findFirst(...)` (or equivalent primary-client query) and add an inline comment documenting why the primary client is required (replica lag could lead to unscoped RBAC checks).

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-06-23T13:04:21.413Z
Learnt from: carderne
Repo: triggerdotdev/trigger.dev PR: 4023
File: apps/webapp/app/services/upsertBranch.server.ts:14-18
Timestamp: 2026-06-23T13:04:21.413Z
Learning: In TypeScript, it’s valid to `import { type X }` and then use `typeof X` in a type-only position, e.g. `type Alias = z.infer<typeof X>`. The `type` modifier suppresses the runtime import, but the type checker still has the full exported type so `z.infer<typeof X>` can resolve correctly. In code reviews, don’t flag this as a TypeScript compile error as long as `typeof X` is used in a type context (e.g., with `z.infer`, `type` aliases, generics), not as a runtime value.

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-06-04T18:16:35.386Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3836
File: apps/supervisor/src/backpressure/backpressureMonitor.ts:3-5
Timestamp: 2026-06-04T18:16:35.386Z
Learning: When reviewing TypeScript in this repo, apply the rule “prefer type aliases over interfaces” only to data/object shapes and union/intersection type modeling. If an interface is being used as a behavioral contract for collaborators to implement (e.g., method-shape interfaces that define required behavior, such as `BackpressureLogger` / `BackpressureSignalSource` in `apps/supervisor/src/backpressure/backpressureMonitor.ts`), keep it as an `interface` and do not flag it as a type-alias-vs-interface violation.

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-06-09T17:58:04.699Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3879
File: apps/webapp/app/models/vercelIntegration.server.ts:619-630
Timestamp: 2026-06-09T17:58:04.699Z
Learning: In this codebase, outbound raw `fetch` calls should typically rely on Node/undici’s default request timeout (about ~300s) rather than adding a per-call `AbortController` + `setTimeout` wrapper inside individual functions (e.g. in files like `apps/webapp/app/models/vercelIntegration.server.ts`). During code review, do not flag the absence of a per-call timeout on a single `fetch` as an issue; if per-call timeouts are needed, they should be implemented via a codebase-wide convention (e.g., a shared fetch wrapper or documented pattern) rather than ad-hoc per-function changes.

Applied to files:

  • internal-packages/run-store/src/runOpsStore.ts
  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-05-18T14:40:02.173Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3658
File: packages/core/src/v3/realtimeStreams/manager.test.ts:1-147
Timestamp: 2026-05-18T14:40:02.173Z
Learning: In the triggerdotdev/trigger.dev repo, the policy “Never mock anything — use testcontainers instead” should only be enforced for integration tests that interact with real external services (e.g., Redis, Postgres) via actual infrastructure. For unit tests that exercise pure in-memory logic (e.g., cache semantics) it is OK to stub collaborators such as `ApiClient` using Vitest (`vi.fn()`) to assert call counts or control behavior. Do not flag `vi.fn()`-based `ApiClient` stubs in unit tests as violations of the testcontainers policy.

Applied to files:

  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
📚 Learning: 2026-06-16T09:19:47.637Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3960
File: apps/webapp/test/prismaInfrastructureErrorCapture.test.ts:0-0
Timestamp: 2026-06-16T09:19:47.637Z
Learning: In this repo’s Vitest setup, `vitest.config.ts` uses `globals: true`, so identifiers like `vi`, `describe`, `it`, and `expect` are available as globals in Vitest test files. During code review, do not flag missing `vi`/`describe`/`it`/`expect` imports as a runtime error or correctness issue when they’re used in `*.test.ts/tsx` or `*.spec.ts/tsx` files. Explicit imports are still preferred for consistency, but they’re not required for runtime behavior.

Applied to files:

  • internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts
🔇 Additional comments (4)
internal-packages/run-store/src/PostgresRunStore.writeAtomicity.test.ts (1)

398-459: LGTM!

The rollback/commit pair correctly exercises the new "#legacy owning store gets the caller's tx" behavior for createExecutionSnapshot, mirroring the existing deleteManyTaskRunWaitpoints pattern above.

internal-packages/run-store/src/runOpsStore.ts (3)

826-854: LGTM!

createExecutionSnapshot's tx-forwarding is correctly scoped to the #legacy owning store, and findSnapshotCompletedWaitpointIds's both-store fan-out correctly handles the fact that snapshot ids are cuids and can't be routed to a single store.


1298-1316: LGTM!

Dropping the caller's tx on the #new leg of deleteManyTaskRunWaitpoints while still forwarding it to #legacy is correct — a control-plane tx cannot span the cross-DB #new connection, and forwarding it there previously (per the line-range change note) looks like exactly the class of bug that could cause a hang on the resume path.


1498-1516: 🎯 Functional Correctness

No action needed: every current caller passes batchTaskRunId (including apps/webapp/app/v3/services/batchTriggerV3.server.ts:1083), so the id fallback here is not exercised by existing call sites.

			> Likely an incorrect or invalid review comment.

@d-cs d-cs enabled auto-merge (squash) July 5, 2026 22:26
@d-cs d-cs merged commit f101983 into main Jul 5, 2026
46 checks passed
@d-cs d-cs deleted the fix/waitpoint-pending-check-primary branch July 5, 2026 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants