|
| 1 | +# Mollifier idempotency-key claim — race fix |
| 2 | + |
| 3 | +**Branch:** `mollifier-phase-3` |
| 4 | +**Date:** 2026-05-21 |
| 5 | +**Status:** Design locked. Implementation pending. |
| 6 | +**Companion:** [`2026-05-19-mollifier-idempotency-design.md`](2026-05-19-mollifier-idempotency-design.md) (Q5) — this extends it. |
| 7 | + |
| 8 | +## Problem |
| 9 | + |
| 10 | +Q5 assumed two simultaneous same-key triggers either both reach PG or both reach the buffer. The gate-transition window violates that: during the burst that trips the gate, the first 1..N triggers (where N = `TRIGGER_MOLLIFIER_TRIP_THRESHOLD`) pass through to PG, and triggers N+1..M get mollified. With the same idempotency key across all of them: |
| 11 | + |
| 12 | +- PG path: engine.trigger races; one inserts, others get `RunDuplicateIdempotencyKeyError` → return the PG winner. ✓ inside-store dedup. |
| 13 | +- Buffer path: accept Lua SETNX races; one wins the buffer SETNX, others get `duplicate_idempotency`. ✓ inside-store dedup. |
| 14 | +- **Across stores: no coordination.** The system produces *two* distinct race-winners for the same key. |
| 15 | + |
| 16 | +Customer-visible damage: |
| 17 | + |
| 18 | +- Caller A receives `{ id: "run_PG" }` |
| 19 | +- Caller B receives `{ id: "run_BUF" }` from a different point in the burst |
| 20 | +- Both are isCached:false (both think they triggered for the first time) |
| 21 | +- Caller B stores `run_BUF` in their DB / log / pipeline |
| 22 | +- Drainer eventually pops `run_BUF` → engine.trigger → P2002 against `run_PG` → drainer marks buffer entry FAILED |
| 23 | +- Caller B's subsequent operations on `run_BUF`: |
| 24 | + - mutations (tags, metadata) queued in the buffered window: silently lost |
| 25 | + - reads via API: work for ~10min via buffer fallback, then 404 forever |
| 26 | +- Caller B has no signal that `run_BUF` was a ghost. Silent data corruption surfacing minutes later. |
| 27 | + |
| 28 | +Found while running `scripts/mollifier-challenge/04-idempotency-collision.sh` without pre-warming the gate. The script was updated to pre-warm so the suite passes, but the underlying race is still there for real customer traffic during natural burst-transitions. |
| 29 | + |
| 30 | +## The customer's contract |
| 31 | + |
| 32 | +> "Same idempotency key → same runId, always." |
| 33 | +
|
| 34 | +That's what makes idempotency keys useful. Internal self-correction (drainer P2002) only cleans up internal state — it doesn't recover the customer's expectation that they have one canonical runId to track. |
| 35 | + |
| 36 | +## Design |
| 37 | + |
| 38 | +A **pre-gate Redis claim** that all same-key triggers serialise through, before the trigger pipeline decides PG vs buffer. |
| 39 | + |
| 40 | +- PG's unique constraint remains the only mechanism the system *requires* for correctness. |
| 41 | +- Redis becomes the **performance / coordination layer** for cross-store dedup. When Redis is up, no duplicate runIds. When Redis is down, the system degrades to today's behaviour (race may briefly produce a buffered duplicate, P2002 catches it). |
| 42 | +- The mollifier already has the lookup infrastructure from B6a (`mollifier:idempotency:{env}:{task}:{key}`). This proposal repurposes it as the pre-gate claim instead of a buffer-only SETNX. |
| 43 | + |
| 44 | +### Flow |
| 45 | + |
| 46 | +``` |
| 47 | +Trigger arrives with idempotencyKey K: |
| 48 | +
|
| 49 | +1. runFriendlyId = generate() // existing, triggerTask.server.ts:131 |
| 50 | +
|
| 51 | +2. SETNX mollifier:idempotency:{env}:{task}:{K} = "pending" EX 30s |
| 52 | +
|
| 53 | +3. If we won the claim: |
| 54 | + try { |
| 55 | + result = runTriggerPipeline() // gate → PG or buffer |
| 56 | + SET ...K = runFriendlyId EX <idempotencyKeyExpiresAt - now> |
| 57 | + return { id: runFriendlyId, isCached: false } |
| 58 | + } catch (err) { |
| 59 | + DEL ...K // free the claim for waiters |
| 60 | + throw err |
| 61 | + } |
| 62 | +
|
| 63 | +4. If we lost the claim: |
| 64 | + poll the key on ~20ms interval, up to safetyNetMs (default 5s) |
| 65 | + - value "pending" → keep polling |
| 66 | + - value is a runId → return { id: <that>, isCached: true } |
| 67 | + - key vanished → retry from step 2 (claimant errored) |
| 68 | + - safetyNet hit → return 503 "Idempotency claim resolution timed out" |
| 69 | +``` |
| 70 | + |
| 71 | +Subsequent same-key triggers (after the burst settles) hit step 2 and find the key already populated with the winner's runId → return cached without ever blocking. |
| 72 | + |
| 73 | +### Why this closes the race |
| 74 | + |
| 75 | +- Same-key triggers serialise through SETNX. Only one trigger ever runs the pipeline; everyone else waits for its runId. |
| 76 | +- Buffer accept and PG insert remain their own race-winners *within* their store (defence in depth), but only one of them is on the path for any given key — the winner of the upstream SETNX. |
| 77 | +- The window between "claimant calls SETNX" and "subsequent caller polls" is nanoseconds (Redis serialises). The window between "claimant SETs runId" and "waiters see it" is one poll-interval (~20ms). |
| 78 | + |
| 79 | +### Failure modes |
| 80 | + |
| 81 | +| Scenario | Behaviour | |
| 82 | +|---|---| |
| 83 | +| Claimant crashes mid-pipeline | Claim TTL (30s) expires → waiters time out, return 503 → SDK retries → new SETNX winner | |
| 84 | +| Claimant's pipeline errors → DEL fires | Next polling waiter sees key vanished → retries SETNX → one of them wins → proceeds | |
| 85 | +| Redis SETNX fails (Redis down) | Log warn, skip the claim machinery → trigger pipeline runs unguarded → today's race may briefly produce a duplicate → P2002 backstop catches it | |
| 86 | +| Redis GET fails for a waiter | Log warn, fall through to running the pipeline → may produce a duplicate but P2002 backstop applies | |
| 87 | +| Claimant finishes, Redis SET (publishing the runId) fails | Waiters time out, return 503 → SDK retries → next claimant finds PG row via existing `IdempotencyKeyConcern` PG findFirst → returns cached | |
| 88 | + |
| 89 | +The system is *correct* without Redis (PG unique constraint is the source of truth); Redis is the path to *perfect customer-visible dedup*. |
| 90 | + |
| 91 | +### Performance |
| 92 | + |
| 93 | +- Every same-key trigger: 1 Redis SETNX (~1ms locally). |
| 94 | +- The winner: + 1 Redis SET on success (~1ms). |
| 95 | +- Losers: a few `GET` polls (~20ms wait each, ~1-2 polls typical = 20-40ms added latency). |
| 96 | +- Triggers WITHOUT an idempotency key: zero change. |
| 97 | + |
| 98 | +For real customer burst patterns, the typical wait is a single poll cycle: the claimant's PG insert (or buffer accept) is fast, the SET happens, the next poll-tick on each waiter resolves. |
| 99 | + |
| 100 | +## Implementation |
| 101 | + |
| 102 | +### Files to touch |
| 103 | + |
| 104 | +**Modify:** |
| 105 | + |
| 106 | +- `apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts` — `IdempotencyKeyConcern.handleTriggerRequest`. After the existing PG findFirst + buffer.lookupIdempotency checks (which still run first for the post-burst settled case), insert the claim machinery. |
| 107 | +- `apps/webapp/app/v3/mollifier/mollifierMollify.server.ts` — on successful `accept`, the existing SETNX behaviour in `acceptMollifierEntry` Lua becomes redundant if the claim wins. Decision: keep the inner SETNX as a belt-and-braces; on `duplicate_idempotency` the mollify path returns the inner winner. Should never fire if the pre-gate claim is working, but cheap to keep. |
| 108 | +- `apps/webapp/app/runEngine/services/triggerTask.server.ts` — on successful `engine.trigger` PG insert, publish the runId to the claim key (best-effort). |
| 109 | + |
| 110 | +**New:** |
| 111 | + |
| 112 | +- `apps/webapp/app/v3/mollifier/idempotencyClaim.server.ts` — claim/publish/wait helpers. Mirror `mutateWithFallback`'s discriminated-outcome shape: |
| 113 | + |
| 114 | +```ts |
| 115 | +export type ClaimOutcome = |
| 116 | + | { kind: "claimed"; runFriendlyId: string } // we own it, proceed |
| 117 | + | { kind: "cached"; runId: string } // someone else's winner, return it |
| 118 | + | { kind: "timed_out" }; // safety net exceeded |
| 119 | + |
| 120 | +export async function claimOrAwait( |
| 121 | + redis: Redis, |
| 122 | + key: string, |
| 123 | + runFriendlyId: string, |
| 124 | + ttl: number, |
| 125 | + opts?: { safetyNetMs?: number; pollStepMs?: number }, |
| 126 | +): Promise<ClaimOutcome>; |
| 127 | + |
| 128 | +export async function publishClaim( |
| 129 | + redis: Redis, |
| 130 | + key: string, |
| 131 | + runId: string, |
| 132 | + ttl: number, |
| 133 | +): Promise<void>; |
| 134 | + |
| 135 | +export async function releaseClaim(redis: Redis, key: string): Promise<void>; |
| 136 | +``` |
| 137 | + |
| 138 | +### Wiring inside `IdempotencyKeyConcern.handleTriggerRequest` |
| 139 | + |
| 140 | +```ts |
| 141 | +if (idempotencyKey) { |
| 142 | + const pgRun = await this.prisma.taskRun.findFirst({ ... }); // existing |
| 143 | + if (pgRun) return { isCached: true, run: pgRun }; |
| 144 | + |
| 145 | + if (!request.body.options?.resumeParentOnCompletion) { |
| 146 | + const buffered = await findBufferedRunWithIdempotency(...); // existing |
| 147 | + if (buffered) return { isCached: true, run: buffered }; |
| 148 | + } |
| 149 | + |
| 150 | + // NEW: pre-gate claim. Skip if buffer/redis unavailable. |
| 151 | + const buffer = getMollifierBuffer(); |
| 152 | + if (buffer) { |
| 153 | + const outcome = await claimOrAwait( |
| 154 | + buffer.redis, |
| 155 | + makeIdempotencyClaimKey(...), |
| 156 | + runFriendlyId, |
| 157 | + ttl, |
| 158 | + ); |
| 159 | + if (outcome.kind === "cached") { |
| 160 | + // Synthesise a cached-run response shaped like the PG/buffer paths |
| 161 | + // return so the rest of the trigger pipeline can short-circuit. |
| 162 | + const synthetic = await resolveCachedRun(outcome.runId, ...); |
| 163 | + return synthetic |
| 164 | + ? { isCached: true, run: synthetic } |
| 165 | + : { isCached: false, idempotencyKey, idempotencyKeyExpiresAt }; |
| 166 | + } |
| 167 | + if (outcome.kind === "timed_out") { |
| 168 | + throw new ServiceValidationError("Idempotency claim resolution timed out", 503); |
| 169 | + } |
| 170 | + // outcome.kind === "claimed" → continue to existing pipeline below |
| 171 | + request._idempotencyClaimOwned = true; // signal for publish on success |
| 172 | + } |
| 173 | +} |
| 174 | +return { isCached: false, idempotencyKey, idempotencyKeyExpiresAt }; |
| 175 | +``` |
| 176 | + |
| 177 | +### Wiring for the publish |
| 178 | + |
| 179 | +After successful `engine.trigger` in `triggerTask.server.ts` (V2 path), AND after successful `mollifyTrigger.accept`: |
| 180 | + |
| 181 | +```ts |
| 182 | +if (request._idempotencyClaimOwned) { |
| 183 | + await publishClaim(redis, claimKey, runFriendlyId, ttl) |
| 184 | + .catch((err) => logger.warn("idempotency claim publish failed", { err })); |
| 185 | +} |
| 186 | +``` |
| 187 | + |
| 188 | +On any pipeline error before publish: |
| 189 | + |
| 190 | +```ts |
| 191 | +if (request._idempotencyClaimOwned) { |
| 192 | + await releaseClaim(redis, claimKey).catch((err) => |
| 193 | + logger.warn("idempotency claim release failed", { err }) |
| 194 | + ); |
| 195 | +} |
| 196 | +``` |
| 197 | + |
| 198 | +### Tests |
| 199 | + |
| 200 | +Unit tests in `apps/webapp/test/mollifierIdempotencyClaim.test.ts`: |
| 201 | + |
| 202 | +1. SETNX wins → `claimed` returned. |
| 203 | +2. SETNX loses, value is already a runId → `cached` returned immediately. |
| 204 | +3. SETNX loses, value is "pending" → poll until it flips → `cached` returned. |
| 205 | +4. SETNX loses, key TTLs out mid-poll → retry SETNX → win → `claimed`. |
| 206 | +5. SETNX loses, never resolves → `timed_out` after safetyNetMs. |
| 207 | +6. publishClaim writes the runId. |
| 208 | +7. releaseClaim DELs the key. |
| 209 | + |
| 210 | +Integration test in `apps/webapp/test/api/idempotency-claim-burst.test.ts` — fire N same-key triggers under various gate states, assert all responses converge on a single runId. |
| 211 | + |
| 212 | +Bash regression in `scripts/mollifier-challenge/04-idempotency-collision.sh` — remove the pre-warm hack; assert that N same-key triggers during a cold-gate burst still produce one runId. |
| 213 | + |
| 214 | +## Sub-decisions |
| 215 | + |
| 216 | +| # | Question | Resolution | |
| 217 | +|---|---|---| |
| 218 | +| 1 | Claim TTL | 30s. Bounded by typical PG insert + buffer accept time + small margin. Shorter risks claimants legitimately taking longer than the TTL; longer risks waiters hanging on crashed claimants. | |
| 219 | +| 2 | Wait safetyNetMs | 5s. Matches the upper bound a customer SDK would tolerate before retry. | |
| 220 | +| 3 | Pre-publish "pending" value vs publishing runId immediately | "pending". Two-stage state lets a waiter distinguish "someone is working on this" from "the answer is this runId". A claimant can DEL the key on error and the next polling waiter retries SETNX cleanly. | |
| 221 | +| 4 | What about `resumeParentOnCompletion` (triggerAndWait)? | Skip the claim machinery. triggerAndWait already bypasses the buffer gate (F4), so it goes to PG; its existing PG-side dedup handles concurrent triggerAndWait calls with the same key. Adding the claim there opens a different rabbit hole. | |
| 222 | +| 5 | What happens to the buffer-side SETNX inside `acceptMollifierEntry` Lua (B6a)? | Keep it. Defence in depth — if the pre-gate claim somehow misses, the inner SETNX still serialises buffer-side accepts. Should never observe a `duplicate_idempotency` outcome from accept in practice. | |
| 223 | + |
| 224 | +## What this does *not* fix |
| 225 | + |
| 226 | +- The PG `findFirst` replica-lag race: the existing `IdempotencyKeyConcern` PG check uses `this.prisma` (writer). Already correct. |
| 227 | +- Cross-environment / cross-task idempotency: not a thing today, not introduced. |
| 228 | +- Customer's own client-side retries with backoff that exceeds claim TTL: SDK retries within TTL hit cached fine; retries outside TTL race like fresh requests (rare and bounded). |
| 229 | + |
| 230 | +## Out of scope |
| 231 | + |
| 232 | +- Distributed-coordination scenarios (multiple Redis instances, cluster mode) — claim key is per-env so hash-tag co-location is straightforward when needed. |
| 233 | +- Observability (metrics) — Phase F1 tightening can add `mollifier.idempotency_claim_{wins,waits,timeouts}` counters. |
| 234 | + |
| 235 | +## Resume guidance for a future session |
| 236 | + |
| 237 | +1. Read this doc. |
| 238 | +2. Read the Q5 doc to understand the existing buffer-side idempotency lookup (`MollifierBuffer.lookupIdempotency`, `resetIdempotency`). |
| 239 | +3. Implement `idempotencyClaim.server.ts` per the sketch above. |
| 240 | +4. Wire `IdempotencyKeyConcern` to use it. |
| 241 | +5. Wire publish/release in the trigger pipeline + mollifyTrigger. |
| 242 | +6. Tests per the section above. |
| 243 | +7. Validate by removing the pre-warm hack from `scripts/mollifier-challenge/04-idempotency-collision.sh` and confirming the script still passes with the gate in a cold state. |
| 244 | + |
| 245 | +Estimated effort: 1-2 days of focused work. Risk: low (Redis-side primitives all exist; the integration is the work). |
0 commit comments