feat(coderd/autobuild): notify users before workspace autostop#26576
feat(coderd/autobuild): notify users before workspace autostop#26576sreya wants to merge 1 commit into
Conversation
|
/coder-agents-review |
|
Chat: Review posted | View chat Review historydeep-review v0.9.0 | Round 2 | Last posted: Round 2, 9 findings (1 P2, 3 P3, 1 P4, 2 Nit, 2 Note), APPROVE. Review Finding inventoryFindings
Round logRound 1Panel (17 reviewers: Bisky, Hisoka, Mafu-san, Mafuuu, Pariston, Komugi, Gon, Leorio, Chopper, Knuckle, Takumi, Meruem, Ging-Go, Kite, Melody, Knov, Zoro). Netero: no findings. 1 P2, 3 P3, 2 Nit, 2 Note. Reviewed against 2537503..ce3eae6. Round 2CRF-1 through CRF-7 addressed. CRF-8 dropped. Panel (9 reviewers: Bisky, Hisoka, Mafu-san, Mafuuu, Pariston, Komugi, Chopper, Meruem, Kite). Netero: no findings. 1 new P4. Reviewed against 2537503..14fef35. About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
There was a problem hiding this comment.
Well-engineered feature with clean idempotence design and thorough test coverage. The marker-based deduplication, HA safety via advisory locks + RepeatableRead, and the six-subtest suite covering window boundaries, idempotence, re-arm, and the unbounded-window edge case are all strong.
Severity summary: 1 P2, 3 P3, 2 Nit, 2 Note.
The P2 (marker-before-enqueue) is the only finding that affects the feature's delivery guarantee. Seven reviewers flagged it independently; Mafuuu's framing is sharpest: "The notification IS the only user-visible outcome. The marker serves solely to deduplicate the notification. When the notification is lost, the marker has prevented the one thing it was protecting."
"CRF-4 is not a GitHub/Linear ticket and carries no inline substance. It reads as a plan-document or review-round identifier. Strip it. The invariant itself ('exactly one reminder per deadline') earns its lines; the CRF-4 label does not." (Gon)
🤖 This review was automatically generated with Coder Agents.
Add a reminder-dispatch pass to the lifecycle executor that notifies a workspace owner a template-configured duration (time_til_autostop_notify) before their workspace is automatically stopped. Per tick, a separate scan (GetWorkspacesEligibleForAutostopReminder) finds running workspaces whose deadline is within the lead window and not yet reminded for that deadline. Inside a per-workspace advisory-locked RepeatableRead transaction it re-validates volatile conditions and stamps workspace_builds.notified_autostop_deadline, then enqueues TemplateWorkspaceAutostopReminder after commit. If the enqueue fails, the marker is re-armed so a later tick retries. Idempotent per deadline and HA-safe; at most one reminder even when the lead exceeds a workspace's remaining lifetime.
ce3eae6 to
14fef35
Compare
|
/coder-agents-review |
There was a problem hiding this comment.
All 7 substantive findings from round 1 were addressed in 14fef35. Each fix verified by multiple reviewers:
CRF-1 (P2): Re-arm logic added at line 716 clears the marker on enqueue failure so the next tick retries. SelfHealsOnEnqueueFailure integration test exercises the full failure-then-retry path.
CRF-2 (P3): Comment rewritten to "volatile eligibility conditions"; dormancy/deletion guards added.
CRF-3 (P3): TestShouldRemindAutostop with 7 table-driven cases, 100% branch coverage.
CRF-4 (P3): SQL now sets updated_at; Go passes dbtime.Now().
CRF-5, CRF-6 (Nit): CRF-4 prefix removed, implementation motivation dropped.
CRF-7 (Note): Tests assert deadline values via WithinDuration and label format via Equal.
One new P4 inline (CRF-9), a narrow edge case during shutdown. Not blocking.
"Three layers of idempotency (marker, lock, dedupe hash). Ran SelfHealsOnEnqueueFailure 3x with -race; clean." (Hisoka)
🤖 This review was automatically generated with Coder Agents.
| // marker would permanently suppress this deadline's only notification. | ||
| // The notifications subsystem's own dedupe_hash prevents same-day | ||
| // duplicates if the enqueue actually partially succeeded. | ||
| if rearmErr := e.db.UpdateWorkspaceBuildNotifiedAutostopDeadline(e.ctx, database.UpdateWorkspaceBuildNotifiedAutostopDeadlineParams{ |
There was a problem hiding this comment.
P4 [CRF-9] The re-arm uses e.ctx. If the executor is shutting down (context cancelled), the enqueue fails with context.Canceled, then the re-arm also fails with context.Canceled. The committed marker stays stamped, and no future tick re-selects this workspace for this deadline. The reminder is silently lost.
The window is extremely narrow (milliseconds between transaction commit and enqueue, during process shutdown). The consequence is one missed courtesy notification. The dormancy notification path has the same pattern without any re-arm at all, so this PR is strictly more robust than the status quo.
A detached context (e.g., context.WithTimeout(context.Background(), 5*time.Second)) for the re-arm would close this gap without introducing new risks, since the re-arm is a single idempotent UPDATE.
(Hisoka)
🤖
Summary
Branch 4 of the workspace autostop reminder stack. This is the branch that turns the feature on: it adds the dispatcher in the AGPL lifecycle executor that sends a reminder to a workspace owner a template-configured duration before their workspace is automatically stopped.
The configuration (
templates.time_til_autostop_notify), schema, and notification template all landed in earlier branches. This PR consumes them.What this does
On each lifecycle tick, for every running workspace whose autostop
deadlineis within its template'stime_til_autostop_notifywindow, the executor enqueues oneTemplateWorkspaceAutostopRemindernotification to the owner ("Your workspace will stop soon"). Each reminder fires at most once per deadline and is safe under high availability.How it works
GetWorkspacesEligibleForAutostopReminder): selects workspaces whose latest build is a runningstartbuild with a realdeadlinethat is still in the future (deadline > now) and within the lead window (deadline <= now + time_til_autostop_notify), whose template has the field enabled (> 0), whose owner is active, and which has not already been reminded for the current deadline (notified_autostop_deadline != deadline).remindUpcomingAutostopsinrunOnce): runs as a separate scan after the transition loop. For each candidate, inside aRepeatableReadtransaction it takes the per-workspace advisory lock (lifecycle-executor:<id>), re-fetches and re-validates every condition, then stampsworkspace_builds.notified_autostop_deadline = deadline. The notification is enqueued only after the transaction commits, with errors logged rather than failing the tick. This mirrors the existing dormancy / auto-update notification pattern in the same file.Idempotence and HA safety
The
notified_autostop_deadlinecolumn stores the deadline value a reminder was last sent for:There is intentionally no upper bound on
time_til_autostop_notify. If the configured lead exceeds a workspace's remaining lifetime, the workspace still receives exactly one reminder (not a per-tick flood), because we requiredeadline > nowand the marker filters repeats. This is documented in the query and the executor, and covered by theExceedsLifetimetest.Changes
coderd/database/queries/workspaces.sql,workspacebuilds.sql:GetWorkspacesEligibleForAutostopReminderandUpdateWorkspaceBuildNotifiedAutostopDeadline(plus regenerated query code).coderd/database/dbauthz/dbauthz.go: authorization for both queries (scan is passthrough likeGetWorkspacesEligibleForTransition; the marker update authorizesActionUpdateon the build's workspace likeUpdateWorkspaceBuildDeadlineByID), withMethodTestSuitecoverage.coderd/autobuild/lifecycle_executor.go: the reminder scan, theshouldRemindAutostopre-validation, the after-commit enqueue, and anAutostopRemindersfield onStats.coderd/autobuild/lifecycle_executor_test.go:TestExecutorAutostopReminderwith six subtests (sent in window, not before window, disabled whenttl = 0, no duplicate on a second in-window tick, re-arm after a deadline bump, and at-most-one when the lead exceeds the workspace lifetime).Validation
make gen(no drift),make fmt,make lint,go build ./..., andgo test ./coderd/autobuild/ -run TestExecutorAutostopReminder(all six subtests pass) plus the new dbauthz suite cases.Stack
01-db(merged) - schema columns + notification template seed02-template(feat: add workspace autostop reminder template #26429) - notification template Go wiring + goldens + docs03-schedule-sdk(feat: plumb time_til_autostop_notify template field #26439) -time_til_autostop_notifyconfig plumbing (schedule store, SDK, CLI)04-executor(this PR) - lifecycle-executor dispatcher05-ui(next) - template schedule-settings form field