Skip to content

feat(coderd/autobuild): notify users before workspace autostop#26576

Open
sreya wants to merge 1 commit into
autostop-notif/03-schedule-sdkfrom
autostop-notif/04-executor
Open

feat(coderd/autobuild): notify users before workspace autostop#26576
sreya wants to merge 1 commit into
autostop-notif/03-schedule-sdkfrom
autostop-notif/04-executor

Conversation

@sreya

@sreya sreya commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

Branch 4 of the workspace autostop reminder stack. This is the branch that turns the feature on: it adds the dispatcher in the AGPL lifecycle executor that sends a reminder to a workspace owner a template-configured duration before their workspace is automatically stopped.

The configuration (templates.time_til_autostop_notify), schema, and notification template all landed in earlier branches. This PR consumes them.

What this does

On each lifecycle tick, for every running workspace whose autostop deadline is within its template's time_til_autostop_notify window, the executor enqueues one TemplateWorkspaceAutostopReminder notification to the owner ("Your workspace will stop soon"). Each reminder fires at most once per deadline and is safe under high availability.

How it works

  • Eligibility query (GetWorkspacesEligibleForAutostopReminder): selects workspaces whose latest build is a running start build with a real deadline that is still in the future (deadline > now) and within the lead window (deadline <= now + time_til_autostop_notify), whose template has the field enabled (> 0), whose owner is active, and which has not already been reminded for the current deadline (notified_autostop_deadline != deadline).
  • Dispatch pass (remindUpcomingAutostops in runOnce): runs as a separate scan after the transition loop. For each candidate, inside a RepeatableRead transaction it takes the per-workspace advisory lock (lifecycle-executor:<id>), re-fetches and re-validates every condition, then stamps workspace_builds.notified_autostop_deadline = deadline. The notification is enqueued only after the transaction commits, with errors logged rather than failing the tick. This mirrors the existing dormancy / auto-update notification pattern in the same file.

Idempotence and HA safety

The notified_autostop_deadline column stores the deadline value a reminder was last sent for:

  • Once a reminder is sent, the marker equals the deadline, so subsequent ticks are filtered out: no duplicates.
  • If the deadline moves (activity bump, manual extend, schedule change), the marker no longer matches and a fresh reminder fires for the new deadline.
  • The per-workspace advisory lock plus the in-transaction re-validation make concurrent replicas safe: only one replica stamps and sends for a given deadline.

There is intentionally no upper bound on time_til_autostop_notify. If the configured lead exceeds a workspace's remaining lifetime, the workspace still receives exactly one reminder (not a per-tick flood), because we require deadline > now and the marker filters repeats. This is documented in the query and the executor, and covered by the ExceedsLifetime test.

Changes

  • coderd/database/queries/workspaces.sql, workspacebuilds.sql: GetWorkspacesEligibleForAutostopReminder and UpdateWorkspaceBuildNotifiedAutostopDeadline (plus regenerated query code).
  • coderd/database/dbauthz/dbauthz.go: authorization for both queries (scan is passthrough like GetWorkspacesEligibleForTransition; the marker update authorizes ActionUpdate on the build's workspace like UpdateWorkspaceBuildDeadlineByID), with MethodTestSuite coverage.
  • coderd/autobuild/lifecycle_executor.go: the reminder scan, the shouldRemindAutostop re-validation, the after-commit enqueue, and an AutostopReminders field on Stats.
  • coderd/autobuild/lifecycle_executor_test.go: TestExecutorAutostopReminder with six subtests (sent in window, not before window, disabled when ttl = 0, no duplicate on a second in-window tick, re-arm after a deadline bump, and at-most-one when the lead exceeds the workspace lifetime).

Validation

make gen (no drift), make fmt, make lint, go build ./..., and go test ./coderd/autobuild/ -run TestExecutorAutostopReminder (all six subtests pass) plus the new dbauthz suite cases.

Stack

  1. 01-db (merged) - schema columns + notification template seed
  2. 02-template (feat: add workspace autostop reminder template #26429) - notification template Go wiring + goldens + docs
  3. 03-schedule-sdk (feat: plumb time_til_autostop_notify template field #26439) - time_til_autostop_notify config plumbing (schedule store, SDK, CLI)
  4. 04-executor (this PR) - lifecycle-executor dispatcher
  5. 05-ui (next) - template schedule-settings form field

@datadog-coder

datadog-coder Bot commented Jun 22, 2026

Copy link
Copy Markdown

Pipelines

⚠️ Warnings

🚦 1 Pipeline job failed

contrib | title   View in Datadog   GitHub Actions

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 14fef35 | Docs | Give us feedback!

@sreya

sreya commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

@coder-agents-review

coder-agents-review Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Chat: Review posted | View chat
Requested: 2026-06-22 19:04 UTC by @sreya
Spend: $74.26 / $100.00

Review history
  • R1 (2026-06-22): 17 reviewers, 2 Nit, 2 Note, 1 P2, 3 P3, COMMENT. Review
  • R2 (2026-06-22): 9 reviewers, 2 Nit, 2 Note, 1 P2, 3 P3, 1 P4, APPROVE. Review

deep-review v0.9.0 | Round 2 | 2537503..14fef35

Last posted: Round 2, 9 findings (1 P2, 3 P3, 1 P4, 2 Nit, 2 Note), APPROVE. Review

Finding inventory

Findings

# Sev Status Location Summary Round Reviewer Posted
CRF-1 P2 Author fixed (14fef35) lifecycle_executor.go:690 Marker commits before enqueue; failed enqueue permanently suppresses reminder R1 Mafuuu P2, Meruem P2, Chopper P3 Yes
CRF-2 P3 Author fixed (14fef35) lifecycle_executor.go:650 Re-validation comment claims "all" but shouldRemindAutostop omits SQL conditions R1 Hisoka P3, Meruem P3, Knov P3, Zoro P3 Yes
CRF-3 P3 Author fixed (14fef35) lifecycle_executor.go:717 shouldRemindAutostop rejection paths have 0% branch coverage R1 Bisky P3, Mafuuu Note Yes
CRF-4 P3 Author fixed (14fef35) workspacebuilds.sql:149 UpdateWorkspaceBuildNotifiedAutostopDeadline does not set updated_at R1 Knuckle Yes
CRF-5 Nit Author fixed (14fef35) lifecycle_executor.go:711 CRF-4 is an opaque reference with no discoverable target R1 Gon, Leorio, Mafu-san, Zoro Yes
CRF-6 Nit Author fixed (14fef35) lifecycle_executor.go:66 AutostopReminders doc comment carries implementation motivation R1 Gon Yes
CRF-7 Note Author fixed (14fef35) lifecycle_executor_test.go:1956 Stats.AutostopReminders value (deadline) never asserted R1 Bisky Yes
CRF-8 Note Dropped by orchestrator (informational Note about pre-existing pattern; no action required) workspaces.sql:889 LEFT JOIN workspace_builds is functionally INNER JOIN R1 Chopper, Meruem, Knuckle, Zoro Yes
CRF-9 P4 Open lifecycle_executor.go:716 Re-arm uses e.ctx; double failure on shutdown permanently suppresses reminder R2 Hisoka Yes

Round log

Round 1

Panel (17 reviewers: Bisky, Hisoka, Mafu-san, Mafuuu, Pariston, Komugi, Gon, Leorio, Chopper, Knuckle, Takumi, Meruem, Ging-Go, Kite, Melody, Knov, Zoro). Netero: no findings. 1 P2, 3 P3, 2 Nit, 2 Note. Reviewed against 2537503..ce3eae6.

Round 2

CRF-1 through CRF-7 addressed. CRF-8 dropped. Panel (9 reviewers: Bisky, Hisoka, Mafu-san, Mafuuu, Pariston, Komugi, Chopper, Meruem, Kite). Netero: no findings. 1 new P4. Reviewed against 2537503..14fef35.

About deep-review

CRF = Coder Review Finding (P0-P4, Nit, Note)

Reviewer Focus
Bisky tests
Chopper ops/errors
Churn-guard change verification
Ging language modernization
Gon naming
Hisoka edge cases
Killua perf
Kite change integrity
Knov contracts
Knuckle SQL
Komugi flake/determinism
Kurapika security
Law decomposition
Leorio docs
Luffy product
Mafu-san process
Mafuuu contracts
Melody dispatch/pairing
Meruem structural
Nami frontend
Netero mechanical checks
Pariston premise testing
Pen-botter product gaps
Razor verification
Robin duplication
Ryosuke Go arch
Takumi concurrency
Zoro shape

🤖 Managed by Coder Agents.

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-engineered feature with clean idempotence design and thorough test coverage. The marker-based deduplication, HA safety via advisory locks + RepeatableRead, and the six-subtest suite covering window boundaries, idempotence, re-arm, and the unbounded-window edge case are all strong.

Severity summary: 1 P2, 3 P3, 2 Nit, 2 Note.

The P2 (marker-before-enqueue) is the only finding that affects the feature's delivery guarantee. Seven reviewers flagged it independently; Mafuuu's framing is sharpest: "The notification IS the only user-visible outcome. The marker serves solely to deduplicate the notification. When the notification is lost, the marker has prevented the one thing it was protecting."

"CRF-4 is not a GitHub/Linear ticket and carries no inline substance. It reads as a plan-document or review-round identifier. Strip it. The invariant itself ('exactly one reminder per deadline') earns its lines; the CRF-4 label does not." (Gon)

🤖 This review was automatically generated with Coder Agents.

Comment thread coderd/autobuild/lifecycle_executor.go
Comment thread coderd/autobuild/lifecycle_executor.go Outdated
Comment thread coderd/autobuild/lifecycle_executor.go
Comment thread coderd/database/queries/workspacebuilds.sql Outdated
Comment thread coderd/autobuild/lifecycle_executor.go Outdated
Comment thread coderd/autobuild/lifecycle_executor.go
Comment thread coderd/autobuild/lifecycle_executor_test.go
Comment thread coderd/database/queries/workspaces.sql
Add a reminder-dispatch pass to the lifecycle executor that notifies a workspace owner a template-configured duration (time_til_autostop_notify) before their workspace is automatically stopped.

Per tick, a separate scan (GetWorkspacesEligibleForAutostopReminder) finds running workspaces whose deadline is within the lead window and not yet reminded for that deadline. Inside a per-workspace advisory-locked RepeatableRead transaction it re-validates volatile conditions and stamps workspace_builds.notified_autostop_deadline, then enqueues TemplateWorkspaceAutostopReminder after commit. If the enqueue fails, the marker is re-armed so a later tick retries. Idempotent per deadline and HA-safe; at most one reminder even when the lead exceeds a workspace's remaining lifetime.
@sreya sreya force-pushed the autostop-notif/04-executor branch from ce3eae6 to 14fef35 Compare June 22, 2026 18:42
@sreya

sreya commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 7 substantive findings from round 1 were addressed in 14fef35. Each fix verified by multiple reviewers:

CRF-1 (P2): Re-arm logic added at line 716 clears the marker on enqueue failure so the next tick retries. SelfHealsOnEnqueueFailure integration test exercises the full failure-then-retry path.
CRF-2 (P3): Comment rewritten to "volatile eligibility conditions"; dormancy/deletion guards added.
CRF-3 (P3): TestShouldRemindAutostop with 7 table-driven cases, 100% branch coverage.
CRF-4 (P3): SQL now sets updated_at; Go passes dbtime.Now().
CRF-5, CRF-6 (Nit): CRF-4 prefix removed, implementation motivation dropped.
CRF-7 (Note): Tests assert deadline values via WithinDuration and label format via Equal.

One new P4 inline (CRF-9), a narrow edge case during shutdown. Not blocking.

"Three layers of idempotency (marker, lock, dedupe hash). Ran SelfHealsOnEnqueueFailure 3x with -race; clean." (Hisoka)

🤖 This review was automatically generated with Coder Agents.

// marker would permanently suppress this deadline's only notification.
// The notifications subsystem's own dedupe_hash prevents same-day
// duplicates if the enqueue actually partially succeeded.
if rearmErr := e.db.UpdateWorkspaceBuildNotifiedAutostopDeadline(e.ctx, database.UpdateWorkspaceBuildNotifiedAutostopDeadlineParams{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P4 [CRF-9] The re-arm uses e.ctx. If the executor is shutting down (context cancelled), the enqueue fails with context.Canceled, then the re-arm also fails with context.Canceled. The committed marker stays stamped, and no future tick re-selects this workspace for this deadline. The reminder is silently lost.

The window is extremely narrow (milliseconds between transaction commit and enqueue, during process shutdown). The consequence is one missed courtesy notification. The dormancy notification path has the same pattern without any re-arm at all, so this PR is strictly more robust than the status quo.

A detached context (e.g., context.WithTimeout(context.Background(), 5*time.Second)) for the re-arm would close this gap without introducing new risks, since the re-arm is a single idempotent UPDATE.

(Hisoka)

🤖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant