Skip to content

fix: align autostart tests with persisted next_start_at#26037

Open
zedkipp wants to merge 1 commit into
mainfrom
zedkipp/fix-autostart-next-start-at-race
Open

fix: align autostart tests with persisted next_start_at#26037
zedkipp wants to merge 1 commit into
mainfrom
zedkipp/fix-autostart-next-start-at-race

Conversation

@zedkipp
Copy link
Copy Markdown
Contributor

@zedkipp zedkipp commented Jun 3, 2026

TestExecutorAutostartOK and its sibling positive autostart tests compute the autobuild tick from sched.Next(workspace.LatestBuild.CreatedAt), but the server persists next_start_at from the build's completion time. When build creation and completion straddle the schedule's next fire time, the persisted value advances past the test's tick, the executor's eligibility query (next_start_at <= tick) drops the workspace, and the test fails with an empty transitions map. This surfaced in flaky test runs.

Add coderdtest.NextAutostartTick(t, workspace) which returns *workspace.NextStartAt, and use it across the affected positive autostart paths in coderd/autobuild, coderd, and enterprise/coderd.

Closes coder/internal#1456

Generated with assistance from Coder Agents.

@zedkipp
Copy link
Copy Markdown
Contributor Author

zedkipp commented Jun 4, 2026

/coder-agents-review

@coder-agents-review
Copy link
Copy Markdown
Contributor

coder-agents-review Bot commented Jun 4, 2026

Chat: Review in progress | View chat
Requested: 2026-06-04 16:44 UTC by @zedkipp

deep-review v0.7.1 | Round 1 | 167ac7b..387fc54

Last posted: Round 1, 4 findings (1 P3, 2 Nit, 1 Note), COMMENT. Review

Finding inventory

Findings

# Sev Status Location Summary Round Reviewer Posted
CRF-1 P3 Open lifecycle_executor_test.go:1243 Missed sibling: TestExecutorRequireActiveVersion still uses old sched.Next pattern R1 Netero, Bisky P3, Mafu-san P2, Ryosuke P3, Kite P3 Yes
CRF-2 Nit Open coderdtest.go:1846 Doc comment on NextAutostartTick is verbose for a 3-line function R1 Gon P2 (downgraded: comment is correct and prevents race reintroduction; verbosity is style) Yes
CRF-3 Nit Open enterprise/coderd/workspaces_test.go:1520 Pre-existing typo "Kick of" should be "Kick off" R1 Gon Yes
CRF-4 Note Open coderdtest.go:1860 require.NotNil called from non-test goroutines at 6 call sites; follows existing pattern R1 Meruem Yes

Contested and acknowledged

(none)

Round log

Round 1

Panel. 1 P3, 2 Nit, 1 Note. Reviewed against 167ac7b..387fc54.

About deep-review

CRF = Coder Review Finding (P0-P4, Nit, Note)

Reviewer Focus
Bisky tests
Chopper ops/errors
Churn-guard change verification
Ging language modernization
Gon naming
Hisoka edge cases
Killua perf
Kite change integrity
Knov contracts
Knuckle SQL
Kurapika security
Law decomposition
Leorio docs
Luffy product
Mafu-san process
Mafuuu contracts
Melody dispatch/pairing
Meruem structural
Nami frontend
Netero mechanical checks
Pariston premise testing
Pen-botter product gaps
Razor verification
Robin duplication
Ryosuke Go arch
Takumi concurrency
Zoro shape

🤖 Managed by Coder Agents.

Copy link
Copy Markdown
Contributor

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-targeted fix. The core insight is right: tests should consume the system's computed schedule state rather than recomputing it from a different base time. The NextAutostartTick helper is minimal, well-guarded, and the enterprise fix at line 1548 (capturing ws from MustTransitionWorkspace) shows the author traced the data dependency correctly. The root cause analysis in the commit message is precise.

As Ryosuke put it: "don't recompute what the system already computed."

1 P3, 2 Nit, 1 Note. The P3 is a missed sibling that's subject to the same race this PR fixes.


coderd/autobuild/lifecycle_executor_test.go:1243

P3 [CRF-1] TestExecutorRequireActiveVersion still uses the old sched.Next(ws.LatestBuild.CreatedAt) pattern in a positive autostart test. Line 1247 asserts require.Len(t, stats.Transitions, 1), it uses a real provisioner with no mocked clock, and it's subject to the same straddling race as the eight sites this PR fixes. When the stop build at line 1235 straddles the hour boundary, the executor filters the workspace out and the test fails with an empty transitions map.

Fix: tickTime := coderdtest.NextAutostartTick(t, ws). The workspace is refreshed from MustTransitionWorkspace at line 1235, so ws.NextStartAt is populated.

This is the gap between "fix siblings" and "fix the one I found."

(Netero P3, Bisky P3, Mafu-san P2, Ryosuke P3, Kite P3)

🤖

🤖 This review was automatically generated with Coder Agents.

Comment thread coderd/coderdtest/coderdtest.go Outdated
Comment thread enterprise/coderd/workspaces_test.go
Comment thread coderd/coderdtest/coderdtest.go
TestExecutorAutostartOK and its sibling positive autostart tests
computed the autobuild tick from sched.Next(workspace.LatestBuild.CreatedAt).
The server sets workspace.next_start_at from the build's completion time,
and sched.Next is strictly monotonic, so when build creation and completion
straddle the schedule's next fire time the test's tick is strictly less
than the persisted next_start_at. The executor's eligibility query
(next_start_at <= currentTick) then filters the workspace out and the
test fails with an empty transitions map.

Add coderdtest.NextAutostartTick(t, workspace) which returns
*workspace.NextStartAt with a non-nil assertion, and use it in the
affected positive autostart paths in coderd/autobuild, coderd, and
enterprise/coderd. Negative tests, mocked-clock prebuild paths, and
paths that assert no transition are unchanged.

Closes coder/internal#1456

Generated with assistance from Coder Agents.
@zedkipp zedkipp force-pushed the zedkipp/fix-autostart-next-start-at-race branch from 387fc54 to 90cfbf4 Compare June 4, 2026 21:11
@zedkipp zedkipp requested review from Emyrk and johnstcn June 4, 2026 22:06
@zedkipp zedkipp marked this pull request as ready for review June 4, 2026 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flake: TestExecutorAutostartOK

1 participant