Skip to content

chore(lint): ban offset-paginated full-set scan loops over db.fns.*#8591

Open
petemoore wants to merge 2 commits into
mainfrom
pmoore/lint-rule-offset-pagination
Open

chore(lint): ban offset-paginated full-set scan loops over db.fns.*#8591
petemoore wants to merge 2 commits into
mainfrom
pmoore/lint-rule-offset-pagination

Conversation

@petemoore
Copy link
Copy Markdown
Member

@petemoore petemoore commented May 7, 2026

Adds a no-restricted-syntax ESLint rule matching the exact AST shape of the bug class fixed in #8586 / #8587 / #8588: a loop containing both an await db.fns.X(...) call and an <id> += <literal> counter increment. The rule message points new contributors at paginatedIterator from @taskcluster/lib-postgres as the correct keyset alternative.

Selector

:matches(WhileStatement, DoWhileStatement, ForStatement, ForInStatement, ForOfStatement)
  :has(AwaitExpression CallExpression MemberExpression[property.name="fns"])
  :has(AssignmentExpression[operator="+="][right.type="Literal"])

A loop is flagged only when both conditions hold simultaneously inside the loop body:

  1. An await db.fns.X(...) call (member access via .fns).
  2. An <id> += <literal> increment (the counter advancement).

The combined match has near-zero false-positive rate: this combination is exactly the offset-pagination bug shape and is virtually never the right thing to write deliberately.

Verified across the codebase

After running yarn lint against all services/, libraries/, and db/ JS, only one site matched: db/test/versions/0122_test.js, where the test deliberately exercises the offset-paginated get_tasks_by_task_group_projid function in a no-concurrent-writes environment to verify stable ORDER BY ordering. Suppressed inline with a one-line eslint-disable-next-line comment explaining the intent.

The rule was tested manually against fixtures covering:

  • the bug pattern (fires correctly)
  • healthy keyset pagination using paginatedIterator (no false positive)
  • counter-only loop with += literal but no db.fns.* call (no false positive)
  • db.fns.* call with non-literal += (e.g. cumulativeWeight += config.weight from launch-config-selector.js — no false positive)

Stacked on #8590

This branch is based on the commit from #8590 (the scoperesolver fix). Until #8590 is merged, this PR's diff includes both commits; once #8590 lands the diff will clean up to just this rule + suppression. Do not merge this until #8590 is merged.

References

Test plan

  • yarn lint — full repo passes
  • Manual fixture tests confirm rule fires on bug pattern, not on healthy variants
  • CI green
  • Pete review

🤖 Generated with Claude Code

@github-project-automation github-project-automation Bot moved this to Backlog / Inbox in TC intake board May 7, 2026
@petemoore petemoore moved this from Backlog / Inbox to In progress in TC intake board May 7, 2026
@petemoore petemoore self-assigned this May 7, 2026
@petemoore petemoore marked this pull request as ready for review May 7, 2026 11:59
@petemoore petemoore requested a review from a team as a code owner May 7, 2026 11:59
@petemoore petemoore requested review from Eijebong, lotas and matt-boris and removed request for a team May 7, 2026 11:59
@petemoore petemoore moved this from In progress to In review in TC intake board May 7, 2026
@lotas
Copy link
Copy Markdown
Contributor

lotas commented May 12, 2026

I don't think we need to change anything here, existing query uses page size 1000 , and currently community-tc has 29 clients, and fxci has 424

Plus I believe that whole architecture around those full reloads should be likely changed, since auth service is the most heavy one, we could have >50 pods running at the same time, and on each client change it would fire those expensive full reads. Which makes it even worse when several clients are updated at once through config change.

So I'd probably not touch this for now

petemoore added 2 commits May 18, 2026 16:21
Offset-based pagination over the clients table during scoperesolver's
in-memory cache reload could silently skip or duplicate clients when
concurrent inserts/expirations shifted the result set mid-scan, causing
transient authentication failures for newly-created clients until the
next reload. Same root cause as #8586 / #8587 in worker-manager.

Switch the scoperesolver scan loop, the auth listClients API handler,
and the static-clients sync to a new keyset-paginated DB function
get_clients_after in version 0126. The deprecated offset variant
get_clients is retained for backwards-compatibility but is no longer
called.

Fixes #8588
Add a `no-restricted-syntax` rule matching the AST shape of the bug
class fixed in #8586 / #8587 / #8588: a loop that contains both an
`await db.fns.X(...)` call and an `<id> += <literal>` increment.
The rule message points at `paginatedIterator` from
`@taskcluster/lib-postgres` as the correct keyset alternative.

Across the entire codebase only one site matched after the
scoperesolver fix in #8590: `db/test/versions/0122_test.js`, where
the test intentionally exercises offset pagination of the
`get_tasks_by_task_group_projid` function in a no-concurrent-writes
environment to verify stable ORDER BY ordering. That single case is
suppressed inline with a one-line `eslint-disable-next-line` comment
explaining the intent.

Fixes #8589
@petemoore petemoore force-pushed the pmoore/lint-rule-offset-pagination branch from 3cc43b9 to 5a28907 Compare May 18, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

2 participants