Skip to content

Bulldozer DB#1285

Open
N2D4 wants to merge 38 commits intodevfrom
bulldozer-db
Open

Bulldozer DB#1285
N2D4 wants to merge 38 commits intodevfrom
bulldozer-db

Conversation

@N2D4
Copy link
Copy Markdown
Contributor

@N2D4 N2D4 commented Mar 24, 2026

Note

Medium Risk
Medium risk due to introducing a new persistent Postgres table (BulldozerStorageEngine) with constraints/seed data plus a large new dev-only studio server that issues raw SQL mutations; failures could impact migrations or local data during development.

Overview
Adds a new Postgres-backed hierarchical storage layer for Bulldozer via a BulldozerStorageEngine table (generated parent path column, self-referential FK w/ cascade, indexes) and Prisma model, plus migration tests validating query patterns/constraints.

Introduces a new local-only run-bulldozer-studio dev tool (started from pnpm dev) that serves an interactive UI and API for inspecting table graphs, running table init/delete and row mutations, and browsing/upserting/deleting raw storage paths with an auth token/origin guard; adds elkjs for graph layout.

Adds Bulldozer sort helper SQL (BULLDOZER_SORT_HELPERS_SQL) and an example fungible-ledger schema wiring multiple Bulldozer table operators, and includes a couple of small robustness tweaks (delay cron job startup; safer passkey registration hints stripping and assert registrationInfo presence).

Reviewed by Cursor Bugbot for commit 5b69a18. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • New Features

    • Added Bulldozer Studio — interactive web UI to browse table graphs, inspect/edit raw hierarchical storage, and run table-level actions.
  • Infrastructure

    • Introduced persistent hierarchical storage with a DB migration and server API to expose schema/table details.
  • Tests

    • Added extensive integration, fuzz and performance test suites for the new table system.
  • Chores

    • Updated dev start scripts and added a backend dependency to support the studio.
  • Documentation

    • Clarified lint usage and added guidance to document tradeoffs.

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
stack-auth-hosted-components Ready Ready Preview, Comment Apr 8, 2026 5:41pm
stack-backend Error Error Apr 8, 2026 5:41pm
stack-dashboard Ready Ready Preview, Comment Apr 8, 2026 5:41pm
stack-demo Ready Ready Preview, Comment Apr 8, 2026 5:41pm
stack-docs Error Error Apr 8, 2026 5:41pm

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a Bulldozer feature suite: new storage engine table and migration, extensive Bulldozer table operators and SQL helpers, a standalone "Bulldozer Studio" dev HTTP server + UI, fuzz/perf tests, and supporting scripts/packaging and minor editor/docs tweaks.

Changes

Cohort / File(s) Summary
Editor / Docs
\.vscode/settings.json, AGENTS.md, claude/CLAUDE-KNOWLEDGE.md
Spell-check dictionary edits; lint/dev guidance extended; added Bulldozer/Postgres operational Q&A.
Backend dev tooling
apps/backend/package.json, apps/dev-launchpad/public/index.html, apps/backend/scripts/run-cron-jobs.ts
Added run-bulldozer-studio script and elkjs dep; launchpad UI entry for Bulldozer Studio; cron workers delay first run by 30s.
Database migration & schema
apps/backend/prisma/migrations/.../migration.sql, apps/backend/prisma/migrations/.../tests/ltree-queries.ts, apps/backend/prisma/schema.prisma
New BulldozerStorageEngine table (jsonb[] keyPath, generated parent, unique constraint, FK, index), seeded roots; migration tests validating behavior and constraints; Prisma model added (with unsupported jsonb[] field).
Standalone Studio server
apps/backend/scripts/run-bulldozer-studio.ts, apps/dev-launchpad/public/index.html
New TypeScript HTTP server serving SPA and JSON API endpoints for schema, table details, raw storage inspect/upsert/delete, and table mutations (init/delete/set-row/delete-row).
Bulldozer core SQL helpers
apps/backend/src/lib/bulldozer/db/bulldozer-sort-helpers-sql.ts, .../utilities.ts, .../index.ts
Added large SQL helper blob for pg_temp sort helpers and a typed SQL-construction utilities module; new central index exposing Table types, toExecutableSql* helpers, and re-exports of table factories.
Bulldozer table operators
apps/backend/src/lib/bulldozer/db/tables/* (stored-, group-by, map, flat-map, filter, concat, limit, sort, l-fold, left-join)
Added many new table factory implementations (declareStoredTable, declareGroupByTable, declareMapTable, declareFlatMapTable, declareFilterTable, declareConcatTable, declareLimitTable, declareSortTable, declareLFoldTable, declareLeftJoinTable) implementing init/delete/isInitialized/list/registerRowChangeTrigger and incremental change propagation SQL.
Example schema & tests
apps/backend/src/lib/bulldozer/db/example-schema.ts, .../index.fuzz.test.ts, .../index.perf.test.ts
Added an example fungible ledger schema and large fuzz + perf test suites that create ephemeral DBs, provision storage engine, run randomized mutations and measure/validate correctness and performance.
Prisma client change
apps/backend/src/prisma-client.tsx
Replaced static import with dynamic await import("@/stack") in connection-string resolution to defer module load.

Sequence Diagram(s)

sequenceDiagram
  participant Browser
  participant StudioServer as "Bulldozer Studio\n(HTTP Server)"
  participant BulldozerLib as "Bulldozer Module\n(SQL helpers & Table ops)"
  participant Postgres

  Browser->>StudioServer: GET / (SPA) / API requests
  StudioServer->>BulldozerLib: toExecutableSqlTransaction / handler (init/list/set/delete)
  BulldozerLib->>Postgres: Execute SQL (storage engine reads/writes, temp helpers)
  Postgres-->>BulldozerLib: Rows / success / errors
  BulldozerLib-->>StudioServer: Results / errors
  StudioServer-->>Browser: JSON response / HTML
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • BilalG1

Poem

"I hopped into code at break of day,
Keys clacking like carrots in a fray.
New tables dug, and trees arranged,
Studio lights flicker, graphs exchanged.
Tiny paws applaud the dev bouquet 🐇"

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.06% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Bulldozer DB' is concise and clearly describes the main feature being added—a persistent database storage layer for Bulldozer with associated tooling and tests.
Description check ✅ Passed The PR description provides a clear overview of major changes, including database schema additions, new development tooling, and testing infrastructure with risk assessment.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bulldozer-db

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (7)
apps/backend/prisma/schema.prisma (1)

1250-1258: Consider adding createdAt/updatedAt timestamps for observability.

Most models in this schema include createdAt and updatedAt fields. If omitted intentionally for performance reasons (high-frequency writes), that's acceptable—just flagging for consistency with other models.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/prisma/schema.prisma` around lines 1250 - 1258, The
BulldozerStorageEngine model is missing createdAt/updatedAt timestamps used
elsewhere for observability; add createdAt DateTime `@default`(now()) and
updatedAt DateTime `@updatedAt` fields to the BulldozerStorageEngine model
(referencing the model name and field identifiers) so new records record
creation time and updates auto-update, or if omission is intentional for
performance, add a comment in the model noting that rationale instead of leaving
them out.
apps/backend/src/lib/bulldozer/db/index.ts (1)

11-36: Keep Table in one place.

apps/backend/src/lib/bulldozer/db/table-type.ts already defines this public interface. Duplicating it here gives the module two sources of truth, which is easy to drift the next time the contract changes. Re-export the canonical type instead.

♻️ Suggested cleanup
-import type { Json, RowData, RowIdentifier, SqlExpression, SqlQuery, SqlStatement, TableId } from "./utilities";
+import type { SqlQuery, SqlStatement } from "./utilities";
 import { quoteSqlIdentifier } from "./utilities";
+export type { Table } from "./table-type";
 
-// ====== Table implementations ======
-// IMPORTANT NOTE: For every new table implementation, we should also add tests (unit, fuzzing, & perf; including an entry in the "hundreds of thousands" perf test), an example in the example schema, and support in Bulldozer Studio.
-
-export type Table<GK extends Json, SK extends Json, RD extends RowData> = {
-  tableId: TableId,
-  inputTables: Table<any, any, any>[],
-  debugArgs: Record<string, unknown>,
-  ...
-};
+// ====== Table implementations ======
+// IMPORTANT NOTE: For every new table implementation, we should also add tests (unit, fuzzing, & perf; including an entry in the "hundreds of thousands" perf test), an example in the example schema, and support in Bulldozer Studio.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/index.ts` around lines 11 - 36, The Table
type is duplicated here; remove the local duplicate definition and re-export the
canonical interface from the existing declaration in table-type.ts instead.
Replace the local type block in this file with a re-export of the Table symbol
from the module that defines it (table-type.ts) so there is a single source of
truth, and update any local references in this file to use that re-exported
Table type (keep function names like listGroups, listRowsInGroup,
compareGroupKeys, compareSortKeys, init, delete, isInitialized,
registerRowChangeTrigger unchanged). Ensure exports remain the same shape so
other modules importing Table from this file keep working.
apps/backend/src/lib/bulldozer/db/tables/concat-table.ts (1)

32-32: rawExpression bypasses SQL template safety - use with caution.

This helper creates an SqlExpression from a raw string without the template literal safety checks. While necessary for programmatic SQL building, it shifts injection-safety responsibility to callers.

Consider adding a brief comment noting that callers must ensure the input is safe, or rename to unsafeRawExpression to make the contract explicit.

Suggested documentation
-  const rawExpression = <T>(sql: string): SqlExpression<T> => ({ type: "expression", sql });
+  // SAFETY: Callers must ensure `sql` is sanitized or derived from trusted sources
+  const rawExpression = <T>(sql: string): SqlExpression<T> => ({ type: "expression", sql });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/tables/concat-table.ts` at line 32, The
function rawExpression currently constructs an SqlExpression<T> from an
arbitrary string and thus bypasses template literal SQL safety; update the
implementation by either renaming rawExpression to unsafeRawExpression to make
the unsafe contract explicit or add a clear inline comment above rawExpression
calling out that callers are responsible for sanitizing inputs to avoid SQL
injection, referencing the SqlExpression type and any callers that use
rawExpression so they can be audited for safety.
apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts (2)

213-214: Minor: Extra whitespace in compareSortKeys expression.

The expression sqlExpression` 0 ` has a leading space. This is functionally harmless in SQL but inconsistent with concat-table.ts line 165 which uses sqlExpression`0`.

Suggested fix for consistency
-    compareSortKeys: (a, b) => sqlExpression` 0 `,
+    compareSortKeys: (a, b) => sqlExpression`0`,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts` around lines 213
- 214, The compareSortKeys assignment uses sqlExpression with an extra leading
space (sqlExpression` 0 `); update the compareSortKeys lambda to use
sqlExpression`0` (no leading/trailing space) so it matches the style used in
concat-table.ts and removes the inconsistent whitespace around the literal;
locate the compareSortKeys definition and replace the template literal
accordingly.

29-30: Position-based row identifiers may cause unnecessary churn.

The expanded row identifier "sourceId:flatIndex" uses ordinal position from WITH ORDINALITY. If the mapper output shifts (e.g., [A,B,C][A,X,B,C]), rows at indices 2+ get new identifiers even if their content is unchanged. This triggers downstream change propagation for stable content.

This may be intentional for simplicity, but if stable content tracking matters, consider using content-based deduplication or stable keys from the mapper.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts` around lines 29 -
30, The current createExpandedRowIdentifier builds identifiers by concatenating
sourceRowIdentifier and the ordinality-based flatIndex (generated via WITH
ORDINALITY), which causes identifier churn when mapper output shifts; change
this to produce stable identifiers by deriving a content-based key instead of
using positional index—e.g., compute a deterministic hash of the mapped element
(or use a stable key field provided by the mapper) and combine that with
sourceRowIdentifier; update createExpandedRowIdentifier to accept a
content-based SqlExpression (e.g., mappedElement or mappedElementHash) instead
of flatIndex and ensure callers that invoke createExpandedRowIdentifier supply
the stable key/hash rather than the ordinal.
apps/backend/src/lib/bulldozer/db/utilities.ts (2)

40-42: Consider additional character escaping for edge cases.

The single-quote escaping is correct for standard PostgreSQL string literals. However, null bytes (\0) and certain Unicode sequences could potentially cause issues in some configurations. For a storage engine that may handle arbitrary user data, consider whether additional sanitization is warranted.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/utilities.ts` around lines 40 - 42,
quoteSqlStringLiteral currently only escapes single quotes which can miss edge
cases like null bytes and problematic Unicode; update it to normalize and
explicitly handle such characters. Modify the quoteSqlStringLiteral function to:
1) reject or encode null bytes (e.g., throw on '\0' or replace with an explicit
escape sequence) and 2) normalize input (e.g., NFC) and optionally escape or
validate noncharacter code points and surrogate pairs before building the
SqlExpression; ensure the resulting sql string still uses the existing
single-quote doubling logic and that the function signature
(quoteSqlStringLiteral and the returned SqlExpression<string>) remains
unchanged.

78-87: Add a docstring clarifying the null sort key semantics.

This function lacks documentation explaining its behavior. When either bound is exclusive and not unbounded, it returns false (no matches), which is mathematically correct—null cannot satisfy > x or < x—but may surprise callers. A docstring explaining this edge case would clarify the intentional design.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/utilities.ts` around lines 78 - 87, Add a
docstring to singleNullSortKeyRangePredicate that explains it computes whether a
NULL sort key can satisfy the provided range: document the options parameters
(start, end, startInclusive, endInclusive), state that when a bound is the
literal "start" or "end" it means unbounded, and explicitly call out the edge
case where either bound is exclusive and not unbounded—this function will return
false because NULL cannot satisfy strict > or < comparisons; also mention the
returned SqlExpression<boolean> semantics (always-true or always-false SQL
literal).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/scripts/run-bulldozer-studio.ts`:
- Around line 2038-2046: Reject raw delete requests that target the reserved
root paths by validating pathSegments before running retryTransaction: if
pathSegments is an empty array or exactly ["table"] then return a client error
(400) / throw an appropriate API error and do not call retryTransaction or
tx.$executeRawUnsafe. Add this check immediately after pathSegments is extracted
(the variable named pathSegments in the POST "/api/raw/delete" handler) and
include a clear error message; leave all existing DB deletion logic
(keyPathSqlLiteral, retryTransaction, globalPrismaClient, tx.$executeRawUnsafe)
unchanged otherwise.
- Around line 2022-2046: Both /api/raw/upsert and /api/raw/delete perform direct
mutations on "BulldozerStorageEngine" without acquiring the same advisory lock
used by executeStatements(), allowing interleaved tree edits; update both
request handlers to acquire the same Bulldozer advisory lock before running the
retryTransaction/globalPrismaClient transaction (i.e., call the same lock
acquisition used by executeStatements() at the start of the transaction scope),
then perform the existing tx.$executeRawUnsafe(...) logic (which uses
keyPathSqlLiteral and quoteSqlJsonbLiteral) and release automatically at
transaction end so raw upserts/deletes are serialized with other tree mutations.
- Around line 136-141: The call in executeStatements currently passes the whole
multi-statement script from toExecutableSqlStatements() into
tx.$executeRawUnsafe which fails because Prisma accepts a single statement per
call; change executeStatements to split the script returned by
toExecutableSqlStatements() into individual SQL statements (e.g., by
semicolon-aware parsing or the existing splitter utility) and then run each
statement sequentially inside the retryTransaction using tx.$executeRawUnsafe
for each one (keeping the SET LOCAL jit and SELECT pg_advisory_xact_lock calls
as individual executions); alternatively, route the full script through a driver
that supports multi-statement scripts if present, but prefer per-statement
execution to fix failures when pg_temp.bulldozer_sort_* is involved.

In `@apps/backend/scripts/run-cron-jobs.ts`:
- Line 28: The inline comment for the wait call is misleading: update the
comment next to the await wait(30_000) call in run-cron-jobs.ts so it reflects
the actual delay (e.g., "Wait 30 seconds to make sure the server is fully
started") or, if the intent was a shorter pause, change the numeric literal
(30_000) to the intended duration (e.g., 3_000) and keep the comment consistent;
ensure the comment and the await wait(...) value match.

In `@apps/backend/src/lib/bulldozer/db/bulldozer-sort-helpers-sql.ts`:
- Around line 573-616: bulldozer_sort_bulk_init_from_table currently hard-codes
ORDER BY "rowSortKey" ASC which conflicts with custom comparators; update the
function signature to accept compare_sort_keys_sql and use it in the dynamic
SELECT (instead of the fixed 'ORDER BY "rowSortKey" ASC, "rowIdentifier" ASC')
so ordered_rows respects the provided comparator, or alternatively change the
implementation to build the initial tree by iterating rows and calling
pg_temp.bulldozer_sort_insert for each row (keeping
pg_temp.bulldozer_sort_ensure_group, pg_temp.bulldozer_sort_build_balanced_group
and pg_temp.bulldozer_sort_put_group_metadata interactions intact) so ordering
logic matches bulldozer_sort_insert/bulldozer_sort_delete.

In `@apps/backend/src/lib/bulldozer/db/example-schema.ts`:
- Around line 103-108: The filter used to build accountEntriesWithCounterparty
currently uses the JSON operator -> which leaves JSON null values as a
non-SQL-NULL and lets rows with counterparty: null through; update the predicate
in the declareFilterTable call (accountEntriesWithCounterparty, fromTable
entriesByAccount) to use the text operator ->> instead (e.g.
("rowData"->>'counterparty') IS NOT NULL) so JSON null and missing keys are
treated as SQL NULL and correctly filtered out.

In `@apps/backend/src/lib/bulldozer/db/index.ts`:
- Around line 54-67: In toExecutableSqlStatements, always use the sequential
temp-table execution path (the sequential temp-table approach currently guarded
by requiresSortSequentialExecutor) instead of the sibling-CTE branch; remove the
conditional branch that returns the sibling-CTE string and ensure the function
builds and returns the sequential executor SQL (the temp-table / sequential
execution block that currently lives in the other branch), eliminating
requiresSortSequentialExecutor or making it always true so data-modifying
statements (DELETE/INSERT) execute in order and can observe prior changes;
update any related variables
(requiresSortHelpers/requiresSortSequentialExecutor) and tests accordingly.

In `@apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts`:
- Around line 187-188: The compareGroupKeys() (and the duplicate at lines
~271-283) currently returns a constant 0 causing all groups to be treated equal;
replace the stub with a real SQL JSONB comparator and thread it through
listGroups() by updating singleNullSortKeyRangePredicate() to use the same
documented JSONB default comparator, and similarly implement compareSortKeys()
to perform proper JSONB sort-key comparison; specifically, update the
compareGroupKeys and compareSortKeys functions to emit a sqlExpression that
compares JSONB values deterministically (the same expression used by
singleNullSortKeyRangePredicate/listGroups) so group range queries and
downstream operators get a correct ordering.
- Around line 16-24: The declareGroupByTable signature and implementation are
unsound: NewRD is declared but the function always returns OldRD, fromTable
erases the sort-key type with any, and compareGroupKeys returns 0 making all
groups equal. Fix by changing the generic parameters to clearly separate input
row type (OldRD) and output/group row type (NewRD), remove the any by preserving
fromTable's sort-key generic (e.g., Table<GK, SK, OldRD>), and update
declareGroupByTable to return Table<GK, SK, NewRD> (or appropriate nullability)
while transforming OldRD -> NewRD using the provided groupBy mapper; implement
compareGroupKeys to perform a real comparison based on GK (or delegate to a
provided comparator) and throw or assert if assumptions about key shape fail
instead of silently returning 0. Ensure you update uses of fromTable, groupBy,
and compareGroupKeys to match the new generics and transformation contract.

In `@apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts`:
- Around line 716-740: listRowsInGroup currently orders results with a
hard-coded "ORDER BY ... rowSortKey ASC" which can disagree with the comparator
used when folding (options.fromTable.compareSortKeys); update listRowsInGroup to
use the same ordering expression that the fold computation uses (or an explicit
persisted ordinal) instead of raw "rowSortKey ASC". Specifically, replace the
ASC ordering in both query branches with the sort expression derived from
options.fromTable.compareSortKeys (or a stored emitted ordinal), ensuring the
sortRangePredicate and ordering use the identical comparator logic; refer to
listRowsInGroup, sortRangePredicate, getGroupRowsPath, groupsPath, rowSortKey
and options.fromTable.compareSortKeys to locate where to apply the change.

In `@apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts`:
- Around line 435-457: The queries in listRowsInGroup lack ORDER BY, causing
non-deterministic row order; update both branches of listRowsInGroup to append
deterministic ORDER BY clauses: for the single-group branch (the branch with
only "row") ORDER BY the computed rowIdentifier (the
("row"."keyPath"[cardinality("row"."keyPath")] #>> '{}') expression), and for
the multi-group branch ORDER BY groupKey then rowIdentifier (groupPath keyPath
expression then ("rows"."keyPath"[cardinality("rows"."keyPath")] #>> '{}') ), so
that compareSortKeys' constant sort key is disambiguated by these stable
tie-breakers. Ensure the ORDER BY uses the same expressions/aliases used in the
SELECT so they remain correct after any refactor.

In `@apps/backend/src/lib/bulldozer/db/tables/limit-table.ts`:
- Around line 118-124: The ranking/read queries in limit-table.ts use raw JSONB
ordering ("rows"."rowSortKey") which ignores the table's declared comparator;
update the ORDER BY clauses (the ones producing "rank" and the subsequent read
queries that reference "rankedRows") to use the table's comparator by invoking
options.fromTable.compareSortKeys(...) (or the equivalent helper that emits the
comparator-aware SQL expression) instead of ordering by the raw JSONB column,
and ensure the tie-breaker "rows"."rowIdentifier" remains; apply the same fix to
the other occurrences noted (the blocks around lines 138-145, 277-283, 376-404)
so that normalizedLimit and rankedRows reflect the comparator-aware ordering.

In `@apps/backend/src/lib/bulldozer/db/tables/sort-table.ts`:
- Around line 312-334: The recursive CTE uses sort-key bounds
(options.fromTable.compareGroupKeys) to prune groups in the all-groups path (the
"groupMetadatas" block), which can incorrectly drop groups or compare the wrong
JSON shape when groupKey is omitted; update the WHERE clauses that inject
start/end comparisons so they skip calling options.fromTable.compareGroupKeys
(use a no-op condition like 1 = 1) when the query is for the all-groups path
(i.e., when groupKey is not present/omitted), and ensure any actual sort-key
filtering is deferred to the existing sortRangePredicate()/sort-range handling
later; refer to "groupMetadatas", "groupPath", "groupMetadata", and
options.fromTable.compareGroupKeys to locate and change the two conditional
SQLExpression injections for start and end.

---

Nitpick comments:
In `@apps/backend/prisma/schema.prisma`:
- Around line 1250-1258: The BulldozerStorageEngine model is missing
createdAt/updatedAt timestamps used elsewhere for observability; add createdAt
DateTime `@default`(now()) and updatedAt DateTime `@updatedAt` fields to the
BulldozerStorageEngine model (referencing the model name and field identifiers)
so new records record creation time and updates auto-update, or if omission is
intentional for performance, add a comment in the model noting that rationale
instead of leaving them out.

In `@apps/backend/src/lib/bulldozer/db/index.ts`:
- Around line 11-36: The Table type is duplicated here; remove the local
duplicate definition and re-export the canonical interface from the existing
declaration in table-type.ts instead. Replace the local type block in this file
with a re-export of the Table symbol from the module that defines it
(table-type.ts) so there is a single source of truth, and update any local
references in this file to use that re-exported Table type (keep function names
like listGroups, listRowsInGroup, compareGroupKeys, compareSortKeys, init,
delete, isInitialized, registerRowChangeTrigger unchanged). Ensure exports
remain the same shape so other modules importing Table from this file keep
working.

In `@apps/backend/src/lib/bulldozer/db/tables/concat-table.ts`:
- Line 32: The function rawExpression currently constructs an SqlExpression<T>
from an arbitrary string and thus bypasses template literal SQL safety; update
the implementation by either renaming rawExpression to unsafeRawExpression to
make the unsafe contract explicit or add a clear inline comment above
rawExpression calling out that callers are responsible for sanitizing inputs to
avoid SQL injection, referencing the SqlExpression type and any callers that use
rawExpression so they can be audited for safety.

In `@apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts`:
- Around line 213-214: The compareSortKeys assignment uses sqlExpression with an
extra leading space (sqlExpression` 0 `); update the compareSortKeys lambda to
use sqlExpression`0` (no leading/trailing space) so it matches the style used in
concat-table.ts and removes the inconsistent whitespace around the literal;
locate the compareSortKeys definition and replace the template literal
accordingly.
- Around line 29-30: The current createExpandedRowIdentifier builds identifiers
by concatenating sourceRowIdentifier and the ordinality-based flatIndex
(generated via WITH ORDINALITY), which causes identifier churn when mapper
output shifts; change this to produce stable identifiers by deriving a
content-based key instead of using positional index—e.g., compute a
deterministic hash of the mapped element (or use a stable key field provided by
the mapper) and combine that with sourceRowIdentifier; update
createExpandedRowIdentifier to accept a content-based SqlExpression (e.g.,
mappedElement or mappedElementHash) instead of flatIndex and ensure callers that
invoke createExpandedRowIdentifier supply the stable key/hash rather than the
ordinal.

In `@apps/backend/src/lib/bulldozer/db/utilities.ts`:
- Around line 40-42: quoteSqlStringLiteral currently only escapes single quotes
which can miss edge cases like null bytes and problematic Unicode; update it to
normalize and explicitly handle such characters. Modify the
quoteSqlStringLiteral function to: 1) reject or encode null bytes (e.g., throw
on '\0' or replace with an explicit escape sequence) and 2) normalize input
(e.g., NFC) and optionally escape or validate noncharacter code points and
surrogate pairs before building the SqlExpression; ensure the resulting sql
string still uses the existing single-quote doubling logic and that the function
signature (quoteSqlStringLiteral and the returned SqlExpression<string>) remains
unchanged.
- Around line 78-87: Add a docstring to singleNullSortKeyRangePredicate that
explains it computes whether a NULL sort key can satisfy the provided range:
document the options parameters (start, end, startInclusive, endInclusive),
state that when a bound is the literal "start" or "end" it means unbounded, and
explicitly call out the edge case where either bound is exclusive and not
unbounded—this function will return false because NULL cannot satisfy strict >
or < comparisons; also mention the returned SqlExpression<boolean> semantics
(always-true or always-false SQL literal).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6843e08f-a066-48c7-8364-28bebef70f53

📥 Commits

Reviewing files that changed from the base of the PR and between b8ea06f and 037d20f.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (30)
  • .vscode/settings.json
  • AGENTS.md
  • apps/backend/package.json
  • apps/backend/prisma/migrations/20260323120000_add_bulldozer_data/migration.sql
  • apps/backend/prisma/migrations/20260323120000_add_bulldozer_data/tests/ltree-queries.ts
  • apps/backend/prisma/schema.prisma
  • apps/backend/scripts/run-bulldozer-studio.ts
  • apps/backend/scripts/run-cron-jobs.ts
  • apps/backend/src/lib/bulldozer/bulldozer-schema.ts
  • apps/backend/src/lib/bulldozer/db/bulldozer-sort-helpers-sql.ts
  • apps/backend/src/lib/bulldozer/db/example-schema.ts
  • apps/backend/src/lib/bulldozer/db/index.fuzz.test.ts
  • apps/backend/src/lib/bulldozer/db/index.perf.test.ts
  • apps/backend/src/lib/bulldozer/db/index.test.ts
  • apps/backend/src/lib/bulldozer/db/index.ts
  • apps/backend/src/lib/bulldozer/db/table-type.ts
  • apps/backend/src/lib/bulldozer/db/tables/concat-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/filter-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/limit-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/map-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/sort-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/stored-table.ts
  • apps/backend/src/lib/bulldozer/db/utilities.ts
  • apps/backend/src/prisma-client.tsx
  • apps/dev-launchpad/public/index.html
  • claude/CLAUDE-KNOWLEDGE.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
apps/backend/src/lib/bulldozer/db/index.ts (2)

13-13: Add comment explaining any usage per coding guidelines.

Line 13 uses Table<any, any, any>[] without explanation. Per coding guidelines, when using any, leave a comment explaining why it's necessary and how type safety is preserved.

Suggested documentation
-  inputTables: Table<any, any, any>[],
+  // `any` is required here because input tables can have heterogeneous generic parameters;
+  // TypeScript lacks existential types to express "Table with some GK/SK/RD". Type safety is
+  // maintained because these tables are only used for dependency tracking and lifecycle management.
+  inputTables: Table<any, any, any>[],

As per coding guidelines: "Try to avoid the any type. Whenever you need to use any, leave a comment explaining why you're using it."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/index.ts` at line 13, Add an inline comment
next to the inputTables parameter that explains why Table<any, any, any>[] uses
any (e.g., dynamic table schemas, third-party types, or runtime-validated
shapes) and describe how type safety is preserved (runtime validation, narrowing
later in functions like the code that processes inputTables, or use of
discriminated unions). Reference the symbol inputTables and the Table<any, any,
any> type in your comment so reviewers see the justification and the mitigation
strategy.

93-93: Consider validating statementTimeout format before interpolation.

Direct string interpolation into SQL without validation could cause issues if unexpected input is passed. While SET LOCAL statement_timeout will likely reject malformed values, validating the format defensively aligns with "fail early, fail loud" principles.

Suggested validation
 export function toExecutableSqlTransaction(statements: SqlStatement[], options: { statementTimeout?: string } = {}): string {
+  if (options.statementTimeout != null && !/^\d+\s?(ms|s|min|h)?$/.test(options.statementTimeout)) {
+    throw new Error(`Invalid statement_timeout format: ${options.statementTimeout}`);
+  }
   return deindent`
     BEGIN;

As per coding guidelines: "Fail early, fail loud. Fail fast with an error instead of silently continuing."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/index.ts` at line 93, Validate
options.statementTimeout before interpolating into the SQL string: before the
template that builds `${options.statementTimeout ? \`SET LOCAL statement_timeout
= '\${options.statementTimeout}';\` : ""}`, add a defensive check that
options.statementTimeout is a non-empty string matching an allowed format (for
example /^\d+(ms|s|min|h)?$/ or whichever units your DB accepts) and throw a
clear error (e.g., new Error("Invalid statementTimeout format")) if it fails;
keep the rest of the interpolation unchanged so only validated values are used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/src/lib/bulldozer/db/tables/limit-table.ts`:
- Line 374: The ORDER BY in listRowsInGroup is using raw JSONB ordering (ORDER
BY rowSortKey ASC) but the WHERE bounds use options.fromTable.compareSortKeys,
causing inconsistent ordering; update the query so ordering uses the same
comparator expression used for bounds—e.g., project the comparator result (or
call the same compareSortKeys expression/function) in the SELECT as an alias and
ORDER BY that alias, or if the comparator matches JSONB semantics, document that
constraint instead; ensure references to rowSortKey and rowIdentifier remain for
stable tie-breaking.

---

Nitpick comments:
In `@apps/backend/src/lib/bulldozer/db/index.ts`:
- Line 13: Add an inline comment next to the inputTables parameter that explains
why Table<any, any, any>[] uses any (e.g., dynamic table schemas, third-party
types, or runtime-validated shapes) and describe how type safety is preserved
(runtime validation, narrowing later in functions like the code that processes
inputTables, or use of discriminated unions). Reference the symbol inputTables
and the Table<any, any, any> type in your comment so reviewers see the
justification and the mitigation strategy.
- Line 93: Validate options.statementTimeout before interpolating into the SQL
string: before the template that builds `${options.statementTimeout ? \`SET
LOCAL statement_timeout = '\${options.statementTimeout}';\` : ""}`, add a
defensive check that options.statementTimeout is a non-empty string matching an
allowed format (for example /^\d+(ms|s|min|h)?$/ or whichever units your DB
accepts) and throw a clear error (e.g., new Error("Invalid statementTimeout
format")) if it fails; keep the rest of the interpolation unchanged so only
validated values are used.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2f44a517-e596-4dcb-bfc4-95ac1d1151be

📥 Commits

Reviewing files that changed from the base of the PR and between 2403c17 and d3a2daa.

📒 Files selected for processing (12)
  • apps/backend/src/lib/bulldozer/db/index.perf.test.ts
  • apps/backend/src/lib/bulldozer/db/index.ts
  • apps/backend/src/lib/bulldozer/db/tables/concat-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/filter-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/flat-map-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/limit-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/map-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/sort-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/stored-table.ts
✅ Files skipped from review due to trivial changes (2)
  • apps/backend/src/lib/bulldozer/db/tables/l-fold-table.ts
  • apps/backend/src/lib/bulldozer/db/index.perf.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts (1)

15-22: ⚠️ Potential issue | 🟠 Major

Decouple the input-table generics from the derived group key.

groupBy only reads rowIdentifier and rowData, but this signature forces the source table to already use the derived GK type and erases its sort key with any. That makes the exported API narrower than the runtime behavior and hides upstream sort-key mistakes.

🔧 Suggested type-only fix
 export function declareGroupByTable<
-  GK extends Json,
-  RD extends RowData,
+  SourceGK extends Json,
+  SourceSK extends Json,
+  OutputGK extends Json,
+  RD extends RowData,
 >(options: {
   tableId: TableId,
-  fromTable: Table<GK, any, RD>,
-  groupBy: SqlMapper<{ rowIdentifier: RowIdentifier, rowData: RD }, { groupKey: GK }>,
-}): Table<GK, null, RD> {
+  fromTable: Table<SourceGK, SourceSK, RD>,
+  groupBy: SqlMapper<{ rowIdentifier: RowIdentifier, rowData: RD }, { groupKey: OutputGK }>,
+}): Table<OutputGK, null, RD> {

As per coding guidelines, "Try to avoid the any type. Whenever you need to use any, leave a comment explaining why you're using it and how you can be certain that errors would still be flagged."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts` around lines 15 -
22, The declareGroupByTable signature currently forces the source table key
generic to GK and uses any for the sort-key, narrowing the API; change the
generics to decouple the input table key types by adding separate generics for
the source partition and sort keys (e.g. FK extends Json, SK extends Json |
null) and use those on fromTable (fromTable: Table<FK, SK, RD>) while keeping
the returned Table<GK, null, RD> and the groupBy mapper type the same
(SqlMapper<{ rowIdentifier: RowIdentifier, rowData: RD }, { groupKey: GK }>);
this avoids using any for sort-key and lets the group key GK be derived only
from the mapper rather than forcing the source table to already use GK.
🧹 Nitpick comments (2)
apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts (2)

307-308: Accept and explicitly ignore parameters for interface clarity.

While TypeScript allows functions with fewer parameters, explicitly accepting and ignoring them makes the intent clearer and matches the interface signature.

     compareGroupKeys: options.leftTable.compareGroupKeys,
-    compareSortKeys: () => sqlExpression`0`,
+    compareSortKeys: (_a, _b) => sqlExpression`0`,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts` around lines 307
- 308, The compareSortKeys callback currently has no parameters; update it to
accept and explicitly ignore the interface's parameters (e.g., (_leftSortKeys,
_rightSortKeys) => sqlExpression`0`) so the signature matches the expected
interface and intent in left-join-table.ts (referencing compareSortKeys and
compareGroupKeys).

21-22: Document why any is used for input table sort keys.

Per coding guidelines, any types should include a comment explaining the rationale. Here, any is appropriate since the left-join operation doesn't depend on input sort keys, but this reasoning should be documented.

-  leftTable: Table<GK, any, OldRD>,
-  rightTable: Table<GK, any, NewRD>,
+  // Sort key type is `any` because left-join ignores input sort keys; output uses null sort key
+  leftTable: Table<GK, any, OldRD>,
+  rightTable: Table<GK, any, NewRD>,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts` around lines 21
- 22, Add a brief inline comment explaining why the sort-key type parameter is
set to any for both leftTable and rightTable: state that the LeftJoinTable (or
left-join operation implemented in this module) does not depend on or inspect
the input tables' sort keys, so the sort-key generic is intentionally
unconstrained and set to any to avoid coupling; add this comment adjacent to the
Table<GK, any, OldRD> and Table<GK, any, NewRD> type annotations (referencing
leftTable and rightTable) so future readers understand the rationale.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts`:
- Around line 42-75: Add a predicate to the SQL built in group-by-table.ts (the
sqlQuery that selects from ${fromChangesTable} and produces
mappedChangesTableName) to filter out no-op source changes: extend the existing
WHERE ${isInitializedExpression} to also exclude rows where both old and new row
objects exist and the derived groupKey and row payload are identical (e.g. AND
NOT (("changes"."oldRowData" IS NOT NULL AND
jsonb_typeof("changes"."oldRowData") = 'object') AND ("changes"."newRowData" IS
NOT NULL AND jsonb_typeof("changes"."newRowData") = 'object') AND
"oldGroup"."groupKey" = "newGroup"."groupKey" AND "changes"."oldRowData" =
"changes"."newRowData")). This uses the same symbols from the diff
(fromChangesTable, options.groupBy, oldGroup/newGroup, oldRowData/newRowData,
isInitializedExpression) so upstream reorder/no-op events are skipped before
storage writes.

In `@apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts`:
- Around line 407-433: The listGroups query is missing an ORDER BY, causing
nondeterministic group ordering; modify the SQL template in the listGroups
function to append an ORDER BY that uses the same key expression and comparator
used elsewhere—e.g., ORDER BY
options.leftTable.compareGroupKeys(sqlExpression`"groupPath"."keyPath"[cardinality("groupPath"."keyPath")]`,
/*right*/) ASC (or wrap the comparator call to produce a sortable value) so rows
are consistently ordered by the computed groupKey; place this ORDER BY just
before the closing backtick of the sqlQuery in listGroups and ensure it
references the same groupKey expression (AS groupKey) and compareGroupKeys
helper for deterministic ordering.
- Around line 275-276: The CASE expressions converting SQL NULLs to JSON
'null'::jsonb should be removed so NULL semantics are preserved; replace the
CASE WHEN "oldRows"."rowData" ... and CASE WHEN "newRows"."rowData" ... with
direct projections of "oldRows"."rowData" AS "oldRowData" and
"newRows"."rowData" AS "newRowData" (matching behavior in
sort-table/limit-table/stored-table) so downstream triggers can distinguish SQL
NULL (row absent) from an actual JSON null value.

---

Duplicate comments:
In `@apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts`:
- Around line 15-22: The declareGroupByTable signature currently forces the
source table key generic to GK and uses any for the sort-key, narrowing the API;
change the generics to decouple the input table key types by adding separate
generics for the source partition and sort keys (e.g. FK extends Json, SK
extends Json | null) and use those on fromTable (fromTable: Table<FK, SK, RD>)
while keeping the returned Table<GK, null, RD> and the groupBy mapper type the
same (SqlMapper<{ rowIdentifier: RowIdentifier, rowData: RD }, { groupKey: GK
}>); this avoids using any for sort-key and lets the group key GK be derived
only from the mapper rather than forcing the source table to already use GK.

---

Nitpick comments:
In `@apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts`:
- Around line 307-308: The compareSortKeys callback currently has no parameters;
update it to accept and explicitly ignore the interface's parameters (e.g.,
(_leftSortKeys, _rightSortKeys) => sqlExpression`0`) so the signature matches
the expected interface and intent in left-join-table.ts (referencing
compareSortKeys and compareGroupKeys).
- Around line 21-22: Add a brief inline comment explaining why the sort-key type
parameter is set to any for both leftTable and rightTable: state that the
LeftJoinTable (or left-join operation implemented in this module) does not
depend on or inspect the input tables' sort keys, so the sort-key generic is
intentionally unconstrained and set to any to avoid coupling; add this comment
adjacent to the Table<GK, any, OldRD> and Table<GK, any, NewRD> type annotations
(referencing leftTable and rightTable) so future readers understand the
rationale.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 50cea181-25dc-4a4d-8268-7d63166fc335

📥 Commits

Reviewing files that changed from the base of the PR and between d3a2daa and b459240.

📒 Files selected for processing (7)
  • apps/backend/scripts/run-bulldozer-studio.ts
  • apps/backend/scripts/run-cron-jobs.ts
  • apps/backend/src/lib/bulldozer/db/example-schema.ts
  • apps/backend/src/lib/bulldozer/db/index.perf.test.ts
  • apps/backend/src/lib/bulldozer/db/index.test.ts
  • apps/backend/src/lib/bulldozer/db/tables/group-by-table.ts
  • apps/backend/src/lib/bulldozer/db/tables/left-join-table.ts
✅ Files skipped from review due to trivial changes (1)
  • apps/backend/scripts/run-cron-jobs.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/backend/scripts/run-bulldozer-studio.ts
  • apps/backend/src/lib/bulldozer/db/index.perf.test.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
apps/backend/src/lib/bulldozer/db/tables/concat-table.ts (1)

21-27: Please narrow or document the any escape hatch.

Table<GK, any, RD> on Lines 26 and 41 erases the input sort-key type for this operator. If the "start" / "end" sentinels force that, please isolate it behind a narrower helper type and document why it's still safe; otherwise sort-key regressions here become type-silent.

As per coding guidelines, "Try to avoid the any type. Whenever you need to use any, leave a comment explaining why you're using it and how you can be certain that errors would still be flagged."

Also applies to: 41-41

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/src/lib/bulldozer/db/tables/concat-table.ts` around lines 21 -
27, The current use of Table<GK, any, RD> in declareConcatTable erases the
tables' sort-key type and silently allows regressions; replace the broad any
with a focused helper type alias (e.g., ConcatInputTable<GK, SK, RD> or
SentinelizedSortKey<T>) that expresses "original SK or the sentinel-widened SK"
and use that alias wherever Table<GK, any, RD> appears (including the tables
parameter and return-site uses in declareConcatTable and related helpers); add a
short inline comment above the alias explaining why the sentinel forces a
widened sort-key, why this particular alias is safe, and when/where callers must
preserve the original SK, and update any type tests or assertions to reflect the
narrower alias so type errors remain visible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/src/lib/bulldozer/db/tables/concat-table.ts`:
- Around line 42-50: The concat table currently uses firstTable.compareGroupKeys
as the public comparator but forwards group-range args to every input (see
getUnionedListGroupsSql and options.tables), which can break pagination if
comparators differ; update the concat-table construction to verify comparator
compatibility: iterate options.tables and assert each table.compareGroupKeys ===
firstTable.compareGroupKeys (or provide an explicit comparator parameter and use
that) and throw a clear error if any mismatch is found so the PR fails fast
instead of silently using the first comparator.
- Around line 24-27: The code stores and returns the mutable options.tables
reference which later mutations can break registered triggers; fix by making and
using a local shallow copy of options.tables (e.g., const tables =
[...options.tables]) at the start of the function, use that copied tables
variable for registering source triggers (the closures that currently read
options.tables) and for any returned metadata, and return or expose that copy
(or an immutable/frozen version) instead of the original options.tables so
future reorders/appends won't change the registered fan-out behavior.

---

Nitpick comments:
In `@apps/backend/src/lib/bulldozer/db/tables/concat-table.ts`:
- Around line 21-27: The current use of Table<GK, any, RD> in declareConcatTable
erases the tables' sort-key type and silently allows regressions; replace the
broad any with a focused helper type alias (e.g., ConcatInputTable<GK, SK, RD>
or SentinelizedSortKey<T>) that expresses "original SK or the sentinel-widened
SK" and use that alias wherever Table<GK, any, RD> appears (including the tables
parameter and return-site uses in declareConcatTable and related helpers); add a
short inline comment above the alias explaining why the sentinel forces a
widened sort-key, why this particular alias is safe, and when/where callers must
preserve the original SK, and update any type tests or assertions to reflect the
narrower alias so type errors remain visible.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0ee6b43-c180-4bba-adf8-bc3dc6666486

📥 Commits

Reviewing files that changed from the base of the PR and between b459240 and c01d931.

📒 Files selected for processing (3)
  • apps/backend/src/lib/bulldozer/db/index.perf.test.ts
  • apps/backend/src/lib/bulldozer/db/index.test.ts
  • apps/backend/src/lib/bulldozer/db/tables/concat-table.ts
✅ Files skipped from review due to trivial changes (1)
  • apps/backend/src/lib/bulldozer/db/index.perf.test.ts

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 31, 2026

Deployment failed with the following error:

Invalid request: `attribution.gitUser` should NOT have additional property `isBot`.

chunks.push(chunkBuffer);
} else if (typeof chunk === "string") {
chunks.push(chunkBuffer);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chunk data silently dropped for non-Buffer stream chunks

Low Severity

In readRequestBody, the chunkBuffer is always created (line 474) and its size always counted in totalBytes, but lines 482–486 only push chunkBuffer into chunks when the original chunk is a Buffer or string. Both branches do the identical thing (chunks.push(chunkBuffer)), yet a chunk that is e.g. a plain Uint8Array would be counted toward the size limit but never collected — silently producing a truncated request body. The conditional can just be an unconditional chunks.push(chunkBuffer).

Fix in Cursor Fix in Web

@nams1570 nams1570 self-requested a review April 6, 2026 16:53
Copy link
Copy Markdown
Collaborator

@nams1570 nams1570 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial Review- some bugs in concattable implementation

(gen_random_uuid(), ${getStorageEnginePath(options.tableId, [])}::jsonb[], 'null'::jsonb),
(gen_random_uuid(), ${getStorageEnginePath(options.tableId, ["metadata"])}::jsonb[], '{ "version": 1 }'::jsonb)
`],
delete: () => [sqlStatement`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bugs: I see the recursive CTE here deletes the bulldozer engine objects, but the triggers aren't deleted right?

What if hypothetically, i create a ConcatTable c_table_a, and two downstream tables p and q have it as an input table. This means that p an q would call c_table_a's registerRowChangeTriggers right?
Now, let's delete c_table_a. would anything break on p and q?
What about if you create a new ConcatTable c_table_a (reusing same name) again? Would p and q now listen for changes to c_table_a?

From my limited testing, i see that the above case will happen.

Also, let's say c_table_a had sourceA as an input table. So sourceA->c_table_a->p. Naturally, changes to sourceA would affect p via triggers. If you delete c_table_a, and then create another c_table_a that DOESN'T take in sourceA as an input, changes to sourceA would still affect the v2 of c_table_a AND p.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same bug applies across all tables.

Comment on lines +33 to +43
const referenceCompareSortKeysSql = firstTable.compareSortKeys(sqlExpression`$1`, sqlExpression`$2`).sql;
for (const table of tables) {
const compareGroupKeysSql = table.compareGroupKeys(sqlExpression`$1`, sqlExpression`$2`).sql;
const compareSortKeysSql = table.compareSortKeys(sqlExpression`$1`, sqlExpression`$2`).sql;
if (compareGroupKeysSql !== referenceCompareGroupKeysSql || compareSortKeysSql !== referenceCompareSortKeysSql) {
throw new StackAssertionError("declareConcatTable requires comparator-compatible input tables", {
tableId: options.tableId,
tableDebugId: tableIdToDebugString(table.tableId),
});
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion: isn't the compareSortKeys check here a little too strict? As I see it, later in the file you set the rows' sortKeys to null anyway, and differing sorting upstream isn't going to affect a Union's output right?

RD extends RowData,
>(options: {
tableId: TableId,
tables: Table<GK, any, RD>[],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: if the same table is entered twice here, wouldn't there be duplicated rows?

Also the same change event would be subscribed to twice right? you'd have two callbacks for the same change to the same table run?

Comment on lines +69 to +79
"sourceRows"."groupkey" AS "groupKey",
${createConcatenatedRowIdentifierSql(tableIndex, `"sourceRows"."rowidentifier"`)} AS "rowIdentifier",
'null'::jsonb AS "rowSortKey",
"sourceRows"."rowdata" AS "rowData"
FROM (${table.listRowsInGroup({
start: "start",
end: "end",
startInclusive: true,
endInclusive: true,
}).sql}) AS "sourceRows"
WHERE ${getInputInitializedSql(table)}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: here, we read sourceRows from each inputTable.listRowsinGroup. But the listRowsinGroup defined in index.ts doesn't require it to return the groupKey. The type looks like this:

listRowsInGroup(options: { groupKey?: SqlExpression<GK>, start: SqlExpression<SK> | "start", end: SqlExpression<SK> | "end", startInclusive: boolean, endInclusive: boolean }): SqlQuery<Iterable<{ rowIdentifier: RowIdentifier, rowSortKey: SK, rowData: RD }>>,

For example, the StoredTable implementation of listRowsinGroup doesn't return groupkey.

model BulldozerStorageEngine {
id String @id @default(uuid()) @db.Uuid
keyPath Json[]
keyPathParent Unsupported("jsonb[]")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if this is nullable, shouldn't it be Unsupported("jsonb[]")? ?

)
DELETE FROM "BulldozerStorageEngine"
WHERE "keyPath" IN (SELECT "path" FROM "pathsToDelete")
`],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter and map table delete omits trigger deregistration

Medium Severity

The delete() methods in declareFilterTable and declareMapTable only return SQL statements to delete storage rows but never call deregisterFromTableTrigger() on the nested flat-map table. Every other derived table (concat, sort, groupBy, limit, leftJoin, and flatMap itself) explicitly deregisters its input trigger in delete(). This leaves stale trigger registrations on the input table that fire on every subsequent change, wastefully evaluating the isInitialized SQL check each time even after the table is deleted.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4b400e2. Configure here.

tableId: TableId,
tables: Table<GK, any, RD>[],
}): Table<GK, null, RD> {
const tables = [...options.tables];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate input tables cause double trigger processing

Medium Severity

declareConcatTable accepts the tables array without checking for duplicate table object references. If the same table is passed twice, ensureInputTriggerRegistrations registers two separate triggers on that single input table. Every change to the duplicated source fires both callbacks, each generating a full set of SQL change-propagation statements for downstream tables. This doubles processing cost and, while the row identifiers differ by tableIndex prefix, the semantics may surprise callers expecting idempotent union behavior.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4b400e2. Configure here.

`);
});
sendJson(response, 200, { ok: true });
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raw delete only removes single node, not subtree

Low Severity

The /api/raw/delete endpoint's reserved-path check only blocks deleting [] and ["table"], but allows deleting any deeper path like ["table", "external:some-table"]. Because the FK has ON DELETE CASCADE, this silently destroys an entire table's storage subtree without going through the table's delete() method, leaving stale JavaScript-side trigger registrations that will generate broken SQL on subsequent upstream changes.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4b400e2. Configure here.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 5 total unresolved issues (including 4 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 5b69a18. Configure here.

if (current === "\"") inDoubleQuote = false;
index++;
continue;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitSqlStatements mishandles escaped double-quote identifiers

Low Severity

The splitSqlStatements parser treats every " as toggling inDoubleQuote mode, but PostgreSQL allows "" inside double-quoted identifiers to represent a literal quote character. A SQL identifier like "column""name" would be incorrectly parsed as two separate quoted regions, causing any semicolons following the premature quote-exit to be treated as statement terminators. No currently generated SQL triggers this, but it breaks the parser contract if future SQL uses escaped identifiers.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5b69a18. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants