Skip to content

Populate ClickHouse analytics tables when seeding preview projects#1471

Merged
BilalG1 merged 6 commits into
devfrom
fix/preview-mode-animation
May 22, 2026
Merged

Populate ClickHouse analytics tables when seeding preview projects#1471
BilalG1 merged 6 commits into
devfrom
fix/preview-mode-animation

Conversation

@BilalG1
Copy link
Copy Markdown
Collaborator

@BilalG1 BilalG1 commented May 22, 2026

Summary

In preview-mode deployments (NEXT_PUBLIC_STACK_IS_PREVIEW=true) the project overview dashboard reported 0 total users, 0 monthly active users, and no live users on the globe. The internal metrics endpoint reads user/team totals from the ClickHouse analytics_internal.* tables and "live users" from recent $token-refresh events — but those tables are normally filled by the external-db-sync pipeline, which does not run in preview deployments, so they were empty.

This makes the preview/demo dummy-data seeder populate ClickHouse directly:

  • seedDummyAnalyticsMirrorTables — mirrors the seeded users / teams / contact channels into analytics_internal.users / teams / contact_channels so the metrics endpoint reports real totals.
  • seedDummyLiveTokenRefreshEvents — emits recent $token-refresh events across distinct countries so the overview globe shows live users.
  • Timestamp clampingbulkRandomTimestampOnDay and the page-view/click timestamps are clamped so seeded events are never dated in the future (future-dated events permanently matched the unbounded "live users" query).
  • buildTokenRefreshClickhouseRow — shared helper for the $token-refresh ClickHouse row shape.
  • create-project — pre-warms the ClickHouse connection so the seeding inserts don't pay the cold-start cost.
  • projects-metrics — types the ClickHouse .json() results (fixes a tsc error).

Also bundles a seeding performance optimization that skips redundant idempotency lookups when seeding a brand-new project.

Notes:

  • Seeded mirror rows use sync_sequence_id = 0 so that if the external-db-sync pipeline ever does run for the project, any real update supersedes the seeded placeholder under ReplacingMergeTree + FINAL.
  • "Live users" naturally decays out of the ~2-minute window a couple of minutes after project creation; preview creates a fresh project per visit, so the initial overview always shows them.

Test plan

  • pnpm --filter @stackframe/backend typecheck passes
  • pnpm --filter @stackframe/backend lint passes
  • Created fresh preview projects; overview shows non-zero Total Users / Monthly Active Users
  • analytics_internal.users / teams / contact_channels populated for the seeded project
  • Globe shows 8 live users across 8 distinct countries (verified via the metrics 2-minute query)
  • No future-dated $token-refresh events in analytics_internal.events

Summary by CodeRabbit

  • Refactor
    • Faster preview project creation by pre-warming the analytics database and reusing the warmed connection.
    • Reduced initialization delays and redundant checks when seeding brand-new projects; creation paths now skip needless probes.
    • More efficient, parallelized seeding of teams/users/events with deterministic handling of token-refresh and session-replay data.
    • Safer timestamp generation to avoid future-dated events and deferred background processing for long-running tasks like payments.

Review Change Stack

Preview-mode deployments don't run the external-db-sync pipeline, so the
ClickHouse analytics_internal.* tables stayed empty and the project
overview dashboard reported 0 total users, 0 monthly active users, and no
live users on the globe.

- seedDummyAnalyticsMirrorTables: mirror seeded users / teams / contact
  channels into analytics_internal.users/teams/contact_channels so the
  metrics endpoint reports real totals.
- seedDummyLiveTokenRefreshEvents: emit recent $token-refresh events
  across distinct countries so the overview globe shows live users.
- bulkRandomTimestampOnDay and the page-view/click timestamps: clamp so
  seeded events are never dated in the future.
- buildTokenRefreshClickhouseRow: shared helper for the $token-refresh
  ClickHouse row shape.
- create-project: pre-warm the ClickHouse connection so the seeding
  inserts don't pay the cold-start cost.
- projects-metrics: type the ClickHouse .json() results.

Also includes a seeding performance optimization that skips redundant
idempotency lookups when seeding a brand-new project.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
stack-auth-hosted-components Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-auth-mcp Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-auth-skills Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-backend Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-dashboard Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-demo Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-docs Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-preview-backend Ready Ready Preview, Comment May 22, 2026 10:19pm
stack-preview-dashboard Ready Ready Preview, Comment May 22, 2026 10:19pm

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 22, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f88625af-92b2-4f2a-b551-fd07116c3d08

📥 Commits

Reviewing files that changed from the base of the PR and between 550bcd3 and dc30afc.

📒 Files selected for processing (1)
  • apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx

📝 Walkthrough

Walkthrough

Adds a freshProject flag across seeders, reuses and pre-warms a ClickHouse admin client for preview seeding, batches team creation, centralizes token-refresh ClickHouse row construction, clamps generated timestamps to avoid future events, and reorders orchestration to defer payments.

Changes

Preview seeding performance and idempotency optimization

Layer / File(s) Summary
Type contracts and ClickHouse typing for freshProject
apps/backend/src/lib/clickhouse.tsx, apps/backend/src/lib/seed-dummy-data.ts (imports, types)
Extends seeding option types with freshProject and optional clickhouseClient; re-exports ClickHouseClient type; updates imports for preview utilities and ClickHouse admin access.
Teams and users seeding optimization
apps/backend/src/lib/seed-dummy-data.ts (seedDummyTeams, seedDummyUsers)
seedDummyTeams replaces per-team existence checks with a single findMany (skipped when freshProject), builds teamsToCreate and creates missing teams concurrently. seedDummyUsers accepts freshProject and skips contact-channel and team-membership probes when true.
Event timestamp clamping and ClickHouse row helper
apps/backend/src/lib/seed-dummy-data.ts (timestamp generation, token-refresh rows, session activity)
Adds buildTokenRefreshClickhouseRow helper for $token-refresh ClickHouse rows; shifts/clamps timestamps to avoid future-dated events; refactors session activity seeding to use the helper and conditionally delete seeded rows only when !freshProject.
Bulk signup/activity seeding with shared client
apps/backend/src/lib/seed-dummy-data.ts (seedBulkSignupsAndActivity)
Updates seedBulkSignupsAndActivity to accept freshProject and a shared clickhouseClient; skips contact-channel probes on fresh projects and clamps bulk timestamps relative to now.
seedDummyProject orchestration and concurrency reorder
apps/backend/src/lib/seed-dummy-data.ts (main seedDummyProject flow)
Computes freshProject, creates/reuses a single ClickHouse client, conditionally deletes ClickHouse events only when reseeding, passes freshProject and client through downstream seeders, seeds analytics token-refresh tables after parallel work, and defers payments to the end (background in preview mode).
Preview endpoint ClickHouse pre-warm
apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx
Creates admin ClickHouse client, starts an unawaited SELECT 1 warm-up promise (ignores failures), passes client into seedDummyProject, and awaits the warm-up promise after seeding completes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • hexclave/stack-auth#1437: Modifies the same dummy seeding flow and ClickHouse-backed seed steps, closely related to these changes.

Suggested reviewers

  • N2D4

Poem

🐰 I warmed ClickHouse while seeds took flight,
Fresh projects skip probes and everything’s light,
Timestamps stay honest, no future in sight,
Teams spin up quick, replays keep right,
Payments wait calmly until preview’s night.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: populating ClickHouse analytics tables during preview project seeding.
Description check ✅ Passed The description is comprehensive, covering the problem statement, implementation details, design rationale, and test plan verification.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/preview-mode-animation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread apps/backend/src/lib/seed-dummy-data.ts
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 22, 2026

Greptile Summary

This PR fixes the preview-mode dashboard showing zero users/activity by directly seeding ClickHouse analytics_internal.* tables during project creation, bypassing the external-db-sync pipeline that preview deployments don't run.

  • seedDummyAnalyticsMirrorTables mirrors the Postgres-seeded users, teams, and contact channels into ClickHouse with sync_sequence_id = 0 so the metrics endpoint reports real totals; real pipeline rows with higher sequence IDs will supersede these under ReplacingMergeTree FINAL.
  • seedDummyLiveTokenRefreshEvents inserts 8 $token-refresh events timestamped at ~now across 8 distinct countries, populating the overview globe's live-user display (events decay naturally out of the ~2-minute window).
  • Timestamp clamping (bulkRandomTimestampOnDay, pvTime, clickTime) prevents future-dated events that would permanently inflate the live count; the ClickHouse pre-warm in create-project overlaps the service wake-up with Postgres seeding to reduce perceived latency.

Confidence Score: 4/5

Safe to merge; the new seeding functions are preview/demo-only and don't touch production user data paths.

The implementation is well-structured: mirror table inserts use sync_sequence_id = 0 so real pipeline rows always win under FINAL, timestamp clamping prevents the future-dated-event bug it was designed to fix, and the fresh-project fast path correctly skips idempotency overhead. Two minor quality concerns: the ClickHouse pre-warm creates a throwaway client (the service wake-up benefit is real, but per-connection TLS is not amortized), and the pvTime/clickTime clamp to now.getTime() clusters multiple demo events at the exact same millisecond.

seed-dummy-data.ts — the two new seeding functions and the timestamp clamping logic are the most behaviorally dense additions and worth a careful read.

Important Files Changed

Filename Overview
apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx Adds a ClickHouse Cloud pre-warm step before seedDummyProject; warmup wakes the service correctly but the client used for warmup is discarded (not a singleton), so per-connection TLS savings are not realized.
apps/backend/src/app/api/latest/internal/projects-metrics/route.tsx Adds TypeScript generics to ClickHouse .json() calls, fixing a tsc error; the aliased column names in the SQL (project_id AS projectId, etc.) correctly match the supplied generic types.
apps/backend/src/lib/seed-dummy-data.ts Adds two new ClickHouse seeders (mirror tables and live token-refresh events), a shared row-builder helper, timestamp future-clamping, and a fresh-project fast path that skips idempotency DB queries; one P2 — pvTime/clickTime clamping causes event timestamp clustering at now.

Sequence Diagram

sequenceDiagram
    participant Route as create-project/route.tsx
    participant Seed as seedDummyProject
    participant PG as Postgres
    participant CH as ClickHouse

    Route->>CH: SELECT 1 (pre-warm, unawaited)
    Route->>Seed: seedDummyProject()
    Seed->>PG: seedDummyTeams / seedDummyUsers
    Seed->>PG: seedBulkSignupsAndActivity (async)
    Seed->>PG: seedDummyEmails / SessionActivity / SessionReplays
    Note over CH: service already awake
    Seed->>CH: bulk $token-refresh / $page-view / $click events
    Seed->>PG: await bulkSignupsPromise
    par mirror tables
        Seed->>PG: findMany users/teams/contacts
        Seed->>CH: INSERT analytics_internal.users/teams/contact_channels
    and live events
        Seed->>PG: findMany projectUsers (non-anon, take 8)
        Seed->>CH: INSERT 8 $token-refresh events at ~now
    end
    Seed->>Seed: seedDummyTransactions (fire-and-forget in preview)
    Seed-->>Route: projectId
    Route->>CH: await clickhouseWarmup (already resolved)
    Route-->>Route: return project_id
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx:44-46
**Warmup discards the warmed client**

`getClickhouseAdminClient()` calls `createClient(...)` on every invocation, so the client used for `SELECT 1` is immediately abandoned. Every subsequent call inside `seedDummyProject``seedDummyAnalyticsMirrorTables`, `seedDummyLiveTokenRefreshEvents`, etc. — creates its own fresh HTTP client and pays its own TLS negotiation. The warmup still triggers the ClickHouse Cloud service wake-up (which is indeed the dominant ~0.7 s cost), so the net effect is positive; but the per-client TLS round-trips are not amortized the way the comment implies. If the full TLS cost matters, consider caching a singleton client and passing it into `seedDummyProject`.

### Issue 2 of 2
apps/backend/src/lib/seed-dummy-data.ts:1932-1934
**Event timestamp clustering at `now`**

Both page-view and click offsets can push a same-day `visitTime` past the current moment, so the `Math.min` clamp collapses them to the exact same `now.getTime()` value. In practice this means dozens of `$page-view` and `$click` events share one millisecond, producing an unnatural spike at the seeding instant in the analytics time-series. Clamping to `now - 1` (or a small random jitter below `now`) would preserve the visual spread without ever producing future-dated events.

```suggestion
        // Clamp to `now - 1ms`: visitTime is already clamped, but adding the
        // offset can push a same-day event past `now` into the future. Using
        // `now - 1` keeps the timestamp strictly in the past so it doesn't
        // appear as a "current" spike in the analytics time-series.
        const pvTime = new Date(Math.min(visitTime.getTime() + pvOffset, now.getTime() - 1));
```

Reviews (1): Last reviewed commit: "Populate ClickHouse analytics tables whe..." | Re-trigger Greptile

Comment thread apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx Outdated
Comment thread apps/backend/src/lib/seed-dummy-data.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx`:
- Around line 39-46: The clickhouseWarmup promise currently swallows errors via
.then(() => undefined, () => undefined) so failures never surface to the later
await and seedDummyProject can't detect ClickHouse issues; change the warm-up to
propagate failures (e.g. remove the rejection handler so the rejected promise
bubbles) or explicitly log and rethrow the error from the rejection handler
returned by getClickhouseAdminClient().command({ query: "SELECT 1" }) so await
clickhouseWarmup will fail and upstream code (seedDummyProject) can handle the
error.

In `@apps/backend/src/lib/seed-dummy-data.ts`:
- Around line 2049-2053: freshProject is used to skip idempotency but ClickHouse
writes remain append-only, so rerunning seeds duplicates analytics; modify
seed-dummy-data.ts to make reseeds idempotent by deleting or replacing the
previously-seeded rows in analytics_internal.events when freshProject is false:
add a targeted DELETE (or invoke a ReplacingMergeTree/replace path) for the
seeded event signatures/tokens before inserting in
seedDummySessionActivityEvents and seedBulkSignupsAndActivity, using the same
identifying fields/tokens those functions generate so you only remove the prior
seed rows for that project rather than all events.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 25e6e66f-9d5a-4737-83f1-08daebd95936

📥 Commits

Reviewing files that changed from the base of the PR and between 0c6e135 and d437255.

📒 Files selected for processing (3)
  • apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx
  • apps/backend/src/app/api/latest/internal/projects-metrics/route.tsx
  • apps/backend/src/lib/seed-dummy-data.ts

Comment thread apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx Outdated
Comment thread apps/backend/src/lib/seed-dummy-data.ts
- Reuse a single ClickHouse client across the preview create-project
  route's warm-up and every analytics seeder, so the connection/TLS
  handshake is established once instead of per seeder.
- Format $token-refresh event_at consistently in
  buildTokenRefreshClickhouseRow so the historical and live seeders
  write identical timestamp strings.
- Clear a project's previously-seeded analytics_internal.events rows
  before reseeding an existing project, so reseeds refresh analytics
  instead of duplicating them (the events table is append-only).
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/backend/src/lib/seed-dummy-data.ts (1)

430-432: 💤 Low value

Consider defensive check per coding guidelines.

The non-null assertion is structurally safe since Promise.all preserves array length, but per coding guidelines, explicit checks are preferred.

🔧 Optional defensive fix
  teamsToCreate.forEach((team, index) => {
-   teamNameToId.set(team.displayName, createdTeams[index]!.id);
+   const created = createdTeams[index] ?? throwErr(`Team creation result missing at index ${index}`);
+   teamNameToId.set(team.displayName, created.id);
  });

As per coding guidelines: "Code defensively; prefer ?? throwErr(...) over non-null assertions with explicit error messages"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/backend/src/lib/seed-dummy-data.ts` around lines 430 - 432, The loop
using teamsToCreate.forEach sets teamNameToId with createdTeams[index]!.id using
a non-null assertion; replace that with a defensive nullish-coalescing check
that throws a clear error if createdTeams[index] is missing (e.g., use
createdTeams[index] ?? throwErr("…") before accessing .id), referring to
teamsToCreate, createdTeams, teamNameToId, displayName and id so the assignment
never uses the ! operator and fails loudly with an explicit message when the
created team is absent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/backend/src/lib/seed-dummy-data.ts`:
- Around line 430-432: The loop using teamsToCreate.forEach sets teamNameToId
with createdTeams[index]!.id using a non-null assertion; replace that with a
defensive nullish-coalescing check that throws a clear error if
createdTeams[index] is missing (e.g., use createdTeams[index] ?? throwErr("…")
before accessing .id), referring to teamsToCreate, createdTeams, teamNameToId,
displayName and id so the assignment never uses the ! operator and fails loudly
with an explicit message when the created team is absent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6e8721cf-1448-4e13-8572-404f6c9f5eee

📥 Commits

Reviewing files that changed from the base of the PR and between d437255 and 72f4d05.

📒 Files selected for processing (3)
  • apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx
  • apps/backend/src/lib/clickhouse.tsx
  • apps/backend/src/lib/seed-dummy-data.ts

@BilalG1 BilalG1 requested a review from N2D4 May 22, 2026 17:34
@BilalG1 BilalG1 assigned N2D4 and unassigned BilalG1 May 22, 2026
Comment thread apps/backend/src/app/api/latest/internal/preview/create-project/route.tsx Outdated
@github-actions github-actions Bot assigned BilalG1 and unassigned N2D4 May 22, 2026
The function was used on line 48 but never imported, causing a runtime
ReferenceError. Add the import alongside the other stack-shared imports.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants