Skip to content

chore: add nats benchmarking pkg#26396

Open
sreya wants to merge 20 commits into
mainfrom
nats-benchmarking
Open

chore: add nats benchmarking pkg#26396
sreya wants to merge 20 commits into
mainfrom
nats-benchmarking

Conversation

@sreya

@sreya sreya commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Just an FYI I let the AI run with this one a little bit more than previous PRs related to production code since this is only for internal use.

Here's a comparison of the prototype vs the current implementation (prior to message queue, we should rerun to make sure we didn't seriously regress for some reason due to that)

High Volume Single Subject

Row Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
100P x 10S x 8KB 397,993 529,403 3,415,410 4,516,496
100P x 10S x 64KB 51,683 55,435 516,818 554,347
10P x 100S x 8KB 353,889 441,064 6,745,004 5,478,560
10P x 100S x 64KB 61,924 65,222 4,263,735 5,794,167

High Volume Multi Subject (10 Subjects)

Row Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
100P x 10S x 8KB 614,206 647,814 560,404 578,822
100P x 10S x 64KB 61,162 69,478 61,161 69,478
10P x 100S x 8KB 613,889 658,842 5,354,673 5,722,741
10P x 100S x 64KB 76,577 87,521 765,762 875,149

High Cardinality Publish Fan In

Row Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
100P x 100S x 8KB 610,315 728,160 559,369 628,184
100P x 100S x 64KB 57,750 65,128 57,747 65,122

High Cardinality Fanout

Row Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
300P x 100S x 8KB 575,351 664,537 522,973 599,190
300P x 100S x 64KB 46,218 59,080 46,217 59,080

High Cardinality Fanout

Row Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
100P x 300S x 8KB 605,637 676,330 1,609,660 1,801,120
100P x 300S x 64KB 51,456 65,459 154,358 196,373

Global Broadcast

Row Messages Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
10P x 10S x 8KB 100,000 70,358 73,329 679,927 704,078
10P x 10S x 64KB 20,000 9,053 8,618 79,210 73,614

Global Broadcast Subscriber Fanout

Row Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
10P x 100S x 8KB 67,993 71,440 6,467,409 6,676,229
10P x 100S x 64KB 8,568 8,466 766,898 737,429

Sharded Broadcast, 10 Subjects

Row Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
10P x 100S x 8KB 688,390 804,489 5,548,640 6,615,891
10P x 100S x 64KB 40,044 44,122 400,347 441,221

Sharded High-Cardinality Thin Fanout

Row Messages Old Pubs/sec New Pubs/sec Old Dels/sec New Dels/sec
100P x 100S x 8KB 1,000,000 759,281 800,269 625,637 763,617
100P x 100S x 64KB 200,000 39,539 45,424 39,535 45,424

sreya added 20 commits June 11, 2026 22:03
Add an importable benchmark library for the NATS-backed pubsub that
measures Pubs/sec and Deliveries/sec under high fan-out load across
configurable subjects, payload sizes, publishers, subscribers, and
replica counts.

- Deterministic plan maps publishers/subscribers to subjects and
  replicas and precomputes exact per-subscriber delivery counts.
- Probe-based readiness gate proves cross-route subscription interest
  has propagated before the measured phase, since routed drops are
  silent.
- Workload-derived sizing for listener queues and server max pending
  prevents slow-consumer drops; any drop signal invalidates the run.
- Bounded phases fail with shortfall, server-stats, and goroutine-dump
  diagnostics instead of hanging.
- TestBenchMatrix (gated behind CODER_TEST_NATS_BENCH=1) runs the
  8 KiB / 64 KiB x 1/5/10 replica matrix and renders grouped markdown
  tables; invalid runs never report a throughput number.
…ndings

Address code review findings:

- Derive MaxPending from the per-node sum of subject volumes with local
  subscribers, since MaxPending is a per-connection budget and one
  subscribe connection carries every coalesced subscription on its
  node. The previous per-subscriber derivation undersized multi-subject
  nodes.
- Derive the per-subscription pending byte limit (new LocalQueueBytes
  knob) alongside the message limit; previously the 512 MiB default
  could trip before the derived message limit.
- Pad message-count budgets with probe headroom so in-flight readiness
  probes cannot consume capacity sized for the benchmark burst.
- Warn when the derived local queue hits its cap and can no longer
  guarantee a drop-free run.
- Return partial Results on publish and flush errors for diagnostics,
  matching the documented Run contract.
- Register subscriber cleanup before subscribing so partial subscribe
  failures are cleaned up by the workload itself.
- Remove a no-op subscriber-node flush whose comment misattributed the
  interest guarantee; SubscribeWithErr flushes the SUB itself.
- Record effective (overridden) configs in matrix report rows.
- Replace the env-gated TestBenchMatrix test with a cmd/natsbench CLI:
  no flags runs the default matrix, -scenario runs one named scenario,
  and shape flags (-payload/-subjects/-publishers/-subscribers/
  -replicas) run a custom configuration. Markdown goes to stdout, logs
  to stderr, and a failed run exits nonzero. The report-to-file env var
  is gone; redirect stdout instead.
- Remove Config.withDefaults: Run now requires a fully populated
  config and validates that Timeout is positive. The CLI defaults the
  timeout to 2 minutes.
- Collapse the readiness gate's two plan inversions into a single
  subjectNodes mapping that serves as both the probe schedule and each
  subscriber's required probe set.
- Document why startPublishers parks on a barrier and when the
  zero-expectation pre-close of allDone applies.
- Drop digit-separator underscores from numeric literals.
- Encode readiness probes as a sentinel byte plus the decimal node
  index instead of a BigEndian uint64, dropping the encoding/binary and
  math dependencies and the overflow guard.
- Return publisher errors over a buffered, closed-on-completion channel
  instead of writing into a shared slice, removing any question of a
  data race on the error collection.
- Move the CLI driver into the natsbench package as the exported Main
  plus a testable cliRun.scenarios; cmd/natsbench is now a thin
  entrypoint. Adds unit coverage for scenario selection.
- Expand the plan doc comment with a concrete worked example of the
  publisher/subscriber to subject/node assignment and expected counts.
Replace the 0x5b sentinel byte with a 'natsbench-probe:' string prefix.
Both distinguish probes from the all-zero benchmark payloads equally
well, but the prefix is self-documenting in packet captures and
debuggers. Decode with strings.CutPrefix.
Collapse the library and its cmd/natsbench entrypoint into a single
package main with a main() that calls runCLI. The benchmark is now run
directly with 'go run ./coderd/x/nats/natsbench/'. Tests still live in
the same directory and continue to pass.
…comments

- Rename awaitReadiness -> awaitTopologyReady, readinessConverged ->
  isReady, readinessShortfall -> unreadySubscribers.
- Give each plan field its own comment line.
- Note why probe flushing dedupes pubNode (it is indexed by publisher,
  so multiple publishers share a node).
Compute the sorted distinct publisher and subscriber node sets once in
buildPlan (plan.pubNodes / plan.subNodes) instead of recomputing
uniqueInts at each call site, including on every iteration of the
readiness gate loop. Several publishers or subscribers can share a node,
so per-node work (flushing, burst sizing) needs the deduped set.
Drop the redundant Scenario column (the payload group header and the
Replicas column already identify each row) and the always-zero Drops
and always-empty Notes columns. A Status column is now included only
for groups that contain an invalid run, so clean matrices render as a
compact four-column table.
Pad every table cell to its column's widest value so the raw markdown
also lines up in a fixed-width terminal, instead of relying on a
markdown viewer to align ragged pipes. Numeric columns stay
right-aligned and the Status column is left-aligned.
The standard matrix now runs with 3 publisher and 3 subscriber
connections (DefaultConns) to match the prior natsbench harness, which
spreads same-subject hashing across connections and raises single-node
throughput over the production 1/1 default. New -publish-conns and
-subscribe-conns flags apply to every run, so 1/1 production behavior is
still reproducible with -publish-conns 1 -subscribe-conns 1.
Drop the trailing colons from table separator rows. Cells are already
padded for terminal alignment, so the GitHub markdown alignment hints
added visual noise without changing the rendered terminal output.
Add Subjects, Publishers, and Subscribers columns to the report so the
workload shape is explicit. The default matrix holds these constant, but
named-scenario overrides and custom runs vary them, and a table that
hides the shape is easy to misread.
probeNode ran string(payload) on every delivered message, allocating a
full copy of the (up to 64 KiB) payload per delivery. At high fan-out
this dominated runtime via GC pressure and understated throughput by up
to ~10x. Compare the probe prefix as bytes against a package-level byte
slice and convert only the tiny trailing node index to a string, and
only for actual probes, so benchmark payloads cost no allocation.
Run validates a fully populated config and applies no defaults, so the
Messages comment claiming 'Zero means DefaultMessages' was a false
contract (validate rejects Messages < 1). Likewise PublishConns and
SubscribeConns do not default in natsbench; zero passes through to
nats.Options, which applies the single-connection default. Clarify on
the Config type that defaulting happens in the CLI, required fields
must be set, and only LocalQueue*/MaxPending are derived when zero.
- Default matrix now uses replica counts (1, 3, 9) coprime with the
  subject count (10) so cluster scenarios actually exercise cross-node
  routing; previously divisor counts co-located every pub/sub pair and
  the readiness gate proved nothing. TestRunCluster likewise uses
  coprime Subjects/Replicas for cross-node integration coverage.
- applySizing now warns when an explicit LocalQueueBytes is below the
  derived size, matching LocalQueueMsgs and MaxPending.
- Wire SIGINT/SIGTERM cancellation through the CLI; the run loop stops
  launching scenarios once interrupted instead of emitting confusing
  topology errors. Move os.Exit out of the deferred-stop scope.
- Replace hand-rolled formatInt with humanize.Comma.
- Add unit tests for the drop-invalidation path (dropState, listener
  drop accounting, awaitPhase fail-fast).
- Trim probe comments to the why; use wg.Go for publisher goroutines.
- Document that DefaultScenarios leaves Timeout unset for the caller.
- Drop vestigial Status/Scenario NotContains assertions in the
  clean-group render test.
- Clarify closeAll comment refers to Pubsub.Close.
- De-stutter the subscriber registration error message.
Call signal stop() explicitly instead of deferring it, so os.Exit no
longer skips it and the two-function split added only to satisfy the
exitAfterDefer lint is unnecessary.
…rops

The clean-group render test asserted NotContains 'Drops', which never
appears in any output and so passed trivially. Restore the meaningful
NotContains 'Status' assertion: Status is a conditional column header
added only for groups with an invalid run, and this test exists to
verify clean groups omit it.
@sreya sreya requested a review from spikecurtis June 16, 2026 05:30
@sreya sreya changed the title Nats benchmarking chore: add nats benchmarking pkg Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant