Multi-machine scanning via Chapel's coforall and locale-based distribution.
Extends panic-attack's single-machine rayon parallelism (assemblyline) to
datacenter-scale scanning across Chapel locales.
Locale 0 (coordinator) Locale 1..N (workers)
┌──────────────────────┐ ┌──────────────────────┐
│ Discover repos │ │ Receive repo paths │
│ Partition round-robin│───────►│ Run panic-attack │
│ Collect results │◄───────│ BLAKE3 fingerprint │
│ Build SystemImage │ │ Stream RepoResult │
│ Write temporal snap │ └──────────────────────┘
└──────────────────────┘
- Chapel 2.8.0+ (matches
chapel/Mason.toml) panic-attackbinary on PATH (or specify via--panicAttackBin)
cd chapel
chpl src/MassPanic.chpl src/Protocol.chpl src/Imaging.chpl src/Temporal.chpl -o mass-panic./mass-panic --repoDirectory=/path/to/repos./mass-panic --repoDirectory=/shared/repos --numLocales=32./mass-panic --repoDirectory=/path/to/repos --mode=full --attackTimeout=60| Mode | Functions | Speed | Use case |
|---|---|---|---|
assail |
Static analysis | Fast | Risk mapping, imaging |
assault |
assail + stress test | Slow | Full stress testing |
ambush |
Timeline-driven stress | Slow | Choreographed attacks |
adjudicate |
assail + logic verdict | Medium | Bug inference |
full |
assail + attack + adjudicate | Slowest | Complete pipeline |
| Flag | Default | Description |
|---|---|---|
--repoManifest |
File with one repo path per line | |
--repoDirectory |
Directory to scan for .git repos | |
--panicAttackBin |
panic-attack |
Path to panic-attack binary |
--mode |
assail |
Operation mode (see above) |
--scheduler |
static |
static (fast, not resumable) or queue (resumable, ~5–15% slower — unmeasured estimate, see panic-attack#87 Wave-3 benchmark followup) |
--resume |
false |
Requires --scheduler=queue; combining with --scheduler=static exits with an error (static mode has no journal). Skips repos already marked "done" in the journal |
--journalDir |
<outputDir>/journal |
Directory for queue-scheduler JSONL shards |
--incremental |
true |
Skip unchanged repos via BLAKE3 |
--cacheFile |
Fingerprint cache file path | |
--outputDir |
mass-panic-results |
Output directory |
--verisimdbDir |
verisimdb-data |
VeriSimDB data directory |
--snapshotLabel |
Label for temporal snapshot | |
--attackTimeout |
30 |
Seconds per attack axis |
--attackAxes |
all |
Comma-separated axes |
--intensity |
medium |
Attack intensity |
--notify |
false |
Generate notification summary |
--panllExport |
false |
Generate PanLL export files |
--quiet |
false |
Suppress progress output (also suppresses the scheduler banner) |
The --scheduler flag is the first decision every mass-panic run
implicitly makes. It controls how work is distributed across
locales, and the tradeoff matters enough that the tool prints a
banner in both directions at startup (unless --quiet) so operators
don't lose overnight sweeps to a Ctrl+C they could have survived.
Round-robin partition up-front, then coforall over Locales. Each
locale gets its fixed list of repos and scans them in-order. This is
the existing implementation and what every previous mass-panic
release has done.
- Fast. No per-repo overhead beyond the existing BLAKE3
fingerprint cache. Chapel's
coforallamortises scheduling cost across the whole range. - Not resumable. A locale crash, a Ctrl+C, or a single failed
repo halfway through — all force restarting the whole run. The
completed repos are in
mass-panic-results/assemblyline-*.jsonbut the coordinator hasn't yet merged them into the SystemImage. - Right for: scheduled nightly sweeps over a stable corpus, where the run finishes before anyone touches the terminal.
Dynamic work-pull via a shared atomic counter plus a per-locale
JSONL journal shard. Each locale claims the next unclaimed repo
from a shared counter, writes a {"state":"claim", …} entry to
its shard, runs the scan, writes {"state":"done", …} with the
full RepoResult payload (weak-point count, severities, fingerprint,
verdict, error).
--resume reads every shard in <journalDir>, extracts the latest
done entry per repo path, reconstructs the RepoResult records,
and skips those repos on the new run — so an interrupted run picks
up where it left off and the final report covers both the
previously-completed repos and the freshly-scanned ones.
- Resumable. Ctrl+C at t=3h drops ~1 repo of work; the next
invocation with
--resumereuses everything completed so far. A locale crash during a multi-day sweep loses only the currently-in-flight repo on that locale. - ~5–15% slower (UNMEASURED ESTIMATE) on clean runs. Not yet
benchmarked against any real corpus — this is a back-of-envelope
number from the per-task dispatch overhead (atomic fetch-add + one
journal write per repo, vs amortised across a
coforallrange). On a clean 10k-repo sweep, expect queue mode to finish in roughly ~1.10× the time of static. A defensible empirical measurement is tracked as panic-attack#87 Wave-3 followup (needs a beefier/self-hosted runner — default GH runners are too noisy for stable scheduler-overhead measurement). - Right for: long interactive sweeps (GitHub-account scale or larger), sweeps where at least one locale is on spot/preemptible infrastructure, or any run where you expect to want to pause and come back.
Static mode is measurably faster on clean runs and doesn't require any durable state. If your run always finishes cleanly, the journal writes are wasted I/O. Making the default explicit ("you are in static mode; here is what you're giving up") lets operators make that call consciously instead of paying for resilience they don't need.
Both schedulers are implemented. --scheduler=static is the default
and preserves the previous behaviour exactly — selecting queue does
not make static slower. --scheduler=queue writes per-run shards
(locale-<id>-<runId>.jsonl) so a crashed run's partial shard stays
isolated from the next run's writes; --resume replays every shard
in the journal directory and merges prior results with fresh ones.
The atomic work counter lives on the coordinator (Locale 0); every claim is one remote fetchAdd (microseconds) against a scan cost of 100ms–60s, so the dispatch overhead is well under 1% on any real workload. The ~5–15% figure above (still unmeasured) accounts for the per-repo journal write + flush, not the atomic itself.
When you run ./mass-panic …, the scheduler banner appears before
repo discovery:
mass-panic: scheduler=static (default)
fastest on clean runs; no --resume support.
A crash or Ctrl+C loses all progress.
Use --scheduler=queue for resumable runs (~5-15% slower, unmeasured).
Or for queue mode:
mass-panic: scheduler=queue
resumable via --resume; per-locale JSONL shards at mass-panic-results/journal
~5-15% slower than static on clean runs (unmeasured; one atomic + one journal write per repo).
A crash or Ctrl+C loses only the in-flight repo per locale — everything already
marked "done" is skipped on the next invocation with --resume.
The banner is suppressed under --quiet.
mass-panic-results/assemblyline-<timestamp>.json— aggregated reportmass-panic-results/system-image-<timestamp>.json— fNIRS-style health mapverisimdb-data/— temporal snapshots (VeriSimDB hexads)
The Chapel layer is optional — a detachable harness on top of the standalone Rust binary. For single-machine scanning, use:
panic-attack assemblyline /path/to/repos # rayon parallel
panic-attack image /path/to/repos # + imaging + temporalChapel adds multi-machine distribution for scanning at GitHub-account or
datacenter scale, where hundreds of machines each scan their partition of
repositories simultaneously. Removing chapel/ entirely leaves the Rust
build green and the single-machine USB-stick experience intact.
The Chapel↔Rust contract is exposed via panic-attack describe-contract
(introduced for the chapel-cli-contract CI gate). Any external orchestrator
— Chapel mass-panic, Nextflow, Airflow, Slurm, a hand-rolled shell script —
can call it to discover accepted flags per mode and the report
schema_version without coupling itself to panic-attack source.
panic-attack applies functional Near-Infrared Spectroscopy (fNIRS) concepts
to codebase health mapping. The canonical mapping lives in
src/Imaging.chpl header (lines 4-27) and is mirrored
here so the metaphor doesn't drift:
| fNIRS term | panic-attack equivalent |
|---|---|
| Cortical region | Repository / directory / file |
| Blood oxygenation | Health score (inverse of risk) |
| Neural activation | Weak point density (findings per KLOC) |
| Hemodynamic response | Change velocity (how fast risk is changing) |
| Optode placement | Scanner coverage (which files were analysed) |
| Channel | Dependency / taint flow edge |
| Functional map | SystemImage |
| Time series | Temporal snapshot sequence in VeriSimDB |
When a new health metric is added, update both Imaging.chpl and this
table; CI does not enforce the mapping but reviewers should.