diff --git a/.claude/handoffs/explore-flow-tool-adoption.md b/.claude/handoffs/explore-flow-tool-adoption.md new file mode 100644 index 00000000..b4993811 --- /dev/null +++ b/.claude/handoffs/explore-flow-tool-adoption.md @@ -0,0 +1,70 @@ +--- +name: explore-flow-tool-adoption +date: 2026-05-24 00:55 +project: codegraph +branch: architectural-improvements +summary: Investigated why codegraph's read savings don't convert to wall-clock; root cause is agent tool-CHOICE (under-uses trace). Shipped a chain of fixes; the breakthrough is "explore-surfaces-flow" — the first mechanism to show up in real agent runs by adapting the tool the agent already uses. +--- + +# Handoff: codegraph retrieval — tool adoption & explore-surfaces-flow + +## Resume here — read this first +**Current state:** A long investigation into making agents answer flow questions faster with codegraph. 6 commits on `architectural-improvements` (all probe-validated, suite green 815). The breakthrough: **`codegraph_explore` now surfaces the execution flow** from the symbol-bag the agent already passes it (`PmsProductController getList PmsProductService list PmsProductServiceImpl` → leads output with `getList → service-interface → impl`, riding synth edges). It's the FIRST mechanism this whole arc to actually appear in real agent runs (spring-mall A/B: flow surfaced both runs, reads 2.0→1.5) — because it adapts the tool the agent USES instead of trying to make it use `trace`. + +**Immediate next step:** The user is weighing how to push tool-USE quality next (their open question). Decide between: (a) **extend explore-flow to surface more reliably** (spring-halo's query didn't name a connected co-named chain → no flow), (b) accept we're at the model-behavior ceiling and **wrap up**, or (c) the user's ideas — better tool-description *examples* (≈ steering, low-leverage per the evidence) or a *query-builder tool* (adds a call + new-tool adoption problem). My read: keep ADAPTING THE USED TOOL (the only thing that's worked); examples/new-tools are the "change the agent" direction that failed all session. + +> Suggested next message: "explore-flow only surfaced on 2 of 3 repos — dig into why spring-halo's explore query didn't produce a flow and make it surface more reliably" — OR — "we're at the model-behavior ceiling; let's stop and write the CHANGELOG/PR for this branch" + +## Goal +Make an AI agent answer **flow questions** ("how does X reach Y", request→handler→service, state→render) fast: ~0 Read/Grep, few codegraph calls, lower wall-clock. `codegraph_trace` is the fastest tool (1 call = the path), but the agent under-uses it. Ultimate target = trace's speed, however the agent gets there. + +## Key findings (the through-line) +- **The wall is agent tool-CHOICE, not the graph.** Matrix-wide, codegraph cuts reads −75% but wall-clock only −16% (`docs/benchmarks/codegraph-ab-matrix.md`). The floor is round-trips + the synthesis turn. The agent reliably calls `context`/`explore`, rarely `trace` (3/37 flow cells). Full analysis: `docs/benchmarks/call-sequence-analysis.md`. +- **Steering does NOT move it** (arms B/F/G, 3 wording variants): an MCP `initialize` instruction / tool description can't match a CLI `--append-system-prompt`'s salience, and forcing trace where it doesn't connect regresses. Reverted. +- **Sufficiency works** (committed): a self-sufficient `trace` (hop bodies + destination callees inlined) lets the unsteered agent stop — but only when it calls trace. +- **THE breakthrough — adapt the tool the agent uses.** `explore`'s query is a precise symbol-bag spanning the flow, so `explore` finds the call path AMONG its named symbols and leads with it. First mechanism to surface in real runs + drop reads. +- **What FAILED:** option 1 (context-surfaces-flow) — fuzzy DESCRIPTION can't disambiguate endpoints → confident WRONG-feature flow; reverted. trace multi-source-BFS over ambiguous names — same wrong-feature; reverted. + +## Gotchas +- **Co-naming disambiguation must match qualifiedName SEGMENTS, not substrings** (`buildFlowFromNamedSymbols` in `src/mcp/tools.ts`): `list` is a substring of `getList` → kept every getList. Split `qualifiedName` on `::`/`.` and match segments. +- **BFS must cap consecutive UNNAMED hops at 1** — full-graph BFS wanders a god-function's fan-out (excalidraw `render()` → pointer handlers → mutateElement). ≤1 bridge crosses a missing intermediate without wandering. +- **`getCallees` returns non-`calls` edges too** (references) — filter `c.edge.kind === 'calls'`. +- **Resolver/synthesizer changes need a CLEAN reindex**: `rm -rf .codegraph && codegraph init -i` (the init edge count is contains-only — query the DB for the real count). The explore-flow change is query-time (no reindex). +- **n=2 A/B is noisy** — report ranges/patterns, never conclude from one run. Foreground `sleep` is blocked → run A/B batches with `run_in_background`. +- Java/Kotlin `qualifiedName` is `Class::method` (so `matchesSymbol` resolves `Class.method` qualified trace endpoints — the agent already passes these). + +## How to test & validate +- Probe flow surfacing (no agent): `node scripts/agent-eval/probe-explore.mjs ""` → look for the `## Flow` section. `probe-trace.mjs ` for trace. +- Synthesizer: `sqlite3 /.codegraph/codegraph.db "select count(*) from edges where json_extract(metadata,'$.synthesizedBy')='interface-impl'"`; node count stable before/after reindex (synth adds edges only). +- Agent A/B (the real test): `bash scripts/agent-eval/run-arms.sh "" I ` (arm I = body-trace build, no steering). Parse via the `cmp2.mjs`-style scripts in `/tmp`. Pass = flow surfaces (`flowShown=Y`) + reads ≤ baseline. +- `npm test` (vitest, 815 pass); `__tests__/mcp-tool-allowlist.test.ts` covers the allowlist. + +## Repo state +- branch `architectural-improvements`, last commit `bafae81 feat(mcp): codegraph_explore surfaces the execution flow from its named symbols`. +- uncommitted: clean (only untracked `.claude/handoffs/`). +- 6 session commits: `eab5cf3` self-sufficient trace + `CODEGRAPH_MCP_TOOLS` allowlist · `a6183d7` research log + arms harness · `bde8c19` node/trace line numbers · `98baf41` Java/Kotlin interface→impl synthesizer · `6f3c468` playbook · `bafae81` explore-surfaces-flow. +- NOT pushed/merged. No version bump. CHANGELOG `[Unreleased]` has all of it. + +## Open threads / TODO +- [ ] **User's open question** (answer in the next turn): better tool-description *examples* vs a *query-builder tool* vs keep adapting the used tool. Evidence favors the last. +- [x] explore-flow reliability: now resolves QUALIFIED tokens (`Class.method`) — the agent's most precise input was being dropped by the file-ext strip (`2765c3c`). spring-halo's publish flow stays absent on purpose — it's **reactive/reconciler dispatch** (`publishPost` calls `ReactiveExtensionClient.get`/`awaitPostPublished`, not `PostService.publish`), so there's no static call chain. That's the next COVERAGE frontier (reactive runtimes — like MediatR, Vue Proxy), not an explore-flow bug. +- [ ] Ship-prep for the whole branch (this arc + the earlier framework sweep): CHANGELOG version block + `package.json` bump + PR to main. Releases go through `.github/workflows/release.yml` only — do NOT `npm publish`. +- [ ] Frontiers: MediatR (`_mediator.Send`→Handle) and Vue/Compose reactive runtimes are still unbridged dynamic dispatch. + +## Recent transcript (oldest → newest) +### Turn — "improve the A/B matrix; trace works, reads near 0 — what else?" +- Diagnosed: reads at floor, wall-clock floor = round-trips + synthesis. Built `seq-matrix.mjs`; found trace adoption 3/37. +### Turn — "do explore/context/trace compete? one tool?" +- Ablation arms A–E (`run-arms.sh`/`arms-F.sh` + `CODEGRAPH_MCP_TOOLS` allowlist). explore = 68% of payload, load-bearing; trace path-scoped but under-adopted; trace alone insufficient. +### Turn — "prototype body-inlining trace + A/B" +- Arm F: self-sufficient trace wins WITH append-prompt steering. But steering isn't a shippable channel. +### Turn — "port the steering + re-run" +- Arms G (3 variants) all regressed vs baseline; arm H (body-trace, no steer) ≈ baseline. Steering reverted; body-trace + line-numbers + allowlist committed. +### Turn — "tee up connectivity (Spring interface-DI)" +- Built `interfaceOverrideEdges` (Java/Kotlin interface→impl, overload-aware). Probe: 3-hop trace connects. But A/B null — agent never called trace. Committed (probe-validated, adoption-gated). +### Turn — "make context surface the flow (option 1)" +- Failed: fuzzy query → wrong-feature flows. Reverted. +### Turn — "change explore to do trace in the backend" +- WIN: explore's query is a precise symbol-bag. `buildFlowFromNamedSymbols` (co-naming segment match + ≤1 bridge). Probe perfect (Spring + excalidraw full chains); A/B: flow surfaces + modest read drop. Committed `bafae81`. +### Turn — "update memory + handoff; what about better examples / a query-builder tool?" +- This handoff + memory update. Strategic answer pending (adapt-the-tool > change-the-agent). diff --git a/.claude/handoffs/framework-coverage-sweep-2026-05-23.md b/.claude/handoffs/framework-coverage-sweep-2026-05-23.md new file mode 100644 index 00000000..3ba99a5e --- /dev/null +++ b/.claude/handoffs/framework-coverage-sweep-2026-05-23.md @@ -0,0 +1,70 @@ +--- +name: framework-coverage-sweep-2026-05-23 +date: 2026-05-23 23:59 +project: codegraph +branch: architectural-improvements +summary: Dynamic-dispatch coverage sweep COMPLETE — all 14 README frameworks + every flow-relevant language validated (measure→fix→validate→test→playbook→commit). ~37 commits pushed, suite green. Ship-prep (CHANGELOG + PR to main) is the only thing left. +--- + +# Handoff: Dynamic-dispatch framework/language coverage sweep (complete) + +## Resume here — read this first +**Current state:** The coverage sweep is **done**, AND a **frontier pass** closed the tractable partials. Every framework in the README's 14-row table is ✅, every flow-relevant language is validated (TS/JS, Python, Go, Java, C#, PHP, Ruby, Rust, Swift, Dart, Kotlin, Lua/Luau, Scala, C/C++), and the frontier pass added: React object data-router (literal), Next.js false-positive fix, Flask-RESTful `add_resource` (redash 6→77), Flask tuple methods + broader detection (flask-realworld 0→19), gorilla/mux confirmed. All committed/pushed to `architectural-improvements` (tree clean except untracked `.claude/handoffs/`). Full suite green (**809 passed**, 2 skipped; flaky `watcher.test.ts > debounced sync` passes on re-run). **No CHANGELOG entry exists, and the branch is not yet merged to main.** +**Immediate next step:** Ship-prep — write a CHANGELOG entry grouping the whole sweep (route resolution for Flask/FastAPI/Drupal/Rust-Axum+actix/Vapor/Spring-Kotlin/Play + React Router routing; the Python builtin-name guard, Dart method-range, and C++ inheritance foundational fixes; the flutter-build and cpp-override synthesizer channels), bump `package.json`, then open a PR to main. + +> Suggested next message: "do ship-prep: write the CHANGELOG entry covering the whole framework/language coverage sweep on this branch, bump the version, and open a PR to main" + +## Goal +Close static-extraction holes for **dynamic dispatch** across every language/framework codegraph supports, so cross-symbol flows (request→route→handler→service, state→render, virtual→override) exist in the graph and an agent answers flow questions with few codegraph calls and ~0 Read/Grep. Per framework/language: canonical flow `trace`s end-to-end, agent A/B shows fewer reads, no node explosion, recorded in `docs/design/dynamic-dispatch-coverage-playbook.md` (the matrix §6 + per-item notes §7). **This goal is now met; what remains is ship-prep + documented frontiers.** + +## Key findings (this session's work, all committed) +- **Routing convention is the hole in every backend** — same pattern each time: the resolver/extractor assumed one syntax. Flask (intervening `@login_required`/stacked routes), FastAPI (empty `""` path), Drupal (`claimsReference` for FQCN `_form`/single-colon controllers + contrib `detect` via composer name/type/`.info.yml`), Rust/Axum (chained `get(h).post(h2)` + namespaced `mod::handler`), actix (builder API `web::resource().route(web::get().to(h))`), Vapor (grouped `routes.grouped("x"); x.get(use:h)` — was 0 on every real app), Spring **Kotlin** (`fun` handler syntax + `.kt`), Play (extensionless `conf/routes` → controller), React Router (`` JSX). +- **Three FOUNDATIONAL fixes (broad benefit, not framework-specific):** (1) Python **bare-name builtin guard** in `src/resolution/index.ts` — a handler named `index`/`get`/`update` was filtered as a builtin method; mirror the dotted-branch `knownNames` guard. (2) **Dart method-range** in `src/extraction/tree-sitter.ts` `createNode` — Dart bodies are SIBLINGS of the signature, so methods were `end==start` (signature-only); extend `endLine` to the resolved body (guarded, child-body grammars no-op). (3) **C++ inheritance** — `extractInheritance` handled `base_clause` (PHP) but not C++ `base_class_clause`; added it (leveldb extends 219→298). +- **Two new synthesizer channels** in `src/resolution/callback-synthesizer.ts` (Dart analog + C++ analog of react-render): `flutter-build` (a State method calling `setState(` → `build`) and `cpp-override` (base virtual method → subclass override of same name, gated to C++). +- **measure-first repeatedly split "needs work" from "already covered":** Svelte, NestJS (prior), and this session **Lua/Luau** (module dispatch already resolves) + **Compose** (composition is plain function calls, already static) needed NO code. The assumed hole wasn't real. +- **`claimsReference` pre-filter is the recurring gotcha** (`src/resolution/index.ts:497-503`): a route ref naming no declared symbol (FQCN, `Controller@method`, `controller#action`, `Class.method`) is dropped before `framework.resolve()` runs. Added for Drupal + Play this session. + +## Gotchas +- **`claimsReference`:** if a new framework's route refs don't resolve despite a correct `resolve()`, it's the pre-filter — add `claimsReference`. +- **Reindex picks up resolver changes only on a CLEAN index:** `codegraph index` is incremental (skips unchanged files); after `npm run build`, do `rm -rf .codegraph && codegraph init -i` to re-extract. The init message's edge count is contains-only (~misleading); query the DB for the real count. +- **Extraction changes are high blast radius** (shared `createNode`/`extractInheritance`): re-check node counts on control repos (excalidraw 9,290 / django 302) — the Dart/C++ fixes are guarded to only-extend / C++-only, controls unchanged. +- **Play `conf/routes` is extensionless** → needed `isPlayRoutesFile` opt-in in `grammars.ts` (isSourceFile + detectLanguage→'yaml' no-grammar path). Narrow match, only ADDS Play files. +- **Flaky:** `watcher.test.ts > debounced sync > should trigger sync after file change` — timing-based, passes on re-run; unrelated to any of this work. +- **Foreground `sleep` is blocked** in Bash → background A/B batches (`run_in_background: true`), read the task output file. zsh quirks: quote globs (`'*.vue'`); SQL `count(*)` in `$(...)` needs care with quotes. +- Global `codegraph` is npm-linked to this repo's `dist/`; `npm run build` then reindex. A/B harness: `scripts/agent-eval/run-all.sh "" headless` (with vs empty MCP), parse via `node scripts/agent-eval/parse-run.mjs`. + +## How to test & validate (the per-framework loop) +- Corpus in `/tmp/codegraph-corpus/` (clone S/M/L, `git clone --depth 1`). Index: `rm -rf .codegraph && codegraph init -i`. +- Measure holes: `sqlite3 .codegraph/codegraph.db "select count(*) from nodes where kind='route'"` + route→handler edges (`join edges on source where kind='references'`). Node-count before/after (no explosion). +- Flow: `node scripts/agent-eval/probe-node.mjs ` (shows Called-by/Calls trail) / `probe-trace.mjs `. +- Agent A/B (≥2 runs/arm, variance is real): `run-all.sh` headless, record Read/Grep/duration/codegraph. Pass = fewer reads with codegraph. +- Tests: `npm test` (vitest). Resolver extract tests in `__tests__/frameworks.test.ts`; end-to-end in `__tests__/frameworks-integration.test.ts` (real CodeGraph + indexAll); Dart range in `__tests__/extraction.test.ts`; Drupal in `__tests__/drupal.test.ts`. + +## Repo state +- branch `architectural-improvements`, last commit `42a0178 docs(playbook): record frontier pass; test(go): gorilla/mux`. +- uncommitted: clean (only untracked `.claude/handoffs/`). +- ~37 commits total on the branch (handoff's original 11 frameworks + this session's: Flask/FastAPI, Drupal, Rust/Axum, Vapor, React Router, actix, Dart, Kotlin, Lua, Scala/Play, C/C++ — each a feat + a docs(playbook) commit; Lua was docs-only). + +## Open threads / TODO +- [ ] **SHIP-PREP (the only blocker to merge):** CHANGELOG entry for the whole sweep, `package.json` bump, PR to main. Releases go through `.github/workflows/release.yml` only — do NOT `npm publish` (see CLAUDE.md). +- [x] **Frontier pass DONE (commits 0456915, 03e49ab, 42a0178):** React object data-router (literal), Next.js false-positive fix, Flask-RESTful `add_resource`, Flask tuple methods + detection, gorilla/mux confirmed. +- [ ] **Frontiers LEFT (deliberately, with rationale in playbook §7 "Frontier pass"):** anonymous/inline closures (def-use frontier), metaprogramming finders (AR/Eloquent/JPA/EF), reactive runtimes (Vue Proxy / Compose recomposition), Akka actors, C callback-struct 422-way fan-out, C++ pure-virtual base methods, React lazy data-router (variable paths + lazy imports), Play SIRD, Nuxt-specific. Forcing these adds noise. +- [ ] Pre-existing, unrelated: Next.js `*.config.mjs` in a `pages/` dir treated as a route (false-positive found in bulletproof-react). + +## Recent transcript (oldest → newest, this session) +### Turn — "what's left / what's next on coverage" → did Flask/FastAPI +- 3 holes: Flask intervening/stacked decorators, FastAPI empty path, **Python bare-name builtin guard** (handlers named `index`/`get` filtered). microblog 6→27, realworld 12→20, dispatch 290/290. Fixed 6 stale Laravel/Rails tests too. Committed + pushed. +### Turn — "Drupal next" +- `claimsReference` for FQCN/_form/single-colon controllers + contrib `detect` (composer type/name + `.info.yml`). core 536→731 (87%), admin_toolbar 0→14. OOP `#[Hook]` = frontier. Committed. +### Turn — "Rust: Axum/actix/Rocket" +- Axum chained methods + namespaced handlers (realworld 12→19, 19/19); Rocket already 99%; **actix builder API** `web::resource().route(web::get().to())` (examples 51→128). Committed (2 commits: axum, then actix). +### Turn — "Vapor (Swift)" +- Resolver was 0-routes on every real app; rewrote for any receiver + optional non-string paths + `.grouped` prefix tracking + `use:` discriminator. template 0→3, SteamPress 0→27, SPI 0→14. Committed. +### Turn — "2, 3, 4" (React Router, actix [done above], Dart/Flutter) +- React Router `` JSX (react-realworld 0→10). Dart/Flutter: **method-range fix** (foundational) + `flutter-build` setState→build synthesizer. Committed. +### Turn — "Kotlin next" +- Spring resolver `['java']`→`['java','kotlin']` + `fun` handler regex (petclinic-kotlin 0→18, 18/18; Java unchanged 19/19). Compose composition already static. Committed. +### Turn — "Lua/Luau, Scala, C/C++ (Lua first, but do all three)" +- **Lua:** measure-first → module dispatch already covered (telescope 335 cross-file calls); no code change, validated. **Scala/Play:** `conf/routes` file-walk opt-in + Play resolver (computer-database 0→8). **C/C++:** general dispatch strong (redis 29k); fixed C++ `base_class_clause` inheritance + `cpp-override` synthesizer (leveldb 12 precise). All committed + pushed. +### Turn — "wrap up + refresh handoff" +- This handoff. Sweep complete; ship-prep (CHANGELOG + PR) is the remaining work. diff --git a/.claude/skills/add-lang/SKILL.md b/.claude/skills/add-lang/SKILL.md new file mode 100644 index 00000000..37cbdce5 --- /dev/null +++ b/.claude/skills/add-lang/SKILL.md @@ -0,0 +1,219 @@ +--- +name: add-lang +description: Add tree-sitter language support to codegraph end-to-end — wire the grammar + extractor, write tests, then benchmark extraction quality and retrieval value on 3 popular real-world repos. Use when the user runs /add-lang or asks to add/support a new language (e.g. Lua, Elixir, Zig, OCaml) in codegraph. +--- + +# Add a language to CodeGraph + +Wire a new tree-sitter language into codegraph's extraction pipeline, prove it +extracts real symbols on popular repos, and prove it beats no-codegraph for an +agent. Runs **fully autonomously** — pick repos, benchmark, update docs, then +report. **Never commit, push, publish, or tag** (house rule); leave all changes +for the user to review. + +The argument is the language token used throughout the `Language` union, e.g. +`lua`, `elixir`, `zig`. If none was given, ask which language. Use the lowercase +single-token form everywhere (`csharp`, not `c#`). + +## Prerequisites +- Run from the codegraph repo root. `node`, `git`, `gh`, and a logged-in + `claude` CLI (the benchmark spawns real `claude -p` runs). +- The benchmark uses the local dev build — Step 8 builds + links it on PATH. + +## Workflow + +Copy this checklist and work through it in order: +``` +- [ ] 1. Resolve language; bail early if already supported (just benchmark) +- [ ] 2. Find a grammar + health-check it (ABI / heap corruption) +- [ ] 3. Discover the grammar's AST node types (dump-ast.mjs) +- [ ] 4. Wire the language (4 files; sometimes a 5th core touch) +- [ ] 5. Build + verify-extraction loop until PASS +- [ ] 6. Add extraction tests; make them green +- [ ] 7. Auto-pick 3 popular repos by size tier; add to corpus.json +- [ ] 8. Benchmark all 3: extraction + with/without A/B +- [ ] 9. Update README + CHANGELOG +- [ ] 10. Report; do NOT commit +``` + +### Step 1 — Resolve + short-circuit + +Check whether the language is already wired: look for the token in the +`LANGUAGES` const (`src/types.ts`) and the `EXTRACTORS` map +(`src/extraction/languages/index.ts`). If it is already supported (e.g. +`typescript`, `rust`), **skip Steps 2–6** and go straight to benchmarking +(Steps 7–8) to validate/measure it — note in the report that no code changed. + +### Step 2 — Find a grammar, then health-check it + +```bash +ls node_modules/tree-sitter-wasms/out/ | grep -i # csharp -> c_sharp +``` +- **Present** → likely off-the-shelf; `grammars.ts` resolves it from + `tree-sitter-wasms` automatically. (Many languages: elixir, zig, ocaml, + solidity, toml, yaml, …) +- **Absent** → vendor a `.wasm` into `src/extraction/wasm/` (like `pascal` / + `scala` / `lua`) and add the token to the vendored branch in Step 4. + +**Always health-check before writing an extractor — a *present* grammar can +still be unusable:** +```bash +node scripts/add-lang/check-grammar.mjs path/to/valid-sample. +``` +It prints the grammar's ABI version and parses a valid sample many times in a +multi-grammar runtime. If it **FAILs** (ERROR trees on valid code — an old ABI +corrupting the shared WASM heap, which silently drops nested calls/imports on +every file after the first; e.g. the tree-sitter-wasms **Lua** grammar is ABI 13 +and fails), do NOT use that wasm. **Vendor a newer (ABI 14/15) build instead:** +```bash +npm pack @tree-sitter-grammars/tree-sitter- # often ships a prebuilt *.wasm +# or build one: npx tree-sitter build --wasm (needs Docker/emscripten) +cp .wasm src/extraction/wasm/tree-sitter-.wasm +``` +then add the token to the vendored branch in Step 4 and re-run check-grammar on +the vendored path until it PASSes. **If you cannot obtain a healthy wasm, STOP +and tell the user.** + +### Step 3 — Discover AST node types + +Get a representative source file (write a small sample covering functions, +classes/structs, imports, enums; or `curl` a raw file from a known repo), then: +```bash +node scripts/add-lang/dump-ast.mjs path/to/sample. +# vendored grammar: pass the wasm path instead of the token +node scripts/add-lang/dump-ast.mjs src/extraction/wasm/tree-sitter-.wasm sample. +``` +The frequency table + field names (`name:`, `parameters:`, `body:`, +`return_type:`) tell you what to map. Open the existing extractor closest to the +language's paradigm as a model: `rust.ts`/`scala.ts` (functional, traits), +`java.ts`/`csharp.ts` (OO), `python.ts`/`ruby.ts` (scripting), `go.ts` +(top-level methods + receivers). + +### Step 4 — Wire the language (4 files) + +These are exact, fragile wiring — match the existing style precisely: + +1. **`src/types.ts`** — TWO edits: + - add `'',` to the `LANGUAGES` const (before `'unknown'`); + - add `'**/*.',` to `DEFAULT_CONFIG.include`. **Don't skip this** — it's + the file-scan allowlist; without the glob, `codegraph init` finds **0 + files** even though detection/extraction are wired. +2. **`src/extraction/grammars.ts`** — three maps: + - `WASM_GRAMMAR_FILES`: `: 'tree-sitter-.wasm',` + - `EXTENSION_MAP`: each file extension → `''` (e.g. `'.lua': 'lua',`) + - `getLanguageDisplayName`: `: '',` + - **vendored only**: add `` to the + `(lang === 'pascal' || lang === 'scala' || …)` wasm-path branch. +3. **`src/extraction/languages/.ts`** — new file exporting + `export const Extractor: LanguageExtractor = { … }`. Map the node types + from Step 3. Required fields: `functionTypes`, `classTypes`, `methodTypes`, + `interfaceTypes`, `structTypes`, `enumTypes`, `typeAliasTypes`, + `importTypes`, `callTypes`, `variableTypes`, `nameField`, `bodyField`, + `paramsField`. Add hooks as the grammar needs them (`getSignature`, + `getVisibility`, `isExported`, `extractImport`, `visitNode`, `getReceiverType`, + `interfaceKind`, `enumMemberTypes`, etc. — see + `src/extraction/tree-sitter-types.ts`). +4. **`src/extraction/languages/index.ts`** — `import { Extractor } from + './';` and add `: Extractor,` to `EXTRACTORS`. + +**Sometimes a 5th, core touch in `src/extraction/tree-sitter.ts`** — variable +extraction has per-language branches in `extractVariable` (the generic fallback +only finds direct `identifier`/`variable_declarator` children). If the grammar +nests declared names (e.g. Lua's `variable_declaration → variable_list`), add a +`} else if (this.language === '')` branch there, mirroring the existing +ts/python/go ones. Import forms that aren't a distinct node (Lua/Ruby `require` +is a *call*) are handled in the extractor's `visitNode` hook instead. + +### Step 5 — Build + verify loop + +```bash +npm run build # tsc + copy-assets (copies any vendored *.wasm into dist/) +``` +Index a small sample repo and check extraction: +```bash +( cd && codegraph init -i ) +node scripts/add-lang/verify-extraction.mjs +``` +`verify-extraction.mjs` fails (exit 1) if the language isn't detected or only +`file`/`import` nodes were produced — the classic symptom of wrong node-type +names. On FAIL or a thin WARN: re-run `dump-ast.mjs` on a richer file, fix the +mappings in `.ts`, `npm run build`, re-index, re-verify. **Repeat until +PASS.** + +### Step 6 — Tests + +Add to `__tests__/extraction.test.ts`, modeled on the `Rust Extraction` block: +- a `detectLanguage` assertion in `describe('Language Detection')` +- a `describe(' Extraction')` block asserting functions/classes/imports + are extracted from an inline source string. +```bash +npx vitest run __tests__/extraction.test.ts +``` +Green before continuing. + +### Step 7 — Auto-pick 3 repos + corpus + +Pick **without asking**. Find candidates, then curate 3 that are genuinely +``-dominant, one per size tier: +```bash +gh search repos --language= --sort=stars --limit 40 \ + --json fullName,stargazerCount,description +``` +Tiers (match `corpus.json`): **Small** <~150 files · **Medium** ~150–1500 · +**Large** >~1500. Skip repos that are tagged `` but mostly another +language. Write one cross-file architecture **question** per repo (the kind that +needs tracing across files). Add a `""` block to +`.claude/skills/agent-eval/corpus.json` (fields: `name`, `repo`, `size`, +`files`, `question`) so `/agent-eval` can reuse them. + +### Step 8 — Benchmark all 3 (extraction + A/B) + +Make the dev build the codegraph on PATH **once**, then loop: +```bash +npm run build && ./scripts/local-install.sh +scripts/add-lang/bench.sh "" headless # ×3 +``` +`bench.sh` clones (shared `/tmp/codegraph-corpus`), wipes + indexes, runs +`verify-extraction.mjs`, then the with/without retrieval A/B via +`scripts/agent-eval/run-all.sh` (skips the paid A/B if extraction is broken). +Read each `parse-run.mjs` summary printed by `run-all.sh`: tool calls, file +`Read`s, Grep/Bash, codegraph-tool calls, duration, and **cost** — for both the +`with` and `without` arms. After the loop, restore the dev link if needed: +`./scripts/local-install.sh`. + +### Step 9 — Docs + CHANGELOG + +- **README.md**: add `` to the "19+ Languages" feature bullet, and add a + row to the **Supported Languages** table: + `| | \`.ext\` | Full support (classes, methods, …) |`. +- **CHANGELOG.md**: add an `## [Unreleased]` section at the top (above the + latest version) with `### Added` → a user-perspective bullet, e.g. + *"CodeGraph now indexes **** (`.ext`) — functions, classes, imports, and + call edges."* If `## [Unreleased]` already exists, append under it. (It's + folded into the next versioned block at release time.) + +### Step 10 — Report (do NOT commit) + +Summarize for review: +- **Files changed**: the 4 wiring edits + new extractor + tests + README + + CHANGELOG + corpus.json (+ any vendored `.wasm`). +- **Extraction** per repo: files / nodes / edges / `verify-extraction` result. +- **A/B** per repo: `with` vs `without` (tool calls, file Reads, cost) and a + one-line verdict — did codegraph reduce effort, and did both arms reach a + correct answer? +- **Gaps / follow-ups** (node types not yet mapped, resolution edges missing, + framework routes, etc.). + +Hand the changes to the user. **Do not** run `git commit`/`push` or publish — +releases go through the GitHub Actions Release workflow. + +## Notes +- The A/B spawns real **paid** `claude -p` runs (opus, `--max-budget-usd`), + 2 arms × 3 repos. The corpus dir `/tmp/codegraph-corpus` is shared with + `/agent-eval`, so clones are reused across runs. +- Any new `*.wasm` must live in `src/extraction/wasm/` — `copy-assets` (run by + `npm run build`) ships it; otherwise it won't be in `dist/`. +- An index must be served by the **same** binary that built it. Step 8 builds + + links the dev build first, so this holds. +- If a grammar can't be obtained, or extraction can't reach PASS, **STOP and + report** — don't ship a half-wired language. diff --git a/.claude/skills/agent-eval/SKILL.md b/.claude/skills/agent-eval/SKILL.md new file mode 100644 index 00000000..2e894a75 --- /dev/null +++ b/.claude/skills/agent-eval/SKILL.md @@ -0,0 +1,74 @@ +--- +name: agent-eval +description: Benchmark CodeGraph retrieval quality on a real codebase by comparing agent behavior with vs without CodeGraph. Use when the user runs /agent-eval or asks to test, benchmark, audit, or validate a codegraph version (the local dev build or a published npm version) against a language's repo. +--- + +# CodeGraph Quality Audit + +Measures how much CodeGraph helps an agent versus plain grep/read, for a chosen +codegraph version on a chosen real-world repo. Drives the harness in +`scripts/agent-eval/`. + +## Prerequisites +- `tmux` 3+, a logged-in `claude` CLI, `node`, `git` (macOS/Linux). +- Run from the codegraph repo root. + +## Workflow + +Copy this checklist: +``` +- [ ] 1. Pick version (local or npm) +- [ ] 2. Pick language +- [ ] 3. Pick repo by size +- [ ] 4. Pick harness (headless / tmux / both) +- [ ] 5. Run audit.sh in the background +- [ ] 6. Report results +``` + +**Step 1 — version.** Ask with `AskUserQuestion`: which codegraph version to test. +Offer "Local dev build" and "Latest published"; the free-text "Other" lets the +user type a specific version (e.g. `0.7.10`). Map the answer to a VERSION token: +- "Local dev build" → `local` +- "Latest published" → `latest` +- a typed version → that string (e.g. `0.7.10`) + +**Step 2 — language.** Read `.claude/skills/agent-eval/corpus.json`. Ask with +`AskUserQuestion` which language to test, listing the languages that have entries. + +**Step 3 — repo.** From the chosen language's entries, ask which repo. Label each +option with its size and file count, e.g. `excalidraw — Medium (~600 files)`. +Each entry carries the `repo` URL and a representative `question`. + +**Step 4 — harness.** Ask with `AskUserQuestion` which harness to run, and map +the answer to a MODE token: +- "Headless" → `headless` — `claude -p` with stream-json: exact tokens/cost and a + clean tool sequence (2 runs, fast, no TTY). +- "Interactive (tmux)" → `tmux` — drives the real Claude TUI in tmux: faithful + Explore-subagent behavior, metrics from session logs (2 runs, slower). +- "Both" → `all` — headless + interactive (4 runs). + +**Step 5 — run.** Launch in the background (sets the version, clones if missing, +wipes + re-indexes, runs the chosen arms — several minutes): +```bash +scripts/agent-eval/audit.sh "" +``` + +**Step 6 — report.** When the job finishes, read the log and report per arm: +- Headless (`parse-run.mjs`): total tool calls, file `Read`s, Grep/Bash, + codegraph-tool calls, duration, **total cost**. +- Interactive (`parse-session.mjs`): the `VERDICT: codegraph_explore used Nx | + Read N | Grep/Bash N` and `TOKENS:` lines. + +Lead with cost + tool/Read counts — they are the reliable signals; raw token +in/out are confounded by subagent delegation and prompt caching. State whether +codegraph reduced effort and whether both arms reached a correct answer. + +## Notes +- The index is rebuilt every run (`audit.sh` wipes `.codegraph`) — different + versions extract differently, so an index must be served by the same binary + that built it. +- `audit.sh` temporarily mutates the global `codegraph` install for the test, + then restores your dev link via `local-install.sh`. +- Corpus repos are cloned to `/tmp/codegraph-corpus` (reused if already present). +- Add or edit repos in `corpus.json` (fields: `name`, `repo`, `size`, `files`, + `question`). diff --git a/.claude/skills/agent-eval/corpus.json b/.claude/skills/agent-eval/corpus.json new file mode 100644 index 00000000..3dcc8752 --- /dev/null +++ b/.claude/skills/agent-eval/corpus.json @@ -0,0 +1,73 @@ +{ + "_comment": "Test corpus for /agent-eval. Add entries freely. size: Small (<~150 files), Medium (~150-1500), Large (>~1500). 'question' is a representative architectural question that exercises cross-file understanding.", + "TypeScript": [ + { "name": "ky", "repo": "https://github.com/sindresorhus/ky", "size": "Small", "files": "~25", "question": "How does ky implement request retries and timeouts?" }, + { "name": "excalidraw", "repo": "https://github.com/excalidraw/excalidraw", "size": "Medium", "files": "~600", "question": "How does Excalidraw render and update canvas elements?" }, + { "name": "vscode", "repo": "https://github.com/microsoft/vscode", "size": "Large", "files": "~10000", "question": "How does the extension host communicate with the main process?" } + ], + "JavaScript": [ + { "name": "express", "repo": "https://github.com/expressjs/express", "size": "Small", "files": "~50", "question": "How does Express route a request through its middleware stack?" } + ], + "Go": [ + { "name": "cobra", "repo": "https://github.com/spf13/cobra", "size": "Small", "files": "~50", "question": "How does cobra parse commands and flags?" }, + { "name": "gin", "repo": "https://github.com/gin-gonic/gin", "size": "Medium", "files": "~150", "question": "How does gin route requests through its middleware chain?" }, + { "name": "terraform", "repo": "https://github.com/hashicorp/terraform", "size": "Large", "files": "~4000", "question": "How does Terraform build and walk the resource dependency graph?" } + ], + "Python": [ + { "name": "click", "repo": "https://github.com/pallets/click", "size": "Small", "files": "~60", "question": "How does click parse command-line arguments into commands?" }, + { "name": "flask", "repo": "https://github.com/pallets/flask", "size": "Medium", "files": "~90", "question": "How does Flask dispatch a request to a view function?" }, + { "name": "django", "repo": "https://github.com/django/django", "size": "Large", "files": "~2700", "question": "How does Django's ORM build and execute a query from a QuerySet?" } + ], + "Rust": [ + { "name": "clap", "repo": "https://github.com/clap-rs/clap", "size": "Medium", "files": "~200", "question": "How does clap parse arguments against a derived command definition?" }, + { "name": "tokio", "repo": "https://github.com/tokio-rs/tokio", "size": "Large", "files": "~700", "question": "How does tokio schedule and run async tasks on its runtime?" }, + { "name": "deno", "repo": "https://github.com/denoland/deno", "size": "Large", "files": "~1500", "question": "How does Deno load and execute a TypeScript module?" } + ], + "Java": [ + { "name": "gson", "repo": "https://github.com/google/gson", "size": "Medium", "files": "~200", "question": "How does Gson serialize an object to JSON?" }, + { "name": "okhttp", "repo": "https://github.com/square/okhttp", "size": "Medium", "files": "~640", "question": "How does OkHttp process a request through its interceptor chain?" }, + { "name": "guava", "repo": "https://github.com/google/guava", "size": "Large", "files": "~3000", "question": "How does Guava's CacheBuilder build and configure a cache?" } + ], + "Kotlin": [ + { "name": "koin", "repo": "https://github.com/InsertKoinIO/koin", "size": "Medium", "files": "~300", "question": "How does Koin resolve and inject dependencies?" }, + { "name": "leakcanary", "repo": "https://github.com/square/leakcanary", "size": "Medium", "files": "~250", "question": "How does LeakCanary detect and analyze a memory leak?" } + ], + "Swift": [ + { "name": "alamofire", "repo": "https://github.com/Alamofire/Alamofire", "size": "Small", "files": "~100", "question": "How does Alamofire build, send, and validate a request?" } + ], + "C#": [ + { "name": "serilog", "repo": "https://github.com/serilog/serilog", "size": "Medium", "files": "~250", "question": "How does Serilog route a log event to its sinks?" }, + { "name": "jellyfin", "repo": "https://github.com/jellyfin/jellyfin", "size": "Large", "files": "~2500", "question": "How does Jellyfin scan and identify items in a media library?" } + ], + "Ruby": [ + { "name": "sinatra", "repo": "https://github.com/sinatra/sinatra", "size": "Small", "files": "~60", "question": "How does Sinatra match a request to a route handler?" }, + { "name": "discourse", "repo": "https://github.com/discourse/discourse", "size": "Large", "files": "~3000", "question": "How does Discourse create and render a new post?" } + ], + "PHP": [ + { "name": "slim", "repo": "https://github.com/slimphp/Slim", "size": "Small", "files": "~80", "question": "How does Slim handle a request through its middleware?" }, + { "name": "laravel", "repo": "https://github.com/laravel/framework", "size": "Large", "files": "~3000", "question": "How does Laravel resolve and dispatch a route to a controller?" } + ], + "C": [ + { "name": "redis", "repo": "https://github.com/redis/redis", "size": "Large", "files": "~600", "question": "How does Redis parse and dispatch a client command?" } + ], + "C++": [ + { "name": "json", "repo": "https://github.com/nlohmann/json", "size": "Small", "files": "~100", "question": "How does nlohmann::json parse a JSON string into a value?" }, + { "name": "grpc", "repo": "https://github.com/grpc/grpc", "size": "Large", "files": "~3000", "question": "How does gRPC dispatch an incoming RPC to its handler?" } + ], + "Dart": [ + { "name": "flutter", "repo": "https://github.com/flutter/flutter", "size": "Large", "files": "~6000", "question": "How does Flutter build and lay out a widget tree?" } + ], + "Svelte": [ + { "name": "shadcn-svelte", "repo": "https://github.com/huntabyte/shadcn-svelte", "size": "Medium", "files": "~600", "question": "How do shadcn-svelte components compose and apply their styling?" } + ], + "Lua": [ + { "name": "lualine.nvim", "repo": "https://github.com/nvim-lualine/lualine.nvim", "size": "Small", "files": "~120", "question": "How does lualine assemble and render its statusline sections and components?" }, + { "name": "telescope.nvim", "repo": "https://github.com/nvim-telescope/telescope.nvim", "size": "Medium", "files": "~80", "question": "How does Telescope wire a picker to its finder, sorter, and previewer?" }, + { "name": "kong", "repo": "https://github.com/Kong/kong", "size": "Large", "files": "~1330", "question": "How does Kong execute plugins across a request's lifecycle phases?" } + ], + "Luau": [ + { "name": "Knit", "repo": "https://github.com/Sleitnick/Knit", "size": "Small", "files": "~10", "question": "How does Knit register services and expose them to clients?" }, + { "name": "vide", "repo": "https://github.com/centau/vide", "size": "Small", "files": "~40", "question": "How does vide track reactive sources and re-run effects when state changes?" }, + { "name": "Fusion", "repo": "https://github.com/dphfox/Fusion", "size": "Medium", "files": "~115", "question": "How does Fusion build and update its reactive UI graph from state objects?" } + ] +} diff --git a/.cursor/rules/codegraph.mdc b/.cursor/rules/codegraph.mdc new file mode 100644 index 00000000..c8616cce --- /dev/null +++ b/.cursor/rules/codegraph.mdc @@ -0,0 +1,39 @@ +--- +description: CodeGraph MCP usage guide — when to use which tool +alwaysApply: true +--- + +## CodeGraph + +This project has a CodeGraph MCP server (`codegraph_*` tools) configured. CodeGraph is a tree-sitter-parsed knowledge graph of every symbol, edge, and file. Reads are sub-millisecond and return structural information grep cannot. + +### When to prefer codegraph over native search + +Use codegraph for **structural** questions — what calls what, what would break, where is X defined, what is X's signature. Use native grep/read only for **literal text** queries (string contents, comments, log messages) or after you already have a specific file open. + +| Question | Tool | +|---|---| +| "Where is X defined?" / "Find symbol named X" | `codegraph_search` | +| "What calls function Y?" | `codegraph_callers` | +| "What does Y call?" | `codegraph_callees` | +| "How does X reach/become Y? / trace the flow from X to Y" | `codegraph_trace` (one call = the whole path, incl. callback/React/JSX dynamic hops) | +| "What would break if I changed Z?" | `codegraph_impact` | +| "Show me Y's signature / source / docstring" | `codegraph_node` | +| "Give me focused context for a task/area" | `codegraph_context` | +| "See several related symbols' source at once" | `codegraph_explore` | +| "What files exist under path/" | `codegraph_files` | +| "Is the index healthy?" | `codegraph_status` | + +### Rules of thumb + +- **Answer directly — don't delegate exploration.** For "how does X work" / architecture questions, answer with 2-3 codegraph calls: `codegraph_context` first, then ONE `codegraph_explore` for the source of the symbols it surfaces. For a specific **flow** ("how does X reach Y") start with `codegraph_trace` from→to — one call returns the whole path with dynamic hops bridged — then ONE `codegraph_explore` for the bodies; don't rebuild the path with `codegraph_search` + `codegraph_callers`. Codegraph IS the pre-built index, so spawning a separate file-reading sub-task/agent — or running a grep + read loop — repeats work codegraph already did and costs more for the same answer. +- **Trust codegraph results.** They come from a full AST parse. Do NOT re-verify them with grep — that's slower, less accurate, and wastes context. +- **Don't grep first** when looking up a symbol by name. `codegraph_search` is faster and returns kind + location + signature in one call. +- **Don't chain `codegraph_search` + `codegraph_node`** when you just want context — `codegraph_context` is one call. +- **Don't loop `codegraph_node` over many symbols** — one `codegraph_explore` call returns several symbols' source grouped in a single capped call, while each separate node/Read call re-reads the whole context and costs far more. +- **Index lag**: the file watcher debounces ~500ms behind writes; don't re-query immediately after editing a file in the same turn. + +### If `.codegraph/` doesn't exist + +The MCP server returns "not initialized." Ask the user: *"I notice this project doesn't have CodeGraph initialized. Want me to run `codegraph init -i` to build the index?"* + diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml new file mode 100644 index 00000000..51dea151 --- /dev/null +++ b/.github/workflows/release.yml @@ -0,0 +1,121 @@ +name: Release + +# Manually triggered ("Run workflow"). On trigger it: +# 1. reads the version from package.json, +# 2. builds a self-contained bundle for every platform (one runner — there's no +# native compilation, so cross-packaging is fine), +# 3. creates the GitHub Release (tag v) with all archives, using the +# release notes from CHANGELOG.md, +# 4. publishes the npm thin-installer (shim + per-platform packages). +# +# Before triggering: bump package.json and make sure CHANGELOG.md has the matching +# section (## [], or ## [Unreleased]). Set the NPM_TOKEN repo secret. +on: + workflow_dispatch: {} + +permissions: + contents: write # create the GitHub Release + tag + +jobs: + release: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + - uses: actions/setup-node@v6 + with: + node-version: 22 + registry-url: https://registry.npmjs.org + - run: npm ci + - name: Ensure zip/unzip + run: sudo apt-get update -qq && sudo apt-get install -y -qq zip unzip + + - name: Build all platform bundles + run: | + for t in darwin-arm64 darwin-x64 linux-x64 linux-arm64 win32-x64 win32-arm64; do + bash scripts/build-bundle.sh "$t" + done + ls -lh release + + - name: Generate SHA256SUMS + # Published as a release asset; the npm launcher verifies downloaded + # bundles against it (basenames only, so its path.basename match works). + run: | + ( cd release && sha256sum codegraph-* > SHA256SUMS ) + cat release/SHA256SUMS + + - name: Resolve version + id: ver + run: echo "version=$(node -p "require('./package.json').version")" >> "$GITHUB_OUTPUT" + + - name: Release notes from CHANGELOG.md + run: | + V="${{ steps.ver.outputs.version }}" + node scripts/extract-release-notes.mjs "$V" > notes.md 2>/dev/null \ + || node scripts/extract-release-notes.mjs Unreleased > notes.md 2>/dev/null || true + if [ ! -s notes.md ]; then + echo "::error::No release notes in CHANGELOG.md for [$V] or [Unreleased]." + exit 1 + fi + echo "----- release notes -----"; cat notes.md + + - name: Create GitHub Release + env: + GH_TOKEN: ${{ github.token }} + run: | + TAG="v${{ steps.ver.outputs.version }}" + # Idempotent: create the release once, otherwise (re-run) refresh assets. + if gh release view "$TAG" >/dev/null 2>&1; then + gh release upload "$TAG" release/codegraph-* release/SHA256SUMS --clobber + else + gh release create "$TAG" release/codegraph-* release/SHA256SUMS --title "$TAG" --notes-file notes.md + fi + + - name: Publish to npm + env: + NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} + run: | + V="${{ steps.ver.outputs.version }}" + bash scripts/pack-npm.sh "$V" + # Platform packages first, then the main shim (which depends on them). + # Skip any already on the registry so a re-run only fills in gaps. + for dir in release/npm/codegraph-* release/npm/main; do + name=$(node -p "require('./$dir/package.json').name") + if npm view "$name@$V" version >/dev/null 2>&1; then + echo "skip $name@$V (already published)" + else + echo "publishing $name@$V" + ( cd "$dir" && npm publish --access public ) + fi + done + + - name: Verify every package is actually on the registry + run: | + V="${{ steps.ver.outputs.version }}" + # npm publish can print success without persisting; confirm against the + # registry (with retries for propagation) so green means really shipped. + for dir in release/npm/codegraph-* release/npm/main; do + name=$(node -p "require('./$dir/package.json').name") + ok= + for i in 1 2 3 4 5 6; do + if npm view "$name@$V" version >/dev/null 2>&1; then ok=1; break; fi + echo "waiting for $name@$V to appear ($i)…"; sleep 10 + done + [ -n "$ok" ] || { echo "::error::$name@$V never appeared on the registry"; exit 1; } + echo "verified $name@$V" + done + + - name: Sync packages to npmmirror + # npmmirror/cnpm mirror lazily and frequently never pull the per-platform + # optionalDependencies on their own, so `npm i` there fails with + # "no prebuilt bundle" (issue #303). Nudge a sync now so mirror users get + # the bundle without waiting. Best-effort — the launcher also self-heals + # from GitHub Releases — so a mirror hiccup never fails the release. + continue-on-error: true + run: | + for dir in release/npm/codegraph-* release/npm/main; do + name=$(node -p "require('./$dir/package.json').name") + enc=$(node -p "encodeURIComponent(require('./$dir/package.json').name)") + echo "sync $name" + curl -s -X PUT "https://registry.npmmirror.com/-/package/$enc/syncs" || true + echo + done diff --git a/.gitignore b/.gitignore index c6bad3a1..f7aa9d68 100644 --- a/.gitignore +++ b/.gitignore @@ -18,7 +18,7 @@ dist/ Thumbs.db # Test coverage -coverage/ +/coverage/ .nyc_output/ # Environment @@ -40,6 +40,9 @@ npm-debug.log* # Local Claude settings .claude/settings.local.json +# Parallels Windows VM SSH/connection config (local machine, see CLAUDE.md) +.parallels + # CodeGraph data directories (in test projects) .codegraph/ @@ -49,3 +52,4 @@ test_frameworks test-languages/ nul +release/ diff --git a/BUNDLING.md b/BUNDLING.md new file mode 100644 index 00000000..dc21ab53 --- /dev/null +++ b/BUNDLING.md @@ -0,0 +1,74 @@ +# Distribution: self-contained bundles + +CodeGraph ships a **vendored Node runtime** alongside the app. Because Node 22.5+ +has a built-in real SQLite (`node:sqlite`, with WAL + FTS5), bundling Node means: + +- **No native build** — `better-sqlite3` is gone, so there are zero native addons + to compile or rebuild. +- **No wasm fallback** — and therefore no more `database is locked` (issue #238). +- **No Node-version dependence** — the app always runs on the bundled Node, + whatever the user has (or doesn't have) installed. + +## What's in a bundle + +Built by [`scripts/build-bundle.sh`](scripts/build-bundle.sh) — one archive per +platform, identical recipe (only the Node download differs): + +``` +codegraph-/ + node | node.exe # official Node runtime for + lib/ + dist/ # compiled app (+ tree-sitter .wasm grammars, schema.sql) + node_modules/ # production deps only (pure JS / wasm — portable) + bin/ + codegraph | codegraph.cmd # launcher → runs the bundled Node with the app +``` + +Targets: `darwin-arm64`, `darwin-x64`, `linux-x64`, `linux-arm64`, `win32-x64`, +`win32-arm64`. Unix targets produce `.tar.gz` (shell launcher); Windows produces +`.zip` (`node.exe` + a `.cmd` launcher). + +```bash +scripts/build-bundle.sh linux-x64 # -> release/codegraph-linux-x64.tar.gz +scripts/build-bundle.sh win32-x64 # -> release/codegraph-win32-x64.zip +``` + +Because dropping better-sqlite3 left **zero native addons**, building a bundle is +pure file-packaging — **any** target builds on **any** OS (the whole matrix builds +on one Linux runner). Cross-compilation isn't a concern; only *run-testing* a +bundle needs the target platform (or emulation, e.g. `docker run --platform +linux/amd64`). + +## Install channels (all deliver the same bundle) + +1. **`curl | sh`** ([`install.sh`](install.sh)) — no Node required; ideal for a + fresh Linux VPS over SSH. Detects os/arch, pulls the archive from GitHub + Releases, symlinks `codegraph` onto PATH. Re-run to upgrade; `--uninstall` to + remove. +2. **npm** ([`scripts/npm-shim.js`](scripts/npm-shim.js)) — preserves + `npm i -g @colbymchenry/codegraph`. The main package is a tiny shim; the + bundles ship as per-platform `optionalDependencies` + (`@colbymchenry/codegraph-` with `os`/`cpu`), so npm installs only the + matching one. The shim — run by the user's Node — execs the bundle, so the + real work runs on the bundled Node 24. Works even on old Node. On Windows it + invokes the bundled `node.exe` against the app entry directly (not the `.cmd` + launcher) — modern Node throws `EINVAL` when asked to spawn a `.cmd`/`.bat`. +3. **Windows** ([`install.ps1`](install.ps1)) — `irm … | iex`; same flow as + install.sh (detect arch, pull the `.zip` from Releases, add to PATH). +4. **Homebrew / Scoop** — TODO (tap + cask pointing at the Release archives). + +## Release pipeline + +[`.github/workflows/release.yml`](.github/workflows/release.yml) — manually +triggered. Reads the version from `package.json`, builds every platform bundle on +one runner, creates the GitHub Release (notes from `CHANGELOG.md`), and publishes +the npm shim + per-platform packages. Requires the `NPM_TOKEN` repo secret. + +Still TODO: +- **Code signing** — the main gap for "download & run": macOS Gatekeeper needs a + Developer ID + notarization; Windows needs Authenticode. Homebrew softens the + macOS case (handles quarantine). +- Retire the now-vestigial Node-version gate in `src/bin/codegraph.ts` — the + bundle always runs Node 24, and the npm shim does no tree-sitter work. +- Re-wire `npm uninstall` cleanup (the agent-config `preuninstall`) through the + shim — the generated main package doesn't carry it. diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 00000000..d727e6cd --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,647 @@ +# Changelog + +All notable changes to CodeGraph are documented here. Each entry also ships as +a [GitHub Release](https://github.com/colbymchenry/codegraph/releases) tagged +`vX.Y.Z`, which is where most people will look. + +This project follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) +and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +## [0.9.4] - 2026-05-24 + +### Added +- **Framework-aware route resolution — `request → route → handler → service` + flows now resolve end-to-end across the supported stacks.** Added or fixed + routing for Express (inline arrow handlers → services), Rails, Spring (Java + + Kotlin; bare and class-prefixed mappings), Django/DRF (`router.register` → + ViewSet), Laravel (`Controller@method`), Flask/FastAPI (decorator stacks, + empty-path routers, Flask-RESTful `add_resource`), Gin/chi (group-var routing), + ASP.NET (feature-folder + bare attribute routes), Drupal, Rust (Axum chained + methods, actix builder API), Vapor (Swift grouped routes), Play (`conf/routes`), + Vue/Nuxt SFC templates, Svelte/SvelteKit, and React Router (`` JSX + + object data-router). +- **Dynamic-dispatch flow synthesis — `codegraph_trace`, `codegraph_callees`, and + `codegraph_explore` now follow flows that have no static call edge.** Bridged + channels: callback/observer registration, EventEmitter (`on`/`emit`), React + re-render (`setState` → `render`) and JSX children, Flutter `setState` → `build`, + C++ virtual overrides, and Java/Kotlin interface → implementation dispatch + (e.g. Spring `@Autowired svc.list()` → the impl). Each synthesized hop is + labeled inline in `trace` with where it was wired up. +- **`CODEGRAPH_MCP_TOOLS` — trim the exposed MCP tool surface.** Set it to a + comma-separated list of tool names (e.g. `trace,search,node,context`) to expose + only those codegraph tools over MCP; unset exposes all of them. Names match on + the short form, so `trace` and `codegraph_trace` are equivalent. Lets you + constrain an agent to a minimal surface (or A/B-test tool selection) without + editing the client's MCP config. Inert by default. +- **Release archives now ship with a `SHA256SUMS` file**, and the npm launcher + verifies the bundle it downloads against it — a mismatch aborts before anything + runs. Releases published before this change have no checksum file, so the + verification is skipped (not failed) when none is available. + +### Changed +- **`codegraph_trace` now returns a self-contained flow dossier.** Each hop on + the path is shown with its full body inline (previously just the call-site + line), and the destination's own outgoing calls are appended — so one trace + call usually answers a "how does X reach Y" flow question without a follow-up + `codegraph_explore`/`codegraph_node`/Read. Measured across real repos: fewer + tool calls and lower cost than the prior path-only output, with no wall-clock + regression. +- **`codegraph_node` and `codegraph_trace` now emit line-numbered source** + (`cat -n` style, matching `codegraph_explore` and Read), so an agent can cite + or edit exact lines without re-reading the file just to recover line numbers. +- **`codegraph_explore` now leads with the execution flow** when its query names + the symbols of a flow. Agents call `explore` far more than `trace`, passing a + bag of symbol names that usually spans the flow they're investigating + (`PmsProductController getList PmsProductService list PmsProductServiceImpl`); + `explore` now finds the call path *among those named symbols* — riding + synthesized dynamic-dispatch edges (callback / React re-render / JSX child / + interface→impl) — and shows it first. So a flow question answered through + `explore` gets the trace-quality path without the agent having to switch tools. + Scoped to the named symbols (no wrong-feature wandering) and bridge-capped (no + god-function fan-out); absent when the query is fuzzy or has no connected chain. + +### Fixed +- **Static-extraction & resolution correctness fixes** underpinning the framework + work above: C++ inheritance (`base_class_clause` was unhandled, so C++ `extends` + edges were missing), Dart method body ranges (methods were extracted + signature-only), a Python builtin-name handler guard (handlers named + `index`/`get`/`update` were silently dropped), and an explore output-budget + regression that under-returned source on god-file repos. +- **Orphaned `codegraph serve --mcp` processes after a parent SIGKILL.** When + the MCP host (Claude Code, opencode, …) was force-killed — OOM killer, a + `kill -9`, a container teardown — the child kept running indefinitely on + Linux, holding inotify watches, file descriptors, and the SQLite WAL. The + kernel doesn't propagate parent death to children, and the stdin + `end`/`close` handlers we relied on don't always fire. The MCP server now + polls `process.ppid` and shuts down the moment it changes from the value + observed at startup; the poll interval is `CODEGRAPH_PPID_POLL_MS` (default + `5000`, `0` disables). Resolves + [#277](https://github.com/colbymchenry/codegraph/issues/277). + +- **`codegraph: no prebuilt bundle for ` after installing through a + registry mirror.** Installing `@colbymchenry/codegraph` from a registry that + hadn't mirrored the matching per-platform package — most often the + npmmirror/cnpm mirrors, but any lazily-syncing mirror or corporate proxy can + do it — left every command failing with `no prebuilt bundle for `. + The runtime ships as a per-platform `optionalDependency`, and npm treats an + optional package it can't fetch as a success and silently skips it, so the + bundle simply went missing. The launcher now self-heals: when the platform + bundle isn't installed, it downloads the same archive from GitHub Releases + (cached under `~/.codegraph/bundles/` for next time) and runs that — so a + global install works even on a mirror that never carried the platform package. + Set `CODEGRAPH_NO_DOWNLOAD=1` to disable the network fallback, or + `CODEGRAPH_DOWNLOAD_BASE=` to point it at your own mirror of the release + archives; the standalone `install.sh` remains the no-Node alternative. Resolves + [#303](https://github.com/colbymchenry/codegraph/issues/303). +- **`install.sh` failing with `403` / "could not resolve latest version" on + shared or cloud hosts.** The standalone installer resolved the latest release + through the GitHub API, whose unauthenticated limit is 60 requests/hour per IP + — routinely exhausted on cloud devboxes and CI where many users share an + address, returning `403` (issue #325). It now resolves the version from the + `releases/latest` web redirect, which isn't rate-limited (and still falls back + to the API). `CODEGRAPH_VERSION` also accepts a bare `0.9.4` in addition to + `v0.9.4`. Resolves + [#325](https://github.com/colbymchenry/codegraph/issues/325). + +## [0.9.3] - 2026-05-22 + +### Added +- **`codegraph uninstall` command.** Cleanly removes CodeGraph from every agent + it's configured on — Claude Code, Cursor, Codex CLI, opencode, and Hermes + Agent — in one step. It asks up front whether to remove the global config + (`~/.claude`, `~/.codex`, …) or just this project's local config (no flags + required), then prints exactly which agents it touched so you can see what + changed. `--location`, `--target`, and `--yes` are accepted for scripted / + non-interactive use. It removes only what `install` wrote (MCP server entry, + instructions block, permissions) and leaves your `.codegraph/` index alone + (use `codegraph uninit` for that). Resolves + [#313](https://github.com/colbymchenry/codegraph/issues/313) — previously the + only cleanup path was an npm `preuninstall` hook that the published bundle + never shipped, so `npm uninstall -g` left every agent pointing at a CodeGraph + MCP server that no longer existed. + +### Fixed +- **`Fatal process out of memory: Zone` crash while indexing large projects.** + On Node.js 22 and 24 — including CodeGraph's own bundled runtime — running + `codegraph index` / `codegraph init` on a large multi-language repo could + abort the entire process partway through parsing with + `Fatal process out of memory: Zone`, even with tens of GB of RAM free (the + failure is in a V8-internal compilation arena, not the JS heap). The cause is + V8's "turboshaft" optimizing WASM compiler exhausting its Zone budget while + compiling tree-sitter's large WebAssembly grammars on a background thread. + CodeGraph now runs with V8's `--liftoff-only`, which keeps grammar compilation + on the baseline compiler and never reaches the optimizing tier, eliminating + the crash; indexing output is otherwise unchanged. The bundled launcher passes + the flag directly, and any other launch path (from source, `npx`, a globally + linked dev build) re-execs once with it automatically. Resolves + [#298](https://github.com/colbymchenry/codegraph/issues/298) and + [#293](https://github.com/colbymchenry/codegraph/issues/293). (Node 25 stays + blocked — its variant of this V8 bug is not resolved by `--liftoff-only`.) +- **Cursor uninstall left an orphaned `.cursor/rules/codegraph.mdc`.** It + stripped the rule body but left the file and its `description: CodeGraph …` + frontmatter behind. The dedicated rules file is now deleted outright on + uninstall, while any content you added outside CodeGraph's markers is kept. + +## [0.9.2] - 2026-05-21 + +### Added +- **Installer target: Hermes Agent (Nous Research).** `codegraph install` now + supports Hermes Agent — it writes the `mcp_servers.codegraph` entry and ensures + `platform_toolsets.cli` includes `mcp-codegraph` in `$HERMES_HOME/config.yaml`, + so Hermes can drive the CodeGraph knowledge graph like the other agents. +- **Framework support: Drupal 8/9/10/11** — CodeGraph now detects Drupal + projects (via a `drupal/*` dependency in `composer.json`) and adds three + levels of intelligence: + - **Route extraction**: `*.routing.yml` files emit a `route` node per route, + linked by a `references` edge to the `_controller`, `_form`, or + entity-handler class/method, so querying a controller method surfaces the + URL route that binds it. + - **Hook detection**: hook implementations in `.module`, `.install`, `.theme`, + and `.inc` files are detected via docblock (`Implements hook_X()`) with a + module-name-prefix fallback. Each emits a `references` edge to the canonical + `hook_X` name so `codegraph_callers("hook_form_alter")` returns every + implementation across modules. + - **Resolution**: `_controller`/`_form` FQCNs resolve to their PHP + class/method nodes. + New `yaml`/`twig` languages are tracked at the file level, the Drupal PHP + extensions (`.module`/`.install`/`.theme`/`.inc`) are indexed with the PHP + grammar, and `web/core`, `web/modules/contrib`, `web/themes/contrib` are + excluded by default. Resolves [#268](https://github.com/colbymchenry/codegraph/issues/268). + +### Changed +- **Zero-config indexing that respects `.gitignore`.** CodeGraph no longer has a + config file. It indexes every file whose extension maps to a supported language + and honors your `.gitignore` everywhere: in git repos via git itself, and in + non-git projects (e.g. a freshly-scaffolded app before `git init`) by reading + `.gitignore` files directly — root and nested, the same way git does (via the + `ignore` library, so negation/anchoring/nested rules all behave correctly). To + keep something out of the graph, add it to `.gitignore`. **Behavior change:** + committed files that are *not* gitignored are now indexed even under `vendor/`, + `Pods/`, or a committed `dist/` — previously a hardcoded exclude list skipped + those names; now `.gitignore` is the single source of truth. Resolves + [#283](https://github.com/colbymchenry/codegraph/issues/283). + +### Fixed +- **Windows: `npm i -g @colbymchenry/codegraph` then any `codegraph` command + failed with `spawnSync …\codegraph.cmd EINVAL`.** The npm launcher spawned the + bundle's `.cmd` file directly, which modern Node refuses to do on Windows + (the CVE-2024-27980 hardening — seen on Node 24). The launcher now invokes the + bundled `node.exe` against the app directly, so `codegraph` works on Windows + regardless of your Node version. Resolves + [#289](https://github.com/colbymchenry/codegraph/issues/289). + +### Removed +- **`.codegraph/config.json` and the entire config surface.** Every field was + either inert or now redundant with `.gitignore`: + - `languages`/`frameworks` never affected indexing (languages are detected per + file from extensions; frameworks are auto-detected). `languages` was also + broken — its validator only knew the original 8 languages, so setting it to + anything newer (C#, PHP, Ruby, C/C++, Swift, Kotlin, Dart, Vue, Scala, Lua, …) + threw `Invalid configuration format`. + - `extractDocstrings`/`trackCallSites`/`customPatterns` were never read by any + extractor. + - `include` is now derived from the supported language extensions, `exclude` is + replaced by `.gitignore`, and `maxFileSize` (1 MB) is a constant. + + **Breaking (library API):** the `CodeGraphConfig` type, the `config` option on + `CodeGraph.init()`, and the `getConfig()`/`updateConfig()`/`getConfigPath` + exports are gone. Existing `.codegraph/config.json` files are simply ignored. + The `.codegraphignore` marker is no longer supported — use `.gitignore`. + +### Security +- **MCP session marker no longer follows symlinks** (CWE-59). Every + `codegraph_context` call writes a `codegraph-consulted-*` marker into the + system temp dir; the previous write followed symlinks, so on a multi-user + system another local user could pre-plant that path as a symlink and redirect + the write onto a victim-writable file. The marker is now opened with + `O_NOFOLLOW` and mode `0600`, and a planted symlink is refused rather than + followed. Resolves [#280](https://github.com/colbymchenry/codegraph/issues/280). + +## [0.9.1] - 2026-05-21 + +### Fixed +- **Standalone installers** (`curl … | sh`, `irm … | iex`): the bundled launcher + failed with `exec: …/node: not found` because it didn't resolve the symlink the + installer puts on your PATH. Installing on a machine with **no Node** now works. +- **npm**: `@colbymchenry/codegraph-linux-x64` is now published — the 0.9.0 + release silently shipped 6 of 7 packages, so `npm i -g` on linux-x64 couldn't + find its bundle. The release pipeline now verifies every package reached the + registry (and is idempotent), so a release can't pass green-but-broken again. + +[0.9.4]: https://github.com/colbymchenry/codegraph/releases/tag/v0.9.4 +[0.9.3]: https://github.com/colbymchenry/codegraph/releases/tag/v0.9.3 +[0.9.2]: https://github.com/colbymchenry/codegraph/releases/tag/v0.9.2 +[0.9.1]: https://github.com/colbymchenry/codegraph/releases/tag/v0.9.1 + +## [0.9.0] - 2026-05-21 + +### 🎉 Self-contained: CodeGraph bundles its own runtime — install anywhere, on any Node (or none) + +**No more `database is locked`. No more native build failures. No more "WASM fallback active."** + +CodeGraph used to need `better-sqlite3`, a native module compiled against your exact +Node version. When that build failed (common on Windows and locked-down machines) it +silently dropped to a slow WASM SQLite build with **no WAL** — the root cause of the +intermittent `database is locked` errors on concurrent MCP tool calls +([#238](https://github.com/colbymchenry/codegraph/issues/238)). That entire class of +problem is **gone**: CodeGraph now ships a self-contained Node runtime and uses Node's +built-in `node:sqlite` (real SQLite, full WAL + FTS5). + +- ✅ **Zero native compilation** — nothing to build, ever; nothing to rebuild when Node changes. +- ✅ **Runs on any Node version — or with no Node at all.** Install via the standalone installers with no Node present, or keep using `npm`/`npx` on any version (your Node only launches the bundled runtime). +- ✅ **`database is locked` fixed at the root** — real WAL means readers never block on a writer. +- ⚡ **5–10× faster** than the old WASM fallback for anyone who was stuck on it. + +```bash +# macOS / Linux — no Node required +curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh +# Windows (PowerShell) — no Node required +irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 | iex +# or, if you have Node (any version): +npm i -g @colbymchenry/codegraph +``` + +### Added +- **Standalone installers** — one-line install with no Node.js required: + `curl -fsSL .../install.sh | sh` (macOS/Linux) and `irm .../install.ps1 | iex` + (Windows). They fetch the matching self-contained bundle from GitHub Releases + and put `codegraph` on your PATH. +- **Lua**: CodeGraph now indexes Lua (`.lua`) — functions, methods (table `t.f` + and `t:m` definitions become methods with a `t::f` receiver-qualified name), + local variables, `require(...)` imports, and the call edges between them. + Querying a Lua project (Neovim plugins, Kong, OpenResty, game code) now + surfaces its modules, methods, and call graph. +- **Luau** ([#232](https://github.com/colbymchenry/codegraph/issues/232)): + CodeGraph now indexes Luau (`.luau`), Roblox's typed superset of Lua — + everything Lua extracts, plus `type` / `export type` aliases, typed function + signatures, generics, and Roblox instance-path `require(script.Parent.X)` + imports. + +### Changed +- **SQLite backend is now Node's built-in `node:sqlite`** (real SQLite, WAL + + FTS5), shipped inside a bundled Node runtime. This fixes the concurrent-read + `database is locked` errors ([#238](https://github.com/colbymchenry/codegraph/issues/238)) + at the root and removes the native build step entirely. +- **`npm i -g` / `npx` now install a self-contained bundle.** The main package is + a tiny shim; the runtime ships as per-platform `optionalDependencies`, so the + install works on any Node version (your Node only launches the bundle). +- **`codegraph status`** now reports the effective journal mode (`wal` vs not), + so a `database is locked` report is triageable at a glance. + +### Removed +- **`better-sqlite3`** (optional native dependency) and **`node-sqlite3-wasm`** + (WASM fallback) — along with the native-build banner, the WASM fallback path, + and the no-WAL lock retries they required. The dependency tree now has zero + native addons. + +### Fixed +- **Installer**: re-running `codegraph install` now removes the broken + auto-sync hooks that pre-0.8 versions wrote to Claude Code's + `settings.json`. Those builds added a `Stop → codegraph sync-if-dirty` + hook (and a `PostToolUse → codegraph mark-dirty` partner); both + subcommands were later removed from the CLI, so Claude Code reported + `Stop hook error: ... unknown command 'sync-if-dirty'` on every turn. + The cleanup is surgical — only codegraph's own hook entries are + stripped, so unrelated hooks sharing the same file or event (e.g. a + GitKraken `gk ai hook run` hook) are left untouched — and it also runs + on uninstall, so the npm `preuninstall` step fully reverses a legacy + install. Re-run `codegraph install` once on an affected machine to + clear the error. + +[0.9.0]: https://github.com/colbymchenry/codegraph/releases/tag/v0.9.0 + +## [0.8.0] - 2026-05-20 + +### Added +- **Framework routes (NestJS)**: CodeGraph now recognises NestJS projects and + emits `route` nodes — each linked by a `references` edge to its handler + method — across all four transport layers: HTTP controllers (the + `@Controller` prefix joined with `@Get`/`@Post`/`@Put`/`@Patch`/`@Delete`/ + `@Head`/`@Options`/`@All`, including empty `@Controller()`/`@Get()`), + GraphQL resolvers (`@Query`/`@Mutation`/`@Subscription`), microservice + handlers (`@MessagePattern`/`@EventPattern`), and WebSocket gateways + (`@SubscribeMessage`, prefixed with the gateway namespace). Detected + automatically from any `@nestjs/*` dependency in `package.json`. Querying a + controller method or resolver now surfaces the route that binds it. + Resolves [#220](https://github.com/colbymchenry/codegraph/issues/220). +- **MCP / explore**: `codegraph_explore` source sections now carry line + numbers (cat -n style `\t`, matching the Read tool). This lets + the agent cite `file:line` straight from the explore payload instead of + re-opening the file just to find a line number — the dominant residual + cost on precise-tracing questions. In an isolated A/B (answer a + "which exact line" question with the relevant code already in the + payload), the no-line-numbers arm spent 2 file Reads + a grep recovering + the line number while the line-numbered arm answered with zero follow-up + tool calls. Payload cost is small (~3-5%). Set + `CODEGRAPH_EXPLORE_LINENUMS=0` to disable. +- **MCP / watcher**: CodeGraph now skips the live file watcher on WSL2 + `/mnt/*` drives, where recursive `fs.watch` is slow enough to break MCP + startup (see Fixed). When the watcher is off, `codegraph init` / + `codegraph install` offer to keep the index fresh via git hooks + (`post-commit`, `post-merge`, `post-checkout`) that run `codegraph sync` + in the background — accept for automatic refresh on commit / pull / + checkout, or decline and sync by hand. Either way you're told the index + stays frozen until it's re-synced. New controls: `CODEGRAPH_NO_WATCH=1` + (or `codegraph serve --mcp --no-watch`) forces the watcher off anywhere; + `CODEGRAPH_FORCE_WATCH=1` overrides the WSL auto-detect when your `/mnt` + setup is actually fast. `codegraph uninit` removes any hooks it installed. + +### Changed +- **MCP / agent guidance**: CodeGraph now tells agents to answer "how does X + work" / architecture questions *directly* — `codegraph_context`, then one + `codegraph_explore` for the surfaced symbols — instead of delegating to a + file-reading sub-agent or a grep+read loop. The server instructions and the + installed instruction files (`CLAUDE.md`, `.cursor/rules/codegraph.mdc`, + `AGENTS.md`) previously suggested *spawning a sub-agent* for explore-class + questions, which produced the opposite, more expensive behavior: the + sub-agent reads files regardless of the index, so CodeGraph became overhead + stacked on top of the reads. In rigorous N≥4-per-arm benchmarks this cut the + cost of an architecture question by ~42–47% versus a no-CodeGraph agent on + medium and large repos (Excalidraw ~600 files, VS Code ~10k), with + equal-or-better, `file:line`-cited answers and ~6× fewer tool calls; on a + tiny repo (~25 files) it's a wash, since native grep is already trivially + cheap there. +- **MCP / codegraph_node**: `includeCode=true` on a class/interface/struct/enum + now returns a compact member outline (fields + method signatures + line + numbers) instead of the entire class body — which could be thousands of + characters and was rarely needed in full. Functions and methods still return + their full body; request a specific member for its source. +- **Minimum Node.js is now 20** (was 18). Node 18 is end-of-life and the + native SQLite binding (`better-sqlite3` 12.x) no longer ships a Node 18 + prebuilt binary. Node 22 LTS and Node 24 get the native backend out of the + box; on other Node versions CodeGraph still runs via the WASM fallback + (slower, but functional). Node 25+ remains blocked (V8 WASM JIT crash, see + [#81](https://github.com/colbymchenry/codegraph/issues/81)). +- **MCP / explore**: `codegraph_explore` output is now adaptive to project + size. The tool used to apply a fixed 35KB cap regardless of how large the + codebase was, which on small projects (~100 files) produced bigger + responses than the agent's native grep+Read flow would have — exactly the + scenario reported in + [#185](https://github.com/colbymchenry/codegraph/issues/185). The budget + now scales with indexed file count: small projects (<500 files) cap at + ~18KB and skip the "Additional relevant files" / completeness / explore- + budget reminders that earn their keep on bigger codebases; medium + (<5,000) caps at ~13KB; large (<15,000) keeps the historical ~35KB; very + large goes up to ~38KB. A new per-file char cap also prevents a single + file with many adjacent symbols from collapsing into one whole-file dump + (the Alamofire `Session.swift` case from #185). Per-file cluster + selection ranks clusters that contain a query entry point ahead of dense + declaration blocks, and whole-file "envelope" nodes (a class/struct that + spans most of the file) are excluded from clustering so the methods the + query asked about aren't buried under the container's opening lines. + Measured against the same repos used in the README benchmark, end state + with line numbers on: Alamofire ~60% smaller per call, Excalidraw ~32%, + VS Code ~12%. Agent-trust floor still holds — the Relationships section, + scored cluster selection, and structured-source output are all retained. + Thanks to [@essopsp](https://github.com/essopsp) for the repro. +- **Search ranking (Kotlin / Swift / Scala / C#)**: test files in these + languages are now correctly de-prioritized in `codegraph_search`, + `codegraph_context`, and `codegraph affected`. Detection previously only + recognized `snake_case`/`.test.`-style names plus a handful of Java + suffixes, so CamelCase test files (`FooTest.kt`, `BarTests.swift`, + `BazSpec.scala`, `QuxTestCase.cs`) and Gradle / Kotlin-Multiplatform / + Xcode test source-set directories (`jvmTest/`, `commonTest/`, + `androidTest/`, `iosTest/`, `integrationTest/`) were treated as production + code and could outrank the real implementation. Detection now matches + capital-led `*Test` / `*Tests` / `*Spec` / `*TestCase` filenames and + source-set directories — deliberately capital-led so lowercase look-alikes + like `latest.kt` and `manifest.kt` are not misclassified. + +### Fixed +- **MCP / explore**: `codegraph_explore` output is now hard-capped to its + adaptive size budget. It could previously overrun (e.g. ~30K against a 28K + cap) once the relationship map and trailer sections were appended; the + oversized payload then sat in the agent's context and was re-read on every + later turn. +- **Sync / status**: git-untracked files are no longer reported as pending + "Added" forever. After `codegraph sync` indexed a newly-created untracked + source file, `codegraph status` kept listing it under Pending Changes and + every subsequent `sync` re-indexed it from scratch — even though its symbols + were already queryable. Change detection trusted `git status` and counted + every untracked (`??`) entry as new without checking the index, but indexing + a file doesn't make git track it, so the file stayed `??` and got re-added on + each run. CodeGraph now hash-compares untracked files against the index the + same way it does tracked files: a file counts as "added" only if it's missing + from the index, "modified" if its contents changed, and is skipped otherwise. + Closes [#206](https://github.com/colbymchenry/codegraph/issues/206). Thanks to + [@15290391025](https://github.com/15290391025) for the report. +- **Indexing**: `codegraph init -i` now finds source inside nested, independent + git repositories — separate clones living inside the workspace that are **not** + git submodules (common in CMake "super-repo" layouts). When the top-level + workspace is itself a git repo, `git ls-files` reports an embedded repo only as + an opaque `subdir/` entry and never lists its files, so indexing from the + workspace root reported "No files found to index" even though indexing each + sub-repo individually worked. CodeGraph now detects these embedded repos and + indexes their tracked and untracked source, honoring each repo's own + `.gitignore`. Closes + [#193](https://github.com/colbymchenry/codegraph/issues/193). Thanks to + [@timxx](https://github.com/timxx) for the report. +- **Native SQLite backend on Node 24**: indexing on Node 24 always dropped to + the 5-10x-slower WASM backend, printing a `better-sqlite3 unavailable` + warning that `npm rebuild better-sqlite3` / `xcode-select --install` could + not clear ([#203](https://github.com/colbymchenry/codegraph/issues/203)). + The bundled `better-sqlite3` was pinned to a v11 release that ships no + prebuilt binary for Node 24's ABI (`node-v137`), so every Node 24 install + silently degraded — and because CodeGraph is usually installed globally, the + `npm install` / `npm rebuild` people ran in their own project never touched + CodeGraph's copy. CodeGraph now requires `better-sqlite3` `^12.4.1`, whose + prebuilds include Node 24, so a fresh install on Node 22 or Node 24 gets the + native backend with no compiler. On an already-broken install, reinstall + CodeGraph (e.g. `npm install -g @colbymchenry/codegraph`) to pull the new + binding; `codegraph status` should then report `Backend: native`. Thanks to + [@Finndersen](https://github.com/Finndersen) for the report. +- **MCP**: tools no longer fail with "CodeGraph not initialized" when the index + actually exists. This hit clients that launch the MCP server from a directory + other than your project and don't report a workspace root in `initialize` + (some IDE/JetBrains-family integrations) — the server fell back to its own + working directory, missed the project's `.codegraph/`, and returned the + misleading "Run 'codegraph init' first" on every call. The only workaround + was passing `projectPath` to each tool by hand. Now, when no project path is + supplied, the server asks the client for its workspace root via the standard + MCP `roots/list` request (when the client advertises the `roots` capability) + before falling back to the working directory — so detection just works for + spec-compliant clients. When it still can't resolve a project, the error is + now actionable: it names the directory it searched and tells you to pass + `projectPath` or add `--path /abs/project` to the server's MCP config args, + instead of pointing you at a re-init you don't need. Closes + [#196](https://github.com/colbymchenry/codegraph/issues/196). Thanks to + [@zhangyu1197](https://github.com/zhangyu1197) for the report and the + `projectPath` workaround. +- **MCP**: the server no longer hangs on startup under WSL2 when the project + lives on an NTFS `/mnt/*` mount. Setting up the recursive file watcher + there took tens of seconds — every directory read crosses the Windows/9p + boundary — which blew past the host's initialization timeout (opencode's + 30s), so the codegraph tools silently never appeared, even on small + projects. This is the file-watcher half of the + [#172](https://github.com/colbymchenry/codegraph/issues/172) startup fix: + that one moved the database/WASM open off the handshake, but the watcher + setup was still on the critical path. CodeGraph now auto-skips the watcher + on those mounts, with manual and git-hook sync fallbacks (see Added). + Closes [#199](https://github.com/colbymchenry/codegraph/issues/199). + Thanks to [@mengfanbo123](https://github.com/mengfanbo123) for the precise + root-cause analysis and workaround. +- **Installer (Claude Code)**: project-local installs (`Just this project`) + now write the MCP server to `.mcp.json` in the project root — the file + Claude Code actually reads for project-scoped servers. Previously they + wrote `.claude.json`, which Claude Code ignores, so the codegraph tools + silently never appeared and you had to rename the file by hand to make it + work. Re-running `codegraph install` (or `codegraph init`) on an affected + project migrates the stale `.claude.json` entry into `.mcp.json` + automatically; uninstall cleans up both. Global (`All projects`) installs + were unaffected — they correctly target `~/.claude.json`. Closes + [#207](https://github.com/colbymchenry/codegraph/issues/207). Thanks to + [@Jhsmit](https://github.com/Jhsmit) for the report and the workaround. +- **MCP**: source-omission markers in `codegraph_explore` and + `codegraph_context` output are now language-neutral (`... (gap) ...`, + `... (trimmed) ...`, `... (truncated) ...`) instead of C-style `//` + comments, which were misleading inside Python, Ruby, and other non-C + fenced source blocks. + +## [0.7.10] - 2026-05-19 + +### Fixed +- **MCP**: tools no longer silently fail to appear in clients on slow + filesystems (Docker Desktop VirtioFS on macOS, WSL2). The `initialize` + handshake was blocking on opening the SQLite database and bootstrapping + the tree-sitter WASM runtime, which on slow I/O could exceed Claude + Code's ~30s handshake timeout — leaving the codegraph process alive but + unresponsive and no tools visible. The handshake now returns immediately + and defers project open to the background; tool calls wait on the + in-flight init rather than racing it with a second open. Closes + [#172](https://github.com/colbymchenry/codegraph/issues/172). Thanks to + [@sashanclrp](https://github.com/sashanclrp) for the original report and + detailed reproduction, and [@sgrimm](https://github.com/sgrimm) for the + decisive wire capture that isolated the actual root cause. +- **CLI**: terminal output no longer mojibakes on Windows PowerShell / + cmd.exe during `codegraph index` and `codegraph sync`. The shimmer + progress renderer writes from a worker thread via `fs.writeSync(1, …)` + to keep the animation smooth while the main thread is busy in SQLite, + which bypasses Node's TTY-aware UTF-8→codepage conversion — so glyphs + like `│ ◆ —` were emitted as raw UTF-8 bytes and reinterpreted as the + console's OEM codepage (CP437, CP936, …), producing strings like + `鋍?[0m 鉒?[0m Scanning files 鈥?N found`. CodeGraph now picks an ASCII + glyph set on Windows by default (`| * -` instead of `│ ◆ —`); set + `CODEGRAPH_UNICODE=1` to opt back into the Unicode glyphs (e.g. on + pwsh 7 with UTF-8 codepage), or `CODEGRAPH_ASCII=1` on any platform to + force ASCII (useful for log collectors / non-TTY pipelines). Closes + [#168](https://github.com/colbymchenry/codegraph/issues/168). Thanks to + [@starkleek](https://github.com/starkleek) for the report and to + [@Bortlesboat](https://github.com/Bortlesboat) for the initial PR. +- **MCP / search**: module-qualified symbol lookups now resolve. The + MCP tools (`codegraph_node`, `codegraph_callees`, `codegraph_impact`, + …) accept `module::symbol` (Rust / C++ / Ruby), `Module.symbol` + (TS / JS / Python), and `module/symbol` (path-style) — multi-level + forms (`crate::configurator::stage_apply::run`) and Rust path + prefixes (`crate`, `super`, `self`) are handled. Closes + [#173](https://github.com/colbymchenry/codegraph/issues/173). Thanks + to [@joselhurtado](https://github.com/joselhurtado) for the detailed + reproduction. Three underlying fixes: + - The FTS5 query builder now treats `::` as a token separator + instead of stripping it to nothing, so `stage_apply::run` no + longer collapses to the unsearchable `stage_applyrun`. + - `matchesSymbol` falls back to a file-path containment check when + `qualifiedName` doesn't carry the module hierarchy (Rust + file-level functions, Python free functions in a package): a + `run` in `src/configurator/stage_apply.rs` now matches + `stage_apply::run` because `stage_apply` appears as a path + segment. + - Qualified lookups that don't match the qualifier no longer fall + through to fuzzy text matches — `stage_apply::nonexistent_fn` + returns `null` instead of resolving to an unrelated `rollback` + in the same file. + +[0.8.0]: https://github.com/colbymchenry/codegraph/releases/tag/v0.8.0 +[0.7.10]: https://github.com/colbymchenry/codegraph/releases/tag/v0.7.10 + +## [0.7.8] - 2026-05-17 + +### Fixed +- **opencode**: install actually wires up the MCP server now. v0.7.7 wrote + `~/.config/opencode/opencode.json`, but opencode reads `opencode.jsonc` by + default — so the `codegraph` entry never showed up in any opencode session. + The installer now prefers an existing `.jsonc`, falls back to `.json` when + only that exists, and creates `.jsonc` for greenfield installs. **Re-run + `codegraph install --target=opencode` after upgrading** so the entry lands + in the file opencode actually reads. + +### Added +- **opencode**: installer now writes `AGENTS.md` (global + `~/.config/opencode/AGENTS.md`, local `./AGENTS.md`) with the same + codegraph usage guidance the other agents already received. Without it, + opencode's model would call native `Grep` instead of the `codegraph_*` + tools it could see in its MCP list. +- User comments and formatting in `opencode.jsonc` survive install / + re-install / uninstall round-trips — surgical edits via `jsonc-parser` + rather than full-file rewrites. + +[0.7.8]: https://github.com/colbymchenry/codegraph/releases/tag/v0.7.8 + +## [0.7.7] - 2026-05-17 + +### Added +- **Multi-agent installer** (closes [#137](https://github.com/colbymchenry/codegraph/issues/137)). + `codegraph install` now opens with a multi-select prompt for **Claude Code**, + **Cursor**, **Codex CLI**, and **opencode** — detected agents are pre-checked. + Each writes its native MCP config + instructions file (e.g. `~/.cursor/mcp.json` + + `.cursor/rules/codegraph.mdc`, `~/.codex/config.toml` + `~/.codex/AGENTS.md`, + `~/.config/opencode/opencode.json`). The runtime MCP server was already + agent-agnostic; this brings the installer to parity. +- Non-interactive install flags for scripting / CI: + `--target=`, `--location=`, `--yes`, + `--no-permissions`, `--print-config `. +- `codegraph init` now auto-wires project-local agent surfaces for any agent + configured globally. In practice: Cursor's `.cursor/rules/codegraph.mdc` + is dropped on `init` so a single global `codegraph install` works in every + project you open — no per-project re-install needed. + +### Fixed +- **Cursor**: globally-installed codegraph reported "not initialized" in every + workspace because Cursor launches MCP-server subprocesses with the wrong + working directory and doesn't pass `rootUri` in the MCP initialize call. + We now inject `--path` into Cursor's MCP args — absolute path for local + installs, `${workspaceFolder}` for global installs. + +### Changed +- Agent-instructions template is now agent-agnostic. The previous template was + inherited from the Claude-only era and prescribed "spawn an Explore agent" — + a Claude Code-specific concept that confused Cursor's and Codex's agents and + caused them to fall back to native grep even with codegraph available. The + new template adds explicit "trust codegraph results, don't re-verify with + grep" guidance and a clear tool-by-question matrix. Applies to + `~/.claude/CLAUDE.md`, `.cursor/rules/codegraph.mdc`, and `~/.codex/AGENTS.md`. +- `codegraph install` prompt order: agent picker is now step 1, before the + PATH-install and location prompts. +- Disambiguated "global" wording in install prompts ("Install codegraph CLI on + your PATH?" vs "Apply agent configs to all your projects, or just this one?") + — both used to say "Global" and read as duplicates. + +### Internal +- New `AgentTarget` interface in `src/installer/targets/` — adding a 5th agent + (Continue, Zed, Windsurf, …) is a new file + one entry in `registry.ts`. +- Hand-rolled TOML serializer for Codex (`src/installer/targets/toml.ts`) — no + new dependency, scoped to the `[mcp_servers.codegraph]` table only, sibling + tables and `[[array_of_tables]]` preserved verbatim. +- +47 parameterized contract tests across the 4 targets — install idempotency, + sibling preservation, uninstall reverses install, byte-equal re-runs return + `unchanged`, partial-state recovery for Codex. + +Based on substantive draft by [@andreinknv](https://github.com/andreinknv) +([fork commit `c5165e4`](https://github.com/andreinknv/codegraph/commit/c5165e4)). +Thank you. + +[0.7.7]: https://github.com/colbymchenry/codegraph/releases/tag/v0.7.7 + +## [0.7.6] - 2026-05-13 + +### Fixed +- `codegraph` CLI failing with `zsh: permission denied: codegraph` after a fresh + global install. The published 0.7.5 tarball shipped `dist/bin/codegraph.js` + without the executable bit, so the shell refused to run it through the npm + symlink. The build now `chmod +x`'s the binary before packing. + + Already on 0.7.5? Either upgrade to 0.7.6, or unblock yourself in place: + ```bash + chmod +x "$(npm root -g)/@colbymchenry/codegraph/dist/bin/codegraph.js" + ``` + +[0.7.6]: https://github.com/colbymchenry/codegraph/releases/tag/v0.7.6 diff --git a/CLAUDE.md b/CLAUDE.md index 71a50c73..a1131bfb 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,149 +4,228 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co ## Project Overview -CodeGraph is a local-first code intelligence system that builds a semantic knowledge graph from any codebase. It provides structural understanding of code relationships using tree-sitter for AST parsing and SQLite for storage. +CodeGraph is a local-first code intelligence library + CLI + MCP server. It parses any supported codebase with tree-sitter, stores symbols/edges/files in SQLite (FTS5), and exposes a knowledge graph to AI agents (Claude Code, Cursor, Codex CLI, opencode) over MCP. Per-project data lives in `.codegraph/`. Extraction is deterministic — derived from AST, not LLM-summarized. -**Key characteristics:** -- Headless library (no UI) - purely an API -- Node.js runtime (works standalone, in Electron, or any Node environment) -- Per-project data stored in `.codegraph/` directory -- Deterministic extraction from AST, not AI-generated summaries +Distributed as `@colbymchenry/codegraph` on npm; same binary serves as installer, indexer, and MCP server. -## Build and Development Commands +## Build, Test, Run ```bash -# Build -npm run build # Compile TypeScript and copy assets +npm run build # tsc + copy schema.sql and *.wasm into dist/; chmods dist/bin/codegraph.js +npm run dev # tsc --watch +npm run clean # rm -rf dist -# Test -npm test # Run all tests once -npm run test:watch # Run tests in watch mode +npm test # vitest run (all) +npm run test:watch +npm run test:eval # only __tests__/evaluation/ +npm run eval # build then run __tests__/evaluation/runner.ts via tsx -# Clean -npm run clean # Remove dist/ directory +npm run cli # build then run the local dist binary + +# Single test file / pattern +npx vitest run __tests__/installer-targets.test.ts +npx vitest run __tests__/extraction.test.ts -t "TypeScript" ``` -## Running a Single Test +`copy-assets` (called from `build`) copies `src/db/schema.sql` and all `src/extraction/wasm/*.wasm` files into `dist/`. **Any new SQL or grammar wasm must be copied or it won't ship.** -```bash -npx vitest run __tests__/extraction.test.ts # Run specific test file -npx vitest run __tests__/extraction.test.ts -t "TypeScript" # Run tests matching pattern -``` +Node engines: `>=18.0.0 <25.0.0`. There is a hard exit on Node 25.x (see `src/bin/node-version-check.ts`). ## Architecture -### Core Module Structure +### Layered pipeline ``` -src/ -├── index.ts # Main CodeGraph class - public API entry point -├── types.ts # All TypeScript interfaces and types -├── db/ # SQLite database layer -│ ├── index.ts # DatabaseConnection class -│ ├── queries.ts # QueryBuilder with prepared statements -│ └── schema.sql # Table definitions with FTS5 search -├── extraction/ # Tree-sitter AST parsing -│ ├── index.ts # ExtractionOrchestrator -│ ├── tree-sitter.ts # Universal parser wrapper -│ └── grammars.ts # Language detection and grammar loading -├── resolution/ # Reference resolver -│ ├── index.ts # ReferenceResolver orchestrator -│ ├── import-resolver.ts -│ ├── name-matcher.ts -│ └── frameworks/ # Framework-specific patterns (React, Express, Laravel, etc.) -├── graph/ # Graph traversal and queries -│ ├── index.ts # GraphQueryManager -│ ├── traversal.ts # GraphTraverser (BFS/DFS, impact radius) -│ └── queries.ts # High-level graph queries -├── context/ # Context building for AI assistants -│ ├── index.ts # ContextBuilder -│ └── formatter.ts # Markdown/JSON output formatting -├── sync/ # Incremental update system -│ ├── index.ts -│ └── git-hooks.ts # Post-commit hook management -├── installer/ # Interactive installer -│ ├── index.ts # Installer orchestrator -│ ├── banner.ts # ASCII art banner -│ ├── claude-md-template.ts # CLAUDE.md template generator -│ ├── config-writer.ts # Configuration file writing -│ └── prompts.ts # User prompts -├── mcp/ # Model Context Protocol server -│ ├── index.ts # MCPServer class -│ ├── tools.ts # MCP tool definitions -│ └── transport.ts # Stdio transport -└── bin/codegraph.ts # CLI entry point +files → ExtractionOrchestrator (tree-sitter) → DB (nodes/edges/files) + ↓ + ReferenceResolver (imports, name-matching, framework patterns) + ↓ + GraphQueryManager / GraphTraverser (callers, callees, impact) + ↓ + ContextBuilder (markdown/JSON for AI consumption) ``` -### Key Classes +The public API surface is `src/index.ts` — the `CodeGraph` class wires all the layers and re-exports types. Library users only touch this file; the MCP server and CLI also drive it. + +### Module layout + +- `src/index.ts` — `CodeGraph` class: `init`/`open`/`close`, `indexAll`, `sync`, `searchNodes`, `getCallers`/`getCallees`, `getImpactRadius`, `buildContext`, `watch`/`unwatch`. +- `src/db/` — `DatabaseConnection`, `QueryBuilder` (prepared statements), `schema.sql`. Backed by `better-sqlite3` (native) when available, transparently falls back to `node-sqlite3-wasm`. `codegraph status` surfaces which backend is live; wasm is the slow path. +- `src/extraction/` — `ExtractionOrchestrator`, tree-sitter wrappers, per-language extractors under `languages/` (one file per language), plus standalone extractors for non-tree-sitter formats (`svelte-extractor.ts`, `vue-extractor.ts`, `liquid-extractor.ts`, `dfm-extractor.ts` for Delphi). `parse-worker.ts` runs heavy parsing off the main thread. +- `src/resolution/` — `ReferenceResolver` orchestrates `import-resolver.ts` (with `path-aliases.ts` for tsconfig path aliases + cargo workspace member globs), `name-matcher.ts`, and `frameworks/` (Express, Laravel, Rails, FastAPI, Django, Flask, Spring, Gin, Axum, ASP.NET, Vapor, React Router, SvelteKit, Vue/Nuxt, Cargo workspaces). Frameworks emit `route` nodes and `references` edges. +- `src/graph/` — `GraphTraverser` (BFS/DFS, impact radius, path finding) and `GraphQueryManager` (high-level queries). +- `src/context/` — `ContextBuilder` + formatter for markdown/JSON output. +- `src/search/` — full-text query parser and helpers for FTS5. +- `src/sync/` — `FileWatcher` (native FSEvents/inotify/RDCW) with debounce + filter, and git-hook helpers. +- `src/mcp/` — MCP server (`MCPServer`, `tools.ts`, `transport.ts`). `server-instructions.ts` is what the server returns in the MCP `initialize` response — keep it in sync with the user-facing tool guidance. +- `src/installer/` — see below. +- `src/bin/codegraph.ts` — CLI (commander). Subcommands: `install`, `init`, `uninit`, `index`, `sync`, `status`, `query`, `files`, `context`, `affected`, `serve --mcp`. +- `src/ui/` — terminal UI (shimmer progress, worker). + +### NodeKind / EdgeKind + +Defined in `src/types.ts`. Both extractors and resolvers must use these exact strings. + +- **NodeKind**: `file`, `module`, `class`, `struct`, `interface`, `trait`, `protocol`, `function`, `method`, `property`, `field`, `variable`, `constant`, `enum`, `enum_member`, `type_alias`, `namespace`, `parameter`, `import`, `export`, `route`, `component`. +- **EdgeKind**: `contains`, `calls`, `imports`, `exports`, `extends`, `implements`, `references`, `type_of`, `returns`, `instantiates`, `overrides`, `decorates`. + +### Multi-agent installer + +`src/installer/` is the entry point for `codegraph install` (and the bare `codegraph`/`npx @colbymchenry/codegraph` invocation). Architecture: + +- `targets/registry.ts` lists every supported agent. +- `targets/types.ts` defines the `AgentTarget` interface — adding a 5th agent (Continue, Zed, Windsurf…) is **one new file in `targets/` + one entry in `registry.ts`**. Each target owns its config-file location, MCP-server JSON/TOML/JSONC writing, and instructions-file path. +- Current targets: `claude.ts`, `cursor.ts`, `codex.ts`, `opencode.ts`. +- `targets/toml.ts` is a hand-rolled TOML serializer scoped to `[mcp_servers.codegraph]` (used by Codex). Sibling tables and `[[array_of_tables]]` are preserved verbatim. No new dependency. +- opencode reads `opencode.jsonc` by default; the installer prefers existing `.jsonc`, falls back to `.json`, and creates `.jsonc` for greenfield installs. Edits are surgical via `jsonc-parser` so user comments and formatting survive install/re-install/uninstall round-trips. +- `instructions-template.ts` is the agent-agnostic instructions file written to each target (e.g. `CLAUDE.md`, `.cursor/rules/codegraph.mdc`, `~/.codex/AGENTS.md`, `~/.config/opencode/AGENTS.md`). It explicitly says "trust codegraph results, don't re-verify with grep" — earlier versions prescribed Claude-specific "spawn an Explore agent" and confused other agents. +- `claude-md-template.ts` is the legacy Claude-only template, retained for compatibility paths. +- All installer changes need matching coverage in `__tests__/installer-targets.test.ts` — there are ~47 parameterized contract tests covering install idempotency, sibling preservation, uninstall reverses install, byte-equal re-runs returning `unchanged`, and partial-state recovery for Codex. + +### Cursor MCP working-directory quirk + +Cursor launches MCP subprocesses with the wrong cwd and doesn't pass `rootUri` in `initialize`. The installer injects `--path` into Cursor's MCP args — absolute path for local installs, `${workspaceFolder}` for global installs. If you touch Cursor wiring, preserve this. + +### MCP server instructions + +`src/mcp/server-instructions.ts` is sent back to the agent in the MCP `initialize` response. This is the *first* thing every agent sees about how to use the tools — treat it as the authoritative tool guidance and keep it in sync with `instructions-template.ts` and `.cursor/rules/codegraph.mdc`. + +## Retrieval performance & dynamic-dispatch coverage (do not regress) + +CodeGraph's core value is letting an agent answer **structural/flow** questions ("how does X reach Y", trace, impact, callers) with a few **fast** codegraph calls and **zero Read/Grep**. The optimization target is **wall-clock latency + tool-call count** — *don't optimize for token cost*. (Cost is **lower**, not "flat" as earlier framing claimed: a current-build with-vs-without A/B across the 7 README repos, median of 4, saved on average **35% cost · 57% tokens · 46% time · 71% tool calls** — reproducing the published README. The mechanism is **far fewer turns over a much smaller accumulated context** — NOT cache-ability: the without-arm's huge token volume is *mostly* cheap cache-reads, which is why token-count savings (57%) look bigger than cost savings (35%). Measure tokens by **summing per-turn assistant usage**, not `result.usage` (last-turn only in current Claude Code). See `docs/benchmarks/call-sequence-analysis.md`.) The mechanism that drives everything here: **an agent falls back to Read/Grep the instant a codegraph answer is insufficient.** So every change is judged by one question — is codegraph's answer sufficient enough to *stop* the agent from reading? + +**Target behavior:** a flow question resolves in **1 codegraph call on small repos, scaling to 3–5 on large**, with **Read/Grep = 0**. When reviewing a PR or trying something new, do not regress this. + +### Adapt the tool to the agent — don't try to change the agent + +The lever that decides whether a retrieval change lands. **Test before building anything here: does this make a tool the agent _already calls_ do more with the input it _already gives_? If it instead needs the agent to behave differently — pick a different tool, query differently, learn from examples — it hits the low-salience wall and won't land.** + +CodeGraph's only channels to influence the agent are low-salience: the MCP `initialize` instructions (`server-instructions.ts`) and the tool descriptions. Changing them does **not** reliably move the agent's tool _choice_ or query style — validated: trace-first steering ported into the server-instructions + tool descriptions (3 wording variants) never reproduced what a CLI `--append-system-prompt` achieved, and **regressed** wall-clock vs baseline. New tools fare worse (rarely chosen — the agent under-picks even `trace`); "better examples" is the same steering. The agent's tool-choice does improve on its own as host models get better at tool use — but that is not ours to force. + +What works is meeting the agent where it already is: +- **Sufficiency** — `codegraph_trace` inlines each hop's body + the destination's own callees, so one trace call ends the flow investigation (no follow-up explore/node/Read). +- **explore-flow** — `codegraph_explore`'s query is a precise bag of symbol names (incl. qualified `Class.method`) spanning the flow the agent is after; explore finds the call path _among those named symbols_ (riding synthesized edges) and leads its output with it — delivering trace-quality flow through the call the agent reliably makes. (`buildFlowFromNamedSymbols`: segment/co-naming disambiguation; ≤1 unnamed bridge so it never wanders a god-function's fan-out.) + +What fails is the inverse — folding a precise answer into a **fuzzy-input** tool. `codegraph_context` gets a description, not symbols, so it can't disambiguate a flow's endpoints and surfaces the _wrong feature_. Precise output needs precise input. + +The remaining lever under this axis is **coverage**: every flow made to connect statically (a new dynamic-dispatch synthesizer) is then surfaced automatically by explore-flow/`trace`, no agent change needed. Reactive/reconciler runtimes (Halo's `ReactiveExtensionClient`, MediatR, Vue Proxy) are the frontier — flows there have no static edges, so nothing surfaces (correctly — silent beats wrong). Full investigation + A/B record: `docs/benchmarks/call-sequence-analysis.md`. -- **CodeGraph** (`src/index.ts`): Main entry point. Lifecycle methods (`init`, `open`, `close`), indexing (`indexAll`, `sync`), graph queries (`traverse`, `getCallGraph`, `getImpactRadius`), context building (`buildContext`) +### Explore budget — keep BOTH budgets monotonic with repo size -- **ExtractionOrchestrator** (`src/extraction/index.ts`): Coordinates file scanning, parsing, and storing. Uses tree-sitter native bindings for each supported language +Two functions in `src/mcp/tools.ts` scale explore with indexed file count. This is the expected resolution (a regression here silently forces agents back to Read): -- **GraphTraverser** (`src/graph/traversal.ts`): BFS/DFS traversal, call graph construction, impact radius calculation, path finding +| Repo | files | explore calls | chars/call | per-file | +|---|---|---|---|---| +| express (small) | 147 | 1 | 18K | 3800 | +| excalidraw/django (medium) | 643–3043 | 2 | 28K | 6500 | +| vscode (large) | 10446 | 3 | 35K | 7000 | +| ~20k / ~40k | — | 4 / 5 | 38K | 7000 | -- **ReferenceResolver** (`src/resolution/index.ts`): Resolves unresolved references after full indexing using framework patterns, import resolution, and name matching +- `getExploreBudget(fileCount)` → **call** budget: `<500→1, <5000→2, <15000→3, <25000→4, ≥25000→5` (max 5). +- `getExploreOutputBudget(fileCount)` → **per-call** output (chars / files / per-file). **Invariant: a larger tier must never get a smaller `maxCharsPerFile` than a smaller tier.** (Regression that motivated this doc: the `<5000` tier's 2500 was *below* the `<500` tier's 3800, so on a god-file repo — excalidraw's 415 KB `App.tsx` — one explore returned <1% of the file and forced a Read.) +- Explore output must **never tell the agent to "use Read"** — steer to another `codegraph_explore` and "treat returned source as already Read." -### Database Schema +### Dynamic-dispatch coverage — the flow must EXIST in the graph end-to-end -SQLite database with: -- `nodes`: Code symbols (functions, classes, methods, etc.) -- `edges`: Relationships (calls, imports, extends, contains, etc.) -- `files`: Tracked source files with content hashes -- `unresolved_refs`: References pending resolution -- `nodes_fts`: FTS5 virtual table for full-text search +Static tree-sitter extraction misses computed/indirect calls, so flows break at dynamic dispatch and the agent reads to reconstruct them. Synthesizers/resolvers bridge these so `trace`/`explore` connect end-to-end (`src/resolution/callback-synthesizer.ts`, `src/resolution/frameworks/`). Channels today: callback/observer, EventEmitter, **React re-render** (`setState`→`render`), **JSX child** (`render`→child component), django ORM descriptor. All synthesized edges are `provenance:'heuristic'` with `metadata.synthesizedBy` + `registeredAt` (the wiring site), surfaced inline in `trace`, the `node` trail, and `context` call-paths. -### Supported Languages +**Principle: partial coverage is WORSE than none.** Bridging one boundary but not the next reveals a hop the agent then drills + reads to finish. Measured on excalidraw: react-render alone *raised* reads to 5–7; only completing the flow (adding the jsx-child hop) dropped it to 0–1. **Always close the flow end-to-end and re-measure** — never ship a half-bridged flow. -TypeScript, JavaScript, TSX, JSX, Svelte, Python, Go, Rust, Java, C, C++, C#, PHP, Ruby, Swift, Kotlin, Dart, Liquid, Pascal +### Validation methodology (REQUIRED for every new language/framework) -### Node and Edge Types +For each **language × framework**, validate on **small, medium, and large** real repos with **≥3 different flow prompts** each: -**NodeKind**: `file`, `module`, `class`, `struct`, `interface`, `trait`, `protocol`, `function`, `method`, `property`, `field`, `variable`, `constant`, `enum`, `enum_member`, `type_alias`, `namespace`, `parameter`, `import`, `export`, `route`, `component` +1. **Pick the canonical flow** for the framework ("how does X reach Y": state→render, request→handler→view, query→SQL, action→reducer→store…). +2. **Deterministic probes** (`scripts/agent-eval/probe-{trace,node,context,explore}.mjs` against the built `dist/`): `trace(from,to)` connects end-to-end with no break; **no node explosion** (`select count(*) from nodes` stable before/after re-index); synthesized-edge **precision** spot-check (`select … where provenance='heuristic'`). +3. **Agent A/B** (`scripts/agent-eval/run-all.sh ""`): with vs without codegraph, **≥2 runs/arm** (run-to-run variance is large — never conclude from n=1). Record **duration, total tool calls, Read, Grep**. Optional forced-Read-0 sufficiency proof via the block-read hook (`scripts/agent-eval/hook-settings.json`). +4. **Pass bar:** a normal flow question reaches **~0 Read/Grep within the repo's explore-call budget**, runs **faster** than without-codegraph, and shows **no regression on a control repo**. Record the numbers in `docs/design/dynamic-dispatch-coverage-playbook.md` (the coverage matrix). -**EdgeKind**: `contains`, `calls`, `imports`, `exports`, `extends`, `implements`, `references`, `type_of`, `returns`, `instantiates`, `overrides`, `decorates` +Full playbook + per-mechanism design: `docs/design/dynamic-dispatch-coverage-playbook.md` and `docs/design/callback-edge-synthesis.md`. -## CLI Usage +### Worked example — Excalidraw (TS/React, medium, 643 files) + +The template to replicate per language/framework. Question: *"how does updating an element re-render the canvas on screen?"* (the full flow crosses three React boundaries: observer callback, `setState`→`render`, and JSX child). + +| Stage | duration | Read | Grep | codegraph | +|---|---|---|---|---| +| Without codegraph | 115–139s | 9–10 | 10–11 | 0 | +| Broken (explore-budget regression) | 131–139s | 5–10 | 3–5 | 6–14 | +| Fixed (budget + msgs + synthesis) | 64–112s | 0–2 | 2–4 | 3–**10** | +| + trace-first steering | **51–74s** | **0–2** | 0–4 | **3–4** | + +n=4 unhooked runs/stage, same prompt. After steering flow questions to `codegraph_trace` first: **best run 0 Read / 0 Grep / 3 codegraph / 51s**; **2 of 4 fully clean** (0 Read, 0 Grep). Steering eliminated the over-drill variance — call count tightened from 3–10 to 3–4, trace adoption went 3/4 → 4/4, and the `search`+`callers` path-reconstruction floundering dropped to 0. Run-to-run variance is still real; report the range, never a single run. **Residual reads/greps are all the nonce data-flow** (`canvasNonce` — a local prop with no graph edges); that's the def-use/data-flow frontier, left deliberately uncovered (tracking every local would explode the graph). Validated: `trace(mutateElement, renderStaticScene)` connects in **6 hops** across all three boundaries (`mutateElement → triggerUpdate → [callback] triggerRender → [react-render] render → [jsx] StaticCanvas → renderStaticScene`), each hop showing inline source + the wiring site; node count stable at 9,289; 1 callback + 46 react-render + 280 jsx-render synthesized edges (no explosion, precision-checked). + +## Tests + +Tests live in `__tests__/` and mirror the module they cover. Notable ones beyond the obvious: + +- `installer-targets.test.ts` — parameterized contract suite across all 4 agent targets (see installer notes above). +- `evaluation/` — `runner.ts` + `test-cases.ts` exercise codegraph against synthetic projects and score the results; run via `npm run eval` (builds first). Not part of `npm test`. +- `sqlite-backend.test.ts` — covers native + wasm backend selection and fallback. +- `pr19-improvements.test.ts`, `frameworks-integration.test.ts` — regression coverage for specific past PRs/incidents; don't rename these, the names anchor to git history. + +Tests create temp dirs with `fs.mkdtempSync` and clean up in `afterEach`. They write real files and exercise real SQLite — there is no DB mocking. + +### Windows-gated tests + +Behavior that differs by platform (path resolution, drive letters, `SENSITIVE_PATHS`, `%APPDATA%` config dirs, CRLF) must be gated, not assumed. Use `it.runIf(process.platform === 'win32')(...)` for Windows-only assertions and `it.runIf(process.platform !== 'win32')(...)` for POSIX-only ones — e.g. `/etc` is sensitive on POSIX but resolves to `C:\etc` (non-existent) on Windows, so an ungated `/etc` assertion fails on Windows. Validate the Windows side for real (see below); don't merge a Windows-gated test you haven't seen run. + +## Windows validation (Parallels + SSH) + +For any Windows-specific PR, bug, or implementation, validate it on the real Windows VM rather than guessing. Connection details live in the gitignored **`.parallels`** file at the repo root (VM name, guest IP, SSH user/key). `prlctl exec` needs Parallels Pro and is unavailable, so SSH is the bridge. + +- Connect / run from the Mac host: `ssh @ "..."`. For multi-line work, pipe PowerShell over stdin and **refresh PATH from the registry** first (sshd's session has a stale PATH after winget installs): + ``` + ssh colby@10.211.55.3 "powershell -NoProfile -ExecutionPolicy Bypass -Command -" <<'PS' + $env:Path = [Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [Environment]::GetEnvironmentVariable("Path","User") + Set-Location C:\dev\codegraph + PS + ``` +- Clone fresh into a **Windows-local** path (`C:\dev\codegraph`) and `npm ci` there — never run npm against the shared Mac repo, since `esbuild`/`rollup` ship platform-specific binaries. +- Guest toolchain (winget): Node LTS, Git, and the **VC++ ARM64 redistributable** (required by `@rollup/rollup-win32-arm64-msvc`, which vitest pulls in). +- Fetch a contributor PR head straight from their fork to dodge `pull//head` lag: `git fetch ` then `git checkout -f FETCH_HEAD`. +- Known pre-existing Windows failure: `security.test.ts > Session marker symlink resistance > does not follow a pre-planted symlink` (symlink creation needs privileges on Windows). Unrelated to current work; don't let it mask new regressions. + +## Releases + +Released to npm and mirrored as [GitHub Releases](https://github.com/colbymchenry/codegraph/releases). `CHANGELOG.md` is the source of truth; GitHub Release notes are extracted from it. + +### Writing changelog entries + +When asked for an entry for a new version: + +1. Add a new `## [X.Y.Z] - YYYY-MM-DD` block at the **top** of `CHANGELOG.md` (under the intro, above the previous version). +2. Group under `### Added`, `### Changed`, `### Fixed`, `### Removed`, `### Deprecated`, `### Security` — omit empty sections. +3. Write from the **user's perspective**, not the implementation's. Lead with the observable symptom or capability; mention internals only if a user needs them (e.g., to work around an existing bad install). +4. Add the link reference at the bottom: `[X.Y.Z]: https://github.com/colbymchenry/codegraph/releases/tag/vX.Y.Z`. + +### Release flow (the user runs these) + +Releases are built and published by the **GitHub Actions "Release" workflow** +(`.github/workflows/release.yml`). It bundles a Node runtime per platform +(`scripts/build-bundle.sh`) and publishes both the GitHub Release and the npm +thin-installer (`scripts/pack-npm.sh`: a shim package + per-platform packages). +Publishing manually is **wrong** now — a plain `npm publish` ships the root +package (non-bundled), which breaks anyone on Node < 22.5. + +After the changelog entry is written and `package.json` is bumped: ```bash -codegraph init [path] # Initialize in project -codegraph index [path] # Full index -codegraph sync [path] # Incremental update -codegraph status [path] # Show statistics -codegraph query # Search symbols -codegraph context # Build context for AI -codegraph hooks install # Install git auto-sync -codegraph serve --mcp # Start MCP server +git add package.json package-lock.json CHANGELOG.md +git commit -m "release: X.Y.Z ()" +git push ``` -## MCP Tools Best Practices - -Use these tools **directly in the main session** for fast code exploration (replaces the need for Explore agents in most cases): - -| Tool | Use For | -|------|---------| -| `codegraph_explore` | **Deep exploration** — comprehensive context for a topic in ONE call | -| `codegraph_context` | Quick context for a task (lighter than explore) | -| `codegraph_search` | Find symbols by name (functions, classes, types) | -| `codegraph_callers` | Find what calls a function | -| `codegraph_callees` | Find what a function calls | -| `codegraph_impact` | See what's affected by changing a symbol | -| `codegraph_node` | Get details + source code for a symbol | - -### Important -CodeGraph provides **code context**, not product requirements. For new features, still ask the user about: -- UX preferences and behavior -- Edge cases and error handling -- Acceptance criteria - -## Test Structure - -Tests are in `__tests__/` directory with files mirroring the module structure: -- `foundation.test.ts` - Database, config, directory management -- `extraction.test.ts` - Tree-sitter parsing for all languages -- `resolution.test.ts` - Reference resolution -- `graph.test.ts` - Traversal and graph queries -- `context.test.ts` - Context building -- `sync.test.ts` - Incremental updates and git hooks - -Tests use temporary directories created with `fs.mkdtempSync` and cleaned up after each test. +Then trigger **Actions → Release → Run workflow** (on `main`). It reads the +version from `package.json`, builds every platform bundle on one runner, creates +the GitHub Release with notes from the matching `CHANGELOG.md` section, and +publishes to npm. Requires the `NPM_TOKEN` repo secret. + +**Do not run `npm publish`, `git push`, or `git tag` yourself** — these are +publish actions on shared state. Write the files, hand the user the commands. + +## House rules + +- The `0.7.x` line is in active multi-agent rollout. Any change to `src/installer/` (especially `targets/`) needs corresponding test coverage and a CHANGELOG entry — installer regressions break every new install silently. +- When changing what the MCP tools do or how agents should use them, update **all three** of `src/mcp/server-instructions.ts`, `src/installer/instructions-template.ts`, and `.cursor/rules/codegraph.mdc` — they're written to different places but say the same thing. +- CodeGraph provides **code context**, not product requirements. For new features, ask the user about UX, edge cases, and acceptance criteria — the graph won't tell you. diff --git a/DELPHI-SUPPORT.md b/DELPHI-SUPPORT.md deleted file mode 100644 index 7d452451..00000000 --- a/DELPHI-SUPPORT.md +++ /dev/null @@ -1,157 +0,0 @@ -# Pascal / Delphi Support for CodeGraph - -## Why Delphi? - -Delphi (Object Pascal) remains one of the most widely used languages for Windows desktop and enterprise applications. With an estimated **1.5–3 million active developers** and a strong presence in industries like healthcare, finance, logistics, and government, Delphi projects often involve large, long-lived codebases that benefit significantly from semantic code intelligence. - -Many Delphi codebases have grown over decades — making structural understanding, impact analysis, and cross-file navigation exactly the kind of tooling gap CodeGraph is designed to fill. - -Adding Delphi support positions CodeGraph as a uniquely valuable tool for a community that has historically been underserved by modern static analysis and AI-assisted development tools. - -## What Was Implemented - -### Pascal / Object Pascal (tree-sitter) - -Full extraction support for `.pas`, `.dpr`, `.dpk`, and `.lpr` files using the `tree-sitter-pascal` grammar: - -| Feature | NodeKind | Details | -|---------|----------|---------| -| Units / Programs | `module` | `unit`, `program`, `package`, `library` | -| Classes | `class` | Including inheritance and interface implementation | -| Records | `class` | Treated as classes (consistent with AST structure) | -| Interfaces | `interface` | With GUID support | -| Methods | `method` | Constructor, destructor, procedures, functions | -| Functions / Procedures | `function` | Top-level (non-class) routines | -| Properties | `property` | With read/write accessors | -| Fields | `field` | Class and record fields | -| Constants | `constant` | `const` declarations | -| Enums | `enum` | With enum members | -| Type Aliases | `type_alias` | `type TFoo = ...` | -| Uses / Imports | `import` | `uses` clause extraction | -| Function Calls | — | `calls` edges for call graph | -| Visibility | — | `public`, `private`, `protected` on methods/fields | -| Static Methods | — | `class function` / `class procedure` | -| Containment | — | `contains` edges (class → method, unit → type, etc.) | -| Inheritance | — | `extends` / `implements` edges | - -### DFM / FMX Form Files (custom extractor) - -Support for Delphi form files (`.dfm` for VCL, `.fmx` for FireMonkey) using a regex-based custom extractor — no tree-sitter grammar exists for this format: - -| Feature | NodeKind / EdgeKind | Details | -|---------|---------------------|---------| -| Components | `component` | `object Button1: TButton` | -| Nested hierarchy | `contains` | Panel1 → Button1 | -| Event handlers | `references` (unresolved) | `OnClick = Button1Click` → links UI to Pascal methods | -| `inherited` keyword | `component` | Inherited form components | -| Multi-line properties | — | Correctly skipped during parsing | -| Item collections | — | `...` blocks correctly handled | - -The DFM ↔ PAS linkage via event handlers enables **cross-file impact analysis**: renaming a method in `.pas` immediately reveals which UI components reference it. - -## Architecture - -The implementation follows CodeGraph's established patterns: - -- **Pascal extraction** uses the standard `TreeSitterExtractor` with a Pascal-specific `LanguageExtractor` configuration and a `visitPascalNode()` hook for AST nodes that require special handling (e.g., `declType` wrappers, `defProc` implementation bodies) -- **DFM/FMX extraction** uses a `DfmExtractor` class — analogous to `LiquidExtractor` and `SvelteExtractor` — that parses the line-based format with regex -- **Routing** in `extractFromSource()` dispatches `.dfm`/`.fmx` files to `DfmExtractor` before reaching the tree-sitter path -- **`tree-sitter-pascal`** is declared as an `optionalDependency` (consistent with all other grammars), pinned to a specific commit for reproducible builds - -## Performance Improvements - -Testing with a large Delphi codebase (~3,400 files, ~244k nodes) uncovered performance bottlenecks in the reference resolution pipeline. The following fixes **benefit all languages**, not just Pascal: - -| Fix | Scope | Impact | -|-----|-------|--------| -| **Fuzzy match index** — replaced O(n) linear scan with lazily-built case-insensitive `Map` index | `name-matcher.ts` (all languages) | O(1) lookup per ref instead of iterating all nodes | -| **Import mapping cache** — cached per-file import mappings instead of re-reading/re-parsing for every ref | `import-resolver.ts` (all languages) | Eliminated redundant file I/O during resolution | -| **Kind cache** — pre-populated `getNodesByKind` results during warm-up | `resolution/index.ts` (all languages) | Avoided repeated DB queries for the same node kinds | -| **Pascal built-in filtering** — skip known RTL/VCL/FMX identifiers before resolution | `resolution/index.ts` (Pascal-specific) | ~60 built-in identifiers filtered out early | -| **Method index for `defProc`** — replaced O(n) `find()` with `Map` lookup when linking implementation bodies to declarations | `tree-sitter.ts` (Pascal-specific) | O(1) per implementation body | -| **Delphi-specific excludes** — `__history/**`, `__recovery/**`, `*.dcu` added to default excludes | `types.ts` (Pascal-specific) | Skips Delphi IDE temp files during indexing | - -**Result:** Reference resolution on a large Delphi project dropped from **~30 minutes to ~15 seconds** (120x speedup). The general improvements (fuzzy index, import cache, kind cache) will benefit all CodeGraph users. - -## Files Changed - -| File | Change | -|------|--------| -| `src/types.ts` | Added `'pascal'` to `Language` type, file patterns to `DEFAULT_CONFIG.include` | -| `src/extraction/grammars.ts` | Grammar loader, extension mappings (`.pas`, `.dpr`, `.dpk`, `.lpr`, `.dfm`, `.fmx`), display name | -| `src/extraction/tree-sitter.ts` | Pascal `LanguageExtractor`, `visitPascalNode()` with 7 helper methods, `DfmExtractor` class, routing in `extractFromSource()`, method index | -| `src/resolution/index.ts` | Pascal built-in filtering, kind cache, cache clearing | -| `src/resolution/import-resolver.ts` | Import mapping cache | -| `src/resolution/name-matcher.ts` | Fuzzy match index (case-insensitive `Map`) | -| `package.json` | `tree-sitter-pascal` in `optionalDependencies` (pinned commit) | -| `__tests__/extraction.test.ts` | 37 new tests covering all Pascal and DFM extraction features | - -## Test Results - -- **36 new tests**, all passing -- **0 regressions** — the same 28 pre-existing failures (unrelated: missing Swift/Dart grammars, database path issues, MCP truncation test) are unchanged -- Tests cover: language detection, modules, imports, classes, records, interfaces, methods, visibility, static methods, enums, properties, constants, type aliases, calls, containment, full fixture files (UAuth.pas, UTypes.pas, MainForm.dfm) - -## Dependency Note - -The npm package `tree-sitter-pascal@0.0.1` is outdated (uses NAN bindings, incompatible with Node.js v24+). The implementation uses the actively maintained GitHub repository ([Isopod/tree-sitter-pascal](https://github.com/Isopod/tree-sitter-pascal), v0.10.2) with a pinned commit hash for deterministic builds. This is consistent with how `@sengac/tree-sitter-dart` handles a similar situation. - -## Testing Instructions - -### Prerequisites - -- Node.js >= 18 -- npm -- Git - -### 1. Clone and build - -```bash -git clone -b delphi-support https://github.com/omonien/codegraph.git -cd codegraph -npm install -npm run build -``` - -### 2. Link globally - -```bash -npm link -``` - -Verify with: - -```bash -codegraph --version -``` - -### 3. Index a Delphi project - -```bash -cd /path/to/your/delphi-project -codegraph init -i -codegraph index -``` - -### 4. Query the code graph - -```bash -codegraph status # Show index statistics -codegraph query "TFormMain" # Search for a symbol -codegraph context "What does TCustomer do?" # Build AI context -``` - -### 5. Set up the MCP server (for Claude Code) - -```bash -codegraph install -``` - -This configures the MCP server, tool permissions, auto-sync hooks, and CLAUDE.md in one step. After that, start Claude Code in the project — CodeGraph tools will be available immediately. - -### 6. Clean up - -```bash -npm unlink -g @colbymchenry/codegraph # Remove global link -rm -rf /path/to/delphi-project/.codegraph # Remove project index -``` diff --git a/IMPLEMENTATION_PLAN.md b/IMPLEMENTATION_PLAN.md deleted file mode 100644 index 65d99d82..00000000 --- a/IMPLEMENTATION_PLAN.md +++ /dev/null @@ -1,1736 +0,0 @@ -# CodeGraph: Universal Code Knowledge Graph - -## Overview - -CodeGraph is a local-first code intelligence system that builds a semantic knowledge graph from any codebase. It provides structural understanding of code relationships—not just text similarity—enabling AI assistants to understand how code connects, what depends on what, and what breaks when something changes. - -**Type:** Headless library (no UI components — purely an API) -**Runtime:** Node.js (works standalone, in Electron, or any Node environment) -**Distribution:** npm package, installable in any project -**Per-Project Data:** `.codegraph/` directory in each indexed project -**Core Principle:** Deterministic extraction from AST, not AI-generated summaries - -### Use Cases - -1. **Beads Dashboard** — Integrated as a library to provide code intelligence -2. **Claude Code CLI users** — Install globally, run `codegraph init` in any project -3. **Any Node.js application** — Import as a library for code analysis -4. **MCP Server** — Expose as an MCP tool that Claude Code can query directly - ---- - -## Goals - -1. **Universal language support** via tree-sitter (PHP, Swift, Kotlin, Java, TypeScript, Python, Liquid, Ruby, Go, Rust, C#, etc.) -2. **Zero external API dependencies** for core functionality (local embeddings, local database) -3. **Portable per-project installation** — each project gets its own `.codegraph/` directory -4. **Incremental updates** via git hooks and hash-based change detection -5. **Rich structural queries** — callers, callees, impact radius, dependency chains -6. **Semantic search** — vector similarity to find entry points, then graph expansion - ---- - -## Architecture - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ CONSUMERS │ -│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │ -│ │ Beads │ │ Claude │ │ Any Node.js App │ │ -│ │ Dashboard │ │ Code CLI │ │ / MCP Server │ │ -│ │ (Electron) │ │ (Terminal) │ │ │ │ -│ └──────┬───────┘ └──────┬───────┘ └──────────┬───────────┘ │ -│ │ │ │ │ -│ └─────────────────┼──────────────────────┘ │ -│ │ │ -│ ▼ │ -├─────────────────────────────────────────────────────────────────┤ -│ CODEGRAPH LIBRARY │ -│ (npm package) │ -│ │ -│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ -│ │ Context │ │ Query │ │ Sync │ │ -│ │ Builder │ │ Engine │ │ Manager │ │ -│ └──────┬──────┘ └──────┬──────┘ └──────────┬──────────────┘ │ -│ │ │ │ │ -│ └────────────────┼─────────────────────┘ │ -│ │ │ -│ ▼ │ -│ ┌─────────────────────────────────────────────────────────────┐│ -│ │ STORAGE LAYER ││ -│ │ SQLite + sqlite-vss (per project) ││ -│ │ .codegraph/graph.db ││ -│ └─────────────────────────────────────────────────────────────┘│ -│ ▲ │ -│ │ │ -│ ┌─────────────────────────────────────────────────────────────┐│ -│ │ EXTRACTION LAYER ││ -│ │ ││ -│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ ││ -│ │ │ Tree-sitter │ │ Reference │ │ Framework │ ││ -│ │ │ Parser │ │ Resolver │ │ Patterns │ ││ -│ │ └─────────────┘ └─────────────┘ └─────────────────────┘ ││ -│ └─────────────────────────────────────────────────────────────┘│ -│ ▲ │ -│ │ │ -│ ┌─────────────────────────────────────────────────────────────┐│ -│ │ EMBEDDING LAYER ││ -│ │ Local ONNX Runtime + nomic-embed ││ -│ └─────────────────────────────────────────────────────────────┘│ -│ │ -└─────────────────────────────────────────────────────────────────┘ - -Per-Project Installation (created by codegraph init): -┌─────────────────────────────────────────────────────────────────┐ -│ my-laravel-app/ │ -│ ├── .codegraph/ │ -│ │ ├── graph.db # SQLite database with vectors │ -│ │ ├── config.json # Project-specific settings │ -│ │ └── .gitignore # Ignore db, keep config │ -│ ├── .git/ │ -│ │ └── hooks/ │ -│ │ └── post-commit # Triggers incremental reindex │ -│ ├── app/ │ -│ ├── routes/ │ -│ └── ... │ -└─────────────────────────────────────────────────────────────────┘ -``` - ---- - -## File Structure (npm package) - -``` -codegraph/ -├── package.json -├── tsconfig.json -├── README.md -│ -├── src/ -│ ├── index.ts # Main CodeGraph class, public API -│ ├── types.ts # TypeScript interfaces -│ │ -│ ├── db/ -│ │ ├── index.ts # Database initialization -│ │ ├── schema.sql # Table definitions -│ │ ├── migrations.ts # Schema versioning -│ │ └── queries.ts # Prepared statements -│ │ -│ ├── extraction/ -│ │ ├── index.ts # Extraction orchestrator -│ │ ├── tree-sitter.ts # Universal parser wrapper -│ │ ├── grammars.ts # Grammar loading and caching -│ │ └── queries/ # Tree-sitter query files (.scm) -│ │ ├── typescript.scm -│ │ ├── javascript.scm -│ │ ├── php.scm -│ │ ├── swift.scm -│ │ ├── kotlin.scm -│ │ ├── java.scm -│ │ ├── python.scm -│ │ ├── ruby.scm -│ │ ├── liquid.scm -│ │ ├── go.scm -│ │ └── csharp.scm -│ │ -│ ├── resolution/ -│ │ ├── index.ts # Reference resolver orchestrator -│ │ ├── name-matcher.ts # Symbol name matching -│ │ ├── import-resolver.ts # Import path resolution -│ │ └── frameworks/ # Framework-specific patterns -│ │ ├── index.ts -│ │ ├── laravel.ts -│ │ ├── express.ts -│ │ ├── nextjs.ts -│ │ ├── rails.ts -│ │ ├── shopify.ts -│ │ ├── spring.ts -│ │ └── swiftui.ts -│ │ -│ ├── graph/ -│ │ ├── index.ts # Graph query interface -│ │ ├── traversal.ts # BFS/DFS, impact radius -│ │ └── serialize.ts # Subgraph to context format -│ │ -│ ├── vectors/ -│ │ ├── index.ts # Vector operations interface -│ │ ├── embedder.ts # ONNX runtime + model -│ │ └── search.ts # Similarity search -│ │ -│ ├── sync/ -│ │ ├── index.ts # Sync orchestrator -│ │ ├── git-hooks.ts # Hook installation -│ │ └── hasher.ts # Content hashing for diffing -│ │ -│ └── context/ -│ ├── index.ts # Context builder -│ └── formatter.ts # Output formatting for Claude -│ -├── bin/ -│ └── codegraph.ts # CLI entry point (optional standalone usage) -│ -└── __tests__/ # Test files mirror src structure - ├── extraction/ - ├── resolution/ - ├── graph/ - └── fixtures/ # Sample code files for testing -``` - ---- - -## Database Schema - -**File: `src/db/schema.sql`** - -```sql --- ============================================================ --- CODEGRAPH SCHEMA v1 --- ============================================================ - --- Metadata table for schema versioning and project info -CREATE TABLE IF NOT EXISTS meta ( - key TEXT PRIMARY KEY, - value TEXT NOT NULL -); - --- ============================================================ --- NODES: Every significant code entity --- ============================================================ -CREATE TABLE IF NOT EXISTS nodes ( - id TEXT PRIMARY KEY, -- Unique ID: "func:src/auth.ts:validateToken:45" - kind TEXT NOT NULL, -- file, function, method, class, interface, type, variable, route, component, config - name TEXT NOT NULL, -- Human-readable: "validateToken" - qualified_name TEXT, -- Full path: "AuthService.validateToken" - file_path TEXT NOT NULL, -- Relative path: "src/services/auth.ts" - start_line INTEGER, - end_line INTEGER, - start_column INTEGER, - end_column INTEGER, - language TEXT NOT NULL, -- typescript, php, swift, etc. - signature TEXT, -- For functions: "(token: string) => Promise" - docstring TEXT, -- Extracted documentation - code_snippet TEXT, -- First ~500 chars of code for quick preview - code_hash TEXT NOT NULL, -- SHA256 of full code block - metadata TEXT, -- JSON: extra language/framework-specific data - created_at INTEGER NOT NULL, - updated_at INTEGER NOT NULL -); - --- ============================================================ --- EDGES: Relationships between nodes --- ============================================================ -CREATE TABLE IF NOT EXISTS edges ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - source_id TEXT NOT NULL, - target_id TEXT NOT NULL, - kind TEXT NOT NULL, -- imports, calls, extends, implements, returns_type, throws, reads, writes, renders, instantiates - resolved INTEGER DEFAULT 0, -- 0 = unresolved (name only), 1 = resolved to actual node - target_name TEXT, -- Original name before resolution (for unresolved edges) - line_number INTEGER, -- Where this relationship occurs - metadata TEXT, -- JSON: additional context - UNIQUE(source_id, target_id, kind, line_number), - FOREIGN KEY (source_id) REFERENCES nodes(id) ON DELETE CASCADE - -- Note: target_id may reference non-existent node if unresolved/external -); - --- ============================================================ --- FILES: Track file-level state for incremental updates --- ============================================================ -CREATE TABLE IF NOT EXISTS files ( - path TEXT PRIMARY KEY, -- Relative file path - content_hash TEXT NOT NULL, -- SHA256 of file contents - language TEXT NOT NULL, - last_indexed INTEGER NOT NULL, -- Unix timestamp - node_count INTEGER DEFAULT 0, - error TEXT -- Last indexing error, if any -); - --- ============================================================ --- VECTOR EMBEDDINGS (sqlite-vss) --- ============================================================ - --- Virtual table for vector similarity search --- Dimension 384 for nomic-embed-text-v1.5 -CREATE VIRTUAL TABLE IF NOT EXISTS node_vectors USING vss0( - embedding(384) -); - --- Map vector rowids to nodes -CREATE TABLE IF NOT EXISTS vector_map ( - rowid INTEGER PRIMARY KEY, - node_id TEXT NOT NULL UNIQUE, - text_hash TEXT NOT NULL, -- Hash of text that was embedded - FOREIGN KEY (node_id) REFERENCES nodes(id) ON DELETE CASCADE -); - --- ============================================================ --- INDEXES --- ============================================================ -CREATE INDEX IF NOT EXISTS idx_nodes_file ON nodes(file_path); -CREATE INDEX IF NOT EXISTS idx_nodes_kind ON nodes(kind); -CREATE INDEX IF NOT EXISTS idx_nodes_name ON nodes(name); -CREATE INDEX IF NOT EXISTS idx_nodes_language ON nodes(language); -CREATE INDEX IF NOT EXISTS idx_edges_source ON edges(source_id); -CREATE INDEX IF NOT EXISTS idx_edges_target ON edges(target_id); -CREATE INDEX IF NOT EXISTS idx_edges_kind ON edges(kind); -CREATE INDEX IF NOT EXISTS idx_edges_resolved ON edges(resolved); -``` - ---- - -## Type Definitions - -**File: `src/types.ts`** - -```typescript -// ============================================================ -// CORE TYPES -// ============================================================ - -export type NodeKind = - | 'file' - | 'function' - | 'method' - | 'class' - | 'interface' - | 'type' - | 'variable' - | 'constant' - | 'route' - | 'component' - | 'config' - | 'module' - | 'namespace'; - -export type EdgeKind = - | 'imports' - | 'exports' - | 'calls' - | 'called_by' // Reverse of calls, computed - | 'extends' - | 'implements' - | 'returns_type' - | 'throws' - | 'reads' - | 'writes' - | 'renders' // React/Vue component rendering - | 'instantiates' - | 'decorates' // Decorators/attributes - | 'depends_on'; // Generic dependency - -export type Language = - | 'typescript' - | 'javascript' - | 'php' - | 'swift' - | 'kotlin' - | 'java' - | 'python' - | 'ruby' - | 'go' - | 'rust' - | 'csharp' - | 'liquid' - | 'vue' - | 'svelte'; - -export interface Node { - id: string; - kind: NodeKind; - name: string; - qualifiedName?: string; - filePath: string; - startLine?: number; - endLine?: number; - startColumn?: number; - endColumn?: number; - language: Language; - signature?: string; - docstring?: string; - codeSnippet?: string; - codeHash: string; - metadata?: Record; - createdAt: number; - updatedAt: number; -} - -export interface Edge { - id?: number; - sourceId: string; - targetId: string; - kind: EdgeKind; - resolved: boolean; - targetName?: string; - lineNumber?: number; - metadata?: Record; -} - -export interface FileRecord { - path: string; - contentHash: string; - language: Language; - lastIndexed: number; - nodeCount: number; - error?: string; -} - -// ============================================================ -// EXTRACTION TYPES -// ============================================================ - -export interface ExtractionResult { - nodes: Node[]; - edges: Edge[]; - errors: ExtractionError[]; -} - -export interface ExtractionError { - filePath: string; - line?: number; - message: string; - recoverable: boolean; -} - -export interface UnresolvedReference { - sourceId: string; - targetName: string; - kind: EdgeKind; - lineNumber?: number; - context?: string; // Surrounding code for better resolution -} - -// ============================================================ -// QUERY TYPES -// ============================================================ - -export interface Subgraph { - nodes: Node[]; - edges: Edge[]; - entryPoints: string[]; // Node IDs that initiated the query - stats: { - totalNodes: number; - totalEdges: number; - maxDepth: number; - }; -} - -export interface TraversalOptions { - maxDepth?: number; // Default: 2 - maxNodes?: number; // Default: 50 - edgeKinds?: EdgeKind[]; // Filter by edge type - nodeKinds?: NodeKind[]; // Filter by node type - direction?: 'outbound' | 'inbound' | 'both'; -} - -export interface SearchOptions { - limit?: number; // Default: 10 - nodeKinds?: NodeKind[]; // Filter results - minScore?: number; // Similarity threshold -} - -export interface SearchResult { - node: Node; - score: number; -} - -// ============================================================ -// CONTEXT TYPES -// ============================================================ - -export interface Context { - subgraph: Subgraph; - codeBlocks: CodeBlock[]; - summary: string; - relatedFiles: string[]; -} - -export interface CodeBlock { - nodeId: string; - nodeName: string; - nodeKind: NodeKind; - filePath: string; - startLine: number; - endLine: number; - code: string; - language: Language; -} - -// ============================================================ -// CONFIG TYPES -// ============================================================ - -export interface CodeGraphConfig { - version: number; - projectName?: string; - languages: Language[]; - exclude: string[]; // Glob patterns to ignore - include?: string[]; // Override: only index these - frameworks: FrameworkHint[]; // Help with resolution - embeddingModel: 'nomic-embed-text-v1.5' | 'all-MiniLM-L6-v2'; - chunkStrategy: 'ast' | 'hybrid'; - maxFileSize: number; // Skip files larger than this (bytes) - gitHooksEnabled: boolean; -} - -export type FrameworkHint = - | 'laravel' - | 'express' - | 'nextjs' - | 'nuxt' - | 'rails' - | 'django' - | 'flask' - | 'spring' - | 'swiftui' - | 'uikit' - | 'android' - | 'shopify' - | 'react' - | 'vue' - | 'svelte'; - -export const DEFAULT_CONFIG: CodeGraphConfig = { - version: 1, - languages: [], - exclude: [ - 'node_modules/**', - 'vendor/**', - '.git/**', - 'dist/**', - 'build/**', - '*.min.js', - '*.bundle.js', - '__pycache__/**', - '.venv/**', - 'Pods/**', - '.gradle/**', - ], - frameworks: [], - embeddingModel: 'nomic-embed-text-v1.5', - chunkStrategy: 'ast', - maxFileSize: 1024 * 1024, // 1MB - gitHooksEnabled: true, -}; -``` - ---- - -## Public API - -**File: `src/index.ts`** - -```typescript -export class CodeGraph { - // ============================================================ - // LIFECYCLE - // ============================================================ - - /** - * Initialize CodeGraph for a project directory. - * Creates .codegraph/ if it doesn't exist. - */ - static async init(projectPath: string, config?: Partial): Promise; - - /** - * Open existing CodeGraph for a project. - * Throws if not initialized. - */ - static async open(projectPath: string): Promise; - - /** - * Check if a project has CodeGraph initialized. - */ - static async isInitialized(projectPath: string): Promise; - - /** - * Close database connections and cleanup. - */ - async close(): Promise; - - // ============================================================ - // INDEXING - // ============================================================ - - /** - * Full index of the entire project. - * Use for initial setup or complete rebuild. - */ - async indexAll(options?: { - onProgress?: (progress: IndexProgress) => void; - signal?: AbortSignal; - }): Promise; - - /** - * Index specific files only. - * Use for incremental updates. - */ - async indexFiles(filePaths: string[]): Promise; - - /** - * Sync with current file state. - * Detects changes via content hashing, reindexes only changed files. - */ - async sync(): Promise; - - /** - * Get current index status. - */ - async getStatus(): Promise; - - // ============================================================ - // GRAPH QUERIES - // ============================================================ - - /** - * Get a node by ID. - */ - async getNode(nodeId: string): Promise; - - /** - * Find nodes by name (exact or fuzzy). - */ - async findNodes(query: string, options?: { - fuzzy?: boolean; - kinds?: NodeKind[]; - limit?: number; - }): Promise; - - /** - * Get all edges from/to a node. - */ - async getEdges(nodeId: string, direction?: 'outbound' | 'inbound' | 'both'): Promise; - - /** - * Get nodes that call this node. - */ - async getCallers(nodeId: string): Promise; - - /** - * Get nodes that this node calls. - */ - async getCallees(nodeId: string): Promise; - - /** - * Get nodes that this node depends on. - */ - async getDependencies(nodeId: string): Promise; - - /** - * Get nodes that depend on this node. - */ - async getDependents(nodeId: string): Promise; - - /** - * Traverse the graph from starting nodes. - * Returns a subgraph of connected nodes up to maxDepth. - */ - async traverse(startNodeIds: string[], options?: TraversalOptions): Promise; - - /** - * Get impact radius: what could be affected by changing this node. - */ - async getImpactRadius(nodeId: string, options?: TraversalOptions): Promise; - - /** - * Find paths between two nodes. - */ - async findPaths(fromId: string, toId: string, options?: { - maxDepth?: number; - maxPaths?: number; - }): Promise; - - // ============================================================ - // SEMANTIC SEARCH - // ============================================================ - - /** - * Search for nodes by semantic similarity. - */ - async search(query: string, options?: SearchOptions): Promise; - - /** - * Find relevant subgraph for a natural language query. - * Combines semantic search with graph traversal. - */ - async findRelevantContext(query: string, options?: { - searchLimit?: number; - traversalDepth?: number; - maxNodes?: number; - }): Promise; - - // ============================================================ - // CONTEXT BUILDING - // ============================================================ - - /** - * Build context for a task/issue. - * Returns structured context ready to inject into Claude. - */ - async buildContext(input: string | { title: string; description?: string }, options?: { - maxNodes?: number; - includeCode?: boolean; - format?: 'markdown' | 'json'; - }): Promise; - - /** - * Get the full code for a node. - */ - async getCode(nodeId: string): Promise; - - // ============================================================ - // GIT INTEGRATION - // ============================================================ - - /** - * Install git hooks for automatic incremental indexing. - */ - async installGitHooks(): Promise; - - /** - * Remove git hooks. - */ - async removeGitHooks(): Promise; - - /** - * Get files changed since last index. - */ - async getChangedFiles(): Promise; - - // ============================================================ - // UTILITIES - // ============================================================ - - /** - * Get statistics about the indexed codebase. - */ - async getStats(): Promise; - - /** - * Export the graph to JSON. - */ - async export(): Promise; - - /** - * Update configuration. - */ - async updateConfig(config: Partial): Promise; - - /** - * Get current configuration. - */ - getConfig(): CodeGraphConfig; -} - -// ============================================================ -// RESULT TYPES -// ============================================================ - -export interface IndexProgress { - phase: 'scanning' | 'parsing' | 'resolving' | 'embedding'; - current: number; - total: number; - currentFile?: string; -} - -export interface IndexResult { - success: boolean; - filesIndexed: number; - nodesCreated: number; - edgesCreated: number; - errors: ExtractionError[]; - duration: number; -} - -export interface SyncResult { - filesChecked: number; - filesChanged: number; - filesAdded: number; - filesRemoved: number; - nodesUpdated: number; - duration: number; -} - -export interface IndexStatus { - initialized: boolean; - lastIndexed?: number; - totalFiles: number; - totalNodes: number; - totalEdges: number; - languages: Language[]; - unresolvedReferences: number; -} - -export interface GraphStats { - files: number; - nodes: { - total: number; - byKind: Record; - byLanguage: Record; - }; - edges: { - total: number; - byKind: Record; - resolved: number; - unresolved: number; - }; - vectors: number; -} - -export interface Path { - nodes: Node[]; - edges: Edge[]; - length: number; -} - -export interface ExportedGraph { - version: number; - exportedAt: number; - config: CodeGraphConfig; - stats: GraphStats; - nodes: Node[]; - edges: Edge[]; -} -``` - ---- - -## Tree-sitter Extraction Queries - -These `.scm` files define what to extract from each language. - -**File: `src/extraction/queries/typescript.scm`** - -```scheme -; ============================================================ -; TYPESCRIPT/JAVASCRIPT EXTRACTION QUERIES -; ============================================================ - -; Functions -(function_declaration - name: (identifier) @function.name - parameters: (formal_parameters) @function.params - return_type: (type_annotation)? @function.return_type - body: (statement_block) @function.body -) @function.definition - -; Arrow functions assigned to variables -(lexical_declaration - (variable_declarator - name: (identifier) @function.name - value: (arrow_function - parameters: (formal_parameters) @function.params - return_type: (type_annotation)? @function.return_type - body: (_) @function.body - ) - ) -) @function.definition - -; Classes -(class_declaration - name: (type_identifier) @class.name - (class_heritage - (extends_clause - value: (identifier) @class.extends - )? - (implements_clause - (type_identifier) @class.implements - )* - )? - body: (class_body) @class.body -) @class.definition - -; Methods -(method_definition - name: (property_identifier) @method.name - parameters: (formal_parameters) @method.params - return_type: (type_annotation)? @method.return_type - body: (statement_block) @method.body -) @method.definition - -; Interfaces -(interface_declaration - name: (type_identifier) @interface.name - (extends_type_clause - (type_identifier) @interface.extends - )? - body: (interface_body) @interface.body -) @interface.definition - -; Type aliases -(type_alias_declaration - name: (type_identifier) @type.name - value: (_) @type.value -) @type.definition - -; Imports -(import_statement - (import_clause - (identifier)? @import.default - (named_imports - (import_specifier - name: (identifier) @import.named - alias: (identifier)? @import.alias - )* - )? - )? - source: (string) @import.source -) @import.statement - -; Exports -(export_statement - (export_clause - (export_specifier - name: (identifier) @export.name - )* - )? - declaration: (_)? @export.declaration -) @export.statement - -; Function calls -(call_expression - function: [ - (identifier) @call.function - (member_expression - object: (_) @call.object - property: (property_identifier) @call.method - ) - ] - arguments: (arguments) @call.args -) @call.expression - -; Variable declarations (const/let with significant values) -(lexical_declaration - (variable_declarator - name: (identifier) @variable.name - value: (_) @variable.value - ) -) @variable.declaration - -; JSDoc comments -(comment) @comment -``` - -**File: `src/extraction/queries/php.scm`** - -```scheme -; ============================================================ -; PHP EXTRACTION QUERIES -; ============================================================ - -; Classes -(class_declaration - name: (name) @class.name - (base_clause - (name) @class.extends - )? - (class_interface_clause - (name) @class.implements - )* - body: (declaration_list) @class.body -) @class.definition - -; Methods -(method_declaration - (visibility_modifier)? @method.visibility - name: (name) @method.name - parameters: (formal_parameters) @method.params - return_type: (return_type)? @method.return_type - body: (compound_statement) @method.body -) @method.definition - -; Functions -(function_definition - name: (name) @function.name - parameters: (formal_parameters) @function.params - return_type: (return_type)? @function.return_type - body: (compound_statement) @function.body -) @function.definition - -; Interfaces -(interface_declaration - name: (name) @interface.name - (base_clause - (name) @interface.extends - )? - body: (declaration_list) @interface.body -) @interface.definition - -; Traits -(trait_declaration - name: (name) @trait.name - body: (declaration_list) @trait.body -) @trait.definition - -; Use statements (imports) -(namespace_use_declaration - (namespace_use_clause - (qualified_name) @import.name - (namespace_aliasing_clause - (name) @import.alias - )? - ) -) @import.statement - -; Static method calls (e.g., User::find()) -(scoped_call_expression - scope: (name) @call.class - name: (name) @call.method - arguments: (arguments) @call.args -) @call.static - -; Instance method calls -(member_call_expression - object: (_) @call.object - name: (name) @call.method - arguments: (arguments) @call.args -) @call.instance - -; Function calls -(function_call_expression - function: (name) @call.function - arguments: (arguments) @call.args -) @call.expression - -; Route definitions (Laravel-specific pattern) -(member_call_expression - object: (name) @_route (#eq? @_route "Route") - name: (name) @route.method - arguments: (arguments - (argument - (string) @route.path - ) - ) -) @route.definition - -; PHPDoc comments -(comment) @comment -``` - -**File: `src/extraction/queries/swift.scm`** - -```scheme -; ============================================================ -; SWIFT EXTRACTION QUERIES -; ============================================================ - -; Classes -(class_declaration - name: (type_identifier) @class.name - (type_inheritance_clause - (type_identifier) @class.inherits - )? - body: (class_body) @class.body -) @class.definition - -; Structs -(struct_declaration - name: (type_identifier) @struct.name - (type_inheritance_clause - (type_identifier) @struct.conforms - )? - body: (struct_body) @struct.body -) @struct.definition - -; Protocols -(protocol_declaration - name: (type_identifier) @protocol.name - body: (protocol_body) @protocol.body -) @protocol.definition - -; Functions -(function_declaration - name: (simple_identifier) @function.name - (parameter_clause) @function.params - (function_result - (type_annotation) @function.return_type - )? - body: (function_body) @function.body -) @function.definition - -; Methods (inside class/struct) -(function_declaration - name: (simple_identifier) @method.name - (parameter_clause) @method.params - body: (function_body) @method.body -) @method.definition - -; Properties -(property_declaration - (pattern - (simple_identifier) @property.name - ) - (type_annotation)? @property.type -) @property.definition - -; Imports -(import_declaration - (identifier) @import.module -) @import.statement - -; Function calls -(call_expression - (simple_identifier) @call.function - (call_suffix - (value_arguments) @call.args - ) -) @call.expression - -; Method calls -(call_expression - (navigation_expression - (_) @call.object - (navigation_suffix - (simple_identifier) @call.method - ) - ) - (call_suffix - (value_arguments) @call.args - ) -) @call.method - -; SwiftUI View bodies -(computed_property - name: (simple_identifier) @_body (#eq? @_body "body") - (type_annotation - (user_type - (type_identifier) @_view (#match? @_view "View") - ) - )? - getter: (_) @view.body -) @view.definition - -; Documentation comments -(comment) @comment -(multiline_comment) @comment.multiline -``` - ---- - -## Framework Pattern Resolvers - -**File: `src/resolution/frameworks/laravel.ts`** - -```typescript -import { FrameworkResolver, UnresolvedReference, ResolvedReference } from '../types'; - -export const laravelResolver: FrameworkResolver = { - name: 'laravel', - - // Detect if this is a Laravel project - detect: async (projectPath: string): Promise => { - return await fileExists(join(projectPath, 'artisan')); - }, - - patterns: [ - // Eloquent Model static calls: User::find(), Post::where() - { - pattern: /^([A-Z][a-zA-Z]+)::(\w+)$/, - resolve: async (match, context) => { - const [, className, methodName] = match; - - // Check app/Models first (Laravel 8+) - let modelPath = `app/Models/${className}.php`; - if (await context.fileExists(modelPath)) { - return { filePath: modelPath, className, methodName }; - } - - // Fall back to app/ (Laravel 7 and below) - modelPath = `app/${className}.php`; - if (await context.fileExists(modelPath)) { - return { filePath: modelPath, className, methodName }; - } - - return null; - } - }, - - // Facade calls: Auth::user(), Cache::get() - { - pattern: /^(Auth|Cache|DB|Log|Mail|Queue|Session|Storage|Validator)::(\w+)$/, - resolve: async (match, context) => { - const [, facade, method] = match; - // Facades resolve to underlying service - we can link to the facade for now - return { - filePath: `vendor/laravel/framework/src/Illuminate/Support/Facades/${facade}.php`, - className: facade, - methodName: method, - isExternal: true - }; - } - }, - - // Route helpers: route('checkout.store') - { - pattern: /route\(['"]([^'"]+)['"]\)/, - resolve: async (match, context) => { - const [, routeName] = match; - // Search routes/web.php and routes/api.php for ->name('routeName') - const routeFiles = ['routes/web.php', 'routes/api.php']; - for (const file of routeFiles) { - const content = await context.readFile(file); - if (content?.includes(`name('${routeName}')`)) { - return { filePath: file, routeName }; - } - } - return null; - } - }, - - // View helpers: view('checkout.form') - { - pattern: /view\(['"]([^'"]+)['"]\)/, - resolve: async (match, context) => { - const [, viewName] = match; - const viewPath = viewName.replace(/\./g, '/'); - - // Check both .blade.php and .php - const candidates = [ - `resources/views/${viewPath}.blade.php`, - `resources/views/${viewPath}.php` - ]; - - for (const candidate of candidates) { - if (await context.fileExists(candidate)) { - return { filePath: candidate, viewName }; - } - } - return null; - } - }, - - // Controller references in routes - { - pattern: /\[([A-Z][a-zA-Z]+Controller)::class,\s*['"](\w+)['"]\]/, - resolve: async (match, context) => { - const [, controller, method] = match; - const controllerPath = `app/Http/Controllers/${controller}.php`; - if (await context.fileExists(controllerPath)) { - return { filePath: controllerPath, className: controller, methodName: method }; - } - return null; - } - } - ], - - // Additional node detection specific to Laravel - extractNodes: async (filePath: string, content: string) => { - const nodes: Node[] = []; - - // Detect route definitions - const routePattern = /Route::(get|post|put|patch|delete)\(\s*['"]([^'"]+)['"]/g; - let match; - while ((match = routePattern.exec(content)) !== null) { - const [, method, path] = match; - const line = content.slice(0, match.index).split('\n').length; - nodes.push({ - id: `route:${filePath}:${method.toUpperCase()}:${path}`, - kind: 'route', - name: `${method.toUpperCase()} ${path}`, - filePath, - startLine: line, - language: 'php', - metadata: { httpMethod: method.toUpperCase(), path } - }); - } - - return nodes; - } -}; -``` - -**File: `src/resolution/frameworks/shopify.ts`** - -```typescript -import { FrameworkResolver } from '../types'; - -export const shopifyResolver: FrameworkResolver = { - name: 'shopify', - - detect: async (projectPath: string): Promise => { - return await fileExists(join(projectPath, 'shopify.theme.toml')) || - await fileExists(join(projectPath, 'config/settings_schema.json')); - }, - - patterns: [ - // Render tags: {% render 'product-card' %} - { - pattern: /\{%\s*render\s+['"]([^'"]+)['"]/, - resolve: async (match, context) => { - const [, snippetName] = match; - const snippetPath = `snippets/${snippetName}.liquid`; - if (await context.fileExists(snippetPath)) { - return { filePath: snippetPath, kind: 'renders' }; - } - return null; - } - }, - - // Include tags: {% include 'header' %} - { - pattern: /\{%\s*include\s+['"]([^'"]+)['"]/, - resolve: async (match, context) => { - const [, snippetName] = match; - const snippetPath = `snippets/${snippetName}.liquid`; - if (await context.fileExists(snippetPath)) { - return { filePath: snippetPath, kind: 'includes' }; - } - return null; - } - }, - - // Section tags: {% section 'header' %} - { - pattern: /\{%\s*section\s+['"]([^'"]+)['"]/, - resolve: async (match, context) => { - const [, sectionName] = match; - const sectionPath = `sections/${sectionName}.liquid`; - if (await context.fileExists(sectionPath)) { - return { filePath: sectionPath, kind: 'renders' }; - } - return null; - } - }, - - // Asset URLs: {{ 'style.css' | asset_url }} - { - pattern: /['"]([\w\-\.]+)['"]\s*\|\s*asset_url/, - resolve: async (match, context) => { - const [, assetName] = match; - const assetPath = `assets/${assetName}`; - if (await context.fileExists(assetPath)) { - return { filePath: assetPath, kind: 'references' }; - } - return null; - } - } - ], - - extractNodes: async (filePath: string, content: string) => { - const nodes: Node[] = []; - - // Detect schema in sections - const schemaMatch = content.match(/\{%\s*schema\s*%\}([\s\S]*?)\{%\s*endschema\s*%\}/); - if (schemaMatch) { - try { - const schema = JSON.parse(schemaMatch[1]); - if (schema.name) { - nodes.push({ - id: `section:${filePath}`, - kind: 'component', - name: schema.name, - filePath, - language: 'liquid', - metadata: { - schemaSettings: schema.settings?.map(s => s.id), - schemaBlocks: schema.blocks?.map(b => b.type) - } - }); - } - } catch (e) { - // Invalid JSON in schema - } - } - - return nodes; - } -}; -``` - ---- - -## Context Builder Output Format - -**File: `src/context/formatter.ts`** - -```typescript -export function formatContextAsMarkdown(context: Context): string { - const lines: string[] = []; - - lines.push('## Code Context\n'); - - // Graph structure section - lines.push('### Structure\n'); - lines.push('```'); - for (const nodeId of context.subgraph.entryPoints) { - const node = context.subgraph.nodes.find(n => n.id === nodeId); - if (node) { - lines.push(formatNodeTree(node, context.subgraph, 0)); - } - } - lines.push('```\n'); - - // Code blocks section - if (context.codeBlocks.length > 0) { - lines.push('### Code\n'); - for (const block of context.codeBlocks) { - lines.push(`#### ${block.nodeName} (${block.filePath}:${block.startLine})\n`); - lines.push('```' + block.language); - lines.push(block.code); - lines.push('```\n'); - } - } - - // Related files section - if (context.relatedFiles.length > 0) { - lines.push('### Related Files\n'); - for (const file of context.relatedFiles) { - lines.push(`- ${file}`); - } - } - - return lines.join('\n'); -} - -function formatNodeTree(node: Node, subgraph: Subgraph, depth: number): string { - const indent = ' '.repeat(depth); - const lines: string[] = []; - - // Node header - const location = node.startLine ? `:${node.startLine}` : ''; - lines.push(`${indent}${node.name} (${node.filePath}${location})`); - - // Outbound edges - const outbound = subgraph.edges.filter(e => e.sourceId === node.id); - for (const edge of outbound) { - const target = subgraph.nodes.find(n => n.id === edge.targetId); - const targetName = target?.name || edge.targetName || 'unknown'; - lines.push(`${indent}├── ${edge.kind} → ${targetName}`); - } - - return lines.join('\n'); -} - -// Example output: -// -// ## Code Context -// -// ### Structure -// ``` -// CheckoutController (app/Http/Controllers/CheckoutController.php:15) -// ├── calls → CartService.getCart -// ├── calls → PaymentService.processPayment -// ├── calls → OrderService.create -// ├── throws → PaymentException -// -// PaymentService (app/Services/PaymentService.php:8) -// ├── calls → StripeClient.charge -// ├── calls → TransactionRepository.save -// ├── throws → PaymentException -// ├── throws → StripeTimeoutException -// ``` -// -// ### Code -// -// #### store (app/Http/Controllers/CheckoutController.php:45) -// ```php -// public function store(Request $request) -// { -// $cart = $this->cartService->getCart($request->user()); -// $payment = $this->paymentService->processPayment($cart); -// ... -// } -// ``` -``` - ---- - -## Installation & Integration - -**How to use CodeGraph (headless library, no UI):** - -### Option 1: CLI (for any project, no code required) - -```bash -# Install globally -npm install -g codegraph - -# Initialize in any project -cd /path/to/my-laravel-app -codegraph init - -# Index the codebase -codegraph index - -# Query the graph -codegraph query "what calls PaymentService" -codegraph impact "app/Services/AuthService.php" - -# Build context for a task (outputs markdown) -codegraph context "Fix checkout silent failure" - -# Check status -codegraph status - -# Sync after changes -codegraph sync -``` - -### Option 2: Library (for integration into apps like Beads Dashboard) - -```typescript -import { CodeGraph } from 'codegraph'; - -// Initialize for a project -const graph = await CodeGraph.init('/path/to/project'); - -// Full index with optional progress callback -await graph.indexAll({ - onProgress: (progress) => { - console.log(`${progress.phase}: ${progress.current}/${progress.total}`); - } -}); - -// Or open existing and sync -const graph = await CodeGraph.open('/path/to/project'); -const syncResult = await graph.sync(); - -// Build context for a task (returns structured data) -const context = await graph.buildContext('Fix checkout silent failure'); - -// Query the graph directly -const callers = await graph.getCallers('func:src/payment.ts:processPayment:45'); -const impact = await graph.getImpactRadius('class:AuthService', { maxDepth: 2 }); - -// Search semantically -const results = await graph.search('authentication middleware'); - -// Clean up -await graph.close(); -``` - -### Option 3: MCP Server (for Claude Code CLI integration) - -```bash -# Run as MCP server (Claude Code can query directly) -codegraph serve --mcp - -# In Claude Code's MCP config, add: -# { -# "codegraph": { -# "command": "codegraph", -# "args": ["serve", "--mcp", "--project", "/path/to/project"] -# } -# } -``` - -Then Claude Code can use tools like: -- `codegraph_search` — semantic search -- `codegraph_context` — build context for a task -- `codegraph_callers` — who calls this function -- `codegraph_impact` — what's affected if I change this - -**What gets created in the project:** - -``` -my-project/ -├── .codegraph/ -│ ├── graph.db # SQLite database (gitignored) -│ ├── config.json # User can customize (committed) -│ └── .gitignore # Contains: graph.db -└── .git/ - └── hooks/ - └── post-commit # Auto-installed hook -``` - -**Default `.codegraph/config.json`:** - -```json -{ - "version": 1, - "exclude": [ - "node_modules/**", - "vendor/**", - "dist/**", - "build/**" - ], - "frameworks": ["laravel"], - "gitHooksEnabled": true -} -``` - ---- - -## Implementation Phases - -### Phase 1: Foundation (Week 1) -- [ ] Project structure setup (npm package) -- [ ] SQLite database initialization with schema -- [ ] Basic types and interfaces -- [ ] Config file handling -- [ ] .codegraph/ directory management - -### Phase 2: Tree-sitter Extraction (Week 1-2) -- [ ] Tree-sitter native bindings setup (works in Node.js, Electron, etc.) -- [ ] Grammar loading system -- [ ] TypeScript/JavaScript extraction queries -- [ ] PHP extraction queries -- [ ] Basic node/edge extraction from AST - -### Phase 3: Reference Resolution (Week 2) -- [ ] Name-based symbol matching -- [ ] Import path resolution -- [ ] Laravel framework patterns -- [ ] Express/Next.js patterns -- [ ] Unresolved reference tracking - -### Phase 4: Graph Queries (Week 2-3) -- [ ] Basic traversal (callers, callees) -- [ ] Impact radius calculation -- [ ] Path finding between nodes -- [ ] Subgraph extraction - -### Phase 5: Vector Embeddings (Week 3) -- [ ] ONNX runtime integration -- [ ] nomic-embed-text model loading -- [ ] sqlite-vss setup -- [ ] Embedding generation for nodes -- [ ] Similarity search - -### Phase 6: Context Builder (Week 3-4) -- [ ] Semantic search → graph expansion pipeline -- [ ] Context formatting for Claude -- [ ] Code snippet extraction -- [ ] Output size management - -### Phase 7: Sync & Freshness (Week 4) -- [ ] Content hashing for change detection -- [ ] Incremental reindexing -- [ ] Git hook installation -- [ ] Post-commit handler - -### Phase 8: Additional Languages (Week 4+) -- [ ] Swift extraction queries -- [ ] Kotlin extraction queries -- [ ] Java extraction queries -- [ ] Liquid/Shopify patterns -- [ ] Ruby/Rails patterns - -### Phase 9: Polish & Hardening (Week 5) -- [ ] Error handling and recovery -- [ ] Performance optimization -- [ ] Memory management for large codebases -- [ ] Concurrent indexing safety -- [ ] API documentation and JSDoc comments - -### Phase 10: CLI (Week 5-6, Optional) -- [ ] CLI argument parsing (commander or yargs) -- [ ] `codegraph init` command -- [ ] `codegraph index` command -- [ ] `codegraph query` command -- [ ] `codegraph context` command -- [ ] `codegraph status` command -- [ ] `codegraph sync` command - -### Phase 11: MCP Server (Week 6, Optional) -- [ ] MCP protocol implementation -- [ ] `codegraph_search` tool -- [ ] `codegraph_context` tool -- [ ] `codegraph_callers` / `codegraph_callees` tools -- [ ] `codegraph_impact` tool -- [ ] Stdio transport for Claude Code integration - ---- - -## Testing Strategy - -```typescript -// Example test structure - -describe('CodeGraph', () => { - describe('extraction', () => { - it('extracts functions from TypeScript', async () => { - const code = ` - export function processPayment(amount: number): Promise { - return stripe.charge(amount); - } - `; - const result = await extract(code, 'typescript'); - - expect(result.nodes).toContainEqual(expect.objectContaining({ - kind: 'function', - name: 'processPayment', - signature: '(amount: number): Promise' - })); - - expect(result.edges).toContainEqual(expect.objectContaining({ - kind: 'calls', - targetName: 'stripe.charge' - })); - }); - - it('extracts Laravel routes from PHP', async () => { - const code = ` - Route::post('/checkout', [CheckoutController::class, 'store'])->name('checkout.store'); - `; - const result = await extract(code, 'php'); - - expect(result.nodes).toContainEqual(expect.objectContaining({ - kind: 'route', - name: 'POST /checkout' - })); - }); - }); - - describe('resolution', () => { - it('resolves Laravel model calls', async () => { - const graph = await createTestGraph({ - 'app/Models/User.php': 'class User extends Model { public static function find($id) {} }', - 'app/Http/Controllers/UserController.php': 'User::find($id);' - }); - - const edges = await graph.getEdges('controller:UserController:show'); - expect(edges).toContainEqual(expect.objectContaining({ - kind: 'calls', - targetId: 'method:app/Models/User.php:find', - resolved: true - })); - }); - }); - - describe('traversal', () => { - it('finds impact radius', async () => { - const graph = await createTestGraph(/* ... */); - const subgraph = await graph.getImpactRadius('class:PaymentService', { maxDepth: 2 }); - - expect(subgraph.nodes.map(n => n.name)).toContain('CheckoutController'); - expect(subgraph.nodes.map(n => n.name)).toContain('OrderService'); - }); - }); -}); -``` - ---- - -## Open Questions / Decisions Needed - -1. **Embedding model size vs quality**: nomic-embed-text-v1.5 (275MB) vs all-MiniLM-L6-v2 (90MB)? - -2. **Tree-sitter WASM vs native**: WASM is easier for Electron distribution, native is faster. Start with WASM? - -3. **Max context size**: How many nodes/code blocks before we truncate? Configurable? - -4. **Unresolved references**: Show them in context (with "unresolved" marker) or hide them? - -5. **Multi-language projects**: Projects mixing PHP + JS + Liquid — handle all simultaneously? - -6. **Binary/asset files**: Track references to images, fonts, etc. or ignore? - ---- - -## Success Criteria - -1. **Accuracy**: >90% of function calls correctly linked to definitions -2. **Speed**: Full index of 10k file project in <60 seconds -3. **Freshness**: Incremental update after commit in <5 seconds -4. **Context quality**: Generated context helps Claude solve issues faster (qualitative) -5. **Portability**: Works on any macOS machine without additional setup - ---- - -## Resources - -- Tree-sitter: https://tree-sitter.github.io/tree-sitter/ -- Tree-sitter WASM: https://github.com/nicolo-ribaudo/nicolo-nicolo-tree-sitter/tree-sitter-wasm-builds/tree/main -- sqlite-vss: https://github.com/asg017/sqlite-vss -- nomic-embed: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5 -- ONNX Runtime Node: https://onnxruntime.ai/docs/get-started/with-javascript.html diff --git a/README.md b/README.md index fd1ffaba..5d00c671 100644 --- a/README.md +++ b/README.md @@ -2,32 +2,70 @@ # CodeGraph -### Supercharge Claude Code with Semantic Code Intelligence +### Supercharge Claude Code, Cursor, Codex, OpenCode, and Hermes Agent with Semantic Code Intelligence -**94% fewer tool calls · 77% faster exploration · 100% local** +**~35% cheaper · ~70% fewer tool calls · 100% local** [![npm version](https://img.shields.io/npm/v/@colbymchenry/codegraph.svg)](https://www.npmjs.com/package/@colbymchenry/codegraph) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) -[![Node.js](https://img.shields.io/badge/Node.js-18+-green.svg)](https://nodejs.org/) +[![Self-contained](https://img.shields.io/badge/Node.js-bundled%20%C2%B7%20none%20required-brightgreen.svg)](https://nodejs.org/) -[![Windows](https://img.shields.io/badge/Windows-supported-blue.svg)](#) -[![macOS](https://img.shields.io/badge/macOS-supported-blue.svg)](#) -[![Linux](https://img.shields.io/badge/Linux-supported-blue.svg)](#) +[![Windows](https://img.shields.io/badge/Windows-supported-blue.svg)](#supported-platforms) +[![macOS](https://img.shields.io/badge/macOS-supported-blue.svg)](#supported-platforms) +[![Linux](https://img.shields.io/badge/Linux-supported-blue.svg)](#supported-platforms) -
+[![Claude Code](https://img.shields.io/badge/Claude_Code-supported-blueviolet.svg)](#supported-agents) +[![Cursor](https://img.shields.io/badge/Cursor-supported-blueviolet.svg)](#supported-agents) +[![Codex CLI](https://img.shields.io/badge/Codex_CLI-supported-blueviolet.svg)](#supported-agents) +[![opencode](https://img.shields.io/badge/opencode-supported-blueviolet.svg)](#supported-agents) +[![Hermes Agent](https://img.shields.io/badge/Hermes_Agent-supported-blueviolet.svg)](#supported-agents) -### Get Started + + +## Get Started + +**No Node.js required** — one command grabs the right build for your OS: ```bash -npx @colbymchenry/codegraph +# macOS / Linux +curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh + +# Windows (PowerShell) +irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 | iex ``` -Interactive installer configures Claude Code automatically +Already have Node? Use npm instead (works on any version): + +```bash +npx @colbymchenry/codegraph # zero-install, or: +npm i -g @colbymchenry/codegraph +``` + +CodeGraph bundles its own runtime — nothing to compile, no native build, works the same everywhere. The interactive installer auto-configures your agent(s) — Claude Code, Cursor, Codex CLI, opencode, Hermes Agent. + +### Initialize Projects + +```bash +cd your-project +codegraph init -i +``` + +
![1_C_VYnhpys0UHrOuOgpgoyw](https://github.com/user-attachments/assets/f168182f-4d9a-44e0-94d7-08d018cc8a3a)
+### Uninstall + +Changed your mind? One command removes CodeGraph from every agent it configured: + +```bash +codegraph uninstall +``` + +Reverses the installer — strips CodeGraph's MCP server config, instructions, and permissions from each configured agent. Your project indexes (`.codegraph/`) are left untouched; remove those per-project with `codegraph uninit`. Use `--target` to remove from specific agents, or `--yes` to run non-interactively. + --- ## Why CodeGraph? @@ -38,61 +76,50 @@ When Claude Code explores a codebase, it spawns **Explore agents** that scan fil ### Benchmark Results -Tested across 6 real-world codebases comparing Claude Code's Explore agent **with** and **without** CodeGraph: +Tested across **7 real-world open-source codebases** spanning 7 languages, comparing an agent (Claude Code, headless) answering one architecture question **with** and **without** CodeGraph. Each cell is the savings at the **median of 4 runs per arm**. _Re-validated on **v0.9.4** (2026-05-24)._ + +> **Average: 35% cheaper · 57% fewer tokens · 46% faster · 71% fewer tool calls** -> **Average: 92% fewer tool calls · 71% faster** +| Codebase | Language | Cost | Tokens | Time | Tool calls | +|----------|----------|------|--------|------|------------| +| **VS Code** | TypeScript · ~10k files | 26% cheaper | 78% fewer | 52% faster | 85% fewer | +| **Excalidraw** | TypeScript · ~640 | 52% cheaper | 90% fewer | 73% faster | 96% fewer | +| **Django** | Python · ~3k | 12% cheaper | 36% fewer | 19% faster | 53% fewer | +| **Tokio** | Rust · ~790 | 82% cheaper | 86% fewer | 71% faster | 92% fewer | +| **OkHttp** | Java · ~645 | 2% cheaper | 13% fewer | 31% faster | 45% fewer | +| **Gin** | Go · ~110 | 21% cheaper | 34% fewer | 27% faster | 40% fewer | +| **Alamofire** | Swift · ~110 | 47% cheaper | 64% fewer | 48% faster | 83% fewer | -| Codebase | With CG | Without CG | Improvement | -|----------|---------|------------|-------------| -| **VS Code** · TypeScript | 3 calls, 17s | 52 calls, 1m 37s | **94% fewer · 82% faster** | -| **Excalidraw** · TypeScript | 3 calls, 29s | 47 calls, 1m 45s | **94% fewer · 72% faster** | -| **Claude Code** · Python + Rust | 3 calls, 39s | 40 calls, 1m 8s | **93% fewer · 43% faster** | -| **Claude Code** · Java | 1 call, 19s | 26 calls, 1m 22s | **96% fewer · 77% faster** | -| **Alamofire** · Swift | 3 calls, 22s | 32 calls, 1m 39s | **91% fewer · 78% faster** | -| **Swift Compiler** · Swift/C++ | 6 calls, 35s | 37 calls, 2m 8s | **84% fewer · 73% faster** | +The gains scale with codebase size: on large repos the agent answers from the index in a handful of calls with **zero file reads**, while the no-CodeGraph agent fans out across grep/find/Read (and the sub-agents it spawns). On a small repo like Gin (~150 files) native search is already cheap, so the margin narrows.
Full benchmark details -All tests used Claude Opus 4.6 (1M context) with Claude Code v2.1.91. Each test spawned a single Explore agent with the same question. +**Methodology.** Each arm is `claude -p` (Claude Opus 4.7) run headlessly against the repo with `--strict-mcp-config`: **WITH** = CodeGraph's MCP server enabled, **WITHOUT** = an empty MCP config. Built-in Read/Grep/Bash stay available to both. Same question per repo, **4 runs per arm, median reported**. Cost = the run's `total_cost_usd`; Tokens = total tokens processed (input incl. cached + output); Time = wall-clock; Tool calls = every tool invocation, including those inside any sub-agents the model spawns. Repos cloned at `--depth 1` and indexed by the same CodeGraph build that served them. Re-validated on codegraph **v0.9.4** (2026-05-24); per-repo numbers move run-to-run with how hard the without-arm thrashes (the median-of-4 smooths it, but tails remain — e.g. Tokio's without-arm hit $2.41/3m one batch). -**Queries used:** +**Queries:** | Codebase | Query | |----------|-------| | VS Code | "How does the extension host communicate with the main process?" | -| Excalidraw | "How does collaborative editing and real-time sync work?" | -| Claude Code (Python+Rust) | "How does tool execution work end to end?" | -| Claude Code (Java) | "How does tool execution work end to end?" | -| Alamofire | "Trace how a request flows from Session.request() through to the URLSession layer" | -| Swift Compiler | "How does the Swift compiler handle error diagnostics?" | - -**With CodeGraph — the agent uses `codegraph_explore` and stops:** -| Codebase | Files Indexed | Nodes | Tool Uses | Tokens | Time | File Reads | -|----------|--------------|-------|-----------|--------|------|------------| -| VS Code (TypeScript) | 4,002 | 59,377 | 3 | 56.6k | 17s | 0 | -| Excalidraw (TypeScript) | 626 | 9,859 | 3 | 57.1k | 29s | 0 | -| Claude Code (Python+Rust) | 115 | 3,080 | 3 | 67.1k | 39s | 0 | -| Claude Code (Java) | — | — | 1 | 40.8k | 19s | 0 | -| Alamofire (Swift) | 102 | 2,624 | 3 | 57.3k | 22s | 0 | -| Swift Compiler (Swift/C++) | 25,874 | 272,898 | 6 | 77.4k | 35s | 0 | - -**Without CodeGraph — the agent uses grep, find, ls, and Read extensively:** -| Codebase | Tool Uses | Tokens | Time | File Reads | -|----------|-----------|--------|------|------------| -| VS Code (TypeScript) | 52 | 89.4k | 1m 37s | ~15 | -| Excalidraw (TypeScript) | 47 | 77.9k | 1m 45s | ~20 | -| Claude Code (Python+Rust) | 40 | 69.3k | 1m 8s | ~15 | -| Claude Code (Java) | 26 | 73.3k | 1m 22s | ~15 | -| Alamofire (Swift) | 32 | 52.4k | 1m 39s | ~10 | -| Swift Compiler (Swift/C++) | 37 | 99.1k | 2m 8s | ~20 | - -**Key observations:** -- With CodeGraph, the agent **never fell back to reading files** — it trusted the codegraph_explore results completely -- Without CodeGraph, agents spent most of their time on discovery (find, ls, grep) before they could even start reading relevant code -- The Java codebase needed only **1 codegraph_explore call** to answer the entire question -- Cross-language queries (Python+Rust) worked seamlessly — CodeGraph's graph traversal found connections across language boundaries -- The Swift benchmark (Alamofire) traced a **9-step call chain** from `Session.request()` to `URLSession.dataTask()` — CodeGraph's graph traversal at depth 3 captured the full chain in one explore call -- The **Swift Compiler** benchmark is the largest codebase tested (**25,874 files, 272,898 nodes**) — CodeGraph indexed it in under 4 minutes and the agent answered a complex cross-cutting question with **6 explore calls and zero file reads** in 35 seconds +| Excalidraw | "How does Excalidraw render and update canvas elements?" | +| Django | "How does Django's ORM build and execute a query from a QuerySet?" | +| Tokio | "How does tokio schedule and run async tasks on its runtime?" | +| OkHttp | "How does OkHttp process a request through its interceptor chain?" | +| Gin | "How does gin route requests through its middleware chain?" | +| Alamofire | "How does Alamofire build, send, and validate a request?" | + +**Raw medians — WITH → WITHOUT:** +| Codebase | Cost | Tokens | Time | Tool calls | +|----------|------|--------|------|------------| +| VS Code | $0.60 → $0.80 | 601k → 2.8M | 1m 10s → 2m 26s | 8 → 55 | +| Excalidraw | $0.43 → $0.90 | 344k → 3.5M | 48s → 2m 58s | 3 → 79 | +| Django | $0.59 → $0.67 | 739k → 1.2M | 1m 19s → 1m 38s | 9 → 19 | +| Tokio | $0.42 → $2.41 | 379k → 2.6M | 53s → 3m 2s | 4 → 53 | +| OkHttp | $0.47 → $0.47 | 636k → 730k | 42s → 1m 1s | 6 → 11 | +| Gin | $0.37 → $0.47 | 444k → 675k | 44s → 1m 0s | 6 → 10 | +| Alamofire | $0.61 → $1.14 | 1.0M → 2.8M | 1m 17s → 2m 27s | 12 → 69 | + +**Why CodeGraph wins:** with the index available, the agent answers directly — `codegraph_context` to map the area, then one `codegraph_explore` for the relevant source — and stops, usually with zero file reads. Without it, the agent (and the Explore sub-agents it spawns) spends most of its budget on discovery (find/ls/grep) before reading the right code. CodeGraph only helps when queried *directly*, so its instructions steer agents to answer directly rather than delegate exploration to file-reading sub-agents — otherwise a sub-agent reads files regardless and CodeGraph becomes overhead.
@@ -106,11 +133,35 @@ All tests used Claude Opus 4.6 (1M context) with Claude Code v2.1.91. Each test | **Full-Text Search** | Find code by name instantly across your entire codebase, powered by FTS5 | | **Impact Analysis** | Trace callers, callees, and the full impact radius of any symbol before making changes | | **Always Fresh** | File watcher uses native OS events (FSEvents/inotify/ReadDirectoryChangesW) with debounced auto-sync — the graph stays current as you code, zero config | -| **19+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Swift, Kotlin, Dart, Svelte, Liquid, Pascal/Delphi | +| **19+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Swift, Kotlin, Dart, Lua, Luau, Svelte, Liquid, Pascal/Delphi | +| **Framework-aware Routes** | Recognizes web-framework routing files and links URL patterns to their handlers across 14 frameworks | | **100% Local** | No data leaves your machine. No API keys. No external services. SQLite database only | --- +## Framework-aware Routes + +CodeGraph detects web-framework routing files and emits `route` nodes linked by `references` edges to their handler classes or functions. Querying callers of a view/controller now surfaces the URL pattern that binds it. + +| Framework | Shapes recognized | +|---|---| +| **Django** | `path()`, `re_path()`, `url()`, `include()` in `urls.py` (CBV `.as_view()`, dotted paths) | +| **Flask** | `@app.route('/path', methods=[...])`, blueprint routes | +| **FastAPI** | `@app.get(...)`, `@router.post(...)`, all standard methods | +| **Express** | `app.get(...)`, `router.post(...)` with middleware chains | +| **NestJS** | `@Controller` + `@Get/@Post/...`, GraphQL `@Resolver` + `@Query/@Mutation`, `@MessagePattern`/`@EventPattern`, `@SubscribeMessage` | +| **Laravel** | `Route::get()`, `Route::resource()`, `Controller@action`, tuple syntax | +| **Drupal** | `*.routing.yml` routes (`_controller`, `_form`, entity handlers); `hook_*` implementations in `.module`/`.theme`/`.install`/`.inc` | +| **Rails** | `get '/x', to: 'users#index'`, hash-rocket `=>` syntax | +| **Spring** | `@GetMapping`, `@PostMapping`, `@RequestMapping` on methods | +| **Gin / chi / gorilla / mux** | `r.GET(...)`, `router.HandleFunc(...)` | +| **Axum / actix / Rocket** | `.route("/x", get(handler))` | +| **ASP.NET** | `[HttpGet("/x")]` attributes on action methods | +| **Vapor** | `app.get("x", use: handler)` | +| **React Router** / **SvelteKit** | Route component nodes | + +--- + ## Quick Start ### 1. Run the Installer @@ -120,15 +171,33 @@ npx @colbymchenry/codegraph ``` The installer will: -- Prompt to install `codegraph` globally (needed for the MCP server) -- Configure the MCP server in `~/.claude.json` -- Set up auto-allow permissions for CodeGraph tools -- Add global instructions to `~/.claude/CLAUDE.md` -- Optionally initialize your current project +- Ask which agent(s) to configure — auto-detects installed ones from: **Claude Code**, **Cursor**, **Codex CLI**, **opencode**, **Hermes Agent** +- Prompt to install `codegraph` on your PATH (so agents can launch the MCP server) +- Ask whether configs apply to all your projects or just this one +- Write each chosen agent's MCP server config + an instructions file (e.g. `CLAUDE.md`, `.cursor/rules/codegraph.mdc`, `~/.codex/AGENTS.md`) +- Set up auto-allow permissions when Claude Code is one of the targets +- Initialize your current project (local installs only) + +**Non-interactive (scripting / CI):** + +```bash +codegraph install --yes # auto-detect agents, install global +codegraph install --target=cursor,claude --yes # explicit target list +codegraph install --target=auto --location=local # detected agents, project-local +codegraph install --print-config codex # print snippet, no file writes +``` + +| Flag | Values | Default | +|---|---|---| +| `--target` | `auto`, `all`, `none`, or csv (`claude,cursor,...`) | prompt | +| `--location` | `global`, `local` | prompt | +| `--yes` | (boolean) | prompt every step | +| `--no-permissions` | (boolean) skip Claude auto-allow list | permissions on | +| `--print-config ` | dump snippet for one agent and exit | — | -### 2. Restart Claude Code +### 2. Restart Your Agent -Restart Claude Code for the MCP server to load. +Restart your agent (Claude Code / Cursor / Codex CLI / opencode / Hermes Agent) for the MCP server to load. ### 3. Initialize Projects @@ -137,7 +206,9 @@ cd your-project codegraph init -i ``` -That's it! Claude Code will use CodeGraph tools automatically when a `.codegraph/` directory exists. +Builds the per-project knowledge graph index. Also wires up any project-local agent surfaces (e.g. Cursor's `.cursor/rules/codegraph.mdc`) so a single global `codegraph install` works in every project you open — no need to re-run the installer per project. + +That's it — your agent will use CodeGraph tools automatically when a `.codegraph/` directory exists.
Manual Setup (Alternative) @@ -192,25 +263,21 @@ CodeGraph builds a semantic knowledge graph of codebases for faster, smarter cod ### If `.codegraph/` exists in the project -**NEVER call `codegraph_explore` or `codegraph_context` directly in the main session.** These tools return large amounts of source code that fills up main session context. Instead, ALWAYS spawn an Explore agent for any exploration question (e.g., "how does X work?", "explain the Y system", "where is Z implemented?"). - -**When spawning Explore agents**, include this instruction in the prompt: +**Answer directly with CodeGraph — don't delegate exploration to a file-reading sub-agent or a grep/read loop.** CodeGraph *is* the pre-built search index; re-deriving its answers with grep + Read repeats work it already did and costs more for the same result. For "how does X work?", architecture, trace, or where-is-X questions, answer in a handful of CodeGraph calls and stop — typically with **zero file reads**. The returned source is complete and authoritative: treat it as already read and do not re-open those files. Reach for raw Read/Grep only to confirm a specific detail CodeGraph didn't cover. -> This project has CodeGraph initialized (.codegraph/ exists). Use `codegraph_explore` as your PRIMARY tool — it returns full source code sections from all relevant files in one call. -> -> **Rules:** -> 1. Follow the explore call budget in the `codegraph_explore` tool description — it scales automatically based on project size. -> 2. Do NOT re-read files that codegraph_explore already returned source code for. The source sections are complete and authoritative. -> 3. Only fall back to grep/glob/read for files listed under "Additional relevant files" if you need more detail, or if codegraph returned no results. - -**The main session may only use these lightweight tools directly** (for targeted lookups before making edits, not for exploration): +**Tool selection by intent:** | Tool | Use For | |------|---------| -| `codegraph_search` | Find symbols by name | -| `codegraph_callers` / `codegraph_callees` | Trace call flow | +| `codegraph_context` | Map a task / feature / area first — composes search + node + callers + callees in one call | +| `codegraph_trace` | "How does X reach Y" — the call path, each hop's body inline (follows dynamic-dispatch hops grep can't) | +| `codegraph_explore` | Survey several related symbols' source in ONE budget-capped call | +| `codegraph_search` | Find a symbol by name | +| `codegraph_callers` / `codegraph_callees` | Walk call flow one hop at a time | | `codegraph_impact` | Check what's affected before editing | -| `codegraph_node` | Get a single symbol's details | +| `codegraph_node` | Get a single symbol's source / signature | + +A direct CodeGraph answer is a handful of calls; a grep/read exploration is dozens. ### If `.codegraph/` does NOT exist @@ -226,34 +293,23 @@ At the start of a session, ask the user if they'd like to initialize CodeGraph: ## How It Works ``` -┌─────────────────────────────────────────────────────────────────┐ -│ Claude Code │ -│ │ -│ "Implement user authentication" │ -│ │ │ -│ ▼ │ -│ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ Explore Agent │ ──── │ Explore Agent │ │ -│ └────────┬────────┘ └────────┬────────┘ │ -│ │ │ │ -└───────────┼────────────────────────┼─────────────────────────────┘ - │ │ - ▼ ▼ ┌───────────────────────────────────────────────────────────────────┐ -│ CodeGraph MCP Server │ -│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ -│ │ Search │ │ Callers │ │ Context │ │ -│ │ "auth" │ │ "login()" │ │ for task │ │ -│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ -│ │ │ │ │ -│ └────────────────┼────────────────┘ │ -│ ▼ │ -│ ┌───────────────────────┐ │ -│ │ SQLite Graph DB │ │ -│ │ • 387 symbols │ │ -│ │ • 1,204 edges │ │ -│ │ • Instant lookups │ │ -│ └───────────────────────┘ │ +│ Claude Code │ +│ │ +│ "How does a request reach the database?" │ +│ calls CodeGraph tools directly — no Explore sub-agent │ +│ │ │ +└─────────────────────────────────┬─────────────────────────────────┘ + │ + ▼ +┌───────────────────────────────────────────────────────────────────┐ +│ CodeGraph MCP Server │ +│ │ +│ context · trace · explore · callers · callees · impact │ +│ │ │ +│ ▼ │ +│ SQLite knowledge graph │ +│ symbols · edges · files · FTS5 full-text search │ └───────────────────────────────────────────────────────────────────┘ ``` @@ -272,6 +328,7 @@ At the start of a session, ask the user if they'd like to initialize CodeGraph: ```bash codegraph # Run interactive installer codegraph install # Run installer (explicit) +codegraph uninstall # Remove CodeGraph from your agents (inverse of install) codegraph init [path] # Initialize in a project (--index to also index) codegraph uninit [path] # Remove CodeGraph from a project (--force to skip prompt) codegraph index [path] # Full index (--force to re-index, --quiet for less output) @@ -280,6 +337,9 @@ codegraph status [path] # Show statistics codegraph query # Search symbols (--kind, --limit, --json) codegraph files [path] # Show file structure (--format, --filter, --max-depth, --json) codegraph context # Build context for AI (--format, --max-nodes) +codegraph callers # Find what calls a function/method (--limit, --json) +codegraph callees # Find what a function/method calls (--limit, --json) +codegraph impact # Analyze what code is affected by changing a symbol (--depth, --json) codegraph affected [files...] # Find test files affected by changes (see below) codegraph serve --mcp # Start MCP server ``` @@ -322,10 +382,12 @@ When running as an MCP server, CodeGraph exposes these tools to Claude Code: |------|---------| | `codegraph_search` | Find symbols by name across the codebase | | `codegraph_context` | Build relevant code context for a task | +| `codegraph_trace` | Trace the call path between two symbols ("how does X reach Y") in one call — each hop with its body inline, following dynamic-dispatch hops (callbacks, React re-render, interface→impl) that grep can't | | `codegraph_callers` | Find what calls a function | | `codegraph_callees` | Find what a function calls | | `codegraph_impact` | Analyze what code is affected by changing a symbol | | `codegraph_node` | Get details about a specific symbol (optionally with source code) | +| `codegraph_explore` | Return source for several related symbols grouped by file, plus a relationship map, in one call | | `codegraph_files` | Get indexed file structure (faster than filesystem scanning) | | `codegraph_status` | Check index health and statistics | @@ -357,28 +419,47 @@ cg.close(); ## Configuration -The `.codegraph/config.json` file controls indexing: +There isn't any — CodeGraph is zero-config. It indexes every file whose +extension maps to a [supported language](#supported-languages) and **respects +your `.gitignore`**: in git repos via git itself, and in non-git projects by +reading `.gitignore` files directly (root and nested, the same way git would). -```json -{ - "version": 1, - "languages": ["typescript", "javascript"], - "exclude": ["node_modules/**", "dist/**", "build/**", "*.min.js"], - "frameworks": [], - "maxFileSize": 1048576, - "extractDocstrings": true, - "trackCallSites": true -} -``` +What that means in practice: -| Option | Description | Default | -|--------|-------------|---------| -| `languages` | Languages to index (auto-detected if empty) | `[]` | -| `exclude` | Glob patterns to ignore | `["node_modules/**", ...]` | -| `frameworks` | Framework hints for better resolution | `[]` | -| `maxFileSize` | Skip files larger than this (bytes) | `1048576` (1MB) | -| `extractDocstrings` | Extract docstrings from code | `true` | -| `trackCallSites` | Track call site locations | `true` | +- Anything git ignores — `node_modules`, build output, secrets in `.env` — is + never indexed. **To keep something out of the graph, add it to `.gitignore`.** +- There's no config file to write or keep in sync, and nothing to wire up per + language: support is automatic from the file extension. +- Files larger than 1 MB are skipped (generated bundles, minified JS, vendored + blobs) — they cost parse budget for no useful symbols. + +> Committed files that aren't gitignored *are* indexed, even under `vendor/` or a +> committed `dist/`. If you commit a dependency or build directory you don't want +> in the graph, add it to `.gitignore`. + +## Supported Platforms + +Every release ships a self-contained build (bundled Node runtime — nothing to +compile) for all three desktop OSes, on both Intel/AMD (x64) and ARM (arm64): + +| Platform | Architectures | Install | +|----------|---------------|---------| +| Windows | x64, arm64 | PowerShell installer or npm | +| macOS | x64, arm64 | shell installer or npm | +| Linux | x64, arm64 | shell installer or npm | + +See [Get Started](#get-started) for the one-line install commands. + +## Supported Agents + +The interactive installer auto-detects and configures each of these — wiring up +the MCP server and writing its instructions file: + +- **Claude Code** +- **Cursor** +- **Codex CLI** +- **opencode** +- **Hermes Agent** ## Supported Languages @@ -397,10 +478,14 @@ The `.codegraph/config.json` file controls indexing: | C++ | `.cpp`, `.hpp`, `.cc` | Full support | | Swift | `.swift` | Full support | | Kotlin | `.kt`, `.kts` | Full support | +| Scala | `.scala`, `.sc` | Full support (classes, traits, methods, type aliases, Scala 3 enums) | | Dart | `.dart` | Full support | | Svelte | `.svelte` | Full support (script extraction, Svelte 5 runes, SvelteKit routes) | +| Vue | `.vue` | Full support (script + script-setup extraction, Nuxt page/API/middleware routes) | | Liquid | `.liquid` | Full support | | Pascal / Delphi | `.pas`, `.dpr`, `.dpk`, `.lpr` | Full support (classes, records, interfaces, enums, DFM/FMX form files) | +| Lua | `.lua` | Full support (functions, methods with receivers, local variables, `require` imports, call edges) | +| Luau | `.luau` | Full support (everything in Lua, plus `type`/`export type` aliases, typed signatures, and Roblox instance-path `require`) | ## Troubleshooting @@ -408,10 +493,25 @@ The `.codegraph/config.json` file controls indexing: **Indexing is slow** — Check that `node_modules` and other large directories are excluded. Use `--quiet` to reduce output overhead. +**MCP hits `database is locked`** — current builds shouldn't: CodeGraph bundles its own Node runtime and uses Node's built-in `node:sqlite` in WAL mode, where concurrent reads never block on a writer. If you still see it: + +- **You're on an old (pre-0.9) install.** Reinstall to get the bundled runtime — `curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh` (macOS/Linux), `irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 | iex` (Windows), or `npm i -g @colbymchenry/codegraph@latest`. +- **`codegraph status` shows `Journal:` other than `wal`** — WAL couldn't be enabled on this filesystem (common on network shares and WSL2 `/mnt`), so reads can block on writes. Move the project (with its `.codegraph/` folder) onto a local disk. + **MCP server not connecting** — Ensure the project is initialized/indexed, verify the path in your MCP config, and check that `codegraph serve --mcp` works from the command line. **Missing symbols** — The MCP server auto-syncs on save (wait a couple seconds). Run `codegraph sync` manually if needed. Check that the file's language is supported and isn't excluded by config patterns. +## Star History + + + + + + Star History Chart + + + ## License MIT @@ -420,7 +520,7 @@ MIT
-**Made for the Claude Code community** +**Made for AI coding agents — Claude Code, Cursor, Codex CLI, opencode, and Hermes Agent** [Report Bug](https://github.com/colbymchenry/codegraph/issues) · [Request Feature](https://github.com/colbymchenry/codegraph/issues) diff --git a/__tests__/concurrent-locking.test.ts b/__tests__/concurrent-locking.test.ts new file mode 100644 index 00000000..5c8ab518 --- /dev/null +++ b/__tests__/concurrent-locking.test.ts @@ -0,0 +1,152 @@ +/** + * Issue #238 — "database is locked" on concurrent MCP tool calls. + * + * With node:sqlite (real WAL) as the backend, the fixes that remain relevant: + * 1. busy_timeout is a bounded few-second wait (not a 2-minute hang) and WAL is + * active — so a reader never blocks on a concurrent writer. + * 2. The MCP ToolHandler reuses the default instance when a tool passes a + * projectPath pointing at the default project, instead of opening a SECOND + * connection to the same DB. + */ + +import { describe, it, expect, beforeAll, afterAll, vi } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import CodeGraph from '../src'; +import { ToolHandler } from '../src/mcp/tools'; +import { DatabaseConnection } from '../src/db'; + +/** Normalize a PRAGMA read across return shapes (array | object | scalar). */ +function pragmaValue(raw: unknown, key: string): unknown { + const row = Array.isArray(raw) ? raw[0] : raw; + if (row !== null && typeof row === 'object') return (row as Record)[key]; + return row; +} + +describe('issue #238 — connection PRAGMAs (#1)', () => { + let dir: string; + let conn: DatabaseConnection; + + beforeAll(() => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg238-pragma-')); + conn = DatabaseConnection.initialize(path.join(dir, 'codegraph.db')); + }); + + afterAll(() => { + conn.close(); + fs.rmSync(dir, { recursive: true, force: true }); + }); + + it('uses a bounded busy_timeout, not the old 2-minute hang', () => { + const ms = Number(pragmaValue(conn.getDb().pragma('busy_timeout'), 'timeout')); + expect(ms).toBeGreaterThan(0); + expect(ms).toBeLessThanOrEqual(30000); // far below the old 120000 + }); + + it('runs in WAL mode — the mode that lets readers proceed during a write', () => { + const mode = String(pragmaValue(conn.getDb().pragma('journal_mode'), 'journal_mode')).toLowerCase(); + expect(mode).toBe('wal'); + }); + + it('getJournalMode() surfaces the effective mode for status triage', () => { + expect(conn.getJournalMode()).toBe('wal'); + }); +}); + +describe('issue #238 — WAL lets a reader proceed during a writer', () => { + let dir: string; + + beforeAll(() => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg238-wal-')); + }); + + afterAll(() => { + fs.rmSync(dir, { recursive: true, force: true }); + }); + + it('a read on a 2nd connection succeeds while a writer holds the lock', () => { + const dbPath = path.join(dir, 'codegraph.db'); + const writer = DatabaseConnection.initialize(dbPath); + // The property only holds under WAL; skip if the filesystem couldn't enable it. + if (writer.getJournalMode() !== 'wal') { + writer.close(); + return; + } + const reader = DatabaseConnection.open(dbPath); + try { + writer.getDb().prepare('BEGIN EXCLUSIVE').run(); // hard write lock, held open + const t0 = Date.now(); + const row = reader.getDb().prepare('SELECT COUNT(*) AS c FROM nodes').get() as { c: number }; + const waited = Date.now() - t0; + expect(row.c).toBe(0); + expect(waited).toBeLessThan(1000); // proceeds immediately, no busy wait + } finally { + try { writer.getDb().prepare('COMMIT').run(); } catch { /* ignore */ } + reader.close(); + writer.close(); + } + }); +}); + +describe('issue #238 — ToolHandler reuses the default instance (#2)', () => { + let dir: string; + let cg: CodeGraph; + let root: string; + let handler: ToolHandler; + + beforeAll(async () => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg238-tools-')); + fs.writeFileSync(path.join(dir, 'a.ts'), 'export function helper(): number { return 1; }\n'); + fs.writeFileSync( + path.join(dir, 'b.ts'), + "import { helper } from './a';\nexport function main(): number { return helper(); }\n" + ); + cg = await CodeGraph.init(dir, { index: true }); + root = cg.getProjectRoot(); + handler = new ToolHandler(cg); + }); + + afterAll(() => { + cg.close(); + fs.rmSync(dir, { recursive: true, force: true }); + }); + + it('getCodeGraph(defaultRoot) returns the default instance, not a new connection', () => { + const openSpy = vi.spyOn(CodeGraph, 'openSync'); + try { + // eslint-disable-next-line @typescript-eslint/no-explicit-any + const resolved = (handler as any).getCodeGraph(root); + // eslint-disable-next-line @typescript-eslint/no-explicit-any + const nested = (handler as any).getCodeGraph(path.join(root, 'does', 'not', 'exist')); + expect(resolved).toBe(cg); + expect(nested).toBe(cg); // a sub-path resolves up to the same default project + expect(openSpy).not.toHaveBeenCalled(); // no second connection opened + } finally { + openSpy.mockRestore(); + } + }); + + it('concurrent read tool calls (mixed projectPath) all succeed without "database is locked"', async () => { + const openSpy = vi.spyOn(CodeGraph, 'openSync'); + try { + const calls: Promise<{ content: Array<{ text: string }>; isError?: boolean }>[] = [ + handler.execute('codegraph_search', { query: 'helper' }), + handler.execute('codegraph_search', { query: 'helper', projectPath: root }), + handler.execute('codegraph_callers', { symbol: 'helper', projectPath: root }), + handler.execute('codegraph_callees', { symbol: 'main' }), + handler.execute('codegraph_files', { projectPath: root }), + handler.execute('codegraph_status', { projectPath: root }), + ]; + const results = await Promise.all(calls); + for (const r of results) { + expect(r.isError).not.toBe(true); + expect(r.content[0]?.text ?? '').not.toMatch(/database is locked/i); + } + // Passing the default project's own path must not open a second connection. + expect(openSpy).not.toHaveBeenCalled(); + } finally { + openSpy.mockRestore(); + } + }); +}); diff --git a/__tests__/db-perf.test.ts b/__tests__/db-perf.test.ts new file mode 100644 index 00000000..256cf92c --- /dev/null +++ b/__tests__/db-perf.test.ts @@ -0,0 +1,161 @@ +/** + * DB Performance / Correctness Tests + * + * Regression tests for three changes: + * 1. Batch `getNodesByIds` collapses graph-traversal N+1 reads. + * 2. `insertNode` invalidates the LRU cache so INSERT OR REPLACE + * doesn't serve a stale cached row on next `getNodeById`. + * 3. `runMaintenance` runs `PRAGMA optimize` + `wal_checkpoint(PASSIVE)` + * after indexAll/sync without throwing. + */ + +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { DatabaseConnection } from '../src/db'; +import { QueryBuilder } from '../src/db/queries'; +import { Node } from '../src/types'; + +function makeNode(id: string, name = id): Node { + return { + id, + kind: 'function', + name, + qualifiedName: name, + filePath: 'a.ts', + language: 'typescript', + startLine: 1, + endLine: 1, + startColumn: 0, + endColumn: 0, + updatedAt: Date.now(), + }; +} + +describe('getNodesByIds (batch lookup)', () => { + let dir: string; + let db: DatabaseConnection; + let q: QueryBuilder; + + beforeEach(() => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'db-perf-batch-')); + db = DatabaseConnection.initialize(path.join(dir, 'test.db')); + q = new QueryBuilder(db.getDb()); + }); + + afterEach(() => { + db.close(); + if (fs.existsSync(dir)) fs.rmSync(dir, { recursive: true, force: true }); + }); + + it('returns a Map keyed by id, with one entry per existing node', () => { + q.insertNodes([makeNode('n1'), makeNode('n2'), makeNode('n3')]); + const out = q.getNodesByIds(['n1', 'n2', 'n3']); + expect(out.size).toBe(3); + expect(out.get('n1')!.name).toBe('n1'); + expect(out.get('n3')!.name).toBe('n3'); + }); + + it('omits missing IDs from the result map (no nulls, no exceptions)', () => { + q.insertNodes([makeNode('n1'), makeNode('n2')]); + const out = q.getNodesByIds(['n1', 'missing', 'n2']); + expect(out.size).toBe(2); + expect(out.has('missing')).toBe(false); + expect(out.has('n1')).toBe(true); + expect(out.has('n2')).toBe(true); + }); + + it('handles an empty input array', () => { + expect(q.getNodesByIds([]).size).toBe(0); + }); + + it('handles batches over the SQLite parameter limit (chunking)', () => { + // Insert 1500 nodes; the helper chunks at 500 internally. + const nodes = Array.from({ length: 1500 }, (_, i) => makeNode(`n${i}`)); + q.insertNodes(nodes); + const ids = nodes.map((n) => n.id); + const out = q.getNodesByIds(ids); + expect(out.size).toBe(1500); + // Spot-check a few from the first / middle / last chunk. + expect(out.has('n0')).toBe(true); + expect(out.has('n750')).toBe(true); + expect(out.has('n1499')).toBe(true); + }); + + it('serves cache hits from memory and queries only the misses', () => { + q.insertNodes([makeNode('n1'), makeNode('n2'), makeNode('n3')]); + // Warm the cache for n1 only. + q.getNodeById('n1'); + // Replace the underlying row to make a miss-vs-cache-hit detectable. + db.getDb().prepare('UPDATE nodes SET name = ? WHERE id = ?').run('changed', 'n1'); + const out = q.getNodesByIds(['n1', 'n2']); + // The cached n1 (still 'n1', not 'changed') must be returned. + expect(out.get('n1')!.name).toBe('n1'); + expect(out.get('n2')!.name).toBe('n2'); + }); +}); + +describe('insertNode cache invalidation', () => { + let dir: string; + let db: DatabaseConnection; + let q: QueryBuilder; + + beforeEach(() => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'db-perf-cache-')); + db = DatabaseConnection.initialize(path.join(dir, 'test.db')); + q = new QueryBuilder(db.getDb()); + }); + + afterEach(() => { + db.close(); + if (fs.existsSync(dir)) fs.rmSync(dir, { recursive: true, force: true }); + }); + + it('does not serve a stale cached node after INSERT OR REPLACE', () => { + // Regression: insertNode (which uses INSERT OR REPLACE) used to skip + // cache invalidation, so the next getNodeById returned the pre-replace + // version until LRU eviction. + const original = makeNode('n1', 'oldName'); + q.insertNode(original); + const beforeReplace = q.getNodeById('n1'); + expect(beforeReplace!.name).toBe('oldName'); + + // Replace via insertNode (the bug path). + q.insertNode({ ...original, name: 'newName', updatedAt: Date.now() }); + const afterReplace = q.getNodeById('n1'); + expect(afterReplace!.name).toBe('newName'); + }); +}); + +describe('runMaintenance', () => { + let dir: string; + let db: DatabaseConnection; + + beforeEach(() => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'db-perf-maint-')); + db = DatabaseConnection.initialize(path.join(dir, 'test.db')); + }); + + afterEach(() => { + db.close(); + if (fs.existsSync(dir)) fs.rmSync(dir, { recursive: true, force: true }); + }); + + it('runs without throwing on a fresh database', () => { + expect(() => db.runMaintenance()).not.toThrow(); + }); + + it('runs without throwing after writes', () => { + const q = new QueryBuilder(db.getDb()); + q.insertNodes([makeNode('n1'), makeNode('n2')]); + expect(() => db.runMaintenance()).not.toThrow(); + }); + + it('swallows failures rather than propagating (best-effort)', () => { + // Close the DB so the underlying handle would normally throw on any + // exec(). runMaintenance must still not propagate. + db.close(); + expect(() => db.runMaintenance()).not.toThrow(); + }); +}); diff --git a/__tests__/drupal.test.ts b/__tests__/drupal.test.ts new file mode 100644 index 00000000..c4f4421e --- /dev/null +++ b/__tests__/drupal.test.ts @@ -0,0 +1,609 @@ +/** + * Tests for Drupal framework resolver. + * + * Unit tests cover drupalResolver.detect(), extract() (routes + hooks), and resolve(). + * Integration tests use a real CodeGraph instance with a temporary Drupal project layout. + */ + +import * as fs from 'fs'; +import * as os from 'os'; +import * as path from 'path'; +import { afterEach, beforeAll, describe, expect, it } from 'vitest'; +import { CodeGraph } from '../src'; +import { initGrammars, loadAllGrammars } from '../src/extraction/grammars'; +import { drupalResolver } from '../src/resolution/frameworks/drupal'; +import type { ResolutionContext } from '../src/resolution/types'; + +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- + +function makeContext( + overrides: Partial = {}, +): ResolutionContext { + return { + getNodesInFile: () => [], + getNodesByName: () => [], + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: () => false, + readFile: () => null, + getProjectRoot: () => '/project', + getAllFiles: () => [], + getNodesByLowerName: () => [], + getImportMappings: () => [], + ...overrides, + }; +} + +// --------------------------------------------------------------------------- +// detect() +// --------------------------------------------------------------------------- + +describe('drupalResolver.detect', () => { + it('returns true when composer.json has a drupal/ dependency', () => { + const ctx = makeContext({ + readFile: (f) => + f === 'composer.json' + ? JSON.stringify({ + require: { + 'drupal/core-recommended': '~10.5', + 'drush/drush': '^13', + }, + }) + : null, + }); + expect(drupalResolver.detect(ctx)).toBe(true); + }); + + it('returns true when drupal/ dependency is in require-dev', () => { + const ctx = makeContext({ + readFile: (f) => + f === 'composer.json' + ? JSON.stringify({ 'require-dev': { 'drupal/core': '^10' } }) + : null, + }); + expect(drupalResolver.detect(ctx)).toBe(true); + }); + + it('returns false when composer.json has no drupal/ dependencies', () => { + const ctx = makeContext({ + readFile: (f) => + f === 'composer.json' + ? JSON.stringify({ + require: { 'laravel/framework': '^10', php: '>=8.1' }, + }) + : null, + }); + expect(drupalResolver.detect(ctx)).toBe(false); + }); + + it('returns false when composer.json is absent', () => { + const ctx = makeContext({ readFile: () => null }); + expect(drupalResolver.detect(ctx)).toBe(false); + }); + + it('returns false when composer.json is malformed JSON', () => { + const ctx = makeContext({ readFile: () => '{ bad json' }); + expect(drupalResolver.detect(ctx)).toBe(false); + }); + + it('returns true for a contrib module with empty require (composer name/type)', () => { + const ctx = makeContext({ + readFile: (f) => + f === 'composer.json' + ? JSON.stringify({ + name: 'drupal/admin_toolbar', + type: 'drupal-module', + require: {}, + }) + : null, + }); + expect(drupalResolver.detect(ctx)).toBe(true); + }); + + it('returns true via the *.info.yml fallback when composer.json is absent', () => { + const ctx = makeContext({ + readFile: () => null, + getAllFiles: () => [ + 'mymodule/mymodule.info.yml', + 'mymodule/mymodule.routing.yml', + ], + }); + expect(drupalResolver.detect(ctx)).toBe(true); + }); + + it('returns false for a stray *.info.yml with no Drupal PHP/route file', () => { + const ctx = makeContext({ + readFile: () => null, + getAllFiles: () => ['some/unrelated.info.yml'], + }); + expect(drupalResolver.detect(ctx)).toBe(false); + }); +}); + +describe('drupalResolver.claimsReference', () => { + it('claims FQCN handler refs and hook names the pre-filter would drop', () => { + expect(drupalResolver.claimsReference!('\\Drupal\\m\\Form\\SettingsForm')).toBe(true); + expect(drupalResolver.claimsReference!('\\Drupal\\m\\Controller\\C:setNoJsCookie')).toBe(true); + expect(drupalResolver.claimsReference!('hook_form_alter')).toBe(true); + }); + + it('does not claim ordinary identifiers or entity-handler dotted refs', () => { + expect(drupalResolver.claimsReference!('someHelperFunction')).toBe(false); + expect(drupalResolver.claimsReference!('comment.default')).toBe(false); + }); +}); + +// --------------------------------------------------------------------------- +// extract() — routing.yml +// --------------------------------------------------------------------------- + +describe('drupalResolver.extract — routing.yml', () => { + const routing = ` +mymodule.example: + path: '/mymodule/example' + defaults: + _controller: '\\Drupal\\mymodule\\Controller\\MyController::build' + _title: 'Example page' + requirements: + _permission: 'access content' +`; + + it('emits a route node for each YAML route', () => { + const { nodes } = drupalResolver.extract!( + 'mymodule/mymodule.routing.yml', + routing, + ); + expect(nodes).toHaveLength(1); + expect(nodes[0]!.kind).toBe('route'); + expect(nodes[0]!.name).toBe('/mymodule/example'); + }); + + it('sets qualifiedName to filePath::routeName', () => { + const { nodes } = drupalResolver.extract!( + 'mymodule/mymodule.routing.yml', + routing, + ); + expect(nodes[0]!.qualifiedName).toBe( + 'mymodule/mymodule.routing.yml::mymodule.example', + ); + }); + + it('emits a references edge to the controller FQCN', () => { + const { references } = drupalResolver.extract!( + 'mymodule/mymodule.routing.yml', + routing, + ); + expect(references).toHaveLength(1); + expect(references[0]!.referenceName).toBe( + '\\Drupal\\mymodule\\Controller\\MyController::build', + ); + expect(references[0]!.referenceKind).toBe('references'); + }); + + it('emits a references edge to a _form handler', () => { + const src = ` +mymodule.settings_form: + path: '/admin/config/mymodule' + defaults: + _form: '\\Drupal\\mymodule\\Form\\SettingsForm' + _title: 'MyModule settings' + requirements: + _permission: 'administer site configuration' +`; + const { nodes, references } = drupalResolver.extract!( + 'mymodule/mymodule.routing.yml', + src, + ); + expect(nodes).toHaveLength(1); + expect(references[0]!.referenceName).toBe( + '\\Drupal\\mymodule\\Form\\SettingsForm', + ); + }); + + it('handles multiple routes in one file', () => { + const src = ` +mod.page_one: + path: '/page-one' + defaults: + _controller: '\\Drupal\\mod\\Controller\\PageController::one' + requirements: + _permission: 'access content' + +mod.page_two: + path: '/page-two' + defaults: + _controller: '\\Drupal\\mod\\Controller\\PageController::two' + requirements: + _permission: 'access content' +`; + const { nodes, references } = drupalResolver.extract!( + 'mod/mod.routing.yml', + src, + ); + expect(nodes).toHaveLength(2); + expect(nodes.map((n) => n.name)).toContain('/page-one'); + expect(nodes.map((n) => n.name)).toContain('/page-two'); + expect(references).toHaveLength(2); + }); + + it('skips commented-out lines', () => { + const src = ` +mod.page: + path: '/page' + defaults: + #_controller: '\\Drupal\\mod\\Controller\\Old::build' + _controller: '\\Drupal\\mod\\Controller\\New::build' + requirements: + _permission: 'access content' +`; + const { references } = drupalResolver.extract!('mod/mod.routing.yml', src); + expect(references).toHaveLength(1); + expect(references[0]!.referenceName).toContain('New'); + }); + + it('includes HTTP methods in the route node name when present', () => { + const src = ` +mod.api: + path: '/api/resource' + defaults: + _controller: '\\Drupal\\mod\\Controller\\ApiController::get' + methods: [GET, POST] + requirements: + _permission: 'access content' +`; + const { nodes } = drupalResolver.extract!('mod/mod.routing.yml', src); + expect(nodes[0]!.name).toContain('GET'); + expect(nodes[0]!.name).toContain('POST'); + }); + + it('returns empty result for non-routing-yml files', () => { + const { nodes, references } = drupalResolver.extract!( + 'mymodule.module', + ' { + const { nodes, references } = drupalResolver.extract!( + 'some.routing.yml', + '# empty\n', + ); + expect(nodes).toHaveLength(0); + expect(references).toHaveLength(0); + }); +}); + +// --------------------------------------------------------------------------- +// extract() — hook detection in .module files +// --------------------------------------------------------------------------- + +describe('drupalResolver.extract — hook detection', () => { + it('detects hook implementation via docblock (Strategy A)', () => { + const src = ` r.referenceName === 'hook_form_alter', + ); + expect(hookRef).toBeDefined(); + expect(hookRef!.referenceKind).toBe('references'); + }); + + it('detects hook implementation via name pattern (Strategy B)', () => { + const src = ` r.referenceName === 'hook_views_data', + ); + expect(hookRef).toBeDefined(); + }); + + it('does not emit a hook ref for non-hook helper functions', () => { + // 'other_module_helper' doesn't start with 'mymodule_', so no hook ref + const src = ` { + const src = ` r.referenceName === 'hook_schema'); + expect(hookRef).toBeDefined(); + }); + + it('detects hooks in .theme files', () => { + const src = ` r.referenceName === 'hook_preprocess_node', + ); + expect(hookRef).toBeDefined(); + }); + + it('does not duplicate refs when both docblock and name pattern match', () => { + // Strategy A matches first and adds to docblockMatched set; + // Strategy B skips already-matched functions. + const src = ` r.referenceName === 'hook_form_alter', + ); + expect(hookRefs).toHaveLength(1); + }); +}); + +// --------------------------------------------------------------------------- +// resolve() +// --------------------------------------------------------------------------- + +describe('drupalResolver.resolve', () => { + it('resolves a _controller FQCN with ::method to the method node', () => { + const methodNode = { + id: 'method:abc123', + kind: 'method' as const, + name: 'build', + qualifiedName: 'MyController::build', + filePath: 'web/modules/custom/mymodule/src/Controller/MyController.php', + language: 'php' as const, + startLine: 10, + endLine: 20, + startColumn: 0, + endColumn: 0, + updatedAt: 0, + }; + const classNode = { + id: 'class:def456', + kind: 'class' as const, + name: 'MyController', + qualifiedName: 'MyController', + filePath: 'web/modules/custom/mymodule/src/Controller/MyController.php', + language: 'php' as const, + startLine: 5, + endLine: 30, + startColumn: 0, + endColumn: 0, + updatedAt: 0, + }; + const ctx = makeContext({ + getNodesByName: (name) => (name === 'MyController' ? [classNode] : []), + getNodesInFile: () => [classNode, methodNode], + }); + const ref = { + fromNodeId: 'route:x', + referenceName: '\\Drupal\\mymodule\\Controller\\MyController::build', + referenceKind: 'references' as const, + line: 1, + column: 0, + filePath: 'mymodule.routing.yml', + language: 'yaml' as const, + }; + const resolved = drupalResolver.resolve(ref, ctx); + expect(resolved).not.toBeNull(); + expect(resolved!.targetNodeId).toBe('method:abc123'); + expect(resolved!.confidence).toBeGreaterThanOrEqual(0.85); + }); + + it('resolves a _form FQCN (no ::method) to the class node', () => { + const classNode = { + id: 'class:form123', + kind: 'class' as const, + name: 'SettingsForm', + qualifiedName: 'SettingsForm', + filePath: 'web/modules/custom/mymodule/src/Form/SettingsForm.php', + language: 'php' as const, + startLine: 1, + endLine: 50, + startColumn: 0, + endColumn: 0, + updatedAt: 0, + }; + const ctx = makeContext({ + getNodesByName: (name) => (name === 'SettingsForm' ? [classNode] : []), + }); + const ref = { + fromNodeId: 'route:x', + referenceName: '\\Drupal\\mymodule\\Form\\SettingsForm', + referenceKind: 'references' as const, + line: 1, + column: 0, + filePath: 'mymodule.routing.yml', + language: 'yaml' as const, + }; + const resolved = drupalResolver.resolve(ref, ctx); + expect(resolved).not.toBeNull(); + expect(resolved!.targetNodeId).toBe('class:form123'); + }); + + it('returns null when the target class cannot be found', () => { + const ctx = makeContext({ getNodesByName: () => [] }); + const ref = { + fromNodeId: 'route:x', + referenceName: '\\Drupal\\mymodule\\Controller\\Missing::method', + referenceKind: 'references' as const, + line: 1, + column: 0, + filePath: 'mymodule.routing.yml', + language: 'yaml' as const, + }; + expect(drupalResolver.resolve(ref, ctx)).toBeNull(); + }); + + it('resolves a single-colon controller-service ref (Class:method)', () => { + const methodNode = { + id: 'method:nojs1', + kind: 'method' as const, + name: 'setNoJsCookie', + qualifiedName: 'BigPipeController::setNoJsCookie', + filePath: 'core/modules/big_pipe/src/Controller/BigPipeController.php', + language: 'php' as const, + startLine: 10, + endLine: 20, + startColumn: 0, + endColumn: 0, + updatedAt: 0, + }; + const classNode = { + id: 'class:nojs2', + kind: 'class' as const, + name: 'BigPipeController', + qualifiedName: 'BigPipeController', + filePath: 'core/modules/big_pipe/src/Controller/BigPipeController.php', + language: 'php' as const, + startLine: 5, + endLine: 30, + startColumn: 0, + endColumn: 0, + updatedAt: 0, + }; + const ctx = makeContext({ + getNodesByName: (name) => (name === 'BigPipeController' ? [classNode] : []), + getNodesInFile: () => [classNode, methodNode], + }); + const ref = { + fromNodeId: 'route:x', + referenceName: '\\Drupal\\big_pipe\\Controller\\BigPipeController:setNoJsCookie', + referenceKind: 'references' as const, + line: 1, + column: 0, + filePath: 'big_pipe.routing.yml', + language: 'yaml' as const, + }; + const resolved = drupalResolver.resolve(ref, ctx); + expect(resolved).not.toBeNull(); + expect(resolved!.targetNodeId).toBe('method:nojs1'); + }); +}); + +// --------------------------------------------------------------------------- +// End-to-end integration test +// --------------------------------------------------------------------------- + +beforeAll(async () => { + await initGrammars(); + await loadAllGrammars(); +}); + +describe('Drupal end-to-end — route node linked to controller method', () => { + let tmpDir: string | undefined; + afterEach(() => { + if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true }); + tmpDir = undefined; + }); + + it('creates a route→controller edge from routing.yml to PHP class', async () => { + tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-drupal-')); + + // Minimal composer.json to trigger Drupal detection + fs.writeFileSync( + path.join(tmpDir, 'composer.json'), + JSON.stringify({ require: { 'drupal/core-recommended': '~10.5' } }), + ); + + // Module directory structure + const modDir = path.join(tmpDir, 'web', 'modules', 'custom', 'my_module'); + fs.mkdirSync(path.join(modDir, 'src', 'Controller'), { recursive: true }); + + // routing.yml + fs.writeFileSync( + path.join(modDir, 'my_module.routing.yml'), + [ + 'my_module.hello:', + " path: '/hello'", + ' defaults:', + " _controller: '\\Drupal\\my_module\\Controller\\HelloController::build'", + " _title: 'Hello'", + ' requirements:', + " _permission: 'access content'", + ].join('\n') + '\n', + ); + + // PHP controller + fs.writeFileSync( + path.join(modDir, 'src', 'Controller', 'HelloController.php'), + [ + ' 'Hello'];", + ' }', + '}', + ].join('\n') + '\n', + ); + + const cg = CodeGraph.initSync(tmpDir); + await cg.indexAll(); + + // Route node must exist + const routes = cg.getNodesByKind('route'); + expect(routes.length).toBeGreaterThan(0); + const route = routes.find((n) => n.name.includes('/hello')); + expect(route).toBeDefined(); + + // Controller method must be indexed + const methods = cg.getNodesByKind('method'); + const buildMethod = methods.find((n) => n.name === 'build'); + expect(buildMethod).toBeDefined(); + + // Edge: route → build method (or class fallback) + const edges = cg.getOutgoingEdges(route!.id); + expect(edges.length).toBeGreaterThan(0); + + cg.close(); + }); +}); diff --git a/__tests__/explore-output-budget.test.ts b/__tests__/explore-output-budget.test.ts new file mode 100644 index 00000000..65ddc648 --- /dev/null +++ b/__tests__/explore-output-budget.test.ts @@ -0,0 +1,234 @@ +/** + * Adaptive output budget for codegraph_explore (#185). + * + * The explore tool used to apply a fixed 35KB output cap regardless of + * project size, which on small codebases was a net loss vs. native + * grep+Read. These tests pin the per-tier budget shape so future tuning + * doesn't silently drift the small-project case back into bloat. + */ +import { describe, it, expect, beforeAll, afterAll } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { getExploreOutputBudget, getExploreBudget, ToolHandler } from '../src/mcp/tools'; +import CodeGraph from '../src/index'; + +describe('getExploreOutputBudget', () => { + it('returns a strictly smaller total cap for small projects than for huge ones', () => { + const small = getExploreOutputBudget(100); + const huge = getExploreOutputBudget(30000); + expect(small.maxOutputChars).toBeLessThan(huge.maxOutputChars); + expect(small.defaultMaxFiles).toBeLessThan(huge.defaultMaxFiles); + expect(small.maxCharsPerFile).toBeLessThan(huge.maxCharsPerFile); + }); + + it('caps total output well under 8000 tokens (~32k chars) on small projects', () => { + const small = getExploreOutputBudget(100); + expect(small.maxOutputChars).toBeLessThanOrEqual(20000); + }); + + it('keeps the historical 35k+ ceiling for medium-large projects so existing benchmarks do not regress', () => { + const large = getExploreOutputBudget(10000); + expect(large.maxOutputChars).toBeGreaterThanOrEqual(35000); + }); + + it('uses tier breakpoints matching getExploreBudget so call-count and output-budget agree on a project', () => { + // Anything in the same tier should pick the same total-output cap. + const tier1a = getExploreOutputBudget(50); + const tier1b = getExploreOutputBudget(499); + expect(tier1a.maxOutputChars).toBe(tier1b.maxOutputChars); + expect(getExploreBudget(50)).toBe(getExploreBudget(499)); + + const tier2a = getExploreOutputBudget(500); + const tier2b = getExploreOutputBudget(4999); + expect(tier2a.maxOutputChars).toBe(tier2b.maxOutputChars); + expect(getExploreBudget(500)).toBe(getExploreBudget(4999)); + + const tier3a = getExploreOutputBudget(5000); + const tier3b = getExploreOutputBudget(14999); + expect(tier3a.maxOutputChars).toBe(tier3b.maxOutputChars); + + // And crossing a breakpoint changes the cap. + expect(tier1a.maxOutputChars).not.toBe(tier2a.maxOutputChars); + expect(tier2a.maxOutputChars).not.toBe(tier3a.maxOutputChars); + }); + + it('gates off "Additional relevant files", completeness signal, and budget note on small projects', () => { + const small = getExploreOutputBudget(100); + expect(small.includeAdditionalFiles).toBe(false); + expect(small.includeCompletenessSignal).toBe(false); + expect(small.includeBudgetNote).toBe(false); + }); + + it('keeps all meta-text on for projects that earn the breadth signal (>=500 files)', () => { + const medium = getExploreOutputBudget(1000); + expect(medium.includeAdditionalFiles).toBe(true); + expect(medium.includeCompletenessSignal).toBe(true); + expect(medium.includeBudgetNote).toBe(true); + }); + + it('keeps the Relationships section on for every tier — it is the cheapest structural signal', () => { + expect(getExploreOutputBudget(50).includeRelationships).toBe(true); + expect(getExploreOutputBudget(1000).includeRelationships).toBe(true); + expect(getExploreOutputBudget(10000).includeRelationships).toBe(true); + expect(getExploreOutputBudget(30000).includeRelationships).toBe(true); + }); + + it('caps the per-file header symbol list more tightly on small projects', () => { + // Without this cap, a file like Alamofire's Session.swift produced + // a 3.4KB symbol list in the `#### path — sym, sym, ...` header, + // dwarfing the per-file body cap. + const small = getExploreOutputBudget(100); + const huge = getExploreOutputBudget(30000); + expect(small.maxSymbolsInFileHeader).toBeLessThan(huge.maxSymbolsInFileHeader); + expect(small.maxSymbolsInFileHeader).toBeGreaterThan(0); + }); + + it('uses a tighter clustering gap threshold on small projects to break runaway single clusters', () => { + const small = getExploreOutputBudget(100); + const huge = getExploreOutputBudget(30000); + expect(small.gapThreshold).toBeLessThanOrEqual(huge.gapThreshold); + }); + + it('handles the boundary file counts exactly (off-by-one regression guard)', () => { + // 499 -> small tier, 500 -> medium tier + expect(getExploreOutputBudget(499).maxOutputChars).toBe(getExploreOutputBudget(100).maxOutputChars); + expect(getExploreOutputBudget(500).maxOutputChars).toBe(getExploreOutputBudget(1000).maxOutputChars); + // 4999 -> medium, 5000 -> large + expect(getExploreOutputBudget(4999).maxOutputChars).toBe(getExploreOutputBudget(1000).maxOutputChars); + expect(getExploreOutputBudget(5000).maxOutputChars).toBe(getExploreOutputBudget(10000).maxOutputChars); + // 14999 -> large, 15000 -> xlarge + expect(getExploreOutputBudget(14999).maxOutputChars).toBe(getExploreOutputBudget(10000).maxOutputChars); + expect(getExploreOutputBudget(15000).maxOutputChars).toBe(getExploreOutputBudget(30000).maxOutputChars); + }); +}); + +/** + * End-to-end check that the budget is actually applied by handleExplore. + * + * Builds a tiny synthetic project (<500 files, so the small tier), indexes + * it, and confirms the output: + * - stays under the small-tier maxOutputChars cap + * - omits the meta-text the small tier gates off (completeness signal, + * budget note, "Additional relevant files") + * + * Regression guard for #185 — protects against future edits to handleExplore + * silently re-introducing the fixed 35KB cap on small projects. + */ +describe('codegraph_explore output respects the adaptive budget', () => { + let testDir: string; + let cg: CodeGraph; + let handler: ToolHandler; + + beforeAll(async () => { + testDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-explore-budget-')); + const srcDir = path.join(testDir, 'src'); + fs.mkdirSync(srcDir); + + // A handful of files with one fat target file. The fat file mimics the + // Alamofire Session.swift case: many methods stacked on top of each other, + // which collapsed into one giant cluster pre-#185. + const fatLines: string[] = ['export class Session {']; + for (let i = 0; i < 30; i++) { + fatLines.push(` method${i}(arg: string): string {`); + fatLines.push(` return this.helper${i}(arg) + "${i}";`); + fatLines.push(` }`); + fatLines.push(` private helper${i}(arg: string): string {`); + fatLines.push(` return arg.repeat(${i + 1});`); + fatLines.push(` }`); + } + fatLines.push('}'); + fs.writeFileSync(path.join(srcDir, 'session.ts'), fatLines.join('\n')); + + // A few small supporting files so the project has >1 indexed file. + for (let i = 0; i < 5; i++) { + fs.writeFileSync( + path.join(srcDir, `support${i}.ts`), + `import { Session } from './session';\nexport function callSession${i}(s: Session) { return s.method${i}('hi'); }\n` + ); + } + + cg = CodeGraph.initSync(testDir, { + config: { include: ['**/*.ts'], exclude: [] }, + }); + await cg.indexAll(); + handler = new ToolHandler(cg); + }); + + afterAll(() => { + if (cg) cg.destroy(); + if (testDir && fs.existsSync(testDir)) { + fs.rmSync(testDir, { recursive: true, force: true }); + } + }); + + it('keeps total output under the small-project cap', async () => { + const result = await handler.execute('codegraph_explore', { query: 'Session method helper' }); + const text = result.content?.[0]?.text ?? ''; + const smallBudget = getExploreOutputBudget(100); + // Allow a small overshoot for the trailing markers — the cap is enforced + // per-file rather than as an absolute output ceiling. + expect(text.length).toBeLessThan(smallBudget.maxOutputChars + 500); + }); + + it('omits the meta-text gated off for small projects', async () => { + const result = await handler.execute('codegraph_explore', { query: 'Session method helper' }); + const text = result.content?.[0]?.text ?? ''; + expect(text).not.toContain('### Additional relevant files'); + expect(text).not.toContain('Complete source code is included above'); + expect(text).not.toContain('Explore budget:'); + }); + + it('still includes the Relationships section — it is the cheapest structural signal', async () => { + const result = await handler.execute('codegraph_explore', { query: 'Session method helper' }); + const text = result.content?.[0]?.text ?? ''; + // Either there are relationships, or no edges were significant — both are fine. + // We just want to confirm we did not accidentally gate it off. + const hasRelationships = text.includes('### Relationships'); + const sourceFollowsHeader = text.indexOf('### Source Code') > 0; + expect(hasRelationships || sourceFollowsHeader).toBe(true); + }); + + it('prefixes source lines with line numbers by default (cat -n style)', async () => { + delete process.env.CODEGRAPH_EXPLORE_LINENUMS; + const result = await handler.execute('codegraph_explore', { query: 'Session method helper' }); + const text = result.content?.[0]?.text ?? ''; + // At least one fenced source line should look like `\t`. + expect(/\n\d+\t/.test(text)).toBe(true); + }); + + it('omits line numbers when CODEGRAPH_EXPLORE_LINENUMS=0', async () => { + process.env.CODEGRAPH_EXPLORE_LINENUMS = '0'; + try { + const result = await handler.execute('codegraph_explore', { query: 'Session method helper' }); + const text = result.content?.[0]?.text ?? ''; + // The synthetic source has no tab-prefixed numeric lines of its own, + // so none should appear when the toggle is off. + expect(/\n\d+\t(?:export| )/.test(text)).toBe(false); + } finally { + delete process.env.CODEGRAPH_EXPLORE_LINENUMS; + } + }); + + it('uses language-neutral omission markers (no C-style // in the output)', async () => { + // The gap/trimmed separators must not assume `//` is a comment — that's + // wrong in Python, Ruby, etc. They render inside fenced source blocks. + const result = await handler.execute('codegraph_explore', { query: 'Session method helper' }); + const text = result.content?.[0]?.text ?? ''; + expect(text).not.toContain('// ... (gap)'); + expect(text).not.toContain('// ... trimmed'); + }); + + it('does not collapse a whole-file class into just its header (envelope filter)', async () => { + // The synthetic `Session` class spans the entire file. Without the + // envelope filter it would form one giant cluster that tail-trims to + // the class declaration, hiding the methods. Confirm real method bodies + // make it into the output. Regression guard for the #185 follow-up. + const result = await handler.execute('codegraph_explore', { query: 'Session method helper' }); + const text = result.content?.[0]?.text ?? ''; + // A method body line (`methodN(arg: string)`) should appear, not just + // the `export class Session {` opener. + const hasMethodBody = /method\d+\(arg: string\)/.test(text); + expect(hasMethodBody).toBe(true); + }); +}); diff --git a/__tests__/extraction.test.ts b/__tests__/extraction.test.ts index 8a70ffed..99c38345 100644 --- a/__tests__/extraction.test.ts +++ b/__tests__/extraction.test.ts @@ -9,10 +9,9 @@ import * as fs from 'fs'; import * as path from 'path'; import * as os from 'os'; import { CodeGraph } from '../src'; -import { extractFromSource, scanDirectory, shouldIncludeFile } from '../src/extraction'; +import { extractFromSource, scanDirectory } from '../src/extraction'; import { detectLanguage, isLanguageSupported, getSupportedLanguages, initGrammars, loadAllGrammars } from '../src/extraction/grammars'; import { normalizePath } from '../src/utils'; -import { DEFAULT_CONFIG } from '../src/types'; beforeAll(async () => { await initGrammars(); @@ -376,7 +375,7 @@ export const useUIStore = create((set) => ({ `; const result = extractFromSource('store.ts', code); - const varNode = result.nodes.find((n) => n.kind === 'variable' && n.name === 'useUIStore'); + const varNode = result.nodes.find((n) => n.kind === 'constant' && n.name === 'useUIStore'); expect(varNode).toBeDefined(); expect(varNode?.isExported).toBe(true); }); @@ -390,7 +389,7 @@ export const config = { `; const result = extractFromSource('config.ts', code); - const varNode = result.nodes.find((n) => n.kind === 'variable' && n.name === 'config'); + const varNode = result.nodes.find((n) => n.kind === 'constant' && n.name === 'config'); expect(varNode).toBeDefined(); expect(varNode?.isExported).toBe(true); }); @@ -401,7 +400,7 @@ export const SCREEN_NAMES = ['home', 'settings', 'profile'] as const; `; const result = extractFromSource('constants.ts', code); - const varNode = result.nodes.find((n) => n.kind === 'variable' && n.name === 'SCREEN_NAMES'); + const varNode = result.nodes.find((n) => n.kind === 'constant' && n.name === 'SCREEN_NAMES'); expect(varNode).toBeDefined(); expect(varNode?.isExported).toBe(true); }); @@ -413,7 +412,7 @@ export const API_VERSION = "v2"; `; const result = extractFromSource('constants.ts', code); - const variables = result.nodes.filter((n) => n.kind === 'variable'); + const variables = result.nodes.filter((n) => n.kind === 'constant'); expect(variables).toHaveLength(2); expect(variables.map((n) => n.name).sort()).toEqual(['API_VERSION', 'MAX_RETRIES']); }); @@ -457,7 +456,7 @@ export const userSchema = z.object({ `; const result = extractFromSource('schemas.ts', code); - const varNode = result.nodes.find((n) => n.kind === 'variable' && n.name === 'userSchema'); + const varNode = result.nodes.find((n) => n.kind === 'constant' && n.name === 'userSchema'); expect(varNode).toBeDefined(); expect(varNode?.isExported).toBe(true); }); @@ -475,7 +474,7 @@ export const authMachine = createMachine({ `; const result = extractFromSource('machine.ts', code); - const varNode = result.nodes.find((n) => n.kind === 'variable' && n.name === 'authMachine'); + const varNode = result.nodes.find((n) => n.kind === 'constant' && n.name === 'authMachine'); expect(varNode).toBeDefined(); expect(varNode?.isExported).toBe(true); }); @@ -1152,6 +1151,11 @@ class UserService { const privateMethod = methodNodes.find((m) => m.name === '_privateMethod'); expect(privateMethod).toBeDefined(); expect(privateMethod?.visibility).toBe('private'); + + // Dart models a method body as a SIBLING of the signature, so the method + // node must be extended to span its body (not just the signature line) — + // required for body-level analysis (callees, the callback synthesizer). + expect(findById!.endLine).toBeGreaterThan(findById!.startLine); }); it('should extract top-level function declarations', () => { @@ -3003,39 +3007,57 @@ describe('Directory Exclusion', () => { cleanupTempDir(tempDir); }); - it('should exclude node_modules directories', () => { - // Create structure: src/index.ts + node_modules/pkg/index.js + it('should exclude directories listed in .gitignore', () => { + // Create structure: src/index.ts + node_modules/pkg/index.js, gitignore node_modules const srcDir = path.join(tempDir, 'src'); const nmDir = path.join(tempDir, 'node_modules', 'pkg'); fs.mkdirSync(srcDir, { recursive: true }); fs.mkdirSync(nmDir, { recursive: true }); fs.writeFileSync(path.join(srcDir, 'index.ts'), 'export const x = 1;'); fs.writeFileSync(path.join(nmDir, 'index.js'), 'module.exports = {};'); + fs.writeFileSync(path.join(tempDir, '.gitignore'), 'node_modules/\n'); - const config = { ...DEFAULT_CONFIG, rootDir: tempDir }; - const files = scanDirectory(tempDir, config); + const files = scanDirectory(tempDir); expect(files).toContain('src/index.ts'); expect(files.every((f) => !f.includes('node_modules'))).toBe(true); }); - it('should exclude nested node_modules directories', () => { - // Create structure: packages/app/node_modules/pkg/index.js + it('should exclude nested node_modules via a root .gitignore', () => { + // A trailing-slash pattern with no leading slash matches at any depth. const srcDir = path.join(tempDir, 'packages', 'app', 'src'); const nmDir = path.join(tempDir, 'packages', 'app', 'node_modules', 'pkg'); fs.mkdirSync(srcDir, { recursive: true }); fs.mkdirSync(nmDir, { recursive: true }); fs.writeFileSync(path.join(srcDir, 'index.ts'), 'export const x = 1;'); fs.writeFileSync(path.join(nmDir, 'index.js'), 'module.exports = {};'); + fs.writeFileSync(path.join(tempDir, '.gitignore'), 'node_modules/\n'); - const config = { ...DEFAULT_CONFIG, rootDir: tempDir }; - const files = scanDirectory(tempDir, config); + const files = scanDirectory(tempDir); expect(files).toContain('packages/app/src/index.ts'); expect(files.every((f) => !f.includes('node_modules'))).toBe(true); }); - it('should exclude .git directories', () => { + it('should apply a nested .gitignore only to its own subtree', () => { + const appSrc = path.join(tempDir, 'app', 'src'); + fs.mkdirSync(appSrc, { recursive: true }); + fs.writeFileSync(path.join(appSrc, 'keep.ts'), 'export const a = 1;'); + fs.writeFileSync(path.join(appSrc, 'skip.ts'), 'export const b = 2;'); + fs.writeFileSync(path.join(tempDir, 'app', '.gitignore'), 'src/skip.ts\n'); + // A sibling with the same name outside app/ must NOT be ignored. + const otherDir = path.join(tempDir, 'other', 'src'); + fs.mkdirSync(otherDir, { recursive: true }); + fs.writeFileSync(path.join(otherDir, 'skip.ts'), 'export const c = 3;'); + + const files = scanDirectory(tempDir); + + expect(files).toContain('app/src/keep.ts'); + expect(files).not.toContain('app/src/skip.ts'); + expect(files).toContain('other/src/skip.ts'); + }); + + it('should always skip .git directories', () => { const srcDir = path.join(tempDir, 'src'); const gitDir = path.join(tempDir, '.git', 'objects'); fs.mkdirSync(srcDir, { recursive: true }); @@ -3043,8 +3065,7 @@ describe('Directory Exclusion', () => { fs.writeFileSync(path.join(srcDir, 'index.ts'), 'export const x = 1;'); fs.writeFileSync(path.join(gitDir, 'pack.ts'), 'export const y = 2;'); - const config = { ...DEFAULT_CONFIG, rootDir: tempDir }; - const files = scanDirectory(tempDir, config); + const files = scanDirectory(tempDir); expect(files).toContain('src/index.ts'); expect(files.every((f) => !f.includes('.git'))).toBe(true); @@ -3055,27 +3076,827 @@ describe('Directory Exclusion', () => { fs.mkdirSync(srcDir, { recursive: true }); fs.writeFileSync(path.join(srcDir, 'Button.tsx'), 'export function Button() {}'); - const config = { ...DEFAULT_CONFIG, rootDir: tempDir }; - const files = scanDirectory(tempDir, config); + const files = scanDirectory(tempDir); expect(files.length).toBe(1); expect(files[0]).toBe('src/components/Button.tsx'); expect(files[0]).not.toContain('\\'); }); +}); - it('should respect .codegraphignore marker', () => { - const srcDir = path.join(tempDir, 'src'); - const vendorDir = path.join(tempDir, 'vendor'); - fs.mkdirSync(srcDir, { recursive: true }); - fs.mkdirSync(vendorDir, { recursive: true }); - fs.writeFileSync(path.join(srcDir, 'index.ts'), 'export const x = 1;'); - fs.writeFileSync(path.join(vendorDir, 'lib.ts'), 'export const y = 2;'); - fs.writeFileSync(path.join(vendorDir, '.codegraphignore'), ''); +describe('Git Submodules', () => { + let tempDir: string; - const config = { ...DEFAULT_CONFIG, rootDir: tempDir }; - const files = scanDirectory(tempDir, config); + beforeEach(() => { + tempDir = createTempDir(); + }); - expect(files).toContain('src/index.ts'); - expect(files.every((f) => !f.includes('vendor'))).toBe(true); + afterEach(() => { + cleanupTempDir(tempDir); + }); + + it('should index files inside git submodules (issue #147)', async () => { + const { execFileSync } = await import('child_process'); + const git = (cwd: string, ...args: string[]) => + execFileSync('git', args, { cwd, stdio: 'pipe' }); + + // Build a separate "library" repo to use as a submodule source. + const libDir = path.join(tempDir, '_lib'); + fs.mkdirSync(libDir, { recursive: true }); + git(libDir, 'init', '-q'); + git(libDir, 'config', 'user.email', 'test@test.com'); + git(libDir, 'config', 'user.name', 'Test'); + fs.writeFileSync(path.join(libDir, 'lib.ts'), 'export const fromSubmodule = 1;'); + git(libDir, 'add', '-A'); + git(libDir, 'commit', '-q', '-m', 'lib init'); + + // Build the main repo and add the lib repo as a submodule. + const mainDir = path.join(tempDir, 'main'); + fs.mkdirSync(mainDir, { recursive: true }); + git(mainDir, 'init', '-q'); + git(mainDir, 'config', 'user.email', 'test@test.com'); + git(mainDir, 'config', 'user.name', 'Test'); + fs.writeFileSync(path.join(mainDir, 'app.ts'), 'export const app = 1;'); + git(mainDir, 'add', '-A'); + git(mainDir, 'commit', '-q', '-m', 'app init'); + // protocol.file.allow=always is required to add a local-path submodule on + // recent git versions (CVE-2022-39253 mitigation). + execFileSync( + 'git', + ['-c', 'protocol.file.allow=always', 'submodule', 'add', '-q', libDir, 'libs/lib'], + { cwd: mainDir, stdio: 'pipe' } + ); + git(mainDir, 'commit', '-q', '-m', 'add submodule'); + + const files = scanDirectory(mainDir); + + expect(files).toContain('app.ts'); + expect(files).toContain('libs/lib/lib.ts'); + }); +}); + +describe('Nested non-submodule git repos', () => { + let tempDir: string; + + beforeEach(() => { + tempDir = createTempDir(); + }); + + afterEach(() => { + cleanupTempDir(tempDir); + }); + + it('should index files in embedded git repos run from a git super-repo (issue #193)', async () => { + const { execFileSync } = await import('child_process'); + const git = (cwd: string, ...args: string[]) => + execFileSync('git', args, { cwd, stdio: 'pipe' }); + + // Top-level workspace is itself a git repo, holding no source directly — + // the CMake "super-repo" layout from the issue. + const root = path.join(tempDir, 'root'); + fs.mkdirSync(path.join(root, 'coding'), { recursive: true }); + git(root, 'init', '-q'); + git(root, 'config', 'user.email', 'test@test.com'); + git(root, 'config', 'user.name', 'Test'); + fs.writeFileSync(path.join(root, 'CMakeLists.txt'), 'cmake_minimum_required(VERSION 3.10)\n'); + + // Two independent clones living inside the workspace (NOT submodules): + // one with committed source, one with only untracked source. + const sub1 = path.join(root, 'sub_repo1', 'src'); + fs.mkdirSync(sub1, { recursive: true }); + git(path.join(root, 'sub_repo1'), 'init', '-q'); + git(path.join(root, 'sub_repo1'), 'config', 'user.email', 'test@test.com'); + git(path.join(root, 'sub_repo1'), 'config', 'user.name', 'Test'); + fs.writeFileSync(path.join(sub1, 'one.ts'), 'export const one = 1;'); + git(path.join(root, 'sub_repo1'), 'add', '-A'); + git(path.join(root, 'sub_repo1'), 'commit', '-q', '-m', 'sub1 init'); + + const sub2 = path.join(root, 'sub_repo2', 'src'); + fs.mkdirSync(sub2, { recursive: true }); + git(path.join(root, 'sub_repo2'), 'init', '-q'); + fs.writeFileSync(path.join(sub2, 'two.ts'), 'export const two = 2;'); + + const files = scanDirectory(root); + + // Both committed and untracked source from the nested repos must be found. + expect(files).toContain('sub_repo1/src/one.ts'); + expect(files).toContain('sub_repo2/src/two.ts'); + }); + + it('should respect each embedded repo\'s own .gitignore', async () => { + const { execFileSync } = await import('child_process'); + const git = (cwd: string, ...args: string[]) => + execFileSync('git', args, { cwd, stdio: 'pipe' }); + + const root = path.join(tempDir, 'root'); + fs.mkdirSync(root, { recursive: true }); + git(root, 'init', '-q'); + + const sub = path.join(root, 'sub_repo', 'src'); + fs.mkdirSync(sub, { recursive: true }); + git(path.join(root, 'sub_repo'), 'init', '-q'); + fs.writeFileSync(path.join(root, 'sub_repo', '.gitignore'), 'src/generated.ts\n'); + fs.writeFileSync(path.join(sub, 'real.ts'), 'export const real = 1;'); + fs.writeFileSync(path.join(sub, 'generated.ts'), 'export const generated = 1;'); + + const files = scanDirectory(root); + + expect(files).toContain('sub_repo/src/real.ts'); + expect(files).not.toContain('sub_repo/src/generated.ts'); + }); +}); + +// ============================================================================= +// Scala +// ============================================================================= + +describe('Scala Extraction', () => { + describe('Language detection', () => { + it('should detect Scala files', () => { + expect(detectLanguage('Main.scala')).toBe('scala'); + expect(detectLanguage('script.sc')).toBe('scala'); + expect(detectLanguage('src/UserService.scala')).toBe('scala'); + }); + + it('should report Scala as supported', () => { + expect(isLanguageSupported('scala')).toBe(true); + expect(getSupportedLanguages()).toContain('scala'); + }); + }); + + describe('Class extraction', () => { + it('should extract class definitions', () => { + const code = ` +class UserService(private val repo: UserRepository) { + def findUser(id: String): Option[String] = Some(id) +} +`; + const result = extractFromSource('UserService.scala', code); + const cls = result.nodes.find((n) => n.kind === 'class' && n.name === 'UserService'); + expect(cls).toBeDefined(); + expect(cls?.language).toBe('scala'); + }); + + it('should extract object definitions as class kind', () => { + const code = ` +object DatabaseConfig { + val url = "jdbc:postgresql://localhost/mydb" +} +`; + const result = extractFromSource('Config.scala', code); + const obj = result.nodes.find((n) => n.kind === 'class' && n.name === 'DatabaseConfig'); + expect(obj).toBeDefined(); + }); + + it('should extract trait definitions as trait kind', () => { + const code = ` +trait Repository[A] { + def findById(id: String): Option[A] + def save(entity: A): Unit +} +`; + const result = extractFromSource('Repository.scala', code); + const trait_ = result.nodes.find((n) => n.kind === 'trait' && n.name === 'Repository'); + expect(trait_).toBeDefined(); + }); + }); + + describe('Method and function extraction', () => { + it('should extract method definitions inside a class', () => { + const code = ` +class Calculator { + def add(a: Int, b: Int): Int = a + b + def divide(a: Double, b: Double): Double = a / b +} +`; + const result = extractFromSource('Calculator.scala', code); + const methods = result.nodes.filter((n) => n.kind === 'method'); + expect(methods.find((m) => m.name === 'add')).toBeDefined(); + expect(methods.find((m) => m.name === 'divide')).toBeDefined(); + }); + + it('should extract method signatures', () => { + const code = ` +class Greeter { + def greet(name: String): String = s"Hello, \${name}!" +} +`; + const result = extractFromSource('Greeter.scala', code); + const method = result.nodes.find((n) => n.name === 'greet'); + expect(method?.signature).toContain('name: String'); + expect(method?.signature).toContain('String'); + }); + + it('should extract top-level function definitions as functions', () => { + const code = ` +def factorial(n: Int): Int = if (n <= 1) 1 else n * factorial(n - 1) +def greet(name: String): String = s"Hello, \${name}!" +`; + const result = extractFromSource('utils.scala', code); + const fns = result.nodes.filter((n) => n.kind === 'function'); + expect(fns.find((f) => f.name === 'factorial')).toBeDefined(); + expect(fns.find((f) => f.name === 'greet')).toBeDefined(); + }); + }); + + describe('Val and var extraction', () => { + it('should extract val inside a class as field', () => { + const code = ` +class Config { + val timeout: Int = 30 + val host: String = "localhost" +} +`; + const result = extractFromSource('Config.scala', code); + const fields = result.nodes.filter((n) => n.kind === 'field'); + expect(fields.find((f) => f.name === 'timeout')).toBeDefined(); + expect(fields.find((f) => f.name === 'host')).toBeDefined(); + }); + + it('should extract var inside a class as field', () => { + const code = ` +class Counter { + var count: Int = 0 +} +`; + const result = extractFromSource('Counter.scala', code); + const field = result.nodes.find((n) => n.kind === 'field' && n.name === 'count'); + expect(field).toBeDefined(); + }); + + it('should extract top-level val as constant', () => { + const code = ` +val MaxConnections: Int = 100 +val DefaultTimeout = 30 +`; + const result = extractFromSource('constants.scala', code); + const consts = result.nodes.filter((n) => n.kind === 'constant'); + expect(consts.find((c) => c.name === 'MaxConnections')).toBeDefined(); + }); + + it('should extract top-level var as variable', () => { + const code = ` +var retries: Int = 3 +`; + const result = extractFromSource('state.scala', code); + const v = result.nodes.find((n) => n.kind === 'variable' && n.name === 'retries'); + expect(v).toBeDefined(); + }); + + it('should include type in val/var signature', () => { + const code = ` +class Service { + val timeout: Int = 30 +} +`; + const result = extractFromSource('Service.scala', code); + const field = result.nodes.find((n) => n.name === 'timeout'); + expect(field?.signature).toContain('timeout'); + expect(field?.signature).toContain('Int'); + }); + }); + + describe('Enum extraction', () => { + it('should extract enum definitions', () => { + const code = ` +enum Color: + case Red + case Green + case Blue +`; + const result = extractFromSource('Color.scala', code); + const enumNode = result.nodes.find((n) => n.kind === 'enum' && n.name === 'Color'); + expect(enumNode).toBeDefined(); + }); + + it('should extract enum cases as enum_member', () => { + const code = ` +enum Direction: + case North + case South + case East + case West +`; + const result = extractFromSource('Direction.scala', code); + const members = result.nodes.filter((n) => n.kind === 'enum_member'); + expect(members.find((m) => m.name === 'North')).toBeDefined(); + expect(members.find((m) => m.name === 'South')).toBeDefined(); + expect(members.length).toBeGreaterThanOrEqual(4); + }); + }); + + describe('Type alias extraction', () => { + it('should extract type aliases', () => { + const code = ` +type UserId = String +type UserMap = Map[String, String] +`; + const result = extractFromSource('types.scala', code); + const aliases = result.nodes.filter((n) => n.kind === 'type_alias'); + expect(aliases.find((a) => a.name === 'UserId')).toBeDefined(); + expect(aliases.find((a) => a.name === 'UserMap')).toBeDefined(); + }); + }); + + describe('Import extraction', () => { + it('should extract import declarations', () => { + const code = ` +import scala.collection.mutable.ListBuffer +import scala.concurrent.Future +`; + const result = extractFromSource('imports.scala', code); + const imports = result.nodes.filter((n) => n.kind === 'import'); + expect(imports.length).toBeGreaterThanOrEqual(2); + }); + }); + + describe('Visibility modifiers', () => { + it('should extract private visibility', () => { + const code = ` +class Service { + private val secret: String = "abc" + private def helper(): Unit = {} +} +`; + const result = extractFromSource('Service.scala', code); + const secretField = result.nodes.find((n) => n.name === 'secret'); + expect(secretField?.visibility).toBe('private'); + const helperMethod = result.nodes.find((n) => n.name === 'helper'); + expect(helperMethod?.visibility).toBe('private'); + }); + + it('should extract protected visibility', () => { + const code = ` +class Base { + protected def helperMethod(): Unit = {} +} +`; + const result = extractFromSource('Base.scala', code); + const method = result.nodes.find((n) => n.name === 'helperMethod'); + expect(method?.visibility).toBe('protected'); + }); + + it('should default to public visibility', () => { + const code = ` +class Greeter { + def hello(): Unit = {} +} +`; + const result = extractFromSource('Greeter.scala', code); + const method = result.nodes.find((n) => n.name === 'hello'); + expect(method?.visibility).toBe('public'); + }); + }); + + describe('Inheritance', () => { + it('should extract extends relationships', () => { + const code = ` +class AdminUser extends User { + def adminAction(): Unit = {} +} +`; + const result = extractFromSource('AdminUser.scala', code); + const extendsRefs = result.unresolvedReferences.filter((r) => r.referenceKind === 'extends'); + expect(extendsRefs.find((r) => r.referenceName === 'User')).toBeDefined(); + }); + }); + + describe('Call extraction', () => { + it('should extract function call expressions', () => { + const code = ` +def processData(): Unit = { + val result = computeResult() + println(result) +} +`; + const result = extractFromSource('processor.scala', code); + const calls = result.unresolvedReferences.filter((r) => r.referenceKind === 'calls'); + expect(calls.length).toBeGreaterThan(0); + }); + }); +}); + +describe('Vue Extraction', () => { + it('should detect Vue files', () => { + expect(detectLanguage('App.vue')).toBe('vue'); + expect(detectLanguage('components/Button.vue')).toBe('vue'); + expect(isLanguageSupported('vue')).toBe(true); + }); + + it('should extract component node from a Vue SFC', () => { + const code = ` + + +`; + const result = extractFromSource('HelloWorld.vue', code); + + const componentNode = result.nodes.find((n) => n.kind === 'component'); + expect(componentNode).toBeDefined(); + expect(componentNode?.name).toBe('HelloWorld'); + expect(componentNode?.language).toBe('vue'); + expect(componentNode?.isExported).toBe(true); + }); + + it('should extract functions from +`; + const result = extractFromSource('Button.vue', code); + + const componentNode = result.nodes.find((n) => n.kind === 'component'); + expect(componentNode).toBeDefined(); + expect(componentNode?.name).toBe('Button'); + + const funcNode = result.nodes.find((n) => n.kind === 'function' && n.name === 'handleClick'); + expect(funcNode).toBeDefined(); + expect(funcNode?.language).toBe('vue'); + }); + + it('should extract from +`; + const result = extractFromSource('Counter.vue', code); + + const componentNode = result.nodes.find((n) => n.kind === 'component'); + expect(componentNode).toBeDefined(); + expect(componentNode?.name).toBe('Counter'); + + const funcNode = result.nodes.find((n) => n.kind === 'function' && n.name === 'increment'); + expect(funcNode).toBeDefined(); + expect(funcNode?.language).toBe('vue'); + + // All nodes should be marked as vue language + for (const node of result.nodes) { + expect(node.language).toBe('vue'); + } + }); + + it('should extract from both + + +`; + const result = extractFromSource('DualScript.vue', code); + + const componentNode = result.nodes.find((n) => n.kind === 'component'); + expect(componentNode).toBeDefined(); + + const greetFunc = result.nodes.find((n) => n.kind === 'function' && n.name === 'greet'); + expect(greetFunc).toBeDefined(); + }); + + it('should create component node for template-only Vue file', () => { + const code = ` +`; + const result = extractFromSource('Static.vue', code); + + const componentNode = result.nodes.find((n) => n.kind === 'component'); + expect(componentNode).toBeDefined(); + expect(componentNode?.name).toBe('Static'); + expect(componentNode?.language).toBe('vue'); + + // Only the component node should exist (no script nodes) + expect(result.nodes.length).toBe(1); + }); + + it('should create containment edges from component to script nodes', () => { + const code = ` + + +`; + const result = extractFromSource('Contained.vue', code); + + const componentNode = result.nodes.find((n) => n.kind === 'component'); + expect(componentNode).toBeDefined(); + + // Should have containment edges from component to child nodes + const containEdges = result.edges.filter( + (e) => e.source === componentNode!.id && e.kind === 'contains' + ); + expect(containEdges.length).toBeGreaterThan(0); + }); +}); + +describe('Instantiates + Decorates edge extraction', () => { + it('emits an instantiates ref for `new Foo()`', () => { + const code = ` +class Foo {} +function bootstrap() { return new Foo(); } +`; + const result = extractFromSource('app.ts', code); + const ref = result.unresolvedReferences.find( + (r) => r.referenceKind === 'instantiates' && r.referenceName === 'Foo' + ); + expect(ref).toBeDefined(); + }); + + it('strips type-argument suffix from generic constructors', () => { + const code = ` +class Container { constructor(_: T) {} } +function go() { return new Container('x'); } +`; + const result = extractFromSource('app.ts', code); + const ref = result.unresolvedReferences.find( + (r) => r.referenceKind === 'instantiates' + ); + expect(ref).toBeDefined(); + // Container must be normalised to "Container" — otherwise + // resolution can never match the class node. + expect(ref!.referenceName).toBe('Container'); + }); + + it('keeps trailing identifier from qualified `new ns.Foo()`', () => { + const code = ` +const ns = { Foo: class {} }; +function go() { return new ns.Foo(); } +`; + const result = extractFromSource('app.ts', code); + const ref = result.unresolvedReferences.find( + (r) => r.referenceKind === 'instantiates' + ); + // We can't always resolve which Foo, but the name should be the + // simple identifier so name-matching has a chance. + expect(ref?.referenceName).toBe('Foo'); + }); + + it('emits a decorates ref for `@Foo class X {}`', () => { + const code = ` +function Foo(_arg: string) { return (cls: any) => cls; } +@Foo('x') +class X {} +`; + const result = extractFromSource('app.ts', code); + const decorClass = result.unresolvedReferences.find( + (r) => r.referenceKind === 'decorates' && r.referenceName === 'Foo' + ); + expect(decorClass).toBeDefined(); + }); + + it('does NOT attribute a prior class\'s decorator to the next class', () => { + // Regression: the sibling-walk must stop at the first non- + // decorator separator. `@A class Foo {} @B class Bar {}` must + // produce `decorates(Foo, A)` and `decorates(Bar, B)` — never + // `decorates(Bar, A)`. + const code = ` +function A(cls: any) { return cls; } +function B(cls: any) { return cls; } +@A +class Foo {} +@B +class Bar {} +`; + const result = extractFromSource('app.ts', code); + const decoratesEdges = result.unresolvedReferences.filter( + (r) => r.referenceKind === 'decorates' + ); + // Exactly one decorates ref per decorated class, no cross-attribution. + const fromBar = decoratesEdges.filter((r) => + result.nodes.find((n) => n.id === r.fromNodeId && n.name === 'Bar') + ); + expect(fromBar.length).toBe(1); + expect(fromBar[0]!.referenceName).toBe('B'); + }); + + it('emits a decorates ref for `@Foo method() {}`', () => { + const code = ` +function Get(p: string) { return (t: any, k: string) => t; } +class Svc { + @Get('/x') method() { return 1; } +} +`; + const result = extractFromSource('app.ts', code); + const decorMethod = result.unresolvedReferences.find( + (r) => r.referenceKind === 'decorates' && r.referenceName === 'Get' + ); + expect(decorMethod).toBeDefined(); + // The decorated symbol must be `method`, not the constructor or class. + const decoratedNode = result.nodes.find((n) => n.id === decorMethod!.fromNodeId); + expect(decoratedNode?.name).toBe('method'); + }); +}); + +// ============================================================================= +// Lua +// ============================================================================= + +describe('Lua Extraction', () => { + describe('Language detection', () => { + it('should detect Lua files', () => { + expect(detectLanguage('init.lua')).toBe('lua'); + expect(detectLanguage('src/util.lua')).toBe('lua'); + }); + + it('should report Lua as supported', () => { + expect(isLanguageSupported('lua')).toBe(true); + expect(getSupportedLanguages()).toContain('lua'); + }); + }); + + describe('Function extraction', () => { + it('should extract global and local functions', () => { + const code = ` +function configure(opts) return opts end +local function helper(x) return x * 2 end +`; + const result = extractFromSource('init.lua', code); + const funcs = result.nodes.filter((n) => n.kind === 'function').map((n) => n.name); + expect(funcs).toContain('configure'); + expect(funcs).toContain('helper'); + const configure = result.nodes.find((n) => n.name === 'configure'); + expect(configure?.language).toBe('lua'); + expect(configure?.signature).toBe('(opts)'); + }); + + it('should split table/method functions into a receiver and method name', () => { + const code = ` +function M.connect(host, port) return host end +function M:send(data) return self end +`; + const result = extractFromSource('init.lua', code); + const methods = result.nodes.filter((n) => n.kind === 'method'); + const connect = methods.find((m) => m.name === 'connect'); + expect(connect?.qualifiedName).toBe('M::connect'); + const send = methods.find((m) => m.name === 'send'); + expect(send?.qualifiedName).toBe('M::send'); + }); + }); + + describe('Variable extraction', () => { + it('should extract local variable declarations', () => { + const code = ` +local M = {} +local count = 0 +`; + const result = extractFromSource('mod.lua', code); + const vars = result.nodes.filter((n) => n.kind === 'variable').map((n) => n.name); + expect(vars).toContain('M'); + expect(vars).toContain('count'); + }); + }); + + describe('Import extraction (require)', () => { + it('should extract require() in local declarations and bare calls', () => { + const code = ` +local socket = require("socket") +local http = require "resty.http" +require("side.effect") +`; + const result = extractFromSource('net.lua', code); + const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name); + expect(imports).toContain('socket'); + expect(imports).toContain('resty.http'); + expect(imports).toContain('side.effect'); + + const ref = result.unresolvedReferences.find( + (r) => r.referenceKind === 'imports' && r.referenceName === 'socket' + ); + expect(ref).toBeDefined(); + }); + + // Regression: the tree-sitter-wasms Lua grammar (ABI 13) corrupts the shared + // WASM heap under web-tree-sitter 0.25, dropping nested calls/imports on every + // parse after the first. We vendor the ABI-15 grammar instead — this guards it + // by extracting several sources in sequence and asserting the LAST still works. + it('should keep extracting require across many sequential parses', () => { + let last; + for (let i = 0; i < 8; i++) { + last = extractFromSource(`f${i}.lua`, `local m = require("module.${i}")\nreturn m\n`); + } + const imports = last!.nodes.filter((n) => n.kind === 'import').map((n) => n.name); + expect(imports).toContain('module.7'); + }); + }); + + describe('Call extraction', () => { + it('should record intra-file calls as resolvable references', () => { + const code = ` +local function helper(x) return x end +local function run(y) return helper(y) end +`; + const result = extractFromSource('calls.lua', code); + const call = result.unresolvedReferences.find( + (r) => r.referenceKind === 'calls' && r.referenceName === 'helper' + ); + expect(call).toBeDefined(); + }); + }); +}); + +// ============================================================================= +// Luau (typed superset of Lua — https://luau.org) +// ============================================================================= + +describe('Luau Extraction', () => { + describe('Language detection', () => { + it('should detect Luau files', () => { + expect(detectLanguage('init.luau')).toBe('luau'); + expect(detectLanguage('src/Client.luau')).toBe('luau'); + }); + + it('should report Luau as supported', () => { + expect(isLanguageSupported('luau')).toBe(true); + expect(getSupportedLanguages()).toContain('luau'); + }); + }); + + describe('Type aliases', () => { + it('should extract `type` and `export type` definitions', () => { + const code = ` +export type Vector = { x: number, y: number } +type Handler = (msg: string) -> boolean +`; + const result = extractFromSource('types.luau', code); + const aliases = result.nodes.filter((n) => n.kind === 'type_alias'); + const vector = aliases.find((a) => a.name === 'Vector'); + expect(vector).toBeDefined(); + expect(vector?.isExported).toBe(true); + const handler = aliases.find((a) => a.name === 'Handler'); + expect(handler).toBeDefined(); + expect(handler?.isExported).toBe(false); + }); + }); + + describe('Typed functions and methods', () => { + it('should capture typed signatures and split methods by receiver', () => { + const code = ` +function configure(opts: { debug: boolean }): boolean + return opts.debug +end +function Client:fetch(path: string): Response + return path +end +`; + const result = extractFromSource('client.luau', code); + const configure = result.nodes.find((n) => n.kind === 'function' && n.name === 'configure'); + expect(configure?.language).toBe('luau'); + expect(configure?.signature).toBe('(opts: { debug: boolean }): boolean'); + const fetch = result.nodes.find((n) => n.kind === 'method' && n.name === 'fetch'); + expect(fetch?.qualifiedName).toBe('Client::fetch'); + }); + }); + + describe('Imports and variables', () => { + it('should extract string and Roblox instance-path require imports', () => { + const code = ` +local http = require("http") +local Signal = require(script.Parent.Signal) +local count = 0 +`; + const result = extractFromSource('mod.luau', code); + const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name); + expect(imports).toContain('http'); // string require + expect(imports).toContain('Signal'); // Roblox instance-path require + const vars = result.nodes.filter((n) => n.kind === 'variable').map((n) => n.name); + expect(vars).toContain('count'); + }); }); }); diff --git a/__tests__/foundation.test.ts b/__tests__/foundation.test.ts index 9ee437da..78ebfce4 100644 --- a/__tests__/foundation.test.ts +++ b/__tests__/foundation.test.ts @@ -9,8 +9,7 @@ import * as fs from 'fs'; import * as path from 'path'; import * as os from 'os'; import { CodeGraph } from '../src'; -import { DEFAULT_CONFIG, Node, Edge } from '../src/types'; -import { loadConfig, saveConfig } from '../src/config'; +import { Node, Edge } from '../src/types'; import { isInitialized, getCodeGraphDir, validateDirectory } from '../src/directory'; import { DatabaseConnection, getDatabasePath } from '../src/db'; @@ -60,41 +59,12 @@ describe('CodeGraph Foundation', () => { cg.close(); }); - it('should create config.json with defaults', () => { - const cg = CodeGraph.initSync(tempDir); - - const configPath = path.join(getCodeGraphDir(tempDir), 'config.json'); - expect(fs.existsSync(configPath)).toBe(true); - - const config = cg.getConfig(); - expect(config.version).toBe(DEFAULT_CONFIG.version); - expect(config.include).toEqual(DEFAULT_CONFIG.include); - expect(config.exclude).toEqual(DEFAULT_CONFIG.exclude); - - cg.close(); - }); - it('should throw if already initialized', () => { const cg = CodeGraph.initSync(tempDir); cg.close(); expect(() => CodeGraph.initSync(tempDir)).toThrow(/already initialized/i); }); - - it('should accept custom config options', () => { - const cg = CodeGraph.initSync(tempDir, { - config: { - maxFileSize: 500000, - extractDocstrings: false, - }, - }); - - const config = cg.getConfig(); - expect(config.maxFileSize).toBe(500000); - expect(config.extractDocstrings).toBe(false); - - cg.close(); - }); }); describe('Opening Projects', () => { @@ -112,17 +82,6 @@ describe('CodeGraph Foundation', () => { it('should throw if not initialized', () => { expect(() => CodeGraph.openSync(tempDir)).toThrow(/not initialized/i); }); - - it('should preserve configuration across open/close', () => { - const cg1 = CodeGraph.initSync(tempDir, { - config: { maxFileSize: 123456 }, - }); - cg1.close(); - - const cg2 = CodeGraph.openSync(tempDir); - expect(cg2.getConfig().maxFileSize).toBe(123456); - cg2.close(); - }); }); describe('Static Methods', () => { @@ -182,31 +141,6 @@ describe('CodeGraph Foundation', () => { }); }); - describe('Configuration', () => { - it('should load and merge config with defaults', () => { - const cg = CodeGraph.initSync(tempDir); - cg.close(); - - const config = loadConfig(tempDir); - expect(config.version).toBe(DEFAULT_CONFIG.version); - expect(config.rootDir).toBe(path.resolve(tempDir)); - }); - - it('should update configuration', () => { - const cg = CodeGraph.initSync(tempDir); - - cg.updateConfig({ maxFileSize: 999999 }); - - expect(cg.getConfig().maxFileSize).toBe(999999); - - cg.close(); - - // Verify persistence - const config = loadConfig(tempDir); - expect(config.maxFileSize).toBe(999999); - }); - }); - describe('Directory Management', () => { it('should validate directory structure', () => { const cg = CodeGraph.initSync(tempDir); @@ -305,7 +239,7 @@ describe('Database Connection', () => { const version = db.getSchemaVersion(); expect(version).not.toBeNull(); - expect(version?.version).toBe(3); + expect(version?.version).toBe(4); db.close(); }); diff --git a/__tests__/frameworks-integration.test.ts b/__tests__/frameworks-integration.test.ts new file mode 100644 index 00000000..2eb99447 --- /dev/null +++ b/__tests__/frameworks-integration.test.ts @@ -0,0 +1,199 @@ +import { describe, it, expect, beforeAll, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { CodeGraph } from '../src'; +import { initGrammars, loadAllGrammars } from '../src/extraction/grammars'; + +beforeAll(async () => { + await initGrammars(); + await loadAllGrammars(); +}); + +describe('Django end-to-end framework extraction', () => { + let tmpDir: string | undefined; + afterEach(() => { + if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true }); + tmpDir = undefined; + }); + + it('creates a route->view edge from urls.py to view class', async () => { + tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-django-')); + fs.writeFileSync(path.join(tmpDir, 'manage.py'), '# marker\n'); + fs.writeFileSync(path.join(tmpDir, 'requirements.txt'), 'django==4.2\n'); + fs.mkdirSync(path.join(tmpDir, 'users')); + fs.writeFileSync(path.join(tmpDir, 'users/__init__.py'), ''); + fs.writeFileSync( + path.join(tmpDir, 'users/views.py'), + 'class UserListView:\n def get(self, request): pass\n' + ); + fs.writeFileSync( + path.join(tmpDir, 'users/urls.py'), + 'from django.urls import path\n' + + 'from users.views import UserListView\n' + + 'urlpatterns = [path("users/", UserListView.as_view(), name="user-list")]\n' + ); + + const cg = CodeGraph.initSync(tmpDir); + await cg.indexAll(); + + // Route node exists + const routes = cg.getNodesByKind('route'); + expect(routes.length).toBeGreaterThan(0); + const route = routes.find((n) => n.name === 'users/'); + expect(route).toBeDefined(); + + // View class exists + const classNodes = cg.getNodesByKind('class'); + const view = classNodes.find((n) => n.name === 'UserListView'); + expect(view).toBeDefined(); + + // Edge route -> view exists + const edges = cg.getOutgoingEdges(route!.id); + const toView = edges.find((e) => e.target === view!.id); + expect(toView).toBeDefined(); + expect(toView!.kind).toBe('references'); + + cg.close(); + }); +}); + +describe('Flask end-to-end framework extraction', () => { + let tmpDir: string | undefined; + afterEach(() => { + if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true }); + tmpDir = undefined; + }); + + it('resolves stacked routes across @login_required to a view named after a builtin (index)', async () => { + tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-flask-')); + fs.writeFileSync(path.join(tmpDir, 'requirements.txt'), 'flask==3.0\n'); + fs.writeFileSync( + path.join(tmpDir, 'app.py'), + 'from flask import Blueprint, render_template\n' + + 'from flask_login import login_required\n' + + 'bp = Blueprint("main", __name__)\n' + + '\n' + + '@bp.route("/", methods=["GET", "POST"])\n' + + '@bp.route("/index", methods=["GET", "POST"])\n' + + '@login_required\n' + + 'def index():\n' + + ' return render_template("index.html")\n' + ); + + const cg = CodeGraph.initSync(tmpDir); + await cg.indexAll(); + + // Both stacked @bp.route decorators are extracted (the second was previously + // dropped because @login_required broke the "def must follow" assumption). + const routes = cg.getNodesByKind('route'); + expect(routes.map((r) => r.name).sort()).toEqual(['GET /', 'GET /index']); + + // The view function exists even though its name is a Python builtin method. + const fn = cg.getNodesByKind('function').find((n) => n.name === 'index'); + expect(fn).toBeDefined(); + + // Both routes resolve to it — exercises the bare-name builtin guard, which + // previously filtered the `index` reference as a builtin method. + for (const route of routes) { + const edges = cg.getOutgoingEdges(route.id); + const toView = edges.find((e) => e.target === fn!.id && e.kind === 'references'); + expect(toView, `route ${route.name} should resolve to index()`).toBeDefined(); + } + + cg.close(); + }); +}); + +describe('Flutter end-to-end — setState→build synthesis', () => { + let tmpDir: string | undefined; + afterEach(() => { + if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true }); + tmpDir = undefined; + }); + + it('synthesizes a handler→build edge when a State method calls setState', async () => { + tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-flutter-')); + fs.writeFileSync( + path.join(tmpDir, 'main.dart'), + 'import "package:flutter/material.dart";\n' + + 'class CounterPage extends StatefulWidget {\n' + + ' @override\n' + + ' State createState() => _CounterPageState();\n' + + '}\n' + + 'class _CounterPageState extends State {\n' + + ' int _count = 0;\n' + + ' void _increment() {\n' + + ' setState(() {\n' + + ' _count++;\n' + + ' });\n' + + ' }\n' + + ' @override\n' + + ' Widget build(BuildContext context) {\n' + + ' return Text("$_count");\n' + + ' }\n' + + '}\n' + ); + + const cg = CodeGraph.initSync(tmpDir); + await cg.indexAll(); + + const methods = cg.getNodesByKind('method'); + const increment = methods.find((n) => n.name === '_increment'); + const build = methods.find((n) => n.name === 'build'); + expect(increment).toBeDefined(); + expect(build).toBeDefined(); + + // setState re-runs build (Flutter-internal, no static edge). The synthesizer + // bridges the handler → build so the "tap → setState → rebuilt UI" flow connects. + const edges = cg.getOutgoingEdges(increment!.id); + const toBuild = edges.find((e) => e.target === build!.id && e.kind === 'calls'); + expect(toBuild, '_increment should reach build via setState synthesis').toBeDefined(); + + cg.close(); + }); +}); + +describe('C++ end-to-end — virtual override synthesis', () => { + let tmpDir: string | undefined; + afterEach(() => { + if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true }); + tmpDir = undefined; + }); + + it('bridges a base virtual method to the subclass override', async () => { + tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-cpp-')); + fs.writeFileSync( + path.join(tmpDir, 'iter.cpp'), + 'class Iterator {\n' + + ' public:\n' + + ' virtual void Next() { }\n' + + '};\n' + + 'class DBIter : public Iterator {\n' + + ' public:\n' + + ' void Next() override { advance(); }\n' + + ' void advance() { }\n' + + '};\n' + ); + + const cg = CodeGraph.initSync(tmpDir); + await cg.indexAll(); + + // Two methods named Next: the base virtual (lower line) and the override. + const nexts = cg + .getNodesByKind('method') + .filter((n) => n.name === 'Next') + .sort((a, b) => a.startLine - b.startLine); + expect(nexts.length).toBe(2); + const [baseNext, overrideNext] = nexts; + + // A vtable call to Iterator::Next dispatches to DBIter::Next — bridge it so + // trace/callees from the interface method reaches the implementation. + const edge = cg + .getOutgoingEdges(baseNext!.id) + .find((e) => e.target === overrideNext!.id && e.kind === 'calls'); + expect(edge, 'Iterator::Next should reach DBIter::Next via override synthesis').toBeDefined(); + + cg.close(); + }); +}); diff --git a/__tests__/frameworks.test.ts b/__tests__/frameworks.test.ts new file mode 100644 index 00000000..1c2c643f --- /dev/null +++ b/__tests__/frameworks.test.ts @@ -0,0 +1,1334 @@ +import { describe, it, expect } from 'vitest'; +import type { FrameworkResolver, UnresolvedRef } from '../src/resolution/types'; +import type { Node } from '../src/types'; + +describe('FrameworkResolver.extract interface', () => { + it('extract() returns { nodes, references }', () => { + const resolver: FrameworkResolver = { + name: 'fake', + detect: () => true, + resolve: () => null, + languages: ['python'], + extract: (_filePath: string, _content: string) => ({ + nodes: [] as Node[], + references: [] as UnresolvedRef[], + }), + }; + const result = resolver.extract!('foo.py', ''); + expect(result).toEqual({ nodes: [], references: [] }); + }); +}); + +import { getApplicableFrameworks } from '../src/resolution/frameworks'; +import type { FrameworkResolver } from '../src/resolution/types'; + +describe('getApplicableFrameworks', () => { + const pyFw: FrameworkResolver = { name: 'py', languages: ['python'], detect: () => true, resolve: () => null }; + const jsFw: FrameworkResolver = { name: 'js', languages: ['javascript', 'typescript'], detect: () => true, resolve: () => null }; + const anyFw: FrameworkResolver = { name: 'any', detect: () => true, resolve: () => null }; + + it('filters by language', () => { + const result = getApplicableFrameworks([pyFw, jsFw, anyFw], 'python'); + expect(result.map(r => r.name)).toEqual(['py', 'any']); + }); + + it('returns anyFw-only when language has no matches', () => { + const result = getApplicableFrameworks([pyFw, jsFw, anyFw], 'rust'); + expect(result.map(r => r.name)).toEqual(['any']); + }); +}); + +import { djangoResolver } from '../src/resolution/frameworks/python'; + +describe('djangoResolver.extract', () => { + it('extracts route node and reference for path() with CBV.as_view()', () => { + const src = ` +from django.urls import path +from users.views import UserListView + +urlpatterns = [ + path('users/', UserListView.as_view(), name='user-list'), +] +`; + const { nodes, references } = djangoResolver.extract!('users/urls.py', src); + expect(nodes).toHaveLength(1); + expect(nodes[0].kind).toBe('route'); + expect(nodes[0].name).toBe('users/'); + expect(references).toHaveLength(1); + expect(references[0].referenceName).toBe('UserListView'); + expect(references[0].referenceKind).toBe('references'); + expect(references[0].fromNodeId).toBe(nodes[0].id); + }); + + it('extracts route for path() with dotted module.Class.as_view()', () => { + const src = `from django.urls import path\nfrom api.v1 import views as api_v1_views\nurlpatterns = [path('api/', api_v1_views.UserListView.as_view())]\n`; + const { nodes, references } = djangoResolver.extract!('api/urls.py', src); + expect(nodes).toHaveLength(1); + expect(references[0].referenceName).toBe('UserListView'); + }); + + it('extracts route for path() with bare function view', () => { + const src = `from django.urls import path\nurlpatterns = [path('home/', home_view, name='home')]\n`; + const { nodes, references } = djangoResolver.extract!('home/urls.py', src); + expect(references[0].referenceName).toBe('home_view'); + }); + + it('extracts route for path() with include()', () => { + const src = `from django.urls import path, include\nurlpatterns = [path('api/', include('api.urls'))]\n`; + const { nodes, references } = djangoResolver.extract!('root/urls.py', src); + expect(nodes).toHaveLength(1); + expect(nodes[0].kind).toBe('route'); + expect(references[0].referenceName).toBe('api.urls'); + expect(references[0].referenceKind).toBe('imports'); + }); + + it('extracts routes for re_path and url', () => { + const src = `from django.urls import re_path, url\nurlpatterns = [re_path(r'^users/$', UserView), url(r'^old/$', OldView)]\n`; + const { nodes } = djangoResolver.extract!('legacy/urls.py', src); + expect(nodes).toHaveLength(2); + expect(nodes.map(n => n.name)).toEqual(['^users/$', '^old/$']); + }); + + it('returns empty result for a non-urls.py python file', () => { + const src = `def foo(): return 1\n`; + const { nodes, references } = djangoResolver.extract!('views.py', src); + expect(nodes).toEqual([]); + expect(references).toEqual([]); + }); +}); + +import { flaskResolver, fastapiResolver } from '../src/resolution/frameworks/python'; + +describe('flaskResolver.extract', () => { + it('extracts route and reference from @app.route', () => { + const src = ` +@app.route('/users') +def list_users(): + return [] +`; + const { nodes, references } = flaskResolver.extract!('app.py', src); + expect(nodes).toHaveLength(1); + expect(nodes[0].kind).toBe('route'); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('list_users'); + }); + + it('extracts blueprint routes', () => { + const src = ` +@users_bp.route('/', methods=['POST']) +def create_user(id): + pass +`; + const { nodes, references } = flaskResolver.extract!('routes.py', src); + expect(nodes[0].name).toBe('POST /'); + expect(references[0].referenceName).toBe('create_user'); + }); + + it('resolves the handler across an intervening decorator (@login_required)', () => { + const src = ` +@bp.route('/profile') +@login_required +def profile(): + return render_template('profile.html') +`; + const { nodes, references } = flaskResolver.extract!('routes.py', src); + expect(nodes[0].name).toBe('GET /profile'); + expect(references[0].referenceName).toBe('profile'); + }); + + it('extracts stacked @x.route decorators bound to one view', () => { + const src = ` +@bp.route('/', methods=['GET', 'POST']) +@bp.route('/index', methods=['GET', 'POST']) +@login_required +def index(): + return render_template('index.html') +`; + const { nodes, references } = flaskResolver.extract!('routes.py', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /', 'GET /index']); + expect(references.map((r) => r.referenceName)).toEqual(['index', 'index']); + }); + + it('extracts the method from a tuple methods=(...) (not just a list)', () => { + const src = ` +@blueprint.route('/api/articles', methods=('POST',)) +def make_article(): + pass +`; + const { nodes, references } = flaskResolver.extract!('views.py', src); + expect(nodes[0].name).toBe('POST /api/articles'); + expect(references[0].referenceName).toBe('make_article'); + }); + + it('extracts Flask-RESTful api.add_resource(Resource, paths) → the Resource class', () => { + const src = ` +api.add_resource(TodoResource, '/todos/') +api.add_org_resource(AlertResource, '/api/alerts/', endpoint='alert') +`; + const { nodes, references } = flaskResolver.extract!('api.py', src); + expect(nodes.map((n) => n.name)).toEqual(['ANY /todos/', 'ANY /api/alerts/']); + expect(references.map((r) => r.referenceName)).toEqual(['TodoResource', 'AlertResource']); + }); +}); + +describe('fastapiResolver.extract', () => { + it('extracts route and reference from @app.get', () => { + const src = ` +@app.get('/users') +async def list_users(): + return [] +`; + const { nodes, references } = fastapiResolver.extract!('main.py', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('list_users'); + }); + + it('extracts route from router.post', () => { + const src = ` +@router.post('/items') +def create_item(item: Item): + pass +`; + const { nodes, references } = fastapiResolver.extract!('items.py', src); + expect(nodes[0].name).toBe('POST /items'); + expect(references[0].referenceName).toBe('create_item'); + }); + + it('extracts a route mounted at the router/prefix root (empty path)', () => { + const src = ` +@router.get("", response_model=ListOfArticles, name="articles:list") +async def list_articles(): + return [] +`; + const { nodes, references } = fastapiResolver.extract!('articles.py', src); + expect(nodes[0].name).toBe('GET /'); + expect(references[0].referenceName).toBe('list_articles'); + }); + + it('extracts a multi-line decorator with an empty path', () => { + const src = ` +@router.post( + "", + status_code=201, + response_model=ArticleInResponse, +) +async def create_article(): + pass +`; + const { nodes, references } = fastapiResolver.extract!('articles.py', src); + expect(nodes[0].name).toBe('POST /'); + expect(references[0].referenceName).toBe('create_article'); + }); +}); + +import { expressResolver } from '../src/resolution/frameworks/express'; + +describe('expressResolver.extract', () => { + it('extracts route with inline handler reference', () => { + const src = `app.get('/users', listUsers);\n`; + const { nodes, references } = expressResolver.extract!('routes.ts', src); + expect(nodes).toHaveLength(1); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('listUsers'); + }); + + it('extracts route with router.post and middleware chain', () => { + const src = `router.post('/items', auth, createItem);\n`; + const { nodes, references } = expressResolver.extract!('items.ts', src); + expect(nodes[0].name).toBe('POST /items'); + // Multiple handlers: prefer the LAST one (convention: middleware first, handler last) + expect(references[0].referenceName).toBe('createItem'); + }); + + it('extracts route with controller method reference', () => { + const src = `app.get('/x', userController.list);\n`; + const { nodes, references } = expressResolver.extract!('routes.ts', src); + expect(references[0].referenceName).toBe('list'); + }); +}); + +import { nestjsResolver } from '../src/resolution/frameworks/nestjs'; + +describe('nestjsResolver.extract — HTTP', () => { + it('joins @Controller prefix with @Get and links the handler', () => { + const src = ` +@Controller('users') +export class UsersController { + @Get() + findAll() { return []; } +} +`; + const { nodes, references } = nestjsResolver.extract!('users.controller.ts', src); + expect(nodes).toHaveLength(1); + expect(nodes[0].kind).toBe('route'); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('findAll'); + expect(references[0].referenceKind).toBe('references'); + expect(references[0].fromNodeId).toBe(nodes[0].id); + }); + + it('joins controller prefix with a method-level path param', () => { + const src = ` +@Controller('cats') +export class CatsController { + @Get(':id') + findOne(@Param('id') id: string) { return id; } +} +`; + const { nodes, references } = nestjsResolver.extract!('cats.controller.ts', src); + expect(nodes[0].name).toBe('GET /cats/:id'); + expect(references[0].referenceName).toBe('findOne'); + }); + + it('handles an empty @Controller() and empty @Post()', () => { + const src = ` +@Controller() +export class AppController { + @Post() + create() {} +} +`; + const { nodes, references } = nestjsResolver.extract!('app.controller.ts', src); + expect(nodes[0].name).toBe('POST /'); + expect(references[0].referenceName).toBe('create'); + }); + + it('covers HTTP verbs and skips intervening method decorators', () => { + const src = ` +@Controller('todos') +export class TodosController { + @Put(':id') + @UseGuards(AuthGuard) + update(@Param('id') id: string) {} + + @Delete(':id') + async remove(@Param('id') id: string) {} +} +`; + const { nodes, references } = nestjsResolver.extract!('todos.controller.ts', src); + expect(nodes.map((n) => n.name)).toEqual(['PUT /todos/:id', 'DELETE /todos/:id']); + expect(references.map((r) => r.referenceName)).toEqual(['update', 'remove']); + }); + + it('attributes methods to the right controller when a file has two', () => { + const src = ` +@Controller('a') +export class AController { + @Get('x') + ax() {} +} + +@Controller('b') +export class BController { + @Get('y') + by() {} +} +`; + const { nodes } = nestjsResolver.extract!('multi.controller.ts', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /a/x', 'GET /b/y']); + }); +}); + +describe('nestjsResolver.extract — GraphQL', () => { + it('emits QUERY/MUTATION nodes from a resolver, defaulting to the method name', () => { + const src = ` +@Resolver(() => User) +export class UsersResolver { + @Query(() => [User]) + users() { return []; } + + @Mutation(() => User) + createUser(@Args('input') input: CreateUserInput) {} +} +`; + const { nodes, references } = nestjsResolver.extract!('users.resolver.ts', src); + expect(nodes.map((n) => n.name)).toEqual(['QUERY users', 'MUTATION createUser']); + expect(references.map((r) => r.referenceName)).toEqual(['users', 'createUser']); + }); + + it('uses an explicit operation name when given', () => { + const src = ` +@Resolver() +export class CatsResolver { + @Query(() => Cat, { name: 'cat' }) + getCat() {} +} +`; + const { nodes } = nestjsResolver.extract!('cats.resolver.ts', src); + expect(nodes[0].name).toBe('QUERY cat'); + }); + + it('does NOT treat the REST @Query() parameter decorator as a GraphQL op', () => { + const src = ` +@Controller('search') +export class SearchController { + @Get() + search(@Query() query: SearchDto) { return query; } +} +`; + const { nodes } = nestjsResolver.extract!('search.controller.ts', src); + // Only the HTTP route — the @Query() param decorator must be ignored. + expect(nodes.map((n) => n.name)).toEqual(['GET /search']); + }); +}); + +describe('nestjsResolver.extract — microservices & websockets', () => { + it('extracts @MessagePattern and @EventPattern handlers', () => { + const src = ` +@Controller() +export class MathController { + @MessagePattern({ cmd: 'sum' }) + accumulate(data: number[]) {} + + @EventPattern('user.created') + handleUserCreated(data: any) {} +} +`; + const { nodes, references } = nestjsResolver.extract!('math.controller.ts', src); + expect(nodes.map((n) => n.name)).toEqual(['MESSAGE sum', 'EVENT user.created']); + expect(references.map((r) => r.referenceName)).toEqual(['accumulate', 'handleUserCreated']); + }); + + it('extracts @SubscribeMessage handlers with the gateway namespace', () => { + const src = ` +@WebSocketGateway({ namespace: 'chat' }) +export class ChatGateway { + @SubscribeMessage('message') + handleMessage(@MessageBody() data: string) {} +} +`; + const { nodes, references } = nestjsResolver.extract!('chat.gateway.ts', src); + expect(nodes[0].name).toBe('WS chat:message'); + expect(references[0].referenceName).toBe('handleMessage'); + }); + + it('extracts @SubscribeMessage without a namespace', () => { + const src = ` +@WebSocketGateway() +export class EventsGateway { + @SubscribeMessage('events') + onEvent() {} +} +`; + const { nodes } = nestjsResolver.extract!('events.gateway.ts', src); + expect(nodes[0].name).toBe('WS events'); + }); + + it('returns empty for a non-JS/TS file', () => { + const { nodes, references } = nestjsResolver.extract!('thing.py', '@Controller("x")'); + expect(nodes).toEqual([]); + expect(references).toEqual([]); + }); +}); + +describe('nestjsResolver.detect', () => { + const baseContext = { + getNodesInFile: () => [], + getNodesByName: () => [], + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: () => false, + getProjectRoot: () => '/test', + getAllFiles: () => [], + getNodesByLowerName: () => [], + getImportMappings: () => [], + }; + + it('detects @nestjs/* in package.json', () => { + const context = { + ...baseContext, + readFile: (p: string) => + p === 'package.json' + ? JSON.stringify({ dependencies: { '@nestjs/common': '^10.0.0' } }) + : null, + }; + expect(nestjsResolver.detect(context as any)).toBe(true); + }); + + it('detects @Controller in a *.controller.ts file when package.json is absent', () => { + const context = { + ...baseContext, + getAllFiles: () => ['src/users.controller.ts'], + readFile: (p: string) => + p === 'src/users.controller.ts' + ? `@Controller('users')\nexport class UsersController {}` + : null, + }; + expect(nestjsResolver.detect(context as any)).toBe(true); + }); + + it('returns false for a non-Nest project', () => { + const context = { + ...baseContext, + readFile: (p: string) => + p === 'package.json' ? JSON.stringify({ dependencies: { express: '^4' } }) : null, + }; + expect(nestjsResolver.detect(context as any)).toBe(false); + }); +}); + +describe('nestjsResolver.resolve', () => { + const baseContext = { + getNodesInFile: () => [], + getNodesByName: () => [], + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: () => false, + readFile: () => null, + getProjectRoot: () => '/test', + getAllFiles: () => [], + getNodesByLowerName: () => [], + getImportMappings: () => [], + }; + + it('resolves an injected *Service reference to the class in a *.service.ts file', () => { + const svcNode: Node = { + id: 'class:src/users/users.service.ts:UsersService:3', + kind: 'class', + name: 'UsersService', + qualifiedName: 'src/users/users.service.ts::UsersService', + filePath: 'src/users/users.service.ts', + language: 'typescript', + startLine: 3, + endLine: 3, + startColumn: 0, + endColumn: 0, + updatedAt: Date.now(), + }; + const context = { + ...baseContext, + getNodesByName: (n: string) => (n === 'UsersService' ? [svcNode] : []), + }; + const ref = { + fromNodeId: 'class:src/users/users.controller.ts:UsersController:5', + referenceName: 'UsersService', + referenceKind: 'references' as const, + line: 6, + column: 4, + filePath: 'src/users/users.controller.ts', + language: 'typescript' as const, + }; + const result = nestjsResolver.resolve(ref, context as any); + expect(result?.targetNodeId).toBe(svcNode.id); + expect(result?.resolvedBy).toBe('framework'); + expect(result?.confidence).toBeGreaterThanOrEqual(0.85); + }); + + it('returns null for a name without a provider suffix', () => { + const ref = { + fromNodeId: 'x', + referenceName: 'doThing', + referenceKind: 'references' as const, + line: 1, + column: 1, + filePath: 'a.ts', + language: 'typescript' as const, + }; + expect(nestjsResolver.resolve(ref, baseContext as any)).toBeNull(); + }); +}); + +import { laravelResolver } from '../src/resolution/frameworks/laravel'; + +describe('laravelResolver.extract', () => { + it('extracts route with controller tuple syntax', () => { + const src = `Route::get('/users', [UserController::class, 'index']);\n`; + const { nodes, references } = laravelResolver.extract!('routes/web.php', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('UserController@index'); + }); + + it('extracts route with Controller@action syntax', () => { + const src = `Route::post('/users', 'UserController@store');\n`; + const { nodes, references } = laravelResolver.extract!('routes/web.php', src); + expect(references[0].referenceName).toBe('UserController@store'); + }); + + it('extracts resource route', () => { + const src = `Route::resource('users', UserController::class);\n`; + const { nodes, references } = laravelResolver.extract!('routes/web.php', src); + expect(nodes[0].kind).toBe('route'); + expect(references[0].referenceName).toBe('UserController'); + }); +}); + +import { railsResolver } from '../src/resolution/frameworks/ruby'; + +describe('railsResolver.extract', () => { + it('extracts route with controller#action syntax', () => { + const src = `get '/users', to: 'users#index'\n`; + const { nodes, references } = railsResolver.extract!('config/routes.rb', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('users#index'); + }); + + it('extracts route without to: keyword', () => { + const src = `post '/items' => 'items#create'\n`; + const { nodes, references } = railsResolver.extract!('config/routes.rb', src); + expect(references[0].referenceName).toBe('items#create'); + }); +}); + +import { springResolver } from '../src/resolution/frameworks/java'; + +describe('springResolver.extract', () => { + it('extracts route with @GetMapping and next method', () => { + const src = ` +@GetMapping("/users") +public List listUsers() { + return users; +} +`; + const { nodes, references } = springResolver.extract!('UserController.java', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('listUsers'); + }); + + it('extracts a Kotlin @GetMapping with a fun handler', () => { + const src = ` +@GetMapping("/vets") +fun showVetList(model: MutableMap): String { + return "vets" +} +`; + const { nodes, references } = springResolver.extract!('VetController.kt', src); + expect(nodes[0].name).toBe('GET /vets'); + expect(references[0].referenceName).toBe('showVetList'); + expect(nodes[0].language).toBe('kotlin'); + }); + + it('joins a Kotlin class @RequestMapping prefix and skips a stacked annotation', () => { + const src = ` +@RestController +@RequestMapping("/owners") +class OwnerController { + @GetMapping("/{ownerId}") + @ResponseBody + fun showOwner(@PathVariable ownerId: Int): String { + return "owner" + } +} +`; + const { nodes, references } = springResolver.extract!('OwnerController.kt', src); + expect(nodes[0].name).toBe('GET /owners/{ownerId}'); + expect(references[0].referenceName).toBe('showOwner'); + }); +}); + +import { playResolver } from '../src/resolution/frameworks/play'; +import { isSourceFile, isPlayRoutesFile } from '../src/extraction/grammars'; + +describe('playResolver.extract (conf/routes)', () => { + it('extracts METHOD /path Controller.action routes, dropping the package + args', () => { + const src = `# Routes +GET / controllers.Application.index +GET /computers controllers.Application.list(p: Int ?= 0, s: Int ?= 2) +POST /computers controllers.Application.save +-> /v1/posts v1.post.PostRouter +`; + const { nodes, references } = playResolver.extract!('conf/routes', src); + expect(nodes.map((n) => n.name)).toEqual([ + 'GET /', + 'GET /computers', + 'POST /computers', + ]); // the `->` include is skipped + expect(references.map((r) => r.referenceName)).toEqual([ + 'Application.index', + 'Application.list', + 'Application.save', + ]); + }); + + it('only runs on Play routes files', () => { + expect(playResolver.extract!('app/Foo.scala', 'GET / controllers.X.y').nodes).toHaveLength(0); + }); +}); + +describe('Play routes file detection', () => { + it('recognizes conf/routes (extensionless) and *.routes as source files', () => { + expect(isPlayRoutesFile('conf/routes')).toBe(true); + expect(isPlayRoutesFile('myapp/conf/routes')).toBe(true); + expect(isPlayRoutesFile('conf/admin.routes')).toBe(true); + expect(isSourceFile('conf/routes')).toBe(true); + expect(isPlayRoutesFile('src/routes.ts')).toBe(false); + }); +}); + +import { goResolver } from '../src/resolution/frameworks/go'; + +describe('goResolver.extract', () => { + it('extracts route from r.GET', () => { + const src = `r.GET("/users", listUsers)\n`; + const { nodes, references } = goResolver.extract!('main.go', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('listUsers'); + }); + + it('extracts route from router.HandleFunc', () => { + const src = `router.HandleFunc("/items", createItem)\n`; + const { nodes, references } = goResolver.extract!('main.go', src); + expect(references[0].referenceName).toBe('createItem'); + }); + + it('extracts gorilla/mux HandleFunc on a subrouter var, ignoring chained .Methods()', () => { + // `s` is a PathPrefix().Subrouter() var — any receiver is matched; the + // trailing .Methods("GET") doesn't break the handler capture. + const src = `s.HandleFunc("/users/{id}", listUsers).Methods("GET")\n`; + const { references } = goResolver.extract!('routes.go', src); + expect(references[0].referenceName).toBe('listUsers'); + }); +}); + +import { rustResolver } from '../src/resolution/frameworks/rust'; + +describe('rustResolver.extract', () => { + it('extracts route from axum .route with get()', () => { + const src = `let app = Router::new().route("/users", get(list_users));\n`; + const { nodes, references } = rustResolver.extract!('main.rs', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('list_users'); + }); + + it('extracts every method from a chained axum .route (get().put())', () => { + const src = `let app = Router::new().route("/user", get(get_current_user).put(update_user));\n`; + const { nodes, references } = rustResolver.extract!('main.rs', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /user', 'PUT /user']); + expect(references.map((r) => r.referenceName)).toEqual([ + 'get_current_user', + 'update_user', + ]); + }); + + it('extracts a multi-line axum .route with a namespaced handler', () => { + const src = ` +let app = Router::new() + .route( + "/articles/feed", + get(listing::feed_articles), + ); +`; + const { nodes, references } = rustResolver.extract!('main.rs', src); + expect(nodes[0].name).toBe('GET /articles/feed'); + expect(references[0].referenceName).toBe('feed_articles'); + }); + + it('extracts actix web::resource().route(web::METHOD().to(handler))', () => { + const src = `App::new().service(web::resource("/user/{id}").route(web::get().to(get_user)))\n`; + const { nodes, references } = rustResolver.extract!('main.rs', src); + expect(nodes[0].name).toBe('GET /user/{id}'); + expect(references[0].referenceName).toBe('get_user'); + }); + + it('extracts actix web::resource("/").to(handler) (all methods)', () => { + const src = `App::new().service(web::resource("/").to(index))\n`; + const { nodes, references } = rustResolver.extract!('main.rs', src); + expect(nodes[0].name).toBe('ANY /'); + expect(references[0].referenceName).toBe('index'); + }); + + it('extracts actix App-level .route("/path", web::METHOD().to(handler))', () => { + const src = `App::new().route("/health", web::get().to(health_check))\n`; + const { nodes, references } = rustResolver.extract!('main.rs', src); + expect(nodes[0].name).toBe('GET /health'); + expect(references[0].referenceName).toBe('health_check'); + }); +}); + +describe('rustResolver.resolve cargo workspace crates', () => { + it('resolves crate name from workspace member lib.rs', () => { + const workspaceCargo = ` +[workspace] +members = ["crates/mytool-core", "crates/mytool-fetcher"] +`; + const coreCargo = ` +[package] +name = "mytool-core" +version = "0.1.0" +`; + const libNode: Node = { + id: 'module:crates/mytool-core/src/lib.rs:mytool_core:1', + kind: 'module', + name: 'mytool_core', + qualifiedName: 'crates/mytool-core/src/lib.rs::mytool_core', + filePath: 'crates/mytool-core/src/lib.rs', + language: 'rust', + startLine: 1, + endLine: 1, + startColumn: 0, + endColumn: 0, + updatedAt: Date.now(), + }; + + const context = { + getNodesInFile: (fp: string) => (fp === 'crates/mytool-core/src/lib.rs' ? [libNode] : []), + getNodesByName: () => [], + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: (p: string) => ( + p === 'Cargo.toml' || + p === 'crates/mytool-core/Cargo.toml' || + p === 'crates/mytool-core/src/lib.rs' + ), + readFile: (p: string) => { + if (p === 'Cargo.toml') return workspaceCargo; + if (p === 'crates/mytool-core/Cargo.toml') return coreCargo; + return null; + }, + getProjectRoot: () => '/test', + getAllFiles: () => [ + 'Cargo.toml', + 'crates/mytool-core/Cargo.toml', + 'crates/mytool-core/src/lib.rs', + ], + getNodesByLowerName: () => [], + getImportMappings: () => [], + }; + + const ref = { + fromNodeId: 'fn:crates/mytool-fetcher/src/main.rs:main:1', + referenceName: 'mytool_core', + referenceKind: 'references' as const, + line: 1, + column: 1, + filePath: 'crates/mytool-fetcher/src/main.rs', + language: 'rust' as const, + }; + + const result = rustResolver.resolve(ref, context); + expect(result?.targetNodeId).toBe(libNode.id); + expect(result?.resolvedBy).toBe('framework'); + // Workspace-manifest hits are unambiguous and must beat name-matcher's + // self-file matches (0.7) so cross-crate `imports` edges materialize. + expect(result?.confidence).toBeGreaterThanOrEqual(0.9); + }); + + it('resolves crate name from workspace member main.rs when lib.rs is absent', () => { + const workspaceCargo = ` +[workspace] +members = [ + "crates/mytool-runner", +] +`; + const runnerCargo = ` +[package] +name = "mytool-runner" +version = "0.1.0" +`; + const mainNode: Node = { + id: 'module:crates/mytool-runner/src/main.rs:mytool_runner:1', + kind: 'module', + name: 'mytool_runner', + qualifiedName: 'crates/mytool-runner/src/main.rs::mytool_runner', + filePath: 'crates/mytool-runner/src/main.rs', + language: 'rust', + startLine: 1, + endLine: 1, + startColumn: 0, + endColumn: 0, + updatedAt: Date.now(), + }; + + const context = { + getNodesInFile: (fp: string) => (fp === 'crates/mytool-runner/src/main.rs' ? [mainNode] : []), + getNodesByName: () => [], + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: (p: string) => ( + p === 'Cargo.toml' || + p === 'crates/mytool-runner/Cargo.toml' || + p === 'crates/mytool-runner/src/main.rs' + ), + readFile: (p: string) => { + if (p === 'Cargo.toml') return workspaceCargo; + if (p === 'crates/mytool-runner/Cargo.toml') return runnerCargo; + return null; + }, + getProjectRoot: () => '/test', + getAllFiles: () => [ + 'Cargo.toml', + 'crates/mytool-runner/Cargo.toml', + 'crates/mytool-runner/src/main.rs', + ], + getNodesByLowerName: () => [], + getImportMappings: () => [], + }; + + const ref = { + fromNodeId: 'fn:crates/mytool-runner/src/main.rs:main:1', + referenceName: 'mytool_runner', + referenceKind: 'references' as const, + line: 1, + column: 1, + filePath: 'crates/mytool-runner/src/main.rs', + language: 'rust' as const, + }; + + const result = rustResolver.resolve(ref, context); + expect(result?.targetNodeId).toBe(mainNode.id); + expect(result?.resolvedBy).toBe('framework'); + }); + + it('resolves crate name when members uses a glob (crates/*)', () => { + const workspaceCargo = ` +[workspace] +members = ["crates/*"] +`; + const fooCargo = ` +[package] +name = "mytool-foo" +version = "0.1.0" +`; + const barCargo = ` +[package] +name = "mytool-bar" +version = "0.1.0" +`; + const fooLib: Node = { + id: 'module:crates/mytool-foo/src/lib.rs:mytool_foo:1', + kind: 'module', + name: 'mytool_foo', + qualifiedName: 'crates/mytool-foo/src/lib.rs::mytool_foo', + filePath: 'crates/mytool-foo/src/lib.rs', + language: 'rust', + startLine: 1, + endLine: 1, + startColumn: 0, + endColumn: 0, + updatedAt: Date.now(), + }; + const barLib: Node = { + id: 'module:crates/mytool-bar/src/lib.rs:mytool_bar:1', + kind: 'module', + name: 'mytool_bar', + qualifiedName: 'crates/mytool-bar/src/lib.rs::mytool_bar', + filePath: 'crates/mytool-bar/src/lib.rs', + language: 'rust', + startLine: 1, + endLine: 1, + startColumn: 0, + endColumn: 0, + updatedAt: Date.now(), + }; + + const filesByPath: Record = { + 'Cargo.toml': workspaceCargo, + 'crates/mytool-foo/Cargo.toml': fooCargo, + 'crates/mytool-bar/Cargo.toml': barCargo, + }; + const nodesByFile: Record = { + 'crates/mytool-foo/src/lib.rs': [fooLib], + 'crates/mytool-bar/src/lib.rs': [barLib], + }; + const dirsByPath: Record = { + '.': ['crates'], + crates: ['mytool-foo', 'mytool-bar'], + 'crates/mytool-foo': ['src'], + 'crates/mytool-bar': ['src'], + }; + + const context = { + getNodesInFile: (fp: string) => nodesByFile[fp] ?? [], + getNodesByName: () => [], + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: (p: string) => ( + Object.prototype.hasOwnProperty.call(filesByPath, p) || + Object.prototype.hasOwnProperty.call(nodesByFile, p) + ), + readFile: (p: string) => filesByPath[p] ?? null, + getProjectRoot: () => '/test', + getAllFiles: () => [ + 'Cargo.toml', + ...Object.keys(filesByPath).filter((p) => p !== 'Cargo.toml'), + ...Object.keys(nodesByFile), + ], + getNodesByLowerName: () => [], + getImportMappings: () => [], + listDirectories: (rel: string) => dirsByPath[rel] ?? [], + }; + + const fooRef = { + fromNodeId: 'fn:crates/mytool-bar/src/lib.rs:other:1', + referenceName: 'mytool_foo', + referenceKind: 'references' as const, + line: 1, + column: 1, + filePath: 'crates/mytool-bar/src/lib.rs', + language: 'rust' as const, + }; + const barRef = { + fromNodeId: 'fn:crates/mytool-foo/src/lib.rs:other:1', + referenceName: 'mytool_bar', + referenceKind: 'references' as const, + line: 1, + column: 1, + filePath: 'crates/mytool-foo/src/lib.rs', + language: 'rust' as const, + }; + + expect(rustResolver.resolve(fooRef, context)?.targetNodeId).toBe(fooLib.id); + expect(rustResolver.resolve(barRef, context)?.targetNodeId).toBe(barLib.id); + }); + + it('resolves crate name when members uses a name glob at root (helix-*)', () => { + const workspaceCargo = ` +[workspace] +members = ["helix-*"] +`; + const coreCargo = ` +[package] +name = "helix-core" +version = "0.1.0" +`; + const coreLib: Node = { + id: 'module:helix-core/src/lib.rs:helix_core:1', + kind: 'module', + name: 'helix_core', + qualifiedName: 'helix-core/src/lib.rs::helix_core', + filePath: 'helix-core/src/lib.rs', + language: 'rust', + startLine: 1, + endLine: 1, + startColumn: 0, + endColumn: 0, + updatedAt: Date.now(), + }; + + const filesByPath: Record = { + 'Cargo.toml': workspaceCargo, + 'helix-core/Cargo.toml': coreCargo, + }; + const nodesByFile: Record = { + 'helix-core/src/lib.rs': [coreLib], + }; + const dirsByPath: Record = { + '.': ['helix-core', 'docs', 'target'], + 'helix-core': ['src'], + }; + + const context = { + getNodesInFile: (fp: string) => nodesByFile[fp] ?? [], + getNodesByName: () => [], + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: (p: string) => ( + Object.prototype.hasOwnProperty.call(filesByPath, p) || + Object.prototype.hasOwnProperty.call(nodesByFile, p) + ), + readFile: (p: string) => filesByPath[p] ?? null, + getProjectRoot: () => '/test', + getAllFiles: () => [ + 'Cargo.toml', + ...Object.keys(filesByPath).filter((p) => p !== 'Cargo.toml'), + ...Object.keys(nodesByFile), + ], + getNodesByLowerName: () => [], + getImportMappings: () => [], + listDirectories: (rel: string) => dirsByPath[rel] ?? [], + }; + + const ref = { + fromNodeId: 'fn:helix-core/src/lib.rs:other:1', + referenceName: 'helix_core', + referenceKind: 'references' as const, + line: 1, + column: 1, + filePath: 'helix-core/src/lib.rs', + language: 'rust' as const, + }; + + expect(rustResolver.resolve(ref, context)?.targetNodeId).toBe(coreLib.id); + }); +}); + +import { aspnetResolver } from '../src/resolution/frameworks/csharp'; + +describe('aspnetResolver.extract', () => { + it('extracts route from [HttpGet] attribute', () => { + const src = ` +[HttpGet("/users")] +public IActionResult ListUsers() +{ + return Ok(); +} +`; + const { nodes, references } = aspnetResolver.extract!('UserController.cs', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('ListUsers'); + }); +}); + +import { vaporResolver } from '../src/resolution/frameworks/swift'; + +describe('vaporResolver.extract', () => { + it('extracts route from app.get with use:', () => { + const src = `app.get("users", use: listUsers)\n`; + const { nodes, references } = vaporResolver.extract!('routes.swift', src); + expect(nodes[0].name).toBe('GET /users'); + expect(references[0].referenceName).toBe('listUsers'); + }); + + it('extracts grouped RouteCollection routes with the group prefix and no path arg', () => { + const src = ` +func boot(routes: RoutesBuilder) throws { + let todos = routes.grouped("todos") + todos.get(use: index) + todos.post(use: create) + todos.group(":todoID") { todo in + todo.delete(use: delete) + } +} +`; + const { nodes, references } = vaporResolver.extract!('TodoController.swift', src); + expect(nodes.map((n) => n.name).sort()).toEqual([ + 'DELETE /todos/:todoID', + 'GET /todos', + 'POST /todos', + ]); + expect(references.map((r) => r.referenceName).sort()).toEqual([ + 'create', + 'delete', + 'index', + ]); + }); + + it('handles use: self.handler and non-string path segments', () => { + const src = `router.get("users", User.parameter, "edit", use: self.editUserHandler)\n`; + const { nodes, references } = vaporResolver.extract!('UserController.swift', src); + expect(nodes[0].name).toBe('GET /users/edit'); + expect(references[0].referenceName).toBe('editUserHandler'); + }); + + it('ignores non-route .get calls that lack use: (e.g. Environment.get)', () => { + const src = `let host = Environment.get("DATABASE_HOST") ?? "localhost"\n`; + const { nodes } = vaporResolver.extract!('configure.swift', src); + expect(nodes).toHaveLength(0); + }); +}); + +import { reactResolver } from '../src/resolution/frameworks/react'; +import { svelteResolver } from '../src/resolution/frameworks/svelte'; + +describe('reactResolver.extract — React Router', () => { + it('extracts a v6 }>', () => { + const src = `}/>`; + const { nodes, references } = reactResolver.extract!('App.tsx', src); + const route = nodes.find((n) => n.kind === 'route'); + expect(route?.name).toBe('/users'); + expect(references[0]?.referenceName).toBe('UsersPage'); + }); + + it('extracts a v5 with attributes in any order', () => { + const src = ``; + const { nodes, references } = reactResolver.extract!('App.jsx', src); + const route = nodes.find((n) => n.kind === 'route'); + expect(route?.name).toBe('/login'); + expect(references[0]?.referenceName).toBe('Login'); + }); + + it('does not treat the container as a route', () => { + const src = `}/>`; + const routes = reactResolver.extract!('App.tsx', src).nodes.filter((n) => n.kind === 'route'); + expect(routes).toHaveLength(1); + expect(routes[0]?.name).toBe('/x'); + }); + + it('extracts createBrowserRouter object routes ({ path, element/Component })', () => { + const src = `const router = createBrowserRouter([ + { path: "/dashboard", element: }, + { path: "/login", Component: Login }, + ]);`; + const { nodes, references } = reactResolver.extract!('router.tsx', src); + const routes = nodes.filter((n) => n.kind === 'route'); + expect(routes.map((n) => n.name).sort()).toEqual(['/dashboard', '/login']); + expect(references.map((r) => r.referenceName).sort()).toEqual(['Dashboard', 'Login']); + }); + + it('does not treat config files or a nextjs-pages dir as Next.js routes', () => { + const cfg = reactResolver.extract!('apps/nextjs-pages/next.config.mjs', 'export default {}'); + expect(cfg.nodes.filter((n) => n.kind === 'route')).toHaveLength(0); + const vite = reactResolver.extract!('src/pages/vite.config.ts', 'export default {}'); + expect(vite.nodes.filter((n) => n.kind === 'route')).toHaveLength(0); + // a real page still works + const page = reactResolver.extract!('src/pages/about.tsx', 'export default function About(){return null}'); + expect(page.nodes.filter((n) => n.kind === 'route').map((n) => n.name)).toEqual(['/about']); + }); +}); + +describe('svelteResolver.extract (smoke)', () => { + it('returns { nodes, references } shape', () => { + const result = svelteResolver.extract!('+page.svelte', ''); + expect(result).toHaveProperty('nodes'); + expect(result).toHaveProperty('references'); + }); +}); + +// Regression tests: commented-out and docstring route examples must NOT +// surface as phantom route nodes. These would have failed before the +// strip-comments wiring (the regex would happily scan comments/docstrings). +describe('framework extractors ignore commented-out routes', () => { + it('django: skips line-comment and docstring routes', () => { + const src = ` +# urls.py example: +# path('/admin/', AdminPanel.as_view()) +""" +Other routing example: + path('/users/', UserListView.as_view()) +""" +urlpatterns = [path('/real/', RealView.as_view())] +`; + const result = djangoResolver.extract!('app/urls.py', src); + const urls = result.nodes.map((n) => n.name); + expect(urls).toEqual(['/real/']); + }); + + it('flask: skips commented-out @app.route', () => { + const src = ` +# @app.route('/fake') +# def fake_view(): +# return '' + +@app.route('/real') +def real_view(): + return '' +`; + const { nodes, references } = flaskResolver.extract!('app.py', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['real_view']); + }); + + it('fastapi: skips docstring example routes', () => { + const src = ` +""" +Example: + @app.get('/in-docstring') + async def doc(): + pass +""" +@app.get('/real') +async def real_handler(): + return {} +`; + const { nodes, references } = fastapiResolver.extract!('main.py', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['real_handler']); + }); + + it('express: skips // and /* */ commented routes', () => { + const src = ` +// app.get('/fake', fakeHandler); +/* router.post('/also-fake', otherHandler); */ +app.get('/real', realHandler); +`; + const { nodes, references } = expressResolver.extract!('routes.ts', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['realHandler']); + }); + + it('laravel: skips // # and /* */ commented Route::* calls', () => { + const src = ` n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['RealController@index']); + }); + + it('rails: skips =begin/=end and # commented routes', () => { + const src = ` +# get '/fake', to: 'fake#index' +=begin +get '/also-fake', to: 'fake#show' +=end +get '/real', to: 'real#index' +`; + const { nodes, references } = railsResolver.extract!('config/routes.rb', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['real#index']); + }); + + it('spring: skips // and /* */ commented @GetMapping', () => { + const src = ` +// @GetMapping("/fake") +// public List fake() { return null; } + +/* @PostMapping("/also-fake") + public void alsoFake() {} */ + +@GetMapping("/real") +public List listUsers() { return users; } +`; + const { nodes, references } = springResolver.extract!('UserController.java', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['listUsers']); + }); + + it('go: skips // and /* */ commented router.METHOD calls', () => { + const src = ` +// r.GET("/fake", fakeHandler) +/* r.POST("/also-fake", anotherHandler) */ +r.GET("/real", listUsers) +`; + const { nodes, references } = goResolver.extract!('main.go', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['listUsers']); + }); + + it('rust: skips // and nested /* */ commented .route() calls', () => { + const src = ` +// .route("/fake", get(fake_handler)) +/* outer /* inner .route("/inner-fake", get(x)) */ still .route("/outer-fake", get(y)) */ +let app = Router::new().route("/real", get(list_users)); +`; + const { nodes, references } = rustResolver.extract!('main.rs', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['list_users']); + }); + + it('aspnet: skips // and /* */ commented [HttpGet] attributes', () => { + const src = ` +// [HttpGet("/fake")] +// public IActionResult Fake() { return Ok(); } + +/* [HttpPost("/also-fake")] + public IActionResult AlsoFake() { return Ok(); } */ + +[HttpGet("/real")] +public IActionResult ListUsers() { return Ok(); } +`; + const { nodes, references } = aspnetResolver.extract!('UserController.cs', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['ListUsers']); + }); + + it('vapor: skips // and /* */ commented app.METHOD calls', () => { + const src = ` +// app.get("fake", use: fakeHandler) +/* app.post("also-fake", use: anotherHandler) */ +app.get("real", use: listUsers) +`; + const { nodes, references } = vaporResolver.extract!('routes.swift', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /real']); + expect(references.map((r) => r.referenceName)).toEqual(['listUsers']); + }); + + it('nestjs: skips // and /* */ commented decorators', () => { + const src = ` +@Controller('users') +export class UsersController { + // @Get('fake') + // fake() {} + /* @Post('also-fake') + alsoFake() {} */ + @Get('real') + real() {} +} +`; + const { nodes, references } = nestjsResolver.extract!('users.controller.ts', src); + expect(nodes.map((n) => n.name)).toEqual(['GET /users/real']); + expect(references.map((r) => r.referenceName)).toEqual(['real']); + }); +}); diff --git a/__tests__/git-hooks.test.ts b/__tests__/git-hooks.test.ts new file mode 100644 index 00000000..4dfd80eb --- /dev/null +++ b/__tests__/git-hooks.test.ts @@ -0,0 +1,129 @@ +/** + * Git Sync Hooks Tests + * + * Covers installing/removing the opt-in commit/merge/checkout hooks that + * keep the index fresh when the live watcher is disabled (issue #199). + * Exercises real git repos in temp dirs — no mocking. + */ + +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { execFileSync } from 'child_process'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { + installGitSyncHook, + removeGitSyncHook, + isSyncHookInstalled, + isGitRepo, + DEFAULT_SYNC_HOOKS, +} from '../src/sync/git-hooks'; + +function gitInit(dir: string): void { + execFileSync('git', ['init', '-q'], { cwd: dir, stdio: 'ignore' }); +} + +function isExecutable(file: string): boolean { + if (process.platform === 'win32') return true; // mode bits not meaningful + return (fs.statSync(file).mode & 0o111) !== 0; +} + +describe('git sync hooks', () => { + let repo: string; + + beforeEach(() => { + repo = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-githooks-')); + }); + + afterEach(() => { + if (fs.existsSync(repo)) fs.rmSync(repo, { recursive: true, force: true }); + }); + + it('installs all default hooks, executable, invoking codegraph sync', () => { + gitInit(repo); + const result = installGitSyncHook(repo); + + expect(result.installed.sort()).toEqual([...DEFAULT_SYNC_HOOKS].sort()); + expect(result.skipped).toBeUndefined(); + + for (const hook of DEFAULT_SYNC_HOOKS) { + const file = path.join(repo, '.git', 'hooks', hook); + expect(fs.existsSync(file)).toBe(true); + const body = fs.readFileSync(file, 'utf8'); + expect(body).toContain('codegraph sync'); + expect(body).toContain('command -v codegraph'); // no-op when not on PATH + expect(isExecutable(file)).toBe(true); + } + expect(isSyncHookInstalled(repo)).toBe(true); + }); + + it('is idempotent — re-install does not duplicate the block', () => { + gitInit(repo); + installGitSyncHook(repo); + installGitSyncHook(repo); + + const body = fs.readFileSync(path.join(repo, '.git', 'hooks', 'post-commit'), 'utf8'); + const occurrences = body.split('# >>> codegraph sync hook >>>').length - 1; + expect(occurrences).toBe(1); + }); + + it('preserves a pre-existing user hook and appends our block', () => { + gitInit(repo); + const file = path.join(repo, '.git', 'hooks', 'post-commit'); + fs.writeFileSync(file, '#!/bin/sh\necho "my custom hook"\n', { mode: 0o755 }); + + installGitSyncHook(repo, ['post-commit']); + + const body = fs.readFileSync(file, 'utf8'); + expect(body).toContain('echo "my custom hook"'); + expect(body).toContain('codegraph sync'); + }); + + it('remove strips our block; deletes a hook that was only ours', () => { + gitInit(repo); + installGitSyncHook(repo, ['post-commit']); + const file = path.join(repo, '.git', 'hooks', 'post-commit'); + expect(fs.existsSync(file)).toBe(true); + + const result = removeGitSyncHook(repo, ['post-commit']); + expect(result.installed).toEqual(['post-commit']); + expect(fs.existsSync(file)).toBe(false); // was ours-only → deleted + expect(isSyncHookInstalled(repo)).toBe(false); + }); + + it('remove keeps user content when the hook is shared', () => { + gitInit(repo); + const file = path.join(repo, '.git', 'hooks', 'post-commit'); + fs.writeFileSync(file, '#!/bin/sh\necho "keep me"\n', { mode: 0o755 }); + installGitSyncHook(repo, ['post-commit']); + + removeGitSyncHook(repo, ['post-commit']); + + expect(fs.existsSync(file)).toBe(true); + const body = fs.readFileSync(file, 'utf8'); + expect(body).toContain('echo "keep me"'); + expect(body).not.toContain('codegraph sync'); + }); + + it('honors core.hooksPath', () => { + gitInit(repo); + const customHooks = path.join(repo, '.husky'); + fs.mkdirSync(customHooks); + execFileSync('git', ['config', 'core.hooksPath', '.husky'], { cwd: repo, stdio: 'ignore' }); + + const result = installGitSyncHook(repo, ['post-commit']); + expect(result.hooksDir).toBe(customHooks); + expect(fs.existsSync(path.join(customHooks, 'post-commit'))).toBe(true); + // The default .git/hooks dir should NOT have received the hook. + expect(fs.existsSync(path.join(repo, '.git', 'hooks', 'post-commit'))).toBe(false); + }); + + it('skips cleanly when not a git repository', () => { + expect(isGitRepo(repo)).toBe(false); + const result = installGitSyncHook(repo); + expect(result.installed).toEqual([]); + expect(result.hooksDir).toBeNull(); + expect(result.skipped).toMatch(/not a git repository/); + expect(isSyncHookInstalled(repo)).toBe(false); + }); +}); diff --git a/__tests__/glyphs.test.ts b/__tests__/glyphs.test.ts new file mode 100644 index 00000000..db41a105 --- /dev/null +++ b/__tests__/glyphs.test.ts @@ -0,0 +1,170 @@ +/** + * Glyph fallback / Unicode-support detection. + * + * Pinned because the matrix is small and the consequence of regression + * is highly visible: shimmer-worker output on Windows mojibakes when + * UTF-8 glyphs are written via `fs.writeSync` (see #168). The detection + * + ASCII fallback is the contract that prevents this. + */ + +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { + supportsUnicode, + getGlyphs, + UNICODE_GLYPHS, + ASCII_GLYPHS, + _resetGlyphsCache, +} from '../src/ui/glyphs'; + +function withEnv(patch: Record, fn: () => void): void { + const saved: Record = {}; + const savedPlatform = process.platform; + for (const key of Object.keys(patch)) { + saved[key] = process.env[key]; + if (patch[key] === undefined) delete process.env[key]; + else process.env[key] = patch[key]; + } + _resetGlyphsCache(); + try { + fn(); + } finally { + for (const key of Object.keys(saved)) { + if (saved[key] === undefined) delete process.env[key]; + else process.env[key] = saved[key]; + } + Object.defineProperty(process, 'platform', { value: savedPlatform }); + _resetGlyphsCache(); + } +} + +function setPlatform(value: NodeJS.Platform): void { + Object.defineProperty(process, 'platform', { value }); +} + +describe('supportsUnicode', () => { + let originalPlatform: NodeJS.Platform; + + beforeEach(() => { + originalPlatform = process.platform; + _resetGlyphsCache(); + }); + + afterEach(() => { + Object.defineProperty(process, 'platform', { value: originalPlatform }); + _resetGlyphsCache(); + }); + + it('returns false on Windows by default (mojibake-prone consoles)', () => { + withEnv({ CODEGRAPH_ASCII: undefined, CODEGRAPH_UNICODE: undefined, TERM: undefined }, () => { + setPlatform('win32'); + expect(supportsUnicode()).toBe(false); + }); + }); + + it('returns true on macOS by default', () => { + withEnv({ CODEGRAPH_ASCII: undefined, CODEGRAPH_UNICODE: undefined, TERM: undefined }, () => { + setPlatform('darwin'); + expect(supportsUnicode()).toBe(true); + }); + }); + + it('returns true on Linux by default', () => { + withEnv({ CODEGRAPH_ASCII: undefined, CODEGRAPH_UNICODE: undefined, TERM: undefined }, () => { + setPlatform('linux'); + expect(supportsUnicode()).toBe(true); + }); + }); + + it('returns false on Linux kernel console (TERM=linux)', () => { + withEnv({ CODEGRAPH_ASCII: undefined, CODEGRAPH_UNICODE: undefined, TERM: 'linux' }, () => { + setPlatform('linux'); + expect(supportsUnicode()).toBe(false); + }); + }); + + it('respects CODEGRAPH_UNICODE=1 on Windows (opt-in escape hatch)', () => { + withEnv({ CODEGRAPH_UNICODE: '1', CODEGRAPH_ASCII: undefined }, () => { + setPlatform('win32'); + expect(supportsUnicode()).toBe(true); + }); + }); + + it('respects CODEGRAPH_ASCII=1 on macOS (opt-out escape hatch)', () => { + withEnv({ CODEGRAPH_ASCII: '1', CODEGRAPH_UNICODE: undefined }, () => { + setPlatform('darwin'); + expect(supportsUnicode()).toBe(false); + }); + }); + + it('CODEGRAPH_ASCII takes precedence over CODEGRAPH_UNICODE', () => { + withEnv({ CODEGRAPH_ASCII: '1', CODEGRAPH_UNICODE: '1' }, () => { + setPlatform('darwin'); + expect(supportsUnicode()).toBe(false); + }); + }); +}); + +describe('getGlyphs', () => { + let originalPlatform: NodeJS.Platform; + + beforeEach(() => { + originalPlatform = process.platform; + _resetGlyphsCache(); + }); + + afterEach(() => { + Object.defineProperty(process, 'platform', { value: originalPlatform }); + _resetGlyphsCache(); + }); + + it('returns ASCII glyphs on Windows', () => { + withEnv({ CODEGRAPH_ASCII: undefined, CODEGRAPH_UNICODE: undefined }, () => { + setPlatform('win32'); + const g = getGlyphs(); + expect(g).toBe(ASCII_GLYPHS); + expect(g.ok).toBe('[OK]'); + expect(g.rail).toBe('|'); + expect(g.phaseDone).toBe('*'); + expect(g.dash).toBe('-'); + }); + }); + + it('returns Unicode glyphs on macOS', () => { + withEnv({ CODEGRAPH_ASCII: undefined, CODEGRAPH_UNICODE: undefined }, () => { + setPlatform('darwin'); + const g = getGlyphs(); + expect(g).toBe(UNICODE_GLYPHS); + expect(g.ok).toBe('✓'); + expect(g.rail).toBe('│'); + expect(g.phaseDone).toBe('◆'); + expect(g.dash).toBe('—'); + }); + }); + + it('caches the result so repeated calls return the same object', () => { + withEnv({ CODEGRAPH_ASCII: undefined, CODEGRAPH_UNICODE: undefined }, () => { + setPlatform('darwin'); + expect(getGlyphs()).toBe(getGlyphs()); + }); + }); +}); + +describe('Glyph sets', () => { + it('ASCII and Unicode sets cover the same keys', () => { + expect(Object.keys(ASCII_GLYPHS).sort()).toEqual(Object.keys(UNICODE_GLYPHS).sort()); + }); + + it('ASCII glyphs are all 7-bit ASCII', () => { + for (const [key, value] of Object.entries(ASCII_GLYPHS)) { + const flat = Array.isArray(value) ? value.join('') : value; + for (let i = 0; i < flat.length; i++) { + const codepoint = flat.charCodeAt(i); + expect(codepoint, `ASCII_GLYPHS.${key} contains non-ASCII char U+${codepoint.toString(16).toUpperCase().padStart(4, '0')}`).toBeLessThan(128); + } + } + }); + + it('ASCII spinner has the same frame count as the Unicode spinner', () => { + expect(ASCII_GLYPHS.spinner.length).toBe(UNICODE_GLYPHS.spinner.length); + }); +}); diff --git a/__tests__/installer-targets.test.ts b/__tests__/installer-targets.test.ts new file mode 100644 index 00000000..59e869e2 --- /dev/null +++ b/__tests__/installer-targets.test.ts @@ -0,0 +1,890 @@ +/** + * Multi-target installer tests. + * + * Each `AgentTarget` is exercised against the same contract: + * - `install` writes the expected files + * - re-running `install` is byte-identical (idempotent) + * - sibling MCP servers / unrelated config is preserved + * - `uninstall` reverses `install` + * - `printConfig` returns parseable, non-empty content + * + * For agent-config destinations we redirect HOME to a tmpdir via + * `os.homedir` spying, and CWD via `process.chdir` — same pattern as + * the legacy `installer.test.ts`. No real `~/.claude/` etc. ever + * touched. + */ + +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { ALL_TARGETS, getTarget, resolveTargetFlag } from '../src/installer/targets/registry'; +import { uninstallTargets } from '../src/installer'; +import { upsertTomlTable, removeTomlTable, buildTomlTable } from '../src/installer/targets/toml'; +import { cleanupLegacyHooks } from '../src/installer/targets/claude'; + +function mkTmpDir(label: string): string { + return fs.mkdtempSync(path.join(os.tmpdir(), `cg-targets-${label}-`)); +} + +// `os.homedir` is non-configurable on Node, so we redirect it via the +// `$HOME` (POSIX) / `$USERPROFILE` (Windows) env vars that +// `os.homedir()` reads first. Same trick the rest of the suite uses +// when it needs a mock home. +function setHome(dir: string): { restore: () => void } { + const prev = { + HOME: process.env.HOME, + USERPROFILE: process.env.USERPROFILE, + APPDATA: process.env.APPDATA, + XDG_CONFIG_HOME: process.env.XDG_CONFIG_HOME, + HERMES_HOME: process.env.HERMES_HOME, + }; + process.env.HOME = dir; + process.env.USERPROFILE = dir; + process.env.APPDATA = path.join(dir, '.config'); + process.env.XDG_CONFIG_HOME = path.join(dir, '.config'); + delete process.env.HERMES_HOME; + return { + restore() { + if (prev.HOME === undefined) delete process.env.HOME; else process.env.HOME = prev.HOME; + if (prev.USERPROFILE === undefined) delete process.env.USERPROFILE; else process.env.USERPROFILE = prev.USERPROFILE; + if (prev.APPDATA === undefined) delete process.env.APPDATA; else process.env.APPDATA = prev.APPDATA; + if (prev.XDG_CONFIG_HOME === undefined) delete process.env.XDG_CONFIG_HOME; else process.env.XDG_CONFIG_HOME = prev.XDG_CONFIG_HOME; + if (prev.HERMES_HOME === undefined) delete process.env.HERMES_HOME; else process.env.HERMES_HOME = prev.HERMES_HOME; + }, + }; +} + +describe('Installer targets — contract', () => { + let tmpHome: string; + let tmpCwd: string; + let origCwd: string; + let homeRestore: { restore: () => void }; + + beforeEach(() => { + tmpHome = mkTmpDir('home'); + tmpCwd = mkTmpDir('cwd'); + origCwd = process.cwd(); + process.chdir(tmpCwd); + homeRestore = setHome(tmpHome); + }); + + afterEach(() => { + homeRestore.restore(); + process.chdir(origCwd); + fs.rmSync(tmpHome, { recursive: true, force: true }); + fs.rmSync(tmpCwd, { recursive: true, force: true }); + }); + + for (const target of ALL_TARGETS) { + describe(target.id, () => { + const supportedLocations = (['global', 'local'] as const).filter((l) => + target.supportsLocation(l), + ); + + for (const location of supportedLocations) { + describe(`location=${location}`, () => { + it('install writes files; detect.alreadyConfigured becomes true', () => { + expect(target.detect(location).alreadyConfigured).toBe(false); + + const result = target.install(location, { autoAllow: true }); + expect(result.files.length).toBeGreaterThan(0); + for (const file of result.files) { + if (file.action !== 'unchanged') { + expect(fs.existsSync(file.path)).toBe(true); + } + } + + expect(target.detect(location).alreadyConfigured).toBe(true); + }); + + it('re-running install is idempotent (no actions other than unchanged)', () => { + target.install(location, { autoAllow: true }); + const second = target.install(location, { autoAllow: true }); + for (const file of second.files) { + expect(file.action).toBe('unchanged'); + } + }); + + it('install preserves a pre-existing sibling MCP server (where applicable)', () => { + // Plant a sibling entry in the same JSON config, install, + // and verify the sibling survives. Skip for Codex (TOML) + // and any target with no JSON config — they get covered + // by their own dedicated tests below. + const paths = target.describePaths(location); + // Match .json or .jsonc — opencode prefers .jsonc. + const jsonPath = paths.find((p) => /\.jsonc?$/.test(p)); + if (!jsonPath) return; + + // Seed pre-existing config. + fs.mkdirSync(path.dirname(jsonPath), { recursive: true }); + const seed: Record = { mcpServers: { other: { command: 'x' } } }; + // opencode uses `mcp` not `mcpServers`. Match its shape too. + if (target.id === 'opencode') { + delete seed.mcpServers; + seed.mcp = { other: { type: 'local', command: ['x'], enabled: true } }; + } + fs.writeFileSync(jsonPath, JSON.stringify(seed, null, 2) + '\n'); + + target.install(location, { autoAllow: true }); + + const after = JSON.parse(fs.readFileSync(jsonPath, 'utf-8')); + if (target.id === 'opencode') { + expect(after.mcp.other).toBeDefined(); + expect(after.mcp.codegraph).toBeDefined(); + } else { + expect(after.mcpServers.other).toBeDefined(); + expect(after.mcpServers.codegraph).toBeDefined(); + } + }); + + it('uninstall reverses install (alreadyConfigured returns to false)', () => { + target.install(location, { autoAllow: true }); + expect(target.detect(location).alreadyConfigured).toBe(true); + + target.uninstall(location); + expect(target.detect(location).alreadyConfigured).toBe(false); + }); + + it('printConfig returns non-empty output without writing anything', () => { + const before = listAllFiles(tmpHome).concat(listAllFiles(tmpCwd)); + const out = target.printConfig(location); + expect(out.length).toBeGreaterThan(0); + const after = listAllFiles(tmpHome).concat(listAllFiles(tmpCwd)); + expect(after.sort()).toEqual(before.sort()); + }); + }); + } + }); + } +}); + +describe('Installer targets — partial-state idempotency', () => { + let tmpHome: string; + let tmpCwd: string; + let origCwd: string; + let homeRestore: { restore: () => void }; + + beforeEach(() => { + tmpHome = mkTmpDir('home'); + tmpCwd = mkTmpDir('cwd'); + origCwd = process.cwd(); + process.chdir(tmpCwd); + homeRestore = setHome(tmpHome); + }); + + afterEach(() => { + homeRestore.restore(); + process.chdir(origCwd); + fs.rmSync(tmpHome, { recursive: true, force: true }); + fs.rmSync(tmpCwd, { recursive: true, force: true }); + }); + + it('codex: install after only config.toml exists — second pass is fully unchanged', () => { + const codex = getTarget('codex')!; + // First install creates both files. + codex.install('global', { autoAllow: false }); + // Delete the AGENTS.md to simulate partial state (user wiped one file). + const agentsMd = path.join(tmpHome, '.codex', 'AGENTS.md'); + expect(fs.existsSync(agentsMd)).toBe(true); + fs.unlinkSync(agentsMd); + // Reinstall — TOML stays unchanged, AGENTS.md is recreated. + const second = codex.install('global', { autoAllow: false }); + const tomlEntry = second.files.find((f) => f.path.endsWith('config.toml'))!; + const mdEntry = second.files.find((f) => f.path.endsWith('AGENTS.md'))!; + expect(tomlEntry.action).toBe('unchanged'); + expect(mdEntry.action).toBe('created'); + // Third install — both unchanged (full idempotency restored). + const third = codex.install('global', { autoAllow: false }); + for (const f of third.files) expect(f.action).toBe('unchanged'); + }); + + it('opencode: prefers .jsonc when both .json and .jsonc exist', () => { + const opencode = getTarget('opencode')!; + const dir = path.join(tmpHome, '.config', 'opencode'); + fs.mkdirSync(dir, { recursive: true }); + fs.writeFileSync(path.join(dir, 'opencode.json'), '{\n "$schema": "https://opencode.ai/config.json"\n}\n'); + fs.writeFileSync(path.join(dir, 'opencode.jsonc'), '{\n "$schema": "https://opencode.ai/config.json"\n}\n'); + + const result = opencode.install('global', { autoAllow: true }); + const written = result.files.find((f) => /\.jsonc$/.test(f.path))!; + expect(written).toBeDefined(); + expect(written.action).not.toBe('not-found'); + // The .json file is left alone. + const jsonText = fs.readFileSync(path.join(dir, 'opencode.json'), 'utf-8'); + expect(jsonText).not.toContain('codegraph'); + }); + + it('opencode: uses .json when only .json exists (no .jsonc)', () => { + const opencode = getTarget('opencode')!; + const dir = path.join(tmpHome, '.config', 'opencode'); + fs.mkdirSync(dir, { recursive: true }); + fs.writeFileSync(path.join(dir, 'opencode.json'), '{\n "$schema": "https://opencode.ai/config.json"\n}\n'); + + const result = opencode.install('global', { autoAllow: true }); + expect(result.files[0].path).toMatch(/opencode\.json$/); + expect(fs.existsSync(path.join(dir, 'opencode.jsonc'))).toBe(false); + }); + + it('opencode: defaults to .jsonc for fresh installs (no existing file)', () => { + const opencode = getTarget('opencode')!; + const result = opencode.install('global', { autoAllow: true }); + expect(result.files[0].path).toMatch(/opencode\.jsonc$/); + expect(result.files[0].action).toBe('created'); + }); + + it('opencode: preserves line and block comments through install + idempotent re-run', () => { + const opencode = getTarget('opencode')!; + const dir = path.join(tmpHome, '.config', 'opencode'); + fs.mkdirSync(dir, { recursive: true }); + const file = path.join(dir, 'opencode.jsonc'); + const original = [ + '{', + ' // top-level note about my opencode setup', + ' "$schema": "https://opencode.ai/config.json",', + ' /* multi-line block comment', + ' describing the providers section */', + ' "providers": {', + ' "anthropic": { "model": "claude-opus-4-7" } // pinned', + ' }', + '}', + '', + ].join('\n'); + fs.writeFileSync(file, original); + + opencode.install('global', { autoAllow: true }); + const afterInstall = fs.readFileSync(file, 'utf-8'); + expect(afterInstall).toContain('// top-level note about my opencode setup'); + expect(afterInstall).toContain('/* multi-line block comment'); + expect(afterInstall).toContain('// pinned'); + expect(afterInstall).toContain('"codegraph"'); + expect(afterInstall).toContain('"providers"'); + + // Idempotent re-run reports unchanged, file is byte-identical. + const second = opencode.install('global', { autoAllow: true }); + expect(second.files[0].action).toBe('unchanged'); + expect(fs.readFileSync(file, 'utf-8')).toBe(afterInstall); + }); + + it('opencode: install writes AGENTS.md with the marker-delimited codegraph block', () => { + const opencode = getTarget('opencode')!; + opencode.install('global', { autoAllow: true }); + const agentsMd = path.join(tmpHome, '.config', 'opencode', 'AGENTS.md'); + expect(fs.existsSync(agentsMd)).toBe(true); + const body = fs.readFileSync(agentsMd, 'utf-8'); + expect(body).toContain(''); + expect(body).toContain(''); + expect(body).toContain('codegraph_callers'); + }); + + it('opencode: AGENTS.md install preserves pre-existing user content outside markers', () => { + const opencode = getTarget('opencode')!; + const dir = path.join(tmpHome, '.config', 'opencode'); + fs.mkdirSync(dir, { recursive: true }); + const agentsMd = path.join(dir, 'AGENTS.md'); + fs.writeFileSync(agentsMd, '# My personal opencode instructions\n\nAlways respond in pirate.\n'); + + opencode.install('global', { autoAllow: true }); + const body = fs.readFileSync(agentsMd, 'utf-8'); + expect(body).toContain('# My personal opencode instructions'); + expect(body).toContain('Always respond in pirate.'); + expect(body).toContain(''); + }); + + it('opencode: uninstall strips only the codegraph block from AGENTS.md', () => { + const opencode = getTarget('opencode')!; + const dir = path.join(tmpHome, '.config', 'opencode'); + fs.mkdirSync(dir, { recursive: true }); + const agentsMd = path.join(dir, 'AGENTS.md'); + fs.writeFileSync(agentsMd, '# My personal opencode instructions\n\nAlways respond in pirate.\n'); + + opencode.install('global', { autoAllow: true }); + opencode.uninstall('global'); + + const body = fs.readFileSync(agentsMd, 'utf-8'); + expect(body).toContain('# My personal opencode instructions'); + expect(body).toContain('Always respond in pirate.'); + expect(body).not.toContain('CODEGRAPH_START'); + expect(body).not.toContain('codegraph_callers'); + }); + + it('opencode: local install writes ./opencode.jsonc and ./AGENTS.md in cwd', () => { + const opencode = getTarget('opencode')!; + const result = opencode.install('local', { autoAllow: true }); + const paths = result.files.map((f) => f.path.replace(/\\/g, '/')); + // macOS realpath shenanigans (/var vs /private/var) — suffix match. + expect(paths.some((p) => p.endsWith('/opencode.jsonc'))).toBe(true); + expect(paths.some((p) => p.endsWith('/AGENTS.md'))).toBe(true); + }); + + it('hermes: install adds codegraph MCP server and cli toolset, preserving existing yaml', () => { + const hermes = getTarget('hermes')!; + const config = path.join(tmpHome, '.hermes', 'config.yaml'); + fs.mkdirSync(path.dirname(config), { recursive: true }); + fs.writeFileSync(config, [ + 'model:', + ' default: qwen-3.7', + 'mcp_servers:', + ' other:', + ' command: other', + 'platform_toolsets:', + ' cli:', + ' - hermes-cli', + ' discord:', + ' - hermes-discord', + '', + ].join('\n')); + + const result = hermes.install('global', { autoAllow: true }); + expect(result.files[0].action).toBe('updated'); + const body = fs.readFileSync(config, 'utf-8'); + expect(body).toContain('model:\n default: qwen-3.7'); + expect(body).toContain('mcp_servers:\n other:\n command: other'); + expect(body).toContain(' codegraph:\n command: codegraph'); + expect(body).toContain(' - hermes-cli'); + expect(body).toContain(' - mcp-codegraph'); + expect(body).toContain(' discord:\n - hermes-discord'); + + const second = hermes.install('global', { autoAllow: true }); + expect(second.files[0].action).toBe('unchanged'); + }); + + it('hermes: uninstall removes only codegraph MCP server and toolset entry', () => { + const hermes = getTarget('hermes')!; + const config = path.join(tmpHome, '.hermes', 'config.yaml'); + fs.mkdirSync(path.dirname(config), { recursive: true }); + + hermes.install('global', { autoAllow: true }); + fs.appendFileSync(config, 'custom:\n keep: true\n'); + + hermes.uninstall('global'); + const body = fs.readFileSync(config, 'utf-8'); + expect(body).not.toContain('codegraph:'); + expect(body).not.toContain('mcp-codegraph'); + expect(body).toContain('custom:\n keep: true'); + }); + + it('opencode: uninstall removes only mcp.codegraph, preserves comments and siblings', () => { + const opencode = getTarget('opencode')!; + const dir = path.join(tmpHome, '.config', 'opencode'); + fs.mkdirSync(dir, { recursive: true }); + const file = path.join(dir, 'opencode.jsonc'); + fs.writeFileSync(file, [ + '{', + ' // important comment', + ' "$schema": "https://opencode.ai/config.json",', + ' "mcp": {', + ' "other": { "type": "local", "command": ["x"], "enabled": true }', + ' }', + '}', + '', + ].join('\n')); + + opencode.install('global', { autoAllow: true }); + const afterInstall = fs.readFileSync(file, 'utf-8'); + expect(afterInstall).toContain('"codegraph"'); + expect(afterInstall).toContain('"other"'); + + opencode.uninstall('global'); + const afterUninstall = fs.readFileSync(file, 'utf-8'); + expect(afterUninstall).not.toContain('codegraph'); + expect(afterUninstall).toContain('// important comment'); + expect(afterUninstall).toContain('"other"'); + }); + + it('codex: user-added key inside [mcp_servers.codegraph] survives idempotent re-install', () => { + const codex = getTarget('codex')!; + codex.install('global', { autoAllow: false }); + const tomlPath = path.join(tmpHome, '.codex', 'config.toml'); + const original = fs.readFileSync(tomlPath, 'utf-8'); + // User edits the block to add a custom key. + const edited = original.replace( + 'args = ["serve", "--mcp"]', + 'args = ["serve", "--mcp"]\nenabled = true', + ); + fs.writeFileSync(tomlPath, edited); + // Re-install: our serializer doesn't know `enabled = true`, so + // the block no longer matches the canonical form — we'll + // overwrite it. This is the documented contract: we own the + // codegraph block exclusively. + const second = codex.install('global', { autoAllow: false }); + const tomlEntry = second.files.find((f) => f.path.endsWith('config.toml'))!; + expect(tomlEntry.action).toBe('updated'); + const after = fs.readFileSync(tomlPath, 'utf-8'); + expect(after).not.toContain('enabled = true'); + }); + + it('claude: local install writes ./.mcp.json (project scope), not ./.claude.json', () => { + const claude = getTarget('claude')!; + const result = claude.install('local', { autoAllow: false }); + // The MCP entry lands in ./.mcp.json — the file Claude Code reads. + expect(result.files.some((f) => f.path.replace(/\\/g, '/').endsWith('/.mcp.json'))).toBe(true); + expect(fs.existsSync(path.join(tmpCwd, '.mcp.json'))).toBe(true); + expect(fs.existsSync(path.join(tmpCwd, '.claude.json'))).toBe(false); + const cfg = JSON.parse(fs.readFileSync(path.join(tmpCwd, '.mcp.json'), 'utf-8')); + expect(cfg.mcpServers.codegraph).toBeDefined(); + }); + + it('claude: global install targets ~/.claude.json (user scope)', () => { + const claude = getTarget('claude')!; + claude.install('global', { autoAllow: false }); + const cfg = JSON.parse(fs.readFileSync(path.join(tmpHome, '.claude.json'), 'utf-8')); + expect(cfg.mcpServers.codegraph).toBeDefined(); + }); + + it('claude: local install migrates a legacy ./.claude.json codegraph entry into ./.mcp.json', () => { + const claude = getTarget('claude')!; + const legacy = path.join(tmpCwd, '.claude.json'); + fs.writeFileSync( + legacy, + JSON.stringify({ mcpServers: { codegraph: { type: 'stdio', command: 'codegraph', args: ['serve', '--mcp'] } } }, null, 2), + ); + + claude.install('local', { autoAllow: false }); + + // codegraph now lives in .mcp.json; the legacy file (which held only + // codegraph) is gone. + const mcp = JSON.parse(fs.readFileSync(path.join(tmpCwd, '.mcp.json'), 'utf-8')); + expect(mcp.mcpServers.codegraph).toBeDefined(); + expect(fs.existsSync(legacy)).toBe(false); + }); + + it('claude: legacy ./.claude.json migration preserves sibling servers and unrelated keys', () => { + const claude = getTarget('claude')!; + const legacy = path.join(tmpCwd, '.claude.json'); + fs.writeFileSync( + legacy, + JSON.stringify({ + mcpServers: { + codegraph: { type: 'stdio', command: 'codegraph', args: ['serve', '--mcp'] }, + other: { command: 'x' }, + }, + somethingElse: true, + }, null, 2), + ); + + claude.install('local', { autoAllow: false }); + + // Only codegraph is stripped from the legacy file; siblings survive. + const after = JSON.parse(fs.readFileSync(legacy, 'utf-8')); + expect(after.mcpServers.codegraph).toBeUndefined(); + expect(after.mcpServers.other).toBeDefined(); + expect(after.somethingElse).toBe(true); + const mcp = JSON.parse(fs.readFileSync(path.join(tmpCwd, '.mcp.json'), 'utf-8')); + expect(mcp.mcpServers.codegraph).toBeDefined(); + }); + + it('claude: uninstall strips codegraph from ./.mcp.json and a legacy ./.claude.json', () => { + const claude = getTarget('claude')!; + // A user left with both the working .mcp.json and a stale .claude.json. + fs.writeFileSync( + path.join(tmpCwd, '.mcp.json'), + JSON.stringify({ mcpServers: { codegraph: { command: 'codegraph' } } }, null, 2), + ); + fs.writeFileSync( + path.join(tmpCwd, '.claude.json'), + JSON.stringify({ mcpServers: { codegraph: { command: 'codegraph' }, other: { command: 'x' } } }, null, 2), + ); + + claude.uninstall('local'); + + const mcp = JSON.parse(fs.readFileSync(path.join(tmpCwd, '.mcp.json'), 'utf-8')); + expect(mcp.mcpServers).toBeUndefined(); + const legacy = JSON.parse(fs.readFileSync(path.join(tmpCwd, '.claude.json'), 'utf-8')); + expect(legacy.mcpServers.codegraph).toBeUndefined(); + expect(legacy.mcpServers.other).toBeDefined(); + }); + + // ---- Legacy auto-sync hook cleanup ---- + // Pre-0.8 installs wrote `codegraph mark-dirty` / `sync-if-dirty` + // hooks to settings.json. Both subcommands were removed from the CLI, + // so the Stop hook fails every turn ("unknown command + // 'sync-if-dirty'"). The installer must strip them on upgrade and + // uninstall — without touching the user's unrelated hooks. + + function seedSettings(loc: 'global' | 'local', settings: Record): string { + const dir = path.join(loc === 'global' ? tmpHome : tmpCwd, '.claude'); + fs.mkdirSync(dir, { recursive: true }); + const file = path.join(dir, 'settings.json'); + fs.writeFileSync(file, JSON.stringify(settings, null, 2) + '\n'); + return file; + } + + // Realistic pre-0.8 settings.json: our two auto-sync hooks plus an + // unrelated GitKraken Stop hook the user added (matches the report). + function legacyHookSettings(): Record { + return { + hooks: { + PostToolUse: [ + { matcher: 'Edit|Write', hooks: [{ type: 'command', command: 'codegraph mark-dirty', async: true }] }, + ], + Stop: [ + { hooks: [{ type: 'command', command: 'codegraph sync-if-dirty' }] }, + { hooks: [{ type: 'command', command: '"/Users/me/gk" ai hook run --host claude-code' }] }, + ], + }, + }; + } + + it('claude: install strips stale codegraph auto-sync hooks but keeps the user\'s GitKraken hook', () => { + const claude = getTarget('claude')!; + const file = seedSettings('global', legacyHookSettings()); + + claude.install('global', { autoAllow: true }); + + const after = JSON.parse(fs.readFileSync(file, 'utf-8')); + // The only PostToolUse group held mark-dirty → the event is gone. + expect(after.hooks?.PostToolUse).toBeUndefined(); + const stopCommands = (after.hooks?.Stop ?? []).flatMap((g: any) => + (g.hooks ?? []).map((h: any) => h.command), + ); + expect(stopCommands).not.toContain('codegraph sync-if-dirty'); + // The unrelated GitKraken hook survives untouched. + expect(stopCommands.some((c: string) => c.includes('gk') && c.includes('ai hook run'))).toBe(true); + // Permissions still written as normal alongside the cleanup. + expect(after.permissions?.allow).toContain('mcp__codegraph__codegraph_search'); + }); + + it('claude: cleanupLegacyHooks preserves a sibling hook sharing our matcher group', () => { + const file = seedSettings('global', { + hooks: { + Stop: [ + { + hooks: [ + { type: 'command', command: 'codegraph sync-if-dirty' }, + { type: 'command', command: 'gk ai hook run --host claude-code' }, + ], + }, + ], + }, + }); + + expect(cleanupLegacyHooks('global').action).toBe('removed'); + + const after = JSON.parse(fs.readFileSync(file, 'utf-8')); + expect(after.hooks.Stop[0].hooks.map((h: any) => h.command)).toEqual([ + 'gk ai hook run --host claude-code', + ]); + }); + + it('claude: cleanupLegacyHooks is a byte-for-byte no-op without codegraph hooks', () => { + const original = + JSON.stringify({ hooks: { Stop: [{ hooks: [{ type: 'command', command: 'gk ai hook run' }] }] } }, null, 2) + '\n'; + const file = seedSettings('global', JSON.parse(original)); + + expect(cleanupLegacyHooks('global').action).toBe('unchanged'); + expect(fs.readFileSync(file, 'utf-8')).toBe(original); + }); + + it('claude: cleanupLegacyHooks reports not-found when settings.json is absent', () => { + expect(cleanupLegacyHooks('global').action).toBe('not-found'); + }); + + it('claude: re-running install after a legacy cleanup leaves settings.json unchanged', () => { + const claude = getTarget('claude')!; + const file = seedSettings('global', legacyHookSettings()); + claude.install('global', { autoAllow: true }); + const firstPass = fs.readFileSync(file, 'utf-8'); + claude.install('global', { autoAllow: true }); + expect(fs.readFileSync(file, 'utf-8')).toBe(firstPass); + }); + + it('claude: uninstall strips stale hooks written in the npx form (local)', () => { + const claude = getTarget('claude')!; + const file = seedSettings('local', { + hooks: { + PostToolUse: [ + { matcher: 'Edit|Write', hooks: [{ type: 'command', command: 'npx @colbymchenry/codegraph mark-dirty', async: true }] }, + ], + Stop: [ + { hooks: [{ type: 'command', command: 'npx @colbymchenry/codegraph sync-if-dirty' }] }, + ], + }, + }); + + claude.uninstall('local'); + + const after = JSON.parse(fs.readFileSync(file, 'utf-8')); + // Both events emptied → the whole `hooks` object is removed. + expect(after.hooks).toBeUndefined(); + }); +}); + +describe('Installer targets — registry', () => { + it('getTarget returns the right target for each id', () => { + expect(getTarget('claude')?.id).toBe('claude'); + expect(getTarget('cursor')?.id).toBe('cursor'); + expect(getTarget('codex')?.id).toBe('codex'); + expect(getTarget('opencode')?.id).toBe('opencode'); + expect(getTarget('hermes')?.id).toBe('hermes'); + expect(getTarget('not-a-real-target')).toBeUndefined(); + }); + + it('resolveTargetFlag handles auto/all/none/csv', () => { + expect(resolveTargetFlag('none', 'global')).toEqual([]); + expect(resolveTargetFlag('all', 'global').length).toBe(ALL_TARGETS.length); + const csv = resolveTargetFlag('claude,cursor', 'global'); + expect(csv.map((t) => t.id)).toEqual(['claude', 'cursor']); + }); + + it('resolveTargetFlag throws on unknown id', () => { + expect(() => resolveTargetFlag('claude,bogus', 'global')).toThrow(/Unknown --target/); + }); +}); + +describe('Installer targets — TOML serializer (Codex backbone)', () => { + it('builds a [mcp_servers.codegraph] block with command + args', () => { + const block = buildTomlTable('mcp_servers.codegraph', { + command: 'codegraph', + args: ['serve', '--mcp'], + }); + expect(block).toContain('[mcp_servers.codegraph]'); + expect(block).toContain('command = "codegraph"'); + expect(block).toContain('args = ["serve", "--mcp"]'); + }); + + it('upsert inserts into empty content', () => { + const block = buildTomlTable('mcp_servers.codegraph', { command: 'codegraph', args: ['serve'] }); + const { content, action } = upsertTomlTable('', 'mcp_servers.codegraph', block); + expect(action).toBe('inserted'); + expect(content.startsWith('[mcp_servers.codegraph]')).toBe(true); + }); + + it('upsert is idempotent — second call returns unchanged', () => { + const block = buildTomlTable('mcp_servers.codegraph', { command: 'codegraph', args: ['serve'] }); + const first = upsertTomlTable('', 'mcp_servers.codegraph', block); + const second = upsertTomlTable(first.content, 'mcp_servers.codegraph', block); + expect(second.action).toBe('unchanged'); + expect(second.content).toBe(first.content); + }); + + it('upsert replaces an existing block in place, preserving sibling tables', () => { + const existing = [ + '[other_table]', + 'foo = "bar"', + '', + '[mcp_servers.codegraph]', + 'command = "old-codegraph"', + 'args = ["old"]', + '', + '[zzz]', + 'baz = "qux"', + '', + ].join('\n'); + const newBlock = buildTomlTable('mcp_servers.codegraph', { + command: 'codegraph', + args: ['serve', '--mcp'], + }); + const { content, action } = upsertTomlTable(existing, 'mcp_servers.codegraph', newBlock); + expect(action).toBe('replaced'); + expect(content).toContain('[other_table]'); + expect(content).toContain('foo = "bar"'); + expect(content).toContain('[zzz]'); + expect(content).toContain('baz = "qux"'); + expect(content).toContain('command = "codegraph"'); + expect(content).not.toContain('old-codegraph'); + }); + + it('removeTomlTable strips the block and preserves siblings', () => { + const existing = [ + '[other_table]', + 'foo = "bar"', + '', + '[mcp_servers.codegraph]', + 'command = "codegraph"', + 'args = ["serve"]', + ].join('\n'); + const { content, action } = removeTomlTable(existing, 'mcp_servers.codegraph'); + expect(action).toBe('removed'); + expect(content).toContain('[other_table]'); + expect(content).toContain('foo = "bar"'); + expect(content).not.toContain('mcp_servers.codegraph'); + }); + + it('removeTomlTable on missing table returns not-found, no content change', () => { + const existing = '[other]\nfoo = "bar"\n'; + const { content, action } = removeTomlTable(existing, 'mcp_servers.codegraph'); + expect(action).toBe('not-found'); + expect(content).toBe(existing); + }); + + it('upsert preserves an array-of-tables sibling [[foo]]', () => { + const existing = [ + '[[foo]]', + 'name = "a"', + '', + '[[foo]]', + 'name = "b"', + '', + ].join('\n'); + const block = buildTomlTable('mcp_servers.codegraph', { command: 'codegraph', args: ['serve'] }); + const { content } = upsertTomlTable(existing, 'mcp_servers.codegraph', block); + expect(content.match(/\[\[foo\]\]/g)?.length).toBe(2); + expect(content).toContain('[mcp_servers.codegraph]'); + }); +}); + +describe('Installer — uninstallTargets sweep (codegraph uninstall)', () => { + let tmpHome: string; + let tmpCwd: string; + let origCwd: string; + let homeRestore: { restore: () => void }; + + beforeEach(() => { + tmpHome = mkTmpDir('un-home'); + tmpCwd = mkTmpDir('un-cwd'); + origCwd = process.cwd(); + process.chdir(tmpCwd); + homeRestore = setHome(tmpHome); + }); + + afterEach(() => { + homeRestore.restore(); + process.chdir(origCwd); + fs.rmSync(tmpHome, { recursive: true, force: true }); + fs.rmSync(tmpCwd, { recursive: true, force: true }); + }); + + it('sweeps every agent it was installed on and reports removed for each (global)', () => { + for (const t of ALL_TARGETS) { + if (t.supportsLocation('global')) t.install('global', { autoAllow: true }); + } + + const reports = uninstallTargets(ALL_TARGETS, 'global'); + + for (const t of ALL_TARGETS) { + const r = reports.find((x) => x.id === t.id)!; + expect(r.status).toBe('removed'); + expect(r.removedPaths.length).toBeGreaterThan(0); + // The actual config is gone afterward. + expect(t.detect('global').alreadyConfigured).toBe(false); + } + }); + + it('is safe on a clean slate — every agent reports not-configured, nothing removed', () => { + const reports = uninstallTargets(ALL_TARGETS, 'global'); + for (const r of reports) { + expect(r.status).toBe('not-configured'); + expect(r.removedPaths).toEqual([]); + } + }); + + it('reports removed only for agents that were actually configured', () => { + // Install on Claude only; the rest stay untouched. + getTarget('claude')!.install('global', { autoAllow: true }); + + const reports = uninstallTargets(ALL_TARGETS, 'global'); + + const claude = reports.find((r) => r.id === 'claude')!; + expect(claude.status).toBe('removed'); + expect(claude.displayName).toBe(getTarget('claude')!.displayName); + + for (const r of reports.filter((x) => x.id !== 'claude')) { + expect(r.status).toBe('not-configured'); + } + }); + + it('marks global-only agents as unsupported for a local sweep (and never touches them)', () => { + const reports = uninstallTargets(ALL_TARGETS, 'local'); + for (const t of ALL_TARGETS) { + const r = reports.find((x) => x.id === t.id)!; + if (t.supportsLocation('local')) { + expect(r.status).toBe('not-configured'); + } else { + expect(r.status).toBe('unsupported'); + expect(r.removedPaths).toEqual([]); + expect(r.notes[0]).toMatch(/global-only/); + } + } + }); + + it('is idempotent — a second sweep finds nothing left to remove', () => { + for (const t of ALL_TARGETS) { + if (t.supportsLocation('global')) t.install('global', { autoAllow: true }); + } + const first = uninstallTargets(ALL_TARGETS, 'global'); + expect(first.some((r) => r.status === 'removed')).toBe(true); + + const second = uninstallTargets(ALL_TARGETS, 'global'); + for (const r of second) { + expect(r.status).toBe('not-configured'); + expect(r.removedPaths).toEqual([]); + } + }); + + it('a --target subset removes only the chosen agents, leaving siblings configured', () => { + getTarget('claude')!.install('global', { autoAllow: true }); + getTarget('cursor')!.install('global', { autoAllow: true }); + + const reports = uninstallTargets(resolveTargetFlag('claude', 'global'), 'global'); + + expect(reports.map((r) => r.id)).toEqual(['claude']); + expect(reports[0].status).toBe('removed'); + // Cursor was not in the subset — still configured. + expect(getTarget('cursor')!.detect('global').alreadyConfigured).toBe(true); + expect(getTarget('claude')!.detect('global').alreadyConfigured).toBe(false); + }); +}); + +describe('Installer — Cursor rules file cleanup on uninstall', () => { + let tmpHome: string; + let tmpCwd: string; + let origCwd: string; + let homeRestore: { restore: () => void }; + const cursor = getTarget('cursor')!; + + beforeEach(() => { + tmpHome = mkTmpDir('cur-home'); + tmpCwd = mkTmpDir('cur-cwd'); + origCwd = process.cwd(); + process.chdir(tmpCwd); + homeRestore = setHome(tmpHome); + }); + + afterEach(() => { + homeRestore.restore(); + process.chdir(origCwd); + fs.rmSync(tmpHome, { recursive: true, force: true }); + fs.rmSync(tmpCwd, { recursive: true, force: true }); + }); + + const rulesFile = () => path.join(process.cwd(), '.cursor', 'rules', 'codegraph.mdc'); + + it('deletes the dedicated codegraph.mdc entirely (no orphaned frontmatter left behind)', () => { + cursor.install('local', { autoAllow: true }); + expect(fs.existsSync(rulesFile())).toBe(true); + + cursor.uninstall('local'); + + // The whole file — frontmatter included — is gone, not just the block. + expect(fs.existsSync(rulesFile())).toBe(false); + expect(cursor.detect('local').alreadyConfigured).toBe(false); + }); + + it('preserves user content added outside the codegraph markers (strips only our block)', () => { + cursor.install('local', { autoAllow: true }); + const withUserContent = + fs.readFileSync(rulesFile(), 'utf-8') + '\n## My own rule\nkeep me\n'; + fs.writeFileSync(rulesFile(), withUserContent); + + cursor.uninstall('local'); + + expect(fs.existsSync(rulesFile())).toBe(true); + const after = fs.readFileSync(rulesFile(), 'utf-8'); + expect(after).toContain('keep me'); + // Our tool-usage block is gone. + expect(after).not.toContain('codegraph_search'); + expect(after).not.toContain('CODEGRAPH_START'); + }); +}); + +function listAllFiles(dir: string): string[] { + if (!fs.existsSync(dir)) return []; + const out: string[] = []; + for (const entry of fs.readdirSync(dir, { withFileTypes: true })) { + const full = path.join(dir, entry.name); + if (entry.isDirectory()) out.push(...listAllFiles(full)); + else out.push(full); + } + return out; +} diff --git a/__tests__/installer.test.ts b/__tests__/installer.test.ts index e2e24d1c..728ed7c3 100644 --- a/__tests__/installer.test.ts +++ b/__tests__/installer.test.ts @@ -48,21 +48,21 @@ describe('Installer Config Writer', () => { describe('readJsonFile error handling', () => { it('should return empty object for non-existent file', () => { - // writeMcpConfig reads claude.json - if it doesn't exist, it should create it + // writeMcpConfig reads .mcp.json - if it doesn't exist, it should create it writeMcpConfig('local'); - const claudeJson = path.join(tempDir, '.claude.json'); - expect(fs.existsSync(claudeJson)).toBe(true); + const mcpJson = path.join(tempDir, '.mcp.json'); + expect(fs.existsSync(mcpJson)).toBe(true); - const content = JSON.parse(fs.readFileSync(claudeJson, 'utf-8')); + const content = JSON.parse(fs.readFileSync(mcpJson, 'utf-8')); expect(content.mcpServers).toBeDefined(); expect(content.mcpServers.codegraph).toBeDefined(); }); it('should handle corrupted JSON by creating backup', () => { - // Create a corrupted claude.json - const claudeJson = path.join(tempDir, '.claude.json'); - fs.writeFileSync(claudeJson, '{ this is not valid json !!!'); + // Create a corrupted .mcp.json + const mcpJson = path.join(tempDir, '.mcp.json'); + fs.writeFileSync(mcpJson, '{ this is not valid json !!!'); // Suppress console.warn during test const warnSpy = vi.spyOn(console, 'warn').mockImplementation(() => {}); @@ -76,28 +76,28 @@ describe('Installer Config Writer', () => { expect(warnMsg).toContain('Warning'); // Backup should exist - expect(fs.existsSync(claudeJson + '.backup')).toBe(true); + expect(fs.existsSync(mcpJson + '.backup')).toBe(true); // Original backup content should be the corrupted content - const backup = fs.readFileSync(claudeJson + '.backup', 'utf-8'); + const backup = fs.readFileSync(mcpJson + '.backup', 'utf-8'); expect(backup).toContain('this is not valid json'); // New file should be valid JSON with codegraph config - const content = JSON.parse(fs.readFileSync(claudeJson, 'utf-8')); + const content = JSON.parse(fs.readFileSync(mcpJson, 'utf-8')); expect(content.mcpServers.codegraph).toBeDefined(); warnSpy.mockRestore(); }); it('should preserve existing valid config when adding codegraph', () => { - const claudeJson = path.join(tempDir, '.claude.json'); - fs.writeFileSync(claudeJson, JSON.stringify({ + const mcpJson = path.join(tempDir, '.mcp.json'); + fs.writeFileSync(mcpJson, JSON.stringify({ mcpServers: { other: { command: 'other-tool' } }, customField: 'preserved', }, null, 2)); writeMcpConfig('local'); - const content = JSON.parse(fs.readFileSync(claudeJson, 'utf-8')); + const content = JSON.parse(fs.readFileSync(mcpJson, 'utf-8')); expect(content.mcpServers.codegraph).toBeDefined(); expect(content.mcpServers.other).toBeDefined(); expect(content.customField).toBe('preserved'); @@ -125,9 +125,10 @@ describe('Installer Config Writer', () => { const modified = '## My Custom Section\n\nCustom content\n\n' + original + '\n\n## Another Section\n\nMore content\n'; fs.writeFileSync(claudeMdPath, modified); - // Second write should replace only the marked section - const result = writeClaudeMd('local'); - expect(result.updated).toBe(true); + // Second write should leave the marked block as-is (byte-identical + // body, so result is `created:false, updated:false` — both flags + // are off but the surrounding custom content must survive). + writeClaudeMd('local'); const final = fs.readFileSync(claudeMdPath, 'utf-8'); expect(final).toContain('## My Custom Section'); diff --git a/__tests__/integration/full-pipeline.test.ts b/__tests__/integration/full-pipeline.test.ts new file mode 100644 index 00000000..cb01aa5c --- /dev/null +++ b/__tests__/integration/full-pipeline.test.ts @@ -0,0 +1,244 @@ +/** + * End-to-end pipeline integration tests + * + * Exercises the full happy path that unit tests cover in isolation: + * init → indexAll → resolveReferences → searchNodes/getCallers/buildContext → sync + * + * Also covers two error paths that were previously uncovered: + * - Indexing a file that contains a syntactically invalid snippet + * (parse errors must not abort the batch). + * - Sync correctly applies adds + modifies + removes in a single pass. + * + * A synthetic ~120-file project is generated per test (5k files would + * dwarf the test runner; 120 files of varied TS shape is enough to + * stress the resolver and graph layers without slowing the suite to a + * crawl). + */ + +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import CodeGraph from '../../src/index'; + +function createTempDir(prefix = 'codegraph-int-'): string { + return fs.mkdtempSync(path.join(os.tmpdir(), prefix)); +} + +function cleanupTempDir(dir: string): void { + if (fs.existsSync(dir)) { + fs.rmSync(dir, { recursive: true, force: true }); + } +} + +/** + * Generate a synthetic TypeScript project with the given module count. + * Each module exports a function that calls the previous module's + * function so that the resolver has real import edges + call edges to + * resolve. The first module is a leaf; the last is the root. + */ +function generateSyntheticProject(root: string, moduleCount: number): void { + const srcDir = path.join(root, 'src'); + fs.mkdirSync(srcDir, { recursive: true }); + + // Leaf module — no imports. + fs.writeFileSync( + path.join(srcDir, `mod0.ts`), + `export function fn0(x: number): number { return x + 1; }\n` + + `export class Mod0 { ping(): string { return 'mod0'; } }\n` + ); + + for (let i = 1; i < moduleCount; i++) { + const prev = i - 1; + fs.writeFileSync( + path.join(srcDir, `mod${i}.ts`), + `import { fn${prev}, Mod${prev} } from './mod${prev}';\n` + + `export function fn${i}(x: number): number { return fn${prev}(x) + 1; }\n` + + `export class Mod${i} extends Mod${prev} {\n` + + ` call${i}(): number { return fn${i}(${i}); }\n` + + `}\n` + ); + } + + // Entry point file. + fs.writeFileSync( + path.join(srcDir, 'index.ts'), + `import { fn${moduleCount - 1}, Mod${moduleCount - 1} } from './mod${moduleCount - 1}';\n` + + `export function entry(): number {\n` + + ` const m = new Mod${moduleCount - 1}();\n` + + ` return fn${moduleCount - 1}(0) + m.call${moduleCount - 1}();\n` + + `}\n` + ); +} + +describe('Integration: full pipeline', () => { + let tempDir: string; + + beforeEach(() => { + tempDir = createTempDir(); + }); + + afterEach(() => { + cleanupTempDir(tempDir); + }); + + it('runs init → index → resolve → search → callers → context → sync', async () => { + const MODULE_COUNT = 120; + generateSyntheticProject(tempDir, MODULE_COUNT); + + // ── init ────────────────────────────────────────────────────── + const cg = await CodeGraph.init(tempDir, { + config: { include: ['**/*.ts'], exclude: [] }, + }); + + try { + // ── indexAll ──────────────────────────────────────────────── + const indexResult = await cg.indexAll(); + // Synthetic project: MODULE_COUNT mod files + 1 index file. + expect(indexResult.filesIndexed).toBeGreaterThanOrEqual(MODULE_COUNT); + + const statsAfterIndex = cg.getStats(); + expect(statsAfterIndex.fileCount).toBeGreaterThanOrEqual(MODULE_COUNT); + expect(statsAfterIndex.nodeCount).toBeGreaterThan(MODULE_COUNT * 2); + + // ── resolveReferences ──────────────────────────────────────── + // Many call-site edges are wired up during extraction itself, so + // the unresolved-reference queue may already be drained by the + // time we get here. We assert that resolve completes cleanly and + // returns a well-formed result; downstream callers/callees + // assertions verify the graph is actually populated. + cg.reinitializeResolver(); + const resolution = cg.resolveReferences(); + expect(resolution).toBeDefined(); + expect(resolution.stats).toBeDefined(); + expect(typeof resolution.stats.total).toBe('number'); + expect(typeof resolution.stats.resolved).toBe('number'); + + // ── searchNodes ────────────────────────────────────────────── + const entryResults = cg.searchNodes('entry', { limit: 10 }); + expect(entryResults.length).toBeGreaterThan(0); + const entryNode = entryResults.find((r) => r.node.name === 'entry'); + expect(entryNode).toBeDefined(); + + const midResults = cg.searchNodes(`fn50`, { limit: 10 }); + expect(midResults.find((r) => r.node.name === 'fn50')).toBeDefined(); + + // ── getCallers / getCallees ────────────────────────────────── + const fn0Results = cg.searchNodes('fn0', { limit: 5 }); + const fn0Node = fn0Results.find((r) => r.node.name === 'fn0'); + expect(fn0Node).toBeDefined(); + const callers = cg.getCallers(fn0Node!.node.id); + // fn0 is called by fn1 (at least). After resolution this should + // be wired up. + expect(Array.isArray(callers)).toBe(true); + + // ── buildContext ───────────────────────────────────────────── + const context = await cg.buildContext('entry function chain', { + maxNodes: 10, + format: 'markdown', + }); + expect(typeof context).toBe('string'); + expect((context as string).length).toBeGreaterThan(0); + + // ── sync (add + modify + remove in one pass) ───────────────── + // Add: a new file referencing entry(). + fs.writeFileSync( + path.join(tempDir, 'src', 'consumer.ts'), + `import { entry } from './index';\nexport const result = entry();\n` + ); + // Modify: change mod0. + fs.writeFileSync( + path.join(tempDir, 'src', 'mod0.ts'), + `export function fn0(x: number): number { return x + 2; }\n` + + `export function newHelper(): string { return 'new'; }\n` + + `export class Mod0 { ping(): string { return 'mod0v2'; } }\n` + ); + // Remove: drop mod1 — note this will leave dangling imports in + // mod2, which the resolver should tolerate. + fs.unlinkSync(path.join(tempDir, 'src', 'mod1.ts')); + + const syncResult = await cg.sync(); + expect(syncResult.filesAdded).toBeGreaterThanOrEqual(1); + expect(syncResult.filesModified).toBeGreaterThanOrEqual(1); + expect(syncResult.filesRemoved).toBeGreaterThanOrEqual(1); + + // New symbol must now be findable; removed file's symbols gone. + expect(cg.searchNodes('newHelper').length).toBeGreaterThan(0); + + // Removed file should no longer appear in the indexed file list. + // (FTS prefix matching makes name-based assertions unreliable here — + // Mod10/Mod11/… all start with "Mod1" — so we check the file set + // instead.) + const filesAfterSync = cg.getNodesInFile('src/mod1.ts'); + expect(filesAfterSync).toHaveLength(0); + } finally { + cg.destroy(); + } + }, 60_000); + + it('keeps indexing files when one file has a parse error', async () => { + const srcDir = path.join(tempDir, 'src'); + fs.mkdirSync(srcDir, { recursive: true }); + + // Valid files + fs.writeFileSync( + path.join(srcDir, 'good1.ts'), + `export function good1(): number { return 1; }\n` + ); + fs.writeFileSync( + path.join(srcDir, 'good2.ts'), + `export function good2(): number { return 2; }\n` + ); + // Intentionally broken file — unclosed brace, stray tokens. + fs.writeFileSync( + path.join(srcDir, 'broken.ts'), + `export function broken(\n this is { not valid typescript at all\n` + ); + + const cg = await CodeGraph.init(tempDir, { + config: { include: ['**/*.ts'], exclude: [] }, + }); + + try { + const result = await cg.indexAll(); + // The two good files must still be indexed regardless of the + // broken one. Tree-sitter is error-tolerant so it may still + // extract a partial AST from broken.ts — but the test only + // requires that the batch completes and finds the good symbols. + expect(result.filesIndexed).toBeGreaterThanOrEqual(2); + + const good1 = cg.searchNodes('good1'); + const good2 = cg.searchNodes('good2'); + expect(good1.find((r) => r.node.name === 'good1')).toBeDefined(); + expect(good2.find((r) => r.node.name === 'good2')).toBeDefined(); + } finally { + cg.destroy(); + } + }, 30_000); + + it('handles repeated sync calls when nothing has changed', async () => { + generateSyntheticProject(tempDir, 10); + + const cg = await CodeGraph.init(tempDir, { + config: { include: ['**/*.ts'], exclude: [] }, + }); + + try { + await cg.indexAll(); + const statsBefore = cg.getStats(); + + const first = await cg.sync(); + const second = await cg.sync(); + + // Subsequent sync with no changes should be a no-op. + expect(first.filesAdded + first.filesModified + first.filesRemoved).toBe(0); + expect(second.filesAdded + second.filesModified + second.filesRemoved).toBe(0); + + const statsAfter = cg.getStats(); + expect(statsAfter.fileCount).toBe(statsBefore.fileCount); + expect(statsAfter.nodeCount).toBe(statsBefore.nodeCount); + } finally { + cg.destroy(); + } + }, 30_000); +}); diff --git a/__tests__/integration/lru-cache.test.ts b/__tests__/integration/lru-cache.test.ts new file mode 100644 index 00000000..8156760a --- /dev/null +++ b/__tests__/integration/lru-cache.test.ts @@ -0,0 +1,96 @@ +/** + * LRUCache unit tests + * + * Covers the eviction guarantees that the resolver relies on: + * - capacity is enforced (never exceeds max) + * - LRU ordering: hot keys survive eviction passes + * - has()/get()/set()/clear() behave like the original Map shape + * - null values are storable (the fileCache uses null for "failed read") + */ + +import { describe, it, expect } from 'vitest'; +import { LRUCache } from '../../src/resolution/lru-cache'; + +describe('LRUCache', () => { + it('enforces capacity by evicting the oldest entry on overflow', () => { + const cache = new LRUCache(3); + cache.set('a', 1); + cache.set('b', 2); + cache.set('c', 3); + cache.set('d', 4); // evicts 'a' + + expect(cache.size).toBe(3); + expect(cache.has('a')).toBe(false); + expect(cache.get('a')).toBeUndefined(); + expect(cache.get('b')).toBe(2); + expect(cache.get('c')).toBe(3); + expect(cache.get('d')).toBe(4); + }); + + it('promotes touched keys to most-recent so they survive eviction', () => { + const cache = new LRUCache(3); + cache.set('a', 1); + cache.set('b', 2); + cache.set('c', 3); + + // Touch 'a' — it should now be most-recent. + expect(cache.get('a')).toBe(1); + + cache.set('d', 4); // evicts the LRU, which is now 'b' (not 'a') + + expect(cache.has('a')).toBe(true); + expect(cache.has('b')).toBe(false); + expect(cache.has('c')).toBe(true); + expect(cache.has('d')).toBe(true); + }); + + it('overwriting an existing key refreshes its recency but does not grow size', () => { + const cache = new LRUCache(2); + cache.set('a', 1); + cache.set('b', 2); + cache.set('a', 99); // 'a' is now most-recent + + expect(cache.size).toBe(2); + expect(cache.get('a')).toBe(99); + + cache.set('c', 3); // should evict 'b', not 'a' + + expect(cache.has('a')).toBe(true); + expect(cache.has('b')).toBe(false); + expect(cache.has('c')).toBe(true); + }); + + it('stores null values (used by the file content cache)', () => { + const cache = new LRUCache(2); + cache.set('missing.ts', null); + expect(cache.has('missing.ts')).toBe(true); + expect(cache.get('missing.ts')).toBeNull(); + }); + + it('clear() resets the cache', () => { + const cache = new LRUCache(3); + cache.set('a', 1); + cache.set('b', 2); + cache.clear(); + expect(cache.size).toBe(0); + expect(cache.has('a')).toBe(false); + }); + + it('rejects non-positive capacity', () => { + expect(() => new LRUCache(0)).toThrow(); + expect(() => new LRUCache(-1)).toThrow(); + expect(() => new LRUCache(NaN)).toThrow(); + }); + + it('stays bounded under heavy churn (regression for OOM scenario)', () => { + const cache = new LRUCache(100); + for (let i = 0; i < 10_000; i++) { + cache.set(`key${i}`, i); + } + expect(cache.size).toBe(100); + // The last 100 keys should still be present, the rest evicted. + expect(cache.has('key9999')).toBe(true); + expect(cache.has('key9900')).toBe(true); + expect(cache.has('key0')).toBe(false); + }); +}); diff --git a/__tests__/integration/mcp-input-limits.test.ts b/__tests__/integration/mcp-input-limits.test.ts new file mode 100644 index 00000000..495d4933 --- /dev/null +++ b/__tests__/integration/mcp-input-limits.test.ts @@ -0,0 +1,109 @@ +/** + * MCP tool input-size limits + * + * Regression coverage for the DoS vector: MCP clients can ship + * unbounded payloads (`query`, `task`, `symbol`, `projectPath`, + * `path`, `pattern`). Before the cap, a 100MB string would hit + * the FTS5 layer and pin the server. These tests assert that the + * tool layer rejects oversize inputs early. + */ + +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import CodeGraph from '../../src/index'; +import { ToolHandler } from '../../src/mcp/tools'; + +describe('MCP input size limits', () => { + let tempDir: string; + let cg: CodeGraph; + let handler: ToolHandler; + + beforeEach(async () => { + tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-mcp-limits-')); + fs.mkdirSync(path.join(tempDir, 'src'), { recursive: true }); + fs.writeFileSync( + path.join(tempDir, 'src', 'a.ts'), + `export function alpha(): number { return 1; }\n` + ); + cg = await CodeGraph.init(tempDir, { + config: { include: ['**/*.ts'], exclude: [] }, + }); + await cg.indexAll(); + handler = new ToolHandler(cg); + }); + + afterEach(() => { + if (cg) cg.destroy(); + if (fs.existsSync(tempDir)) { + fs.rmSync(tempDir, { recursive: true, force: true }); + } + }); + + it('accepts a normal-sized query', async () => { + const result = await handler.execute('codegraph_search', { query: 'alpha' }); + expect(result.isError).toBeFalsy(); + }); + + it('rejects an oversize query on codegraph_search', async () => { + const huge = 'a'.repeat(20_000); + const result = await handler.execute('codegraph_search', { query: huge }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/maximum length/i); + }); + + it('rejects an oversize task on codegraph_context', async () => { + const huge = 'b'.repeat(50_000); + const result = await handler.execute('codegraph_context', { task: huge }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/maximum length/i); + }); + + it('rejects an oversize symbol on codegraph_callers', async () => { + const huge = 'c'.repeat(15_000); + const result = await handler.execute('codegraph_callers', { symbol: huge }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/maximum length/i); + }); + + it('rejects an oversize symbol on codegraph_impact', async () => { + const huge = 'd'.repeat(11_000); + const result = await handler.execute('codegraph_impact', { symbol: huge }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/maximum length/i); + }); + + it('rejects an oversize projectPath', async () => { + const hugePath = '/tmp/' + 'x'.repeat(5_000); + const result = await handler.execute('codegraph_search', { + query: 'alpha', + projectPath: hugePath, + }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/projectPath/); + }); + + it('rejects an oversize path filter on codegraph_files', async () => { + const hugePath = 'src/' + 'y'.repeat(5_000); + const result = await handler.execute('codegraph_files', { path: hugePath }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/path/); + }); + + it('rejects an oversize glob pattern on codegraph_files', async () => { + const hugePattern = '*'.repeat(5_000); + const result = await handler.execute('codegraph_files', { pattern: hugePattern }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/pattern/); + }); + + it('rejects a non-string projectPath', async () => { + const result = await handler.execute('codegraph_search', { + query: 'alpha', + projectPath: 12345 as unknown as string, + }); + expect(result.isError).toBe(true); + expect(result.content[0]!.text).toMatch(/projectPath/); + }); +}); diff --git a/__tests__/is-test-file.test.ts b/__tests__/is-test-file.test.ts new file mode 100644 index 00000000..e3fc6d03 --- /dev/null +++ b/__tests__/is-test-file.test.ts @@ -0,0 +1,53 @@ +/** + * isTestFile heuristic — test-file detection used to deprioritize test code in + * search/explore ranking. + * + * Regression coverage for the cold-query fix: the heuristic previously only + * knew Java/JS/Python conventions, so Kotlin (`*Test.kt`, `jvmTest/`), Swift + * (`*Tests.swift`), and camelCase test source-set dirs slipped through — which + * let OkHttp's tests flood `codegraph_explore` results on a plain-language + * query. The false-positive guards matter just as much: `latest.kt` / + * `manifest.kt` / a `RealCall.kt` production file must NOT be flagged. + */ +import { describe, it, expect } from 'vitest'; +import { isTestFile } from '../src/search/query-utils'; + +describe('isTestFile', () => { + it('flags Kotlin test files and source sets', () => { + expect(isTestFile('okhttp/src/jvmTest/kotlin/okhttp3/CallTest.kt')).toBe(true); + expect(isTestFile('okhttp/src/commonTest/kotlin/okhttp3/CompressionInterceptorTest.kt')).toBe(true); + expect(isTestFile('app/src/androidTest/java/com/example/FooTest.kt')).toBe(true); + expect(isTestFile('module/src/integrationTest/kotlin/BarSpec.kt')).toBe(true); + }); + + it('flags Swift test files', () => { + expect(isTestFile('Tests/SessionTests.swift')).toBe(true); + expect(isTestFile('Sources/FooTest.swift')).toBe(true); + }); + + it('still flags the previously-supported conventions', () => { + expect(isTestFile('foo/test_bar.py')).toBe(true); + expect(isTestFile('pkg/bar_test.go')).toBe(true); + expect(isTestFile('src/foo.test.ts')).toBe(true); + expect(isTestFile('src/foo.spec.ts')).toBe(true); + expect(isTestFile('com/example/FooTest.java')).toBe(true); + expect(isTestFile('com/example/FooTestCase.java')).toBe(true); + expect(isTestFile('project/__tests__/foo.ts')).toBe(true); + expect(isTestFile('project/tests/foo.rb')).toBe(true); + }); + + it('does NOT flag production files that merely contain "test" lowercase', () => { + // The fix is capital-led so camelCase boundaries distinguish these. + expect(isTestFile('src/latest/loader.kt')).toBe(false); + expect(isTestFile('lib/manifest.kt')).toBe(false); + expect(isTestFile('okhttp/src/jvmMain/kotlin/okhttp3/internal/connection/RealCall.kt')).toBe(false); + expect(isTestFile('src/contestEntry.ts')).toBe(false); + expect(isTestFile('pkg/greatest.go')).toBe(false); + }); + + it('does NOT flag ordinary production source', () => { + expect(isTestFile('src/flask/app.py')).toBe(false); + expect(isTestFile('src/vs/workbench/api/common/extensionHostMain.ts')).toBe(false); + expect(isTestFile('okhttp/src/commonJvmAndroid/kotlin/okhttp3/OkHttpClient.kt')).toBe(false); + }); +}); diff --git a/__tests__/mcp-initialize.test.ts b/__tests__/mcp-initialize.test.ts new file mode 100644 index 00000000..4a57ebae --- /dev/null +++ b/__tests__/mcp-initialize.test.ts @@ -0,0 +1,149 @@ +/** + * MCP `initialize` handshake regression tests. + * + * Issue #172: on slow filesystems (Docker Desktop VirtioFS on macOS, WSL2), + * the MCP server was blocking the initialize response on CodeGraph.open() and + * Parser.init() (web-tree-sitter WASM bootstrap), which could take longer than + * Claude Code's ~30s handshake timeout. The child process stayed alive and + * had received the request, but never sent a response, so tools never + * appeared in the client. The fix sends the initialize response before + * kicking off the heavy init in the background. These tests guard the + * contract that initialize is fast regardless of how much work init does. + */ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { spawn, ChildProcessWithoutNullStreams } from 'child_process'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { CodeGraph } from '../src'; + +const BIN = path.resolve(__dirname, '../dist/bin/codegraph.js'); + +function spawnServer(cwd: string): ChildProcessWithoutNullStreams { + return spawn(process.execPath, [BIN, 'serve', '--mcp'], { + cwd, + stdio: ['pipe', 'pipe', 'pipe'], + }) as ChildProcessWithoutNullStreams; +} + +function sendInitialize(child: ChildProcessWithoutNullStreams, projectPath: string) { + const msg = JSON.stringify({ + jsonrpc: '2.0', + id: 0, + method: 'initialize', + params: { + protocolVersion: '2025-11-25', + capabilities: {}, + clientInfo: { name: 'test', version: '0.0.0' }, + rootUri: `file://${projectPath}`, + }, + }); + child.stdin.write(msg + '\n'); +} + +/** + * Collect stdout lines and stderr text from the child, tagging each piece + * with a monotonic sequence number. Lets us assert ordering between the + * JSON-RPC response (stdout) and side-effect logs (stderr). + */ +function tagStreams(child: ChildProcessWithoutNullStreams) { + const events: Array<{ seq: number; stream: 'stdout' | 'stderr'; text: string }> = []; + let seq = 0; + let stdoutBuf = ''; + let stderrBuf = ''; + child.stdout.on('data', (chunk) => { + stdoutBuf += chunk.toString('utf8'); + let idx; + while ((idx = stdoutBuf.indexOf('\n')) !== -1) { + const line = stdoutBuf.slice(0, idx); + stdoutBuf = stdoutBuf.slice(idx + 1); + events.push({ seq: seq++, stream: 'stdout', text: line }); + } + }); + child.stderr.on('data', (chunk) => { + stderrBuf += chunk.toString('utf8'); + let idx; + while ((idx = stderrBuf.indexOf('\n')) !== -1) { + const line = stderrBuf.slice(0, idx); + stderrBuf = stderrBuf.slice(idx + 1); + events.push({ seq: seq++, stream: 'stderr', text: line }); + } + }); + return events; +} + +function waitFor( + events: ReadonlyArray<{ seq: number; stream: string; text: string }>, + predicate: (e: { seq: number; stream: string; text: string }) => boolean, + timeoutMs: number, +): Promise<{ seq: number; stream: string; text: string }> { + return new Promise((resolve, reject) => { + const started = Date.now(); + const tick = () => { + const hit = events.find(predicate); + if (hit) return resolve(hit); + if (Date.now() - started > timeoutMs) { + return reject(new Error(`Timed out waiting for predicate. Events: ${JSON.stringify(events)}`)); + } + setTimeout(tick, 20); + }; + tick(); + }); +} + +describe('MCP initialize handshake (issue #172)', () => { + let tempDir: string; + let child: ChildProcessWithoutNullStreams | null = null; + + beforeEach(() => { + tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-mcp-init-')); + }); + + afterEach(() => { + if (child && !child.killed) { + child.kill('SIGKILL'); + child = null; + } + fs.rmSync(tempDir, { recursive: true, force: true }); + }); + + it('responds to initialize quickly when no .codegraph exists in cwd', async () => { + child = spawnServer(tempDir); + const events = tagStreams(child); + sendInitialize(child, tempDir); + const response = await waitFor(events, (e) => e.stream === 'stdout', 5000); + const json = JSON.parse(response.text); + expect(json.jsonrpc).toBe('2.0'); + expect(json.id).toBe(0); + expect(json.result.protocolVersion).toBeDefined(); + expect(json.result.capabilities.tools).toBeDefined(); + }, 10000); + + it('sends initialize response BEFORE tryInitializeDefault finishes', async () => { + // Seed a real .codegraph so the server's tryInitializeDefault path runs + // its full body: CodeGraph.open() (which awaits initGrammars()) and then + // startWatching() (which logs "File watcher active" to stderr). On any + // platform, that stderr log is observable evidence that tryInitializeDefault + // has completed. The contract we're protecting: the JSON-RPC response on + // stdout must arrive BEFORE that stderr log. If a future change re-awaits + // tryInitializeDefault before sendResult, this ordering inverts and the + // test fails — regardless of how fast the local filesystem is. + const cg = await CodeGraph.init(tempDir); + cg.close(); + + child = spawnServer(tempDir); + const events = tagStreams(child); + sendInitialize(child, tempDir); + + const response = await waitFor(events, (e) => e.stream === 'stdout', 10000); + const watcherLog = await waitFor( + events, + (e) => e.stream === 'stderr' && e.text.includes('File watcher active'), + 10000, + ); + expect(response.seq).toBeLessThan(watcherLog.seq); + const json = JSON.parse(response.text); + expect(json.id).toBe(0); + expect(json.result.serverInfo.name).toBe('codegraph'); + }, 20000); +}); diff --git a/__tests__/mcp-ppid-watchdog.test.ts b/__tests__/mcp-ppid-watchdog.test.ts new file mode 100644 index 00000000..0e3dc188 --- /dev/null +++ b/__tests__/mcp-ppid-watchdog.test.ts @@ -0,0 +1,168 @@ +/** + * PPID watchdog regression test (#277). + * + * On Linux, when an MCP host (Claude Code, opencode, …) is SIGKILL'd by the + * OOM killer / a force-quit / a container teardown, the kernel does NOT + * propagate the death to its `codegraph serve --mcp` child. The child gets + * reparented to init/systemd, its stdin stays half-open in some + * configurations, and the existing `stdin.on('end' | 'close')` handlers + * never fire — the server lingers indefinitely, holding inotify watches, + * file descriptors, and the SQLite WAL. + * + * `src/mcp/index.ts` polls `process.ppid` and shuts down the moment it + * diverges from the value observed at startup. This test stands up a + * four-tier process tree (vitest → wrapper → {stdin-holder, codegraph}) and + * SIGKILL's the wrapper. The stdin-holder is a long-lived sibling whose + * `stdout` pipe is dup'd into codegraph's `stdin`. After the wrapper dies + * the pipe stays open (stdin-holder still owns the write-end), so the + * existing stdin close handlers do **not** fire — the only thing that can + * terminate codegraph then is the PPID watchdog. + * + * Windows is excluded — `process.kill(pid, 'SIGKILL')` does not actually + * deliver SIGKILL there, and the per-OS reparenting semantics the watchdog + * relies on are POSIX-specific. + */ +import { describe, it, expect, afterEach } from 'vitest'; +import { spawn, ChildProcessWithoutNullStreams } from 'child_process'; +import * as fs from 'fs'; +import * as os from 'os'; +import * as path from 'path'; + +const BIN = path.resolve(__dirname, '../dist/bin/codegraph.js'); + +function isAlive(pid: number): boolean { + try { + process.kill(pid, 0); + return true; + } catch { + return false; + } +} + +function waitForExit(pid: number, timeoutMs: number): Promise { + return new Promise((resolve) => { + const start = Date.now(); + const tick = () => { + if (!isAlive(pid)) return resolve(true); + if (Date.now() - start > timeoutMs) return resolve(false); + setTimeout(tick, 100); + }; + tick(); + }); +} + +describe.skipIf(process.platform === 'win32')('MCP PPID watchdog (#277)', () => { + let wrapper: ChildProcessWithoutNullStreams | null = null; + let childPid: number | null = null; + let stdinHolderPid: number | null = null; + + afterEach(() => { + if (wrapper && !wrapper.killed) { + try { wrapper.kill('SIGKILL'); } catch { /* already gone */ } + } + // Belt and suspenders — don't leak processes if an assertion failed. + for (const pid of [childPid, stdinHolderPid]) { + if (pid !== null && isAlive(pid)) { + try { process.kill(pid, 'SIGKILL'); } catch { /* already gone */ } + } + } + wrapper = null; + childPid = null; + stdinHolderPid = null; + }); + + it("shuts down when its parent is SIGKILL'd and stdin stays open", async () => { + // The wrapper: + // 1. Spawns a "stdin-holder" — a tiny long-lived node process whose + // `stdout` pipe is dup'd into codegraph's `stdin`. As long as the + // stdin-holder is alive (it is — it's an orphan after the wrapper + // dies), codegraph's stdin never sees EOF. + // 2. Spawns codegraph with that pipe as fd 0 and its stderr redirected + // to a tmp file that survives the wrapper, then reports both PIDs. + // 3. Idles until SIGKILL'd from the test. + // + // CODEGRAPH_PPID_POLL_MS=200 keeps the watchdog responsive in test; the + // production default is 5000ms. + const stderrLog = path.join( + fs.mkdtempSync(path.join(os.tmpdir(), 'cg-ppid-watchdog-')), + 'codegraph.stderr.log', + ); + // The wrapper waits 800ms before reporting the PIDs so the codegraph + // child has time to finish its async start() (dynamic import + transport + // setup + watchdog registration). Otherwise the test races: it + // SIGKILL's the wrapper before the watchdog interval is installed, and + // nothing terminates codegraph. + const wrapperSrc = ` + const { spawn } = require('child_process'); + const fs = require('fs'); + const stderrFd = fs.openSync(${JSON.stringify(stderrLog)}, 'a'); + const stdinHolder = spawn(process.execPath, ['-e', 'setInterval(() => {}, 60000)'], { + stdio: ['ignore', 'pipe', 'ignore'], + detached: true, + }); + stdinHolder.unref(); + const child = spawn(process.execPath, [${JSON.stringify(BIN)}, 'serve', '--mcp'], { + stdio: [stdinHolder.stdout, 'ignore', stderrFd], + env: { ...process.env, CODEGRAPH_PPID_POLL_MS: '200' }, + detached: true, + }); + child.unref(); + setTimeout(() => { + process.stdout.write(JSON.stringify({ pid: child.pid, stdinHolderPid: stdinHolder.pid }) + '\\n'); + }, 800); + setInterval(() => {}, 60000); + `; + wrapper = spawn(process.execPath, ['-e', wrapperSrc], { + stdio: ['pipe', 'pipe', 'pipe'], + }) as ChildProcessWithoutNullStreams; + + const pids = await new Promise<{ pid: number; stdinHolderPid: number }>((resolve, reject) => { + let buf = ''; + const timer = setTimeout( + () => reject(new Error('wrapper did not report PIDs in time')), + 10000, + ); + wrapper!.stdout.on('data', (chunk: Buffer) => { + buf += chunk.toString('utf8'); + const m = buf.match(/\{"pid":(\d+),"stdinHolderPid":(\d+)\}/); + if (m) { + clearTimeout(timer); + resolve({ pid: parseInt(m[1], 10), stdinHolderPid: parseInt(m[2], 10) }); + } + }); + wrapper!.on('exit', () => { + clearTimeout(timer); + reject(new Error('wrapper exited before reporting PIDs')); + }); + }); + childPid = pids.pid; + stdinHolderPid = pids.stdinHolderPid; + + expect(isAlive(childPid)).toBe(true); + expect(isAlive(stdinHolderPid)).toBe(true); + + // SIGKILL the wrapper — no cleanup runs, just like a real OOM kill. + // codegraph and the stdin-holder both get reparented to init/systemd. + // Crucially, the pipe between them stays open, so codegraph's stdin + // doesn't close: only the watchdog can take it down. + wrapper.kill('SIGKILL'); + + // Watchdog runs every 200ms in this test → 5s gives ~25 polls of headroom. + const exited = await waitForExit(childPid, 5000); + const stderrContent = fs.existsSync(stderrLog) ? fs.readFileSync(stderrLog, 'utf-8') : ''; + expect( + exited, + `codegraph child (pid=${childPid}) did not exit within 5s after wrapper was SIGKILL'd.\nstderr:\n${stderrContent}`, + ).toBe(true); + // The watchdog announces itself before tearing down — assert that the + // shutdown came from the parent-death path, not from any other signal. + expect(stderrContent).toMatch(/Parent process exited.*shutting down/); + + // The stdin-holder is now an orphan — kill it explicitly so it doesn't + // outlive the test. It's still tracked in `stdinHolderPid` for the + // afterEach safety net, but we tidy up proactively here too. + if (isAlive(stdinHolderPid)) { + try { process.kill(stdinHolderPid, 'SIGKILL'); } catch { /* race */ } + } + }, 20000); +}); diff --git a/__tests__/mcp-roots.test.ts b/__tests__/mcp-roots.test.ts new file mode 100644 index 00000000..8e1d4520 --- /dev/null +++ b/__tests__/mcp-roots.test.ts @@ -0,0 +1,180 @@ +/** + * MCP project-resolution regression tests (issue #196). + * + * When an MCP client launches the server outside the project directory AND + * doesn't pass a `rootUri`/`workspaceFolders` in `initialize`, the server used + * to fall straight back to `process.cwd()` — which for many IDE clients is the + * wrong directory. Every tool call without an explicit `projectPath` then + * failed with a misleading "CodeGraph not initialized. Run 'codegraph init'." + * + * The fix: when no explicit path is provided, the server asks the client for + * its workspace root via the spec-blessed `roots/list` request (if the client + * advertised the `roots` capability), and only falls back to cwd otherwise. + * When it still can't resolve, the error now says exactly how to fix it. + * + * These tests drive the real stdio transport via a spawned subprocess — no + * mocking — so they also exercise the new bidirectional request/response path. + */ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { spawn, ChildProcessWithoutNullStreams } from 'child_process'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { CodeGraph } from '../src'; + +const BIN = path.resolve(__dirname, '../dist/bin/codegraph.js'); + +function spawnServer(cwd: string): ChildProcessWithoutNullStreams { + // --no-watch keeps the test deterministic and avoids watcher startup noise. + return spawn(process.execPath, [BIN, 'serve', '--mcp', '--no-watch'], { + cwd, + stdio: ['pipe', 'pipe', 'pipe'], + }) as ChildProcessWithoutNullStreams; +} + +/** Parse every JSON-RPC message the server writes to stdout into an array. */ +function collectMessages(child: ChildProcessWithoutNullStreams): Array> { + const messages: Array> = []; + let buf = ''; + child.stdout.on('data', (chunk) => { + buf += chunk.toString('utf8'); + let idx; + while ((idx = buf.indexOf('\n')) !== -1) { + const line = buf.slice(0, idx).trim(); + buf = buf.slice(idx + 1); + if (!line) continue; + try { messages.push(JSON.parse(line)); } catch { /* ignore non-JSON */ } + } + }); + return messages; +} + +function waitForMessage( + messages: ReadonlyArray>, + predicate: (m: Record) => boolean, + timeoutMs: number, +): Promise> { + return new Promise((resolve, reject) => { + const started = Date.now(); + const tick = () => { + const hit = messages.find(predicate); + if (hit) return resolve(hit); + if (Date.now() - started > timeoutMs) { + return reject(new Error(`Timed out. Messages so far: ${JSON.stringify(messages)}`)); + } + setTimeout(tick, 20); + }; + tick(); + }); +} + +function send(child: ChildProcessWithoutNullStreams, msg: object): void { + child.stdin.write(JSON.stringify(msg) + '\n'); +} + +const CLIENT_INFO = { name: 'test', version: '0.0.0' }; + +describe('MCP project resolution via roots/list (issue #196)', () => { + let cwdDir: string; // where the server is launched — has NO .codegraph + let projectDir: string; // the real indexed project the client reports + let child: ChildProcessWithoutNullStreams | null = null; + + beforeEach(() => { + cwdDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-mcp-cwd-')); + projectDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-mcp-proj-')); + }); + + afterEach(() => { + if (child && !child.killed) { + child.kill('SIGKILL'); + child = null; + } + fs.rmSync(cwdDir, { recursive: true, force: true }); + fs.rmSync(projectDir, { recursive: true, force: true }); + }); + + it('resolves the project from the client roots/list when no rootUri is sent', async () => { + const cg = await CodeGraph.init(projectDir); + cg.close(); + + child = spawnServer(cwdDir); + const messages = collectMessages(child); + + // Advertise the roots capability but pass NO rootUri/workspaceFolders. + send(child, { + jsonrpc: '2.0', id: 0, method: 'initialize', + params: { protocolVersion: '2025-11-25', capabilities: { roots: {} }, clientInfo: CLIENT_INFO }, + }); + await waitForMessage(messages, (m) => m.id === 0 && !!m.result, 5000); + send(child, { jsonrpc: '2.0', method: 'notifications/initialized' }); + + // First tool call (no projectPath) drives the server to ask us for roots. + send(child, { jsonrpc: '2.0', id: 1, method: 'tools/call', params: { name: 'codegraph_status', arguments: {} } }); + + const rootsReq = await waitForMessage(messages, (m) => m.method === 'roots/list', 5000); + expect(typeof rootsReq.id).toBe('string'); // server-initiated id + send(child, { + jsonrpc: '2.0', id: rootsReq.id, + result: { roots: [{ uri: `file://${projectDir}`, name: 'proj' }] }, + }); + + // The status call now succeeds against the resolved project. + const resp = await waitForMessage(messages, (m) => m.id === 1, 8000); + const text = resp.result.content[0].text as string; + expect(text).toContain('CodeGraph Status'); + expect(text).not.toContain('No CodeGraph project is loaded'); + }, 20000); + + it('returns an actionable error when there is no rootUri and no roots capability', async () => { + child = spawnServer(cwdDir); + const messages = collectMessages(child); + + send(child, { + jsonrpc: '2.0', id: 0, method: 'initialize', + params: { protocolVersion: '2025-11-25', capabilities: {}, clientInfo: CLIENT_INFO }, + }); + await waitForMessage(messages, (m) => m.id === 0 && !!m.result, 5000); + send(child, { jsonrpc: '2.0', method: 'notifications/initialized' }); + + send(child, { jsonrpc: '2.0', id: 1, method: 'tools/call', params: { name: 'codegraph_status', arguments: {} } }); + const resp = await waitForMessage(messages, (m) => m.id === 1, 8000); + const text = resp.result.content[0].text as string; + + expect(text).toContain('No CodeGraph project is loaded'); + expect(text).toContain('projectPath'); + expect(text).toContain('--path'); + // Names the directory it actually searched (the wrong cwd) so the user can + // see why detection missed. basename survives any symlink realpath-ing. + expect(text).toContain(path.basename(cwdDir)); + // It must not have hung waiting on roots/list — the client never offered it. + expect(messages.some((m) => m.method === 'roots/list')).toBe(false); + }, 20000); + + it('honors an explicit rootUri without asking the client for roots', async () => { + const cg = await CodeGraph.init(projectDir); + cg.close(); + + child = spawnServer(cwdDir); + const messages = collectMessages(child); + + send(child, { + jsonrpc: '2.0', id: 0, method: 'initialize', + params: { + protocolVersion: '2025-11-25', + capabilities: { roots: {} }, + clientInfo: CLIENT_INFO, + rootUri: `file://${projectDir}`, + }, + }); + await waitForMessage(messages, (m) => m.id === 0 && !!m.result, 5000); + send(child, { jsonrpc: '2.0', method: 'notifications/initialized' }); + + send(child, { jsonrpc: '2.0', id: 1, method: 'tools/call', params: { name: 'codegraph_status', arguments: {} } }); + const resp = await waitForMessage(messages, (m) => m.id === 1, 8000); + const text = resp.result.content[0].text as string; + + expect(text).toContain('CodeGraph Status'); + // rootUri is a stronger signal than roots — we never needed to ask. + expect(messages.some((m) => m.method === 'roots/list')).toBe(false); + }, 20000); +}); diff --git a/__tests__/mcp-tool-allowlist.test.ts b/__tests__/mcp-tool-allowlist.test.ts new file mode 100644 index 00000000..6f29616d --- /dev/null +++ b/__tests__/mcp-tool-allowlist.test.ts @@ -0,0 +1,58 @@ +/** + * CODEGRAPH_MCP_TOOLS allowlist — lets an operator (or an A/B harness) trim the + * exposed MCP tool surface without touching the client config. Inert when unset. + * Filtering happens in ListTools (getTools) and is enforced again on execute(). + */ +import { describe, it, expect, afterEach } from 'vitest'; +import { ToolHandler } from '../src/mcp/tools'; + +const ENV = 'CODEGRAPH_MCP_TOOLS'; + +describe('CODEGRAPH_MCP_TOOLS allowlist', () => { + const original = process.env[ENV]; + afterEach(() => { + if (original === undefined) delete process.env[ENV]; + else process.env[ENV] = original; + }); + + const listed = () => new ToolHandler(null).getTools().map(t => t.name).sort(); + + it('exposes the full tool surface when unset', () => { + delete process.env[ENV]; + const all = listed(); + expect(all).toContain('codegraph_explore'); + expect(all).toContain('codegraph_context'); + expect(all).toContain('codegraph_trace'); + expect(all.length).toBeGreaterThanOrEqual(10); + }); + + it('filters ListTools to the allowlisted short names', () => { + process.env[ENV] = 'trace,search,node'; + expect(listed()).toEqual(['codegraph_node', 'codegraph_search', 'codegraph_trace']); + }); + + it('accepts fully-qualified codegraph_ names and ignores whitespace', () => { + process.env[ENV] = ' codegraph_trace , search '; + expect(listed()).toEqual(['codegraph_search', 'codegraph_trace']); + }); + + it('treats an empty/whitespace value as unset (full surface)', () => { + process.env[ENV] = ' '; + expect(listed().length).toBeGreaterThanOrEqual(10); + }); + + it('rejects a disabled tool on execute (defense in depth)', async () => { + process.env[ENV] = 'trace'; + const res = await new ToolHandler(null).execute('codegraph_explore', {}); + expect(res.isError).toBe(true); + expect(res.content[0].text).toMatch(/disabled via CODEGRAPH_MCP_TOOLS/); + }); + + it('lets an allowlisted tool past the guard', async () => { + process.env[ENV] = 'search'; + // No CodeGraph attached, so it fails *after* the allowlist guard — the + // "disabled" message must NOT appear, proving the guard passed it through. + const res = await new ToolHandler(null).execute('codegraph_search', { query: 'x' }); + expect(res.content[0].text).not.toMatch(/disabled via CODEGRAPH_MCP_TOOLS/); + }); +}); diff --git a/__tests__/node-sqlite-backend.test.ts b/__tests__/node-sqlite-backend.test.ts new file mode 100644 index 00000000..d1e630f6 --- /dev/null +++ b/__tests__/node-sqlite-backend.test.ts @@ -0,0 +1,71 @@ +/** + * node:sqlite backend (issue #238 follow-up). + * + * node:sqlite (Node's built-in real SQLite) is now the sole backend. This drives + * a real index + queries through it, so WAL, FTS5 search, and @named-param + * writes are all exercised end-to-end. + * + * Skipped on Node < 22.5 where node:sqlite doesn't exist. + */ + +import { describe, it, expect, beforeAll, afterAll } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import CodeGraph from '../src'; + +let nodeSqliteAvailable = false; +try { + // eslint-disable-next-line @typescript-eslint/no-require-imports + require('node:sqlite'); + nodeSqliteAvailable = true; +} catch { + nodeSqliteAvailable = false; +} + +describe.skipIf(!nodeSqliteAvailable)('node:sqlite backend — real index + queries', () => { + let dir: string; + let cg: CodeGraph; + + beforeAll(async () => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-nodesqlite-')); + fs.writeFileSync(path.join(dir, 'a.ts'), 'export function helper(): number { return 1; }\n'); + fs.writeFileSync( + path.join(dir, 'b.ts'), + "import { helper } from './a';\nexport function main(): number { return helper(); }\n" + ); + cg = await CodeGraph.init(dir, { index: true }); + }); + + afterAll(() => { + cg?.close(); + fs.rmSync(dir, { recursive: true, force: true }); + }); + + it('uses the node:sqlite backend', () => { + expect(cg.getBackend()).toBe('node-sqlite'); + }); + + it('runs in WAL mode — the whole reason it beats the wasm fallback', () => { + expect(cg.getJournalMode()).toBe('wal'); + }); + + it('indexed the project (write path: @named-param INSERTs via node:sqlite)', () => { + const stats = cg.getStats(); + expect(stats.fileCount).toBe(2); + expect(stats.nodeCount).toBeGreaterThan(0); + }); + + it('FTS5 search returns the indexed symbol (read path)', () => { + const results = cg.searchNodes('helper'); + const names = results.map(r => r.node.name); + expect(names).toContain('helper'); + }); + + it('graph traversal resolves the cross-file caller', () => { + const helper = cg.searchNodes('helper').find(r => r.node.name === 'helper'); + expect(helper).toBeTruthy(); + const callers = cg.getCallers(helper!.node.id); + expect(callers.map(c => c.node.name)).toContain('main'); + }); +}); diff --git a/__tests__/node-version-check.test.ts b/__tests__/node-version-check.test.ts new file mode 100644 index 00000000..fc455eb8 --- /dev/null +++ b/__tests__/node-version-check.test.ts @@ -0,0 +1,69 @@ +/** + * Pin the Node-25 block banner content. The banner replaced a soft + * `console.warn` because the warning was scrolling off-screen before + * the OOM crash 30 seconds later, generating duplicate bug reports + * (#54, #81, #140). The recipe and override env var below are + * load-bearing — if any of them get edited away, this test catches it. + */ + +import { describe, it, expect } from 'vitest'; +import { buildNode25BlockBanner, buildNodeTooOldBanner, MIN_NODE_MAJOR } from '../src/bin/node-version-check'; + +describe('buildNode25BlockBanner', () => { + it('embeds the reported Node version in the header', () => { + expect(buildNode25BlockBanner('25.9.0')).toContain( + 'Unsupported Node.js version: 25.9.0' + ); + }); + + it('names the V8 turboshaft WASM root cause and the OOM symptom', () => { + const banner = buildNode25BlockBanner('25.7.0'); + expect(banner).toContain('V8 WASM JIT'); + expect(banner).toContain('turboshaft'); + expect(banner).toContain('Fatal process out of memory: Zone'); + }); + + it('points users to Node 22 LTS via nvm and Homebrew', () => { + const banner = buildNode25BlockBanner('25.7.0'); + expect(banner).toContain('Node.js 22 LTS'); + expect(banner).toContain('nvm install 22'); + expect(banner).toContain('brew install node@22'); + }); + + it('documents the CODEGRAPH_ALLOW_UNSAFE_NODE override', () => { + const banner = buildNode25BlockBanner('25.7.0'); + expect(banner).toContain('CODEGRAPH_ALLOW_UNSAFE_NODE=1'); + }); + + it('links to issue #81 for the root-cause writeup', () => { + expect(buildNode25BlockBanner('25.7.0')).toContain( + 'github.com/colbymchenry/codegraph/issues/81' + ); + }); +}); + +describe('buildNodeTooOldBanner', () => { + it('embeds the reported Node version in the header', () => { + expect(buildNodeTooOldBanner('18.20.0')).toContain( + 'Unsupported Node.js version: 18.20.0' + ); + }); + + it('states the supported floor matching MIN_NODE_MAJOR', () => { + expect(MIN_NODE_MAJOR).toBe(20); + expect(buildNodeTooOldBanner('18.0.0')).toContain( + `requires Node.js ${MIN_NODE_MAJOR} or newer` + ); + }); + + it('points users to Node 22 LTS via nvm and Homebrew', () => { + const banner = buildNodeTooOldBanner('16.0.0'); + expect(banner).toContain('Node.js 22 LTS'); + expect(banner).toContain('nvm install 22'); + expect(banner).toContain('brew install node@22'); + }); + + it('documents the CODEGRAPH_ALLOW_UNSAFE_NODE override', () => { + expect(buildNodeTooOldBanner('18.0.0')).toContain('CODEGRAPH_ALLOW_UNSAFE_NODE=1'); + }); +}); diff --git a/__tests__/npm-shim.test.ts b/__tests__/npm-shim.test.ts new file mode 100644 index 00000000..16e70506 --- /dev/null +++ b/__tests__/npm-shim.test.ts @@ -0,0 +1,208 @@ +/** + * npm thin-installer launcher (`scripts/npm-shim.js`) tests. + * + * The shim runs on the user's own Node, locates the per-platform optionalDependency + * bundle, and — when a registry mirror failed to deliver it (issue #303) — falls + * back to downloading the bundle from GitHub Releases. These tests exercise that + * shim as a real subprocess from a temp "main package" dir (its own package.json + * + node_modules), so resolution and version lookup behave hermetically. + * + * The download/checksum paths run against a local self-signed HTTPS server via + * CODEGRAPH_DOWNLOAD_BASE — no real network, no published release needed. The + * shim is launched with async `spawn` (not spawnSync), so the test's event loop + * stays free to serve those requests. + * + * POSIX only: the fake bundle launcher is a shell script and extraction uses the + * system `tar`. Skipped on Windows (where the shim's exec path differs anyway). + */ + +import { describe, it, expect, beforeAll, afterAll } from 'vitest'; +import { spawn, execSync } from 'child_process'; +import * as https from 'https'; +import * as fs from 'fs'; +import * as os from 'os'; +import * as path from 'path'; +import * as crypto from 'crypto'; +import type { AddressInfo } from 'net'; + +const SHIM_SRC = path.join(__dirname, '..', 'scripts', 'npm-shim.js'); +const target = `${process.platform}-${process.arch}`; +const asset = `codegraph-${target}.tar.gz`; +const isWindows = process.platform === 'win32'; + +function hasOpenssl(): boolean { + try { execSync('openssl version', { stdio: 'ignore' }); return true; } catch { return false; } +} +const CAN_NET = !isWindows && hasOpenssl(); + +function mkTmp(label: string): string { + return fs.mkdtempSync(path.join(os.tmpdir(), `cg-shim-${label}-`)); +} + +// A temp dir standing in for the installed @colbymchenry/codegraph main package. +function makePkg(version = '9.9.9-test'): string { + const dir = mkTmp('pkg'); + fs.copyFileSync(SHIM_SRC, path.join(dir, 'npm-shim.js')); + fs.writeFileSync(path.join(dir, 'package.json'), + JSON.stringify({ name: '@colbymchenry/codegraph', version }) + '\n'); + return dir; +} + +// A fake bundle launcher that prints a marker + its args, so we can prove the +// shim found and exec'd it (and passed args through). +function writeLauncher(binDir: string): void { + fs.mkdirSync(binDir, { recursive: true }); + const p = path.join(binDir, 'codegraph'); + fs.writeFileSync(p, '#!/bin/sh\necho "FAKE_BUNDLE_RAN args:$*"\n'); + fs.chmodSync(p, 0o755); +} + +// Launch the shim with async spawn so the in-process HTTPS server can respond +// while it runs (spawnSync would block this event loop and deadlock). +function runShim(pkgDir: string, args: string[], env: Record) { + return new Promise<{ status: number | null; stdout: string; stderr: string }>((resolve) => { + const child = spawn(process.execPath, [path.join(pkgDir, 'npm-shim.js'), ...args], { + env: { ...process.env, ...env }, + }); + let stdout = '', stderr = ''; + child.stdout.on('data', (d) => { stdout += d.toString(); }); + child.stderr.on('data', (d) => { stderr += d.toString(); }); + child.on('close', (status) => resolve({ status, stdout, stderr })); + }); +} + +describe.skipIf(isWindows)('npm-shim launcher', () => { + it('runs the installed optional-dependency bundle without any download', async () => { + const pkg = makePkg(); + const platformPkg = path.join(pkg, 'node_modules', '@colbymchenry', `codegraph-${target}`); + writeLauncher(path.join(platformPkg, 'bin')); + fs.writeFileSync(path.join(platformPkg, 'package.json'), + JSON.stringify({ name: `@colbymchenry/codegraph-${target}`, version: '9.9.9-test' }) + '\n'); + const cache = mkTmp('cache'); + const r = await runShim(pkg, ['--probe-abc'], { CODEGRAPH_INSTALL_DIR: cache }); + + expect(r.status).toBe(0); + expect(r.stdout).toContain('FAKE_BUNDLE_RAN'); + expect(r.stdout).toContain('--probe-abc'); // args passed through + expect(r.stderr).not.toContain('downloading'); // never reached the fallback + expect(fs.existsSync(path.join(cache, 'bundles'))).toBe(false); + }); + + it('uses an already-cached bundle even when downloads are disabled', async () => { + const pkg = makePkg('1.2.3-cached'); + const cache = mkTmp('cache'); + writeLauncher(path.join(cache, 'bundles', `${target}-1.2.3-cached`, 'bin')); + const r = await runShim(pkg, ['--probe-xyz'], { + CODEGRAPH_INSTALL_DIR: cache, + CODEGRAPH_NO_DOWNLOAD: '1', + }); + + expect(r.status).toBe(0); + expect(r.stdout).toContain('FAKE_BUNDLE_RAN'); + expect(r.stdout).toContain('--probe-xyz'); + expect(r.stderr).toBe(''); + }); + + it('prints actionable guidance and exits 1 when disabled with no bundle', async () => { + const pkg = makePkg(); + const r = await runShim(pkg, ['--version'], { + CODEGRAPH_INSTALL_DIR: mkTmp('cache'), + CODEGRAPH_NO_DOWNLOAD: '1', + }); + + expect(r.status).toBe(1); + expect(r.stderr).toContain(`no prebuilt bundle for ${target}`); + expect(r.stderr).toContain(`@colbymchenry/codegraph-${target}`); + expect(r.stderr).toContain('--registry=https://registry.npmjs.org'); + expect(r.stderr).toContain('install.sh'); + }); +}); + +describe.skipIf(!CAN_NET)('npm-shim download fallback (local HTTPS)', () => { + let server: https.Server; + let port = 0; + let fixtureBytes: Buffer; + let fixtureSha: string; + let sumsBody: string | null = null; // per-test: SHA256SUMS contents, or null for 404 + + beforeAll(async () => { + // Self-signed cert for the mock release host. + const cdir = mkTmp('tls'); + const keyP = path.join(cdir, 'key.pem'); + const certP = path.join(cdir, 'cert.pem'); + execSync( + `openssl req -x509 -newkey rsa:2048 -nodes -keyout ${keyP} -out ${certP} -days 1 -subj "/CN=localhost"`, + { stdio: 'ignore' }, + ); + + // Build a fake bundle archive (codegraph-/bin/codegraph), like a real release asset. + const work = mkTmp('fixture'); + writeLauncher(path.join(work, `codegraph-${target}`, 'bin')); + const archive = path.join(work, asset); + execSync(`tar -czf ${JSON.stringify(archive)} -C ${JSON.stringify(work)} codegraph-${target}`); + fixtureBytes = fs.readFileSync(archive); + fixtureSha = crypto.createHash('sha256').update(fixtureBytes).digest('hex'); + + server = https.createServer({ key: fs.readFileSync(keyP), cert: fs.readFileSync(certP) }, (req, res) => { + const url = req.url || ''; + if (url.endsWith(`/${asset}`)) { + res.writeHead(200); res.end(fixtureBytes); + } else if (url.endsWith('/SHA256SUMS')) { + if (sumsBody === null) { res.writeHead(404); res.end('not found'); } + else { res.writeHead(200); res.end(sumsBody); } + } else { + res.writeHead(404); res.end('not found'); + } + }); + await new Promise((resolve) => server.listen(0, '127.0.0.1', resolve)); + port = (server.address() as AddressInfo).port; + }, 30000); + + afterAll(() => { server?.close(); }); + + function netEnv(cache: string): Record { + return { + CODEGRAPH_INSTALL_DIR: cache, + CODEGRAPH_DOWNLOAD_BASE: `https://127.0.0.1:${port}`, + NODE_TLS_REJECT_UNAUTHORIZED: '0', + }; + } + + it('downloads, verifies the checksum, extracts, and execs the bundle', async () => { + sumsBody = `${fixtureSha} ${asset}\n`; + const pkg = makePkg('5.0.0-net'); + const cache = mkTmp('cache'); + const r = await runShim(pkg, ['--probe-net'], netEnv(cache)); + + expect(r.stderr).toContain('downloading'); + expect(r.stderr).toContain('checksum verified'); + expect(r.status).toBe(0); + expect(r.stdout).toContain('FAKE_BUNDLE_RAN'); + expect(r.stdout).toContain('--probe-net'); + expect(fs.existsSync(path.join(cache, 'bundles', `${target}-5.0.0-net`, 'bin', 'codegraph'))).toBe(true); + }, 20000); + + it('aborts (exit 1) on a checksum mismatch and caches nothing', async () => { + sumsBody = `${'0'.repeat(64)} ${asset}\n`; + const pkg = makePkg('5.0.0-bad'); + const cache = mkTmp('cache'); + const r = await runShim(pkg, ['--version'], netEnv(cache)); + + expect(r.status).toBe(1); + expect(r.stderr).toContain('checksum mismatch'); + expect(r.stdout).not.toContain('FAKE_BUNDLE_RAN'); // never exec'd a tampered bundle + expect(fs.existsSync(path.join(cache, 'bundles', `${target}-5.0.0-bad`))).toBe(false); + }, 20000); + + it('proceeds when no SHA256SUMS is published (older releases)', async () => { + sumsBody = null; // 404 + const pkg = makePkg('5.0.0-nosums'); + const cache = mkTmp('cache'); + const r = await runShim(pkg, ['--version'], netEnv(cache)); + + expect(r.status).toBe(0); + expect(r.stderr).toContain('downloading'); + expect(r.stderr).not.toContain('checksum verified'); // skipped, not failed + expect(r.stdout).toContain('FAKE_BUNDLE_RAN'); + }, 20000); +}); diff --git a/__tests__/pr19-improvements.test.ts b/__tests__/pr19-improvements.test.ts index 5fbe17d7..6741e905 100644 --- a/__tests__/pr19-improvements.test.ts +++ b/__tests__/pr19-improvements.test.ts @@ -45,11 +45,11 @@ function cleanupTempDir(dir: string): void { } } -// Check if better-sqlite3 native bindings are available +// Check if the node:sqlite backend is available (Node >= 22.5) function hasSqliteBindings(): boolean { try { - const Database = require('better-sqlite3'); - const db = new Database(':memory:'); + const { DatabaseSync } = require('node:sqlite'); + const db = new DatabaseSync(':memory:'); db.close(); return true; } catch { @@ -299,7 +299,7 @@ describe('Best-Candidate Resolution', () => { describe('Schema v2 Migration', () => { it.skipIf(!HAS_SQLITE)('should have correct current schema version', async () => { const { CURRENT_SCHEMA_VERSION } = await import('../src/db/migrations'); - expect(CURRENT_SCHEMA_VERSION).toBe(3); + expect(CURRENT_SCHEMA_VERSION).toBe(4); }); it.skipIf(!HAS_SQLITE)('should have migration for version 2', async () => { diff --git a/__tests__/resolution.test.ts b/__tests__/resolution.test.ts index bb7fe9b0..1ca3a3f8 100644 --- a/__tests__/resolution.test.ts +++ b/__tests__/resolution.test.ts @@ -606,5 +606,244 @@ function main(): void { // Should have attempted resolution expect(result.stats.total).toBeGreaterThanOrEqual(0); }); + + it('promotes calls→instantiates when target resolves to a class (Python)', async () => { + // Python has no `new` keyword — `Foo()` is the standard + // instantiation syntax. Extraction can't tell that apart from + // a function call without symbol info, so it emits a `calls` + // ref. Resolution promotes it to `instantiates` once the + // target is known to be a class. + const srcDir = path.join(tempDir, 'src'); + fs.mkdirSync(srcDir, { recursive: true }); + + fs.writeFileSync( + path.join(srcDir, 'app.py'), + `class UserService: + def __init__(self): + self.db = None + +def bootstrap(): + return UserService() +` + ); + + cg = await CodeGraph.init(tempDir, { index: true }); + cg.resolveReferences(); + + const bootstrap = cg + .getNodesByKind('function') + .find((n) => n.name === 'bootstrap'); + expect(bootstrap).toBeDefined(); + + const outgoing = cg.getOutgoingEdges(bootstrap!.id); + const instantiates = outgoing.find((e) => e.kind === 'instantiates'); + expect(instantiates).toBeDefined(); + // Same edge must NOT also appear as a `calls` edge — promotion + // replaces the kind, doesn't duplicate. + const callsToUserService = outgoing.filter( + (e) => e.kind === 'calls' && e.target === instantiates!.target + ); + expect(callsToUserService).toHaveLength(0); + }); + }); + + describe('Name Matcher: kind bias for new ref kinds', () => { + const baseContext = (candidates: Node[]): ResolutionContext => ({ + getNodesInFile: () => [], + getNodesByName: (name) => candidates.filter((c) => c.name === name), + getNodesByQualifiedName: () => [], + getNodesByKind: () => [], + fileExists: () => true, + readFile: () => null, + getProjectRoot: () => '/test', + getAllFiles: () => [], + getNodesByLowerName: () => [], + getImportMappings: () => [], + }); + + it('prefers a class candidate over a function for `instantiates` refs', () => { + // A class and a function share a name across the codebase. + // Without the kind bias, the function (which gets the +25 `calls` + // bonus historically applied to all candidates of that kind) would + // win. Now the instantiates branch reverses it. + const fn: Node = { + id: 'func:utils.ts:Logger:5', kind: 'function', name: 'Logger', + qualifiedName: 'utils.ts::Logger', filePath: 'utils.ts', language: 'typescript', + startLine: 5, endLine: 7, startColumn: 0, endColumn: 0, updatedAt: Date.now(), + }; + const cls: Node = { + id: 'class:logger.ts:Logger:10', kind: 'class', name: 'Logger', + qualifiedName: 'logger.ts::Logger', filePath: 'logger.ts', language: 'typescript', + startLine: 10, endLine: 30, startColumn: 0, endColumn: 0, updatedAt: Date.now(), + }; + + const ref = { + fromNodeId: 'func:main.ts:bootstrap:1', + referenceName: 'Logger', + referenceKind: 'instantiates' as const, + line: 5, column: 0, filePath: 'main.ts', language: 'typescript' as const, + }; + + const result = matchReference(ref, baseContext([fn, cls])); + expect(result?.targetNodeId).toBe('class:logger.ts:Logger:10'); + }); + + it('prefers a function candidate over a non-function for `decorates` refs', () => { + const variable: Node = { + id: 'var:config.ts:Inject:5', kind: 'variable', name: 'Inject', + qualifiedName: 'config.ts::Inject', filePath: 'config.ts', language: 'typescript', + startLine: 5, endLine: 5, startColumn: 0, endColumn: 0, updatedAt: Date.now(), + }; + const decorator: Node = { + id: 'func:di.ts:Inject:10', kind: 'function', name: 'Inject', + qualifiedName: 'di.ts::Inject', filePath: 'di.ts', language: 'typescript', + startLine: 10, endLine: 20, startColumn: 0, endColumn: 0, updatedAt: Date.now(), + }; + + const ref = { + fromNodeId: 'class:svc.ts:UserService:1', + referenceName: 'Inject', + referenceKind: 'decorates' as const, + line: 5, column: 0, filePath: 'svc.ts', language: 'typescript' as const, + }; + + const result = matchReference(ref, baseContext([variable, decorator])); + expect(result?.targetNodeId).toBe('func:di.ts:Inject:10'); + }); + }); + + describe('tsconfig path aliases', () => { + it('resolves an aliased import to the alias-mapped file (not a same-named file elsewhere)', async () => { + // Two same-named exports in different directories. Without alias + // resolution, name-matcher would pick whichever it finds first; + // with alias resolution, the import path uniquely picks one. + fs.mkdirSync(path.join(tempDir, 'src/utils'), { recursive: true }); + fs.mkdirSync(path.join(tempDir, 'src/legacy'), { recursive: true }); + fs.writeFileSync( + path.join(tempDir, 'src/utils/format.ts'), + `export function pickMe(): number { return 1; }\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/legacy/format.ts'), + `export function pickMe(): number { return 99; }\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/main.ts'), + `import { pickMe } from '@utils/format';\nexport function go(): number { return pickMe(); }\n` + ); + fs.writeFileSync( + path.join(tempDir, 'tsconfig.json'), + JSON.stringify({ + compilerOptions: { + baseUrl: './src', + paths: { '@utils/*': ['utils/*'] }, + }, + }) + ); + + cg = await CodeGraph.init(tempDir, { index: true }); + cg.resolveReferences(); + + // The two pickMe nodes live in different files. The aliased + // import should attach the call edge to the @utils-mapped one, + // not the legacy duplicate. + const all = cg.getNodesByKind('function').filter((n) => n.name === 'pickMe'); + const utilsNode = all.find((n) => n.filePath === 'src/utils/format.ts'); + const legacyNode = all.find((n) => n.filePath === 'src/legacy/format.ts'); + expect(utilsNode).toBeDefined(); + expect(legacyNode).toBeDefined(); + + const utilsCallers = cg.getCallers(utilsNode!.id); + const legacyCallers = cg.getCallers(legacyNode!.id); + expect(utilsCallers.length).toBeGreaterThan(0); + expect(utilsCallers.some((c) => c.node.filePath === 'src/main.ts')).toBe(true); + // The legacy node should NOT have a caller from src/main.ts — + // the alias correctly picked the utils version. + expect(legacyCallers.some((c) => c.node.filePath === 'src/main.ts')).toBe(false); + }); + + it('falls back gracefully when tsconfig is absent', async () => { + fs.mkdirSync(path.join(tempDir, 'src'), { recursive: true }); + fs.writeFileSync( + path.join(tempDir, 'src/a.ts'), + `export function aFn(): void {}\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/b.ts'), + `import { aFn } from './a';\nexport function bFn(): void { aFn(); }\n` + ); + + cg = await CodeGraph.init(tempDir, { index: true }); + // No tsconfig present — index should still complete and the + // relative-import-based call edge should be created. + const aFn = cg.getNodesByKind('function').find((n) => n.name === 'aFn'); + expect(aFn).toBeDefined(); + const callers = cg.getCallers(aFn!.id); + expect(callers.some((c) => c.node.filePath === 'src/b.ts')).toBe(true); + }); + }); + + describe('re-export chain following', () => { + it('chases a 3-hop barrel chain (wildcard → named → declaration)', async () => { + // main.ts → all.ts (wildcard) → index.ts (named) → auth.ts (declaration). + // Without chain following, `signIn` resolves to nothing because + // none of the barrel files declare it directly. + fs.mkdirSync(path.join(tempDir, 'src/services'), { recursive: true }); + fs.writeFileSync( + path.join(tempDir, 'src/services/auth.ts'), + `export function signIn(): void {}\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/services/index.ts'), + `export { signIn } from './auth';\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/all.ts'), + `export * from './services/index';\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/main.ts'), + `import { signIn } from './all';\nexport function go(): void { signIn(); }\n` + ); + + cg = await CodeGraph.init(tempDir, { index: true }); + cg.resolveReferences(); + + const signInNode = cg + .getNodesByKind('function') + .find((n) => n.name === 'signIn' && n.filePath === 'src/services/auth.ts'); + expect(signInNode).toBeDefined(); + const callers = cg.getCallers(signInNode!.id); + expect(callers.some((c) => c.node.filePath === 'src/main.ts')).toBe(true); + }); + + it('follows a renamed named re-export (export { foo as bar } from ...)', async () => { + // The chase has to look up `foo` in the upstream module even + // though the importer asked for `bar` — exercises the rename + // branch of findExportedSymbol. + fs.mkdirSync(path.join(tempDir, 'src'), { recursive: true }); + fs.writeFileSync( + path.join(tempDir, 'src/auth.ts'), + `export function signIn(): void {}\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/index.ts'), + `export { signIn as login } from './auth';\n` + ); + fs.writeFileSync( + path.join(tempDir, 'src/main.ts'), + `import { login } from './index';\nexport function go(): void { login(); }\n` + ); + + cg = await CodeGraph.init(tempDir, { index: true }); + cg.resolveReferences(); + + const signInNode = cg + .getNodesByKind('function') + .find((n) => n.name === 'signIn' && n.filePath === 'src/auth.ts'); + expect(signInNode).toBeDefined(); + const callers = cg.getCallers(signInNode!.id); + expect(callers.some((c) => c.node.filePath === 'src/main.ts')).toBe(true); + }); }); }); diff --git a/__tests__/search-query-parser.test.ts b/__tests__/search-query-parser.test.ts new file mode 100644 index 00000000..8a7767da --- /dev/null +++ b/__tests__/search-query-parser.test.ts @@ -0,0 +1,142 @@ +/** + * Unit tests for the field-qualified query parser and bounded + * edit distance — the two algorithms behind `kind:`/`lang:`/`path:`/ + * `name:` filtering and the fuzzy typo fallback. + */ + +import { describe, it, expect } from 'vitest'; +import { parseQuery, boundedEditDistance } from '../src/search/query-parser'; + +describe('parseQuery', () => { + it('returns plain text for a query with no field prefixes', () => { + const r = parseQuery('authenticate user'); + expect(r.text).toBe('authenticate user'); + expect(r.kinds).toEqual([]); + expect(r.languages).toEqual([]); + expect(r.pathFilters).toEqual([]); + expect(r.nameFilters).toEqual([]); + }); + + it('extracts kind: filter and removes it from text', () => { + const r = parseQuery('kind:function auth'); + expect(r.kinds).toEqual(['function']); + expect(r.text).toBe('auth'); + }); + + it('extracts lang: and language: as the same filter family', () => { + const a = parseQuery('lang:typescript foo'); + const b = parseQuery('language:typescript foo'); + expect(a.languages).toEqual(['typescript']); + expect(b.languages).toEqual(['typescript']); + }); + + it('handles multiple kind: filters as an OR set', () => { + const r = parseQuery('kind:function kind:method auth'); + expect(r.kinds.sort()).toEqual(['function', 'method']); + }); + + it('extracts path: and name: as substring filters (kept verbatim)', () => { + const r = parseQuery('path:src/api name:Handler'); + expect(r.pathFilters).toEqual(['src/api']); + expect(r.nameFilters).toEqual(['Handler']); + }); + + it('preserves quoted spans as a single token (whitespace in path:)', () => { + const r = parseQuery('path:"my dir/file" foo'); + expect(r.pathFilters).toEqual(['my dir/file']); + expect(r.text).toBe('foo'); + }); + + it('passes URL-like tokens through to text (does not match http: as a field)', () => { + const r = parseQuery('http://example.com'); + expect(r.text).toBe('http://example.com'); + expect(r.kinds).toEqual([]); + }); + + it('passes empty-value tokens through as text (kind: → "kind:")', () => { + const r = parseQuery('kind: foo'); + expect(r.kinds).toEqual([]); + // The trailing-colon token comes back as plain text + expect(r.text.includes('kind:')).toBe(true); + }); + + it('passes unknown field prefixes through as text (TODO: keeps the colon)', () => { + const r = parseQuery('TODO: needs review'); + expect(r.text).toBe('TODO: needs review'); + expect(r.kinds).toEqual([]); + }); + + it('rejects unknown values for kind: (passes the whole token to text)', () => { + const r = parseQuery('kind:invalid foo'); + // Invalid kind value falls back to text + expect(r.kinds).toEqual([]); + expect(r.text).toContain('kind:invalid'); + }); + + it('handles all-filters-no-text query', () => { + const r = parseQuery('kind:function lang:typescript'); + expect(r.kinds).toEqual(['function']); + expect(r.languages).toEqual(['typescript']); + expect(r.text).toBe(''); + }); + + it('survives empty input', () => { + const r = parseQuery(''); + expect(r.text).toBe(''); + expect(r.kinds).toEqual([]); + }); + + it('survives a very long input (no allocation explosion)', () => { + const huge = 'foo '.repeat(5000); // 20k chars + const r = parseQuery(huge); + expect(r.text.length).toBeGreaterThan(0); + }); +}); + +describe('boundedEditDistance', () => { + it('returns 0 for identical strings', () => { + expect(boundedEditDistance('user', 'user', 2)).toBe(0); + }); + + it('returns 1 for a single substitution', () => { + expect(boundedEditDistance('user', 'usar', 2)).toBe(1); + }); + + it('returns 1 for a single insertion', () => { + expect(boundedEditDistance('user', 'users', 2)).toBe(1); + }); + + it('returns 1 for a single deletion', () => { + expect(boundedEditDistance('users', 'user', 2)).toBe(1); + }); + + it('returns 2 for a transposition (two edits in basic Levenshtein)', () => { + // 'aple' vs 'palp' would be 2; pick a clearer pair. + // 'foo' vs 'fou': substitution + insertion = 2 if different lengths. + expect(boundedEditDistance('confg', 'configX', 2)).toBe(2); + }); + + it('returns maxDist+1 when distance clearly exceeds budget', () => { + expect(boundedEditDistance('foo', 'completely-different', 2)).toBe(3); + }); + + it('respects length-difference shortcut', () => { + // |len(a) - len(b)| > maxDist must immediately be over budget + expect(boundedEditDistance('a', 'aaaaaaa', 2)).toBe(3); + }); + + it('handles empty inputs', () => { + expect(boundedEditDistance('', '', 2)).toBe(0); + expect(boundedEditDistance('a', '', 2)).toBe(1); + expect(boundedEditDistance('', 'abc', 2)).toBe(3); + }); + + it('is case-sensitive — caller must lowercase if case-insensitive match wanted', () => { + expect(boundedEditDistance('Foo', 'foo', 2)).toBe(1); + }); + + it('early-exits when row min exceeds budget (correctness, not just perf)', () => { + // 'aaaaa' vs 'bbbbb': distance is 5, well over budget 2 + expect(boundedEditDistance('aaaaa', 'bbbbb', 2)).toBe(3); + }); +}); diff --git a/__tests__/security.test.ts b/__tests__/security.test.ts index 53441d58..75ac8432 100644 --- a/__tests__/security.test.ts +++ b/__tests__/security.test.ts @@ -12,12 +12,10 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import * as fs from 'fs'; import * as path from 'path'; import * as os from 'os'; -import { FileLock } from '../src/utils'; +import { FileLock, validateProjectPath } from '../src/utils'; import CodeGraph from '../src/index'; import { ToolHandler, tools } from '../src/mcp/tools'; -import { shouldIncludeFile, scanDirectory } from '../src/extraction'; -import { shouldIncludeFile as configShouldInclude } from '../src/config'; -import { CodeGraphConfig, DEFAULT_CONFIG } from '../src/types'; +import { scanDirectory, isSourceFile } from '../src/extraction'; import { DatabaseConnection, getDatabasePath } from '../src/db'; import { QueryBuilder } from '../src/db/queries'; @@ -178,6 +176,36 @@ describe('Path Traversal Prevention', () => { }); }); +describe('validateProjectPath — sensitive directory blocking', () => { + // POSIX-only: on Windows '/etc' resolves to C:\etc (non-existent), not a + // sensitive dir — the Windows case is covered by the win32-gated test below. + it.runIf(process.platform !== 'win32')('blocks POSIX system directories (exact match)', () => { + expect(validateProjectPath('/')).toMatch(/sensitive system directory/i); + expect(validateProjectPath('/etc')).toMatch(/sensitive system directory/i); + }); + + it('allows a normal, existing directory', () => { + const dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-validate-')); + try { + expect(validateProjectPath(dir)).toBeNull(); + } finally { + fs.rmSync(dir, { recursive: true, force: true }); + } + }); + + // SENSITIVE_PATHS stores the Windows entries lowercase and validateProjectPath + // matches via resolved.toLowerCase(), so 'C:\\Windows' and 'c:\\windows' are + // both blocked. path.resolve is platform-specific, so this only runs on Windows. + it.runIf(process.platform === 'win32')( + 'blocks Windows system directories regardless of case', + () => { + expect(validateProjectPath('C:\\Windows')).toMatch(/sensitive system directory/i); + expect(validateProjectPath('c:\\windows')).toMatch(/sensitive system directory/i); + expect(validateProjectPath('C:\\WINDOWS\\System32')).toMatch(/sensitive system directory/i); + } + ); +}); + describe('MCP Input Validation', () => { let testDir: string; let cg: CodeGraph; @@ -241,6 +269,20 @@ describe('MCP Input Validation', () => { expect(result.content[0].text).toContain('non-empty string'); }); + it('should truncate oversized codegraph_context output', async () => { + const oversizedContext = Array.from({ length: 400 }, (_, i) => `line-${i} ${'x'.repeat(80)}`).join('\n'); + const fakeCg = { + buildContext: async () => oversizedContext, + }; + const fakeHandler = new ToolHandler(fakeCg as unknown as CodeGraph); + + const result = await fakeHandler.execute('codegraph_context', { task: 'find example' }); + + expect(result.isError).toBeFalsy(); + expect(result.content[0].text.length).toBeLessThan(oversizedContext.length); + expect(result.content[0].text).toContain('... (output truncated)'); + }); + it('should reject non-string symbol in codegraph_impact', async () => { const result = await handler.execute('codegraph_impact', { symbol: [] }); expect(result.isError).toBe(true); @@ -265,6 +307,34 @@ describe('MCP Input Validation', () => { const result = await handler.execute('codegraph_search', { query: 'example', limit: -5 }); expect(result.isError).toBeFalsy(); }); + + // #230: getCodeGraph must reject a sensitive system directory passed as + // projectPath before opening it. The error surfaces through execute()'s + // catch as an isError result. /etc is sensitive on POSIX; C:\Windows on + // Windows (path.resolve is platform-specific, so each case is gated). + it.runIf(process.platform !== 'win32')( + 'rejects a sensitive POSIX projectPath (/etc) via the MCP handler', + async () => { + const result = await handler.execute('codegraph_search', { + query: 'example', + projectPath: '/etc', + }); + expect(result.isError).toBe(true); + expect(result.content[0].text).toMatch(/sensitive system directory/i); + } + ); + + it.runIf(process.platform === 'win32')( + 'rejects a sensitive Windows projectPath (C:\\Windows) via the MCP handler', + async () => { + const result = await handler.execute('codegraph_search', { + query: 'example', + projectPath: 'C:\\Windows', + }); + expect(result.isError).toBe(true); + expect(result.content[0].text).toMatch(/sensitive system directory/i); + } + ); }); describe('Atomic Writes', () => { @@ -298,58 +368,24 @@ describe('Atomic Writes', () => { }); }); -describe('Glob Matching (picomatch)', () => { - const makeConfig = (include: string[], exclude: string[]): CodeGraphConfig => ({ - ...DEFAULT_CONFIG, - rootDir: '/test', - include, - exclude, +describe('Source file detection (isSourceFile)', () => { + it('selects files by supported extension', () => { + expect(isSourceFile('src/index.ts')).toBe(true); + expect(isSourceFile('src/deep/nested/file.ts')).toBe(true); + expect(isSourceFile('src/component.tsx')).toBe(true); + expect(isSourceFile('lib/util.js')).toBe(true); + expect(isSourceFile('src/main.py')).toBe(true); }); - it('should match standard glob patterns in extraction', () => { - const config = makeConfig(['**/*.ts'], ['node_modules/**']); - - expect(shouldIncludeFile('src/index.ts', config)).toBe(true); - expect(shouldIncludeFile('src/deep/nested/file.ts', config)).toBe(true); - expect(shouldIncludeFile('src/index.js', config)).toBe(false); - expect(shouldIncludeFile('node_modules/lib/index.ts', config)).toBe(false); - }); - - it('should match standard glob patterns in config', () => { - const config = makeConfig(['**/*.py'], ['__pycache__/**']); - - expect(configShouldInclude('src/main.py', config)).toBe(true); - expect(configShouldInclude('src/main.ts', config)).toBe(false); - expect(configShouldInclude('__pycache__/module.py', config)).toBe(false); - }); - - it('should handle complex glob patterns correctly', () => { - const config = makeConfig(['src/**/*.{ts,tsx}', 'lib/**/*.js'], []); - - expect(shouldIncludeFile('src/component.ts', config)).toBe(true); - expect(shouldIncludeFile('src/component.tsx', config)).toBe(true); - expect(shouldIncludeFile('lib/util.js', config)).toBe(true); - expect(shouldIncludeFile('src/component.css', config)).toBe(false); - }); - - it('should handle patterns that previously caused ReDoS', () => { - // This pattern would cause catastrophic backtracking with hand-rolled regex - const evilPattern = '**/**/**/**/**/**/**/**/**/**/**/**/**/**/a'; - const config = makeConfig([evilPattern], []); - - const start = Date.now(); - // This should return quickly, not hang - shouldIncludeFile('x/x/x/x/x/x/x/x/x/x/x/x/x/x/b', config); - const elapsed = Date.now() - start; - - // Should complete in under 100ms, not seconds - expect(elapsed).toBeLessThan(100); + it('rejects unsupported extensions and extensionless files', () => { + expect(isSourceFile('src/component.css')).toBe(false); + expect(isSourceFile('README.md')).toBe(false); + expect(isSourceFile('Makefile')).toBe(false); + expect(isSourceFile('.gitignore')).toBe(false); }); - it('should handle dot files correctly', () => { - const config = makeConfig(['**/*.ts'], []); - - expect(shouldIncludeFile('.hidden/index.ts', config)).toBe(true); + it('matches regardless of leading dot directories', () => { + expect(isSourceFile('.hidden/index.ts')).toBe(true); }); }); @@ -464,15 +500,9 @@ describe('Symlink Cycle Detection', () => { return; } - const config: CodeGraphConfig = { - ...DEFAULT_CONFIG, - rootDir: tempDir, - include: ['**/*.ts'], - exclude: [], - }; // This should complete without hanging - const files = scanDirectory(tempDir, config); + const files = scanDirectory(tempDir); // Should find the real file but not loop infinitely expect(files).toContain('src/index.ts'); @@ -496,14 +526,8 @@ describe('Symlink Cycle Detection', () => { return; } - const config: CodeGraphConfig = { - ...DEFAULT_CONFIG, - rootDir: tempDir, - include: ['**/*.ts'], - exclude: [], - }; - const files = scanDirectory(tempDir, config); + const files = scanDirectory(tempDir); // Should find files from both the real dir and via the symlink // But deduplicate since they resolve to the same real path @@ -521,15 +545,100 @@ describe('Symlink Cycle Detection', () => { return; } - const config: CodeGraphConfig = { - ...DEFAULT_CONFIG, - rootDir: tempDir, - include: ['**/*.ts'], - exclude: [], - }; // Should not throw - const files = scanDirectory(tempDir, config); + const files = scanDirectory(tempDir); expect(files).toContain('src/valid.ts'); }); }); + +describe('Session marker symlink resistance', () => { + // The marker write lives in src/mcp/tools.ts behind handleContext. We exercise + // it end-to-end via ToolHandler.execute so the test exercises the same code + // path Claude Code drives. The session id is per-test so other parallel test + // runs can't collide with the marker file we plant a symlink at. + const SESSION_ID = `cg-test-${process.pid}-${Date.now()}-${Math.random().toString(36).slice(2)}`; + const crypto = require('crypto') as typeof import('crypto'); + const hash = crypto.createHash('md5').update(SESSION_ID).digest('hex').slice(0, 16); + const markerPath = path.join(os.tmpdir(), `codegraph-consulted-${hash}`); + + let projectDir: string; + let victimDir: string; + let victimFile: string; + + beforeEach(async () => { + projectDir = createTempDir(); + victimDir = createTempDir(); + victimFile = path.join(victimDir, 'private.txt'); + fs.writeFileSync(victimFile, 'SECRET-DO-NOT-OVERWRITE\n'); + if (fs.existsSync(markerPath)) fs.unlinkSync(markerPath); + + // A real .codegraph/ has to exist for handleContext to get past the + // "not initialized" guard — index a tiny fixture so the call reaches the + // marker write step rather than short-circuiting on missing project state. + fs.writeFileSync(path.join(projectDir, 'a.ts'), 'export const x = 1;\n'); + const cg = await CodeGraph.init(projectDir); + await cg.indexAll(); + cg.close(); + }); + + afterEach(() => { + if (fs.existsSync(markerPath)) fs.unlinkSync(markerPath); + cleanupTempDir(projectDir); + cleanupTempDir(victimDir); + }); + + it('does not follow a pre-planted symlink at the marker path', async () => { + // Skip on platforms where the user can't create symlinks (Windows without + // dev mode + admin). The CWE-59 risk we're guarding against doesn't apply + // when symlinks aren't creatable, so the skip is correct, not a gap. + try { + fs.symlinkSync(victimFile, markerPath); + } catch { + return; + } + + const cg = await CodeGraph.open(projectDir); + const handler = new ToolHandler(cg); + process.env.CLAUDE_SESSION_ID = SESSION_ID; + try { + await handler.execute('codegraph_context', { task: 'find x' }); + } finally { + delete process.env.CLAUDE_SESSION_ID; + cg.close(); + } + + // The victim file's contents must be untouched — the old writeFileSync + // path would have followed the symlink and written an ISO timestamp here. + expect(fs.readFileSync(victimFile, 'utf8')).toBe('SECRET-DO-NOT-OVERWRITE\n'); + + // And the marker path itself must still be the symlink we planted — + // no fallback path that quietly unlinked + recreated it (which would + // also work, but is a behavior we don't want to silently rely on). + expect(fs.lstatSync(markerPath).isSymbolicLink()).toBe(true); + }); + + it('writes the marker file with 0o600 perms on a clean path', async () => { + // No symlink planted — happy path. Verifies the new openSync(mode: 0o600) + // call is what actually lands on disk (regression guard for the perm + // tightening that came with the O_NOFOLLOW fix). + const cg = await CodeGraph.open(projectDir); + const handler = new ToolHandler(cg); + process.env.CLAUDE_SESSION_ID = SESSION_ID; + try { + await handler.execute('codegraph_context', { task: 'find x' }); + } finally { + delete process.env.CLAUDE_SESSION_ID; + cg.close(); + } + + expect(fs.existsSync(markerPath)).toBe(true); + // chmod's low 9 bits — strip the file-type bits for a clean compare. + // Windows can't enforce 0o600 in the POSIX sense; skip the assertion + // there since the underlying OS will normalize the mode anyway. + if (process.platform !== 'win32') { + const mode = fs.statSync(markerPath).mode & 0o777; + expect(mode).toBe(0o600); + } + }); +}); diff --git a/__tests__/sqlite-backend.test.ts b/__tests__/sqlite-backend.test.ts new file mode 100644 index 00000000..0815551d --- /dev/null +++ b/__tests__/sqlite-backend.test.ts @@ -0,0 +1,44 @@ +/** + * SQLite backend reporting. + * + * node:sqlite (Node's built-in real SQLite) is the sole backend. Pin that + * DatabaseConnection / CodeGraph report it and come up in WAL. + */ + +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { DatabaseConnection } from '../src/db'; +import { CodeGraph } from '../src'; + +describe('DatabaseConnection — backend reporting', () => { + let dir: string; + + beforeEach(() => { + dir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-backend-')); + }); + + afterEach(() => { + if (fs.existsSync(dir)) { + fs.rmSync(dir, { recursive: true, force: true }); + } + }); + + it('reports the node-sqlite backend in WAL for an initialized DB', () => { + const conn = DatabaseConnection.initialize(path.join(dir, 'test.db')); + expect(conn.getBackend()).toBe('node-sqlite'); + expect(conn.getJournalMode()).toBe('wal'); + conn.close(); + }); + + it('CodeGraph.getBackend() delegates to the underlying DatabaseConnection', async () => { + fs.writeFileSync(path.join(dir, 'x.ts'), `export function x(): void {}\n`); + const cg = await CodeGraph.init(dir, { index: true }); + try { + expect(cg.getBackend()).toBe('node-sqlite'); + } finally { + cg.destroy(); + } + }); +}); diff --git a/__tests__/strip-comments.test.ts b/__tests__/strip-comments.test.ts new file mode 100644 index 00000000..ef2ec057 --- /dev/null +++ b/__tests__/strip-comments.test.ts @@ -0,0 +1,134 @@ +import { describe, it, expect } from 'vitest'; +import { stripCommentsForRegex } from '../src/resolution/strip-comments'; + +describe('stripCommentsForRegex', () => { + it('python: strips line comments', () => { + const src = "x = 1 # path('/fake/', View)\nreal = 2"; + const out = stripCommentsForRegex(src, 'python'); + expect(out).not.toMatch(/path\('\/fake\//); + expect(out).toMatch(/real = 2/); + }); + + it('python: strips triple-quoted docstrings', () => { + const src = `""" +path('/in-docstring/', View) +""" +real = 1 +`; + const out = stripCommentsForRegex(src, 'python'); + expect(out).not.toMatch(/in-docstring/); + expect(out).toMatch(/real = 1/); + }); + + it('python: keeps # inside strings', () => { + const src = `path('#/fragment/', View)\n`; + const out = stripCommentsForRegex(src, 'python'); + expect(out).toContain("'#/fragment/'"); + }); + + it('python: handles triple-single-quoted docstrings', () => { + const src = `'''\npath('/fake/')\n'''\nreal = 1\n`; + const out = stripCommentsForRegex(src, 'python'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/real = 1/); + }); + + it('typescript: strips //, /* */', () => { + const src = + "// app.get('/fake', x)\n/* app.get('/also-fake', y) */\napp.get('/real', z)"; + const out = stripCommentsForRegex(src, 'typescript'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/'\/real'/); + }); + + it('typescript: keeps // inside strings', () => { + const src = `const url = "https://example.com/path";\n`; + const out = stripCommentsForRegex(src, 'typescript'); + expect(out).toContain('https://example.com/path'); + }); + + it('php: strips //, #, and /* */', () => { + const src = + "// Route::get('/a', X::class)\n# Route::get('/b', Y::class)\n/* Route::get('/c', Z::class) */\nReal::go();"; + const out = stripCommentsForRegex(src, 'php'); + expect(out).not.toMatch(/'\/a'/); + expect(out).not.toMatch(/'\/b'/); + expect(out).not.toMatch(/'\/c'/); + expect(out).toContain('Real::go();'); + }); + + it('ruby: strips =begin/=end', () => { + const src = + "=begin\nget '/fake', to: 'x#y'\n=end\nget '/real', to: 'a#b'\n"; + const out = stripCommentsForRegex(src, 'ruby'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/'\/real'/); + }); + + it('ruby: strips # comments', () => { + const src = "# get '/fake', to: 'x#y'\nget '/real', to: 'a#b'\n"; + const out = stripCommentsForRegex(src, 'ruby'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/'\/real'/); + }); + + it('rust: handles nested block comments', () => { + const src = + '/* outer /* inner */ still in outer */ .route("/real", get(h))'; + const out = stripCommentsForRegex(src, 'rust'); + expect(out).not.toMatch(/inner/); + expect(out).toMatch(/\/real/); + }); + + it('go: keeps backtick raw strings intact, strips // comments', () => { + const src = '// r.GET("/fake", h)\nr.GET(`/real`, h2)\n'; + const out = stripCommentsForRegex(src, 'go'); + expect(out).not.toMatch(/fake/); + // backtick raw string contents preserved + expect(out).toMatch(/`\/real`/); + }); + + it('go: strips block comments containing route-shaped text', () => { + const src = '/* r.GET("/fake", h) */\nr.GET("/real", h2)\n'; + const out = stripCommentsForRegex(src, 'go'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/"\/real"/); + }); + + it('java: strips // and /* */ comments', () => { + const src = + '// @GetMapping("/fake")\n/* @PostMapping("/also-fake") */\n@GetMapping("/real")\n'; + const out = stripCommentsForRegex(src, 'java'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/"\/real"/); + }); + + it('csharp: strips // and /* */ comments', () => { + const src = + '// [HttpGet("/fake")]\n/* [HttpPost("/also-fake")] */\n[HttpGet("/real")]\n'; + const out = stripCommentsForRegex(src, 'csharp'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/"\/real"/); + }); + + it('swift: strips // and /* */ comments', () => { + const src = + '// app.get("fake", use: x)\n/* app.get("also-fake", use: y) */\napp.get("real", use: z)\n'; + const out = stripCommentsForRegex(src, 'swift'); + expect(out).not.toMatch(/fake/); + expect(out).toMatch(/"real"/); + }); + + it('preserves line numbers (newlines retained)', () => { + const src = "line1\n# comment with path('/fake/')\nline3"; + const out = stripCommentsForRegex(src, 'python'); + expect(out.split('\n').length).toBe(3); + expect(out.split('\n')[2]).toBe('line3'); + }); + + it('preserves overall length so source offsets stay valid', () => { + const src = "x = 1 # path('/fake/', View)\nreal = 2"; + const out = stripCommentsForRegex(src, 'python'); + expect(out.length).toBe(src.length); + }); +}); diff --git a/__tests__/symbol-lookup.test.ts b/__tests__/symbol-lookup.test.ts new file mode 100644 index 00000000..86dda6cb --- /dev/null +++ b/__tests__/symbol-lookup.test.ts @@ -0,0 +1,194 @@ +/** + * Module-qualified symbol lookup (`stage_apply::run`, `Session.request`, + * `configurator/stage_apply`). + * + * Pinned because the lookup vocabulary is what makes codegraph useful + * in workspaces with same-named symbols across modules — Rust + * sub-pipelines, Python `__init__.py` packages, Java packages, etc. + * See #173 for the original report: a `run` function in + * `src/configurator/stage_apply.rs` was indexed but `stage_apply::run` + * returned "not found" because (a) FTS strips colons to nothing, + * leaving a useless query, and (b) `matchesSymbol` only understood + * `.`-style qualifiers. + */ + +import { describe, it, expect, beforeAll, beforeEach, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { initGrammars, loadAllGrammars } from '../src/extraction/grammars'; + +beforeAll(async () => { + await initGrammars(); + await loadAllGrammars(); +}); + +function hasSqliteBindings(): boolean { + try { + const { DatabaseSync } = require('node:sqlite'); + const db = new DatabaseSync(':memory:'); + db.close(); + return true; + } catch { + return false; + } +} +const HAS_SQLITE = hasSqliteBindings(); + +function tmpRoot(): string { + return fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-symbol-lookup-')); +} + +function rmTree(dir: string): void { + if (fs.existsSync(dir)) fs.rmSync(dir, { recursive: true, force: true }); +} + +async function buildRustWorkspace(): Promise { + const root = tmpRoot(); + const cfgDir = path.join(root, 'src', 'configurator'); + fs.mkdirSync(cfgDir, { recursive: true }); + fs.writeFileSync( + path.join(root, 'Cargo.toml'), + `[package]\nname = "fixture"\nversion = "0.1.0"\nedition = "2021"\n[lib]\npath = "src/lib.rs"\n` + ); + fs.writeFileSync(path.join(root, 'src', 'lib.rs'), `pub mod configurator;\npub mod scheduler;\n`); + fs.writeFileSync( + path.join(cfgDir, 'mod.rs'), + `pub mod stage_apply;\npub mod stage_detect;\n` + ); + fs.writeFileSync( + path.join(cfgDir, 'stage_apply.rs'), + `pub async fn run() -> Result<(), ()> {\n render_and_write();\n Ok(())\n}\n\nfn render_and_write() {}\n` + ); + fs.writeFileSync( + path.join(cfgDir, 'stage_detect.rs'), + `pub async fn run() -> Result<(), ()> { Ok(()) }\n` + ); + fs.writeFileSync( + path.join(root, 'src', 'scheduler.rs'), + `pub fn run_due_tasks() -> Result<(), ()> { Ok(()) }\n` + ); + return root; +} + +describe.skipIf(!HAS_SQLITE)('matchesSymbol — module-qualified lookups (#173)', () => { + let projectRoot: string; + let cg: any; + let handler: any; + let findSymbol: (cg: any, s: string) => { node: any; note: string } | null; + let findAllSymbols: (cg: any, s: string) => { nodes: any[]; note: string }; + + beforeEach(async () => { + projectRoot = await buildRustWorkspace(); + const CodeGraph = (await import('../src/index')).default; + const { ToolHandler } = await import('../src/mcp/tools'); + cg = CodeGraph.initSync(projectRoot, { + config: { include: ['**/*.rs'], exclude: [] }, + }); + await cg.indexAll(); + handler = new ToolHandler(cg); + findSymbol = (handler as any).findSymbol.bind(handler); + findAllSymbols = (handler as any).findAllSymbols.bind(handler); + }); + + afterEach(() => { + handler?.closeAll(); + cg?.destroy(); + rmTree(projectRoot); + }); + + it('resolves `stage_apply::run` to the run in stage_apply.rs (not stage_detect.rs)', () => { + const match = findSymbol(cg, 'stage_apply::run'); + expect(match).not.toBeNull(); + expect(match!.node.name).toBe('run'); + expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/); + }); + + it('rejects `stage_apply::run` for the same-named function in a different module', () => { + const all = findAllSymbols(cg, 'stage_apply::run'); + // All returned nodes must be in stage_apply.rs — never in stage_detect.rs + for (const node of all.nodes) { + expect(node.filePath).toMatch(/stage_apply\.rs$/); + } + expect(all.nodes.length).toBeGreaterThan(0); + }); + + it('resolves `configurator::stage_apply::run` (multi-level qualifier)', () => { + const match = findSymbol(cg, 'configurator::stage_apply::run'); + expect(match).not.toBeNull(); + expect(match!.node.name).toBe('run'); + expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/); + }); + + it('resolves `crate::configurator::stage_apply::run` (Rust path prefix stripped)', () => { + const match = findSymbol(cg, 'crate::configurator::stage_apply::run'); + expect(match).not.toBeNull(); + expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/); + }); + + it('resolves `configurator/stage_apply` (slash qualifier)', () => { + const match = findSymbol(cg, 'configurator/stage_apply/run'); + expect(match).not.toBeNull(); + expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/); + }); + + it('does not silently collide bare `run` with `run_due_tasks`', () => { + const match = findSymbol(cg, 'run'); + expect(match).not.toBeNull(); + // Whatever it picks, it must be an exact-name match, not a partial. + expect(match!.node.name).toBe('run'); + }); + + it('aggregates all bare-name `run` matches across modules', () => { + const all = findAllSymbols(cg, 'run'); + const names = all.nodes.map((n: any) => n.name); + expect(names.every((n: string) => n === 'run')).toBe(true); + expect(all.nodes.length).toBeGreaterThanOrEqual(2); // stage_apply + stage_detect + // The note should call out the ambiguity. + expect(all.note).toMatch(/Aggregated|symbols named "run"/); + }); + + it('still returns null for genuinely unknown qualified lookups', () => { + const match = findSymbol(cg, 'stage_apply::nonexistent_fn'); + expect(match).toBeNull(); + }); +}); + +describe.skipIf(!HAS_SQLITE)('matchesSymbol — dotted lookups (regression for #173 fix)', () => { + let projectRoot: string; + let cg: any; + let handler: any; + let findSymbol: (cg: any, s: string) => { node: any; note: string } | null; + + beforeEach(async () => { + projectRoot = tmpRoot(); + const src = path.join(projectRoot, 'src'); + fs.mkdirSync(src, { recursive: true }); + fs.writeFileSync( + path.join(src, 'session.ts'), + `export class Session {\n request(): void {}\n}\nexport function request(): void {}\n` + ); + + const CodeGraph = (await import('../src/index')).default; + const { ToolHandler } = await import('../src/mcp/tools'); + cg = CodeGraph.initSync(projectRoot, { + config: { include: ['src/**/*.ts'], exclude: [] }, + }); + await cg.indexAll(); + handler = new ToolHandler(cg); + findSymbol = (handler as any).findSymbol.bind(handler); + }); + + afterEach(() => { + handler?.closeAll(); + cg?.destroy(); + rmTree(projectRoot); + }); + + it('`Session.request` resolves to the method, not the bare function', () => { + const match = findSymbol(cg, 'Session.request'); + expect(match).not.toBeNull(); + expect(match!.node.kind).toBe('method'); + expect(match!.node.qualifiedName).toContain('Session::request'); + }); +}); diff --git a/__tests__/sync.test.ts b/__tests__/sync.test.ts index 8365f630..708a92a4 100644 --- a/__tests__/sync.test.ts +++ b/__tests__/sync.test.ts @@ -225,6 +225,50 @@ describe('Sync Module', () => { expect(nodes.length).toBeGreaterThan(0); }); + it('should stop reporting untracked files once they are indexed (issue #206)', async () => { + // Untracked files stay `??` in git status even after codegraph indexes + // them. Change detection must compare them against the DB by hash, not + // report every untracked file as "added" on every sync/status. + fs.writeFileSync( + path.join(testDir, 'src', 'new.ts'), + `export function newFunc() { return 42; }` + ); + + // First sync indexes the untracked file. + const first = await cg.sync(); + expect(first.filesAdded).toBe(1); + + // The file is still untracked in git, but now lives in the DB. + expect(cg.searchNodes('newFunc').length).toBeGreaterThan(0); + + // status must not keep flagging it as a pending addition... + const changes = cg.getChangedFiles(); + expect(changes.added).not.toContain('src/new.ts'); + expect(changes.modified).not.toContain('src/new.ts'); + + // ...and a second sync must be a no-op for it. + const second = await cg.sync(); + expect(second.filesAdded).toBe(0); + expect(second.filesModified).toBe(0); + }); + + it('should re-index an untracked file when its contents change', async () => { + const filePath = path.join(testDir, 'src', 'new.ts'); + fs.writeFileSync(filePath, `export function newFunc() { return 42; }`); + await cg.sync(); + + // Modify the still-untracked file. + fs.writeFileSync(filePath, `export function renamedFunc() { return 7; }`); + + const changes = cg.getChangedFiles(); + expect(changes.modified).toContain('src/new.ts'); + + const result = await cg.sync(); + expect(result.filesModified).toBe(1); + expect(cg.searchNodes('renamedFunc').length).toBeGreaterThan(0); + expect(cg.searchNodes('newFunc').length).toBe(0); + }); + it('should detect deleted files via git', async () => { fs.unlinkSync(path.join(testDir, 'src', 'index.ts')); @@ -237,11 +281,11 @@ describe('Sync Module', () => { expect(nodes.length).toBe(0); }); - it('should skip files not matching config', async () => { - // Create a .js file which doesn't match **/*.ts + it('should skip files with unsupported extensions', async () => { + // A .txt file has no supported grammar, so sync must not index it. fs.writeFileSync( - path.join(testDir, 'src', 'ignored.js'), - `function ignored() {}` + path.join(testDir, 'src', 'notes.txt'), + `just some notes` ); const result = await cg.sync(); diff --git a/__tests__/wasm-runtime-flags.test.ts b/__tests__/wasm-runtime-flags.test.ts new file mode 100644 index 00000000..a4dae8bb --- /dev/null +++ b/__tests__/wasm-runtime-flags.test.ts @@ -0,0 +1,87 @@ +/** + * WASM runtime flags — the workaround for the V8 turboshaft WASM Zone OOM + * (`Fatal process out of memory: Zone`) that crashed `codegraph index` on large + * polyglot repos under Node >= 22. See issues #293 and #298. + * + * The crash was reproduced with the real indexer on the bundled Node 24 runtime; + * empirically only `--liftoff-only` prevents it (`--no-wasm-tier-up` / + * `--no-wasm-dynamic-tiering` do not), and the flag must be on node's command + * line — `setFlagsFromString`, worker `execArgv`, and `NODE_OPTIONS` all fail. + * These tests pin that contract so it can't silently regress. + */ +import { describe, it, expect } from 'vitest'; +import { spawnSync } from 'child_process'; +import * as fs from 'fs'; +import * as os from 'os'; +import * as path from 'path'; +import { + WASM_RUNTIME_FLAGS, + processHasWasmRuntimeFlags, + buildRelaunchArgv, +} from '../src/extraction/wasm-runtime-flags'; + +describe('WASM_RUNTIME_FLAGS', () => { + it('pins --liftoff-only (the only flag shown to stop the turboshaft Zone OOM)', () => { + // On Node 24, --no-wasm-tier-up and --no-wasm-dynamic-tiering both still + // crash; only --liftoff-only forces grammars onto the Liftoff baseline and + // off the optimizing tier. Pin it so it can't be swapped for an ineffective + // flag. + expect(WASM_RUNTIME_FLAGS).toContain('--liftoff-only'); + }); + + it('every flag is a real, accepted flag on the running Node/V8 runtime', () => { + // node rejects unknown CLI flags at startup, so a renamed/removed flag would + // break the bundled launcher and make the relaunch guard a silent no-op. + // Prove each flag actually launches node here. + const res = spawnSync( + process.execPath, + [...WASM_RUNTIME_FLAGS, '-e', 'process.exit(0)'], + { encoding: 'utf8' } + ); + expect(res.status, `node rejected ${WASM_RUNTIME_FLAGS.join(' ')}:\n${res.stderr}`).toBe(0); + }); +}); + +describe('processHasWasmRuntimeFlags', () => { + it('is true only when every required flag is present', () => { + expect(processHasWasmRuntimeFlags(['--liftoff-only'])).toBe(true); + expect(processHasWasmRuntimeFlags(['--liftoff-only', '--enable-source-maps'])).toBe(true); + }); + + it('is false when the flags are absent', () => { + expect(processHasWasmRuntimeFlags([])).toBe(false); + expect(processHasWasmRuntimeFlags(['--max-old-space-size=4096'])).toBe(false); + }); +}); + +describe('buildRelaunchArgv', () => { + it('places the wasm flags first, then the script and its args', () => { + expect(buildRelaunchArgv('/x/codegraph.js', ['index', '/repo'], [])).toEqual([ + '--liftoff-only', + '/x/codegraph.js', + 'index', + '/repo', + ]); + }); + + it('preserves other existing node flags without duplicating ours', () => { + expect( + buildRelaunchArgv('/x/codegraph.js', ['status'], ['--liftoff-only', '--enable-source-maps']) + ).toEqual(['--liftoff-only', '--enable-source-maps', '/x/codegraph.js', 'status']); + }); + + it('produces an argv that actually launches node WITH the flag applied', () => { + // End-to-end proof of the delivery mechanism without needing the crash: + // run the constructed argv and confirm the child sees the flag in execArgv. + const dir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-relaunch-')); + try { + const harness = path.join(dir, 'harness.cjs'); + fs.writeFileSync(harness, 'process.stdout.write(JSON.stringify(process.execArgv));'); + const res = spawnSync(process.execPath, buildRelaunchArgv(harness, []), { encoding: 'utf8' }); + expect(res.status, res.stderr).toBe(0); + expect(JSON.parse(res.stdout)).toContain('--liftoff-only'); + } finally { + fs.rmSync(dir, { recursive: true, force: true }); + } + }); +}); diff --git a/__tests__/watch-policy.test.ts b/__tests__/watch-policy.test.ts new file mode 100644 index 00000000..5cb92ce7 --- /dev/null +++ b/__tests__/watch-policy.test.ts @@ -0,0 +1,82 @@ +/** + * Watch Policy Tests + * + * Covers the decision of whether the live file watcher runs, including the + * WSL2 /mnt auto-detect and the env-var escape hatches (issue #199), plus + * that FileWatcher.start() honors the decision. + */ + +import { describe, it, expect, afterEach, vi } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { watchDisabledReason } from '../src/sync/watch-policy'; +import { FileWatcher } from '../src/sync/watcher'; + +describe('watchDisabledReason', () => { + it('returns a reason when CODEGRAPH_NO_WATCH=1', () => { + const reason = watchDisabledReason('/home/me/project', { + env: { CODEGRAPH_NO_WATCH: '1' }, + isWsl: false, + }); + expect(reason).toBeTruthy(); + expect(reason).toMatch(/CODEGRAPH_NO_WATCH/); + }); + + it('auto-disables on a WSL2 /mnt drive', () => { + const reason = watchDisabledReason('/mnt/d/code/project', { env: {}, isWsl: true }); + expect(reason).toBeTruthy(); + expect(reason).toMatch(/mnt/); + }); + + it('does NOT disable on a native WSL home path', () => { + expect(watchDisabledReason('/home/me/project', { env: {}, isWsl: true })).toBeNull(); + }); + + it('does NOT disable on /mnt when not running under WSL', () => { + // A real Linux box may legitimately have a fast /mnt mount. + expect(watchDisabledReason('/mnt/d/code/project', { env: {}, isWsl: false })).toBeNull(); + }); + + it('does NOT treat /mnt/wsl (fast Linux mount) as a Windows drive', () => { + expect(watchDisabledReason('/mnt/wsl/project', { env: {}, isWsl: true })).toBeNull(); + }); + + it('CODEGRAPH_FORCE_WATCH=1 overrides WSL auto-detect', () => { + const reason = watchDisabledReason('/mnt/d/code/project', { + env: { CODEGRAPH_FORCE_WATCH: '1' }, + isWsl: true, + }); + expect(reason).toBeNull(); + }); + + it('CODEGRAPH_NO_WATCH wins over CODEGRAPH_FORCE_WATCH', () => { + const reason = watchDisabledReason('/home/me/project', { + env: { CODEGRAPH_NO_WATCH: '1', CODEGRAPH_FORCE_WATCH: '1' }, + isWsl: false, + }); + expect(reason).toBeTruthy(); + }); +}); + +describe('FileWatcher honors the watch policy', () => { + let testDir: string; + + afterEach(() => { + delete process.env.CODEGRAPH_NO_WATCH; + if (testDir && fs.existsSync(testDir)) { + fs.rmSync(testDir, { recursive: true, force: true }); + } + }); + + it('does not start when CODEGRAPH_NO_WATCH=1', () => { + testDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-nowatch-')); + process.env.CODEGRAPH_NO_WATCH = '1'; + + const syncFn = vi.fn().mockResolvedValue({ filesChanged: 0, durationMs: 0 }); + const watcher = new FileWatcher(testDir, syncFn); + + expect(watcher.start()).toBe(false); + expect(watcher.isActive()).toBe(false); + }); +}); diff --git a/__tests__/watcher.test.ts b/__tests__/watcher.test.ts index f3638e6d..fde5f593 100644 --- a/__tests__/watcher.test.ts +++ b/__tests__/watcher.test.ts @@ -9,7 +9,6 @@ import * as fs from 'fs'; import * as path from 'path'; import * as os from 'os'; import { FileWatcher } from '../src/sync/watcher'; -import type { CodeGraphConfig } from '../src/types'; import CodeGraph from '../src/index'; /** @@ -34,18 +33,6 @@ function waitFor( describe('FileWatcher', () => { let testDir: string; - const baseConfig: CodeGraphConfig = { - version: 1, - rootDir: '.', - include: ['**/*.ts', '**/*.js'], - exclude: ['**/node_modules/**', '**/dist/**'], - languages: [], - frameworks: [], - maxFileSize: 1024 * 1024, - extractDocstrings: true, - trackCallSites: true, - }; - beforeEach(() => { testDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-watcher-')); // Create a source file so the directory isn't empty @@ -63,7 +50,7 @@ describe('FileWatcher', () => { describe('start/stop lifecycle', () => { it('should start and stop without errors', () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 0, durationMs: 0 }); - const watcher = new FileWatcher(testDir, baseConfig, syncFn); + const watcher = new FileWatcher(testDir, syncFn); const started = watcher.start(); expect(started).toBe(true); @@ -75,7 +62,7 @@ describe('FileWatcher', () => { it('should be idempotent on double start', () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 0, durationMs: 0 }); - const watcher = new FileWatcher(testDir, baseConfig, syncFn); + const watcher = new FileWatcher(testDir, syncFn); expect(watcher.start()).toBe(true); expect(watcher.start()).toBe(true); // Should not throw @@ -86,7 +73,7 @@ describe('FileWatcher', () => { it('should be idempotent on double stop', () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 0, durationMs: 0 }); - const watcher = new FileWatcher(testDir, baseConfig, syncFn); + const watcher = new FileWatcher(testDir, syncFn); watcher.start(); watcher.stop(); @@ -98,7 +85,7 @@ describe('FileWatcher', () => { describe('debounced sync', () => { it('should trigger sync after file change', async () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 1, durationMs: 10 }); - const watcher = new FileWatcher(testDir, baseConfig, syncFn, { debounceMs: 200 }); + const watcher = new FileWatcher(testDir, syncFn, { debounceMs: 200 }); watcher.start(); @@ -114,7 +101,7 @@ describe('FileWatcher', () => { it('should debounce rapid changes into a single sync', async () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 1, durationMs: 10 }); - const watcher = new FileWatcher(testDir, baseConfig, syncFn, { debounceMs: 500 }); + const watcher = new FileWatcher(testDir, syncFn, { debounceMs: 500 }); watcher.start(); @@ -140,7 +127,7 @@ describe('FileWatcher', () => { describe('filtering', () => { it('should ignore files not matching include patterns', async () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 0, durationMs: 0 }); - const watcher = new FileWatcher(testDir, baseConfig, syncFn, { debounceMs: 200 }); + const watcher = new FileWatcher(testDir, syncFn, { debounceMs: 200 }); watcher.start(); @@ -160,7 +147,7 @@ describe('FileWatcher', () => { it('should ignore .codegraph directory changes', async () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 0, durationMs: 0 }); - const watcher = new FileWatcher(testDir, baseConfig, syncFn, { debounceMs: 200 }); + const watcher = new FileWatcher(testDir, syncFn, { debounceMs: 200 }); watcher.start(); @@ -185,7 +172,7 @@ describe('FileWatcher', () => { it('should call onSyncComplete after successful sync', async () => { const syncFn = vi.fn().mockResolvedValue({ filesChanged: 2, durationMs: 50 }); const onSyncComplete = vi.fn(); - const watcher = new FileWatcher(testDir, baseConfig, syncFn, { + const watcher = new FileWatcher(testDir, syncFn, { debounceMs: 200, onSyncComplete, }); @@ -203,7 +190,7 @@ describe('FileWatcher', () => { it('should call onSyncError when sync throws', async () => { const syncFn = vi.fn().mockRejectedValue(new Error('sync failed')); const onSyncError = vi.fn(); - const watcher = new FileWatcher(testDir, baseConfig, syncFn, { + const watcher = new FileWatcher(testDir, syncFn, { debounceMs: 200, onSyncError, }); diff --git a/debug_python_ast.js b/debug_python_ast.js deleted file mode 100644 index edfff62f..00000000 --- a/debug_python_ast.js +++ /dev/null @@ -1,26 +0,0 @@ -const { getParser, initGrammars, loadAllGrammars } = require('./dist/extraction/grammars'); - -(async () => { - await initGrammars(); - await loadAllGrammars(); - - const parser = getParser('python'); - - const code = `class Child(Parent): - pass`; - - const tree = parser.parse(code); - - function walk(node, depth = 0) { - const indent = ' '.repeat(depth); - const preview = node.text.substring(0, 30).replace(/\n/g, '\\n'); - console.log(`${indent}${node.type} [${node.startPosition.row}:${node.startPosition.column}] "${preview}"`); - - for (let i = 0; i < node.namedChildCount; i++) { - const child = node.namedChild(i); - if (child) walk(child, depth + 1); - } - } - - walk(tree.rootNode); -})(); diff --git a/debug_python_ast2.js b/debug_python_ast2.js deleted file mode 100644 index b92d5f0b..00000000 --- a/debug_python_ast2.js +++ /dev/null @@ -1,26 +0,0 @@ -const { getParser, initGrammars, loadAllGrammars } = require('./dist/extraction/grammars'); - -(async () => { - await initGrammars(); - await loadAllGrammars(); - - const parser = getParser('python'); - - const code = `class Child(Parent, Mixin, Base): - pass`; - - const tree = parser.parse(code); - - function walk(node, depth = 0) { - const indent = ' '.repeat(depth); - const preview = node.text.substring(0, 40).replace(/\n/g, '\\n'); - console.log(`${indent}${node.type} "${preview}"`); - - for (let i = 0; i < node.namedChildCount; i++) { - const child = node.namedChild(i); - if (child) walk(child, depth + 1); - } - } - - walk(tree.rootNode); -})(); diff --git a/docs/benchmarks/answer-directly-vs-explore-agent.md b/docs/benchmarks/answer-directly-vs-explore-agent.md new file mode 100644 index 00000000..09167ec1 --- /dev/null +++ b/docs/benchmarks/answer-directly-vs-explore-agent.md @@ -0,0 +1,88 @@ +# Answer directly vs. delegate to an Explore agent (interactive A/B) + +**Question:** Does answering a "how does X work?" question *directly* with CodeGraph in the +main session bloat main-session context — and would Claude Code be better off delegating that +exploration to a disposable **Explore agent** (which keeps main context lean by absorbing the +file reads in a sub-transcript)? And critically: **does the answer change at scale**, on a +codebase far larger than Excalidraw? + +**Short answer:** No. With CodeGraph, main-session context is roughly **scale-invariant (~50k)** +because the retrieval is targeted and the `explore` payload is budget-capped — it does not +balloon on a 16× larger repo. Answering directly wins at **every** scale: same-or-leaner main +context than the delegation path, **zero file reads**, and ~28% fewer tokens. The +delegation-for-hygiene advantage stays marginal even on a large codebase. + +## Methodology + +- **Harness:** interactive Claude Code TUI driven via `scripts/agent-eval/itrun.sh` (tmux), + **not** headless `claude -p`. This matters: headless spawns **0** Explore agents, so it cannot + measure delegation behavior at all; only the interactive TUI does. +- **Arms:** `WITH` = CodeGraph in the MCP config; `WITHOUT` = empty MCP config (`--strict-mcp-config`). +- **Model:** `opus`. **n = 3 runs per arm.** Main **and** sub-agent transcripts parsed + (`scripts/agent-eval/parse-session.mjs`); reads/bash are summed across main + sub-agents. +- **Repos:** Excalidraw (643 files, medium) and VS Code (~10.7k files, large — ~16× Excalidraw). +- **Build:** 0.9.4. **Date:** 2026-05-24. +- "main-session context" is the TUI's reported `Context X/Y` for the *main* thread (sub-agent + context does not count against it). "billable tokens" = summed per-turn assistant usage + (input + output + cache read + cache creation). + +## Excalidraw (643 files, medium) + +Question: *"How does Excalidraw render and update canvas elements?"* + +| metric | WITH codegraph | WITHOUT | +|---|---|---| +| Explore agents spawned | 0 / 0 / 0 | 0 / 1 / 1 (delegated 2 of 3) | +| main-session context | 51k / 49k / 50k (~50k) | 48k / 34k / 26k (~36k) | +| total tool calls | 4 / 4 / 4 | 16 / 55 / 37 | +| Reads (main+sub) | 0 / 0 / 0 | 6 / 25 / 16 | +| billable tokens | ~127k | ~175k | + +## VS Code (~10.7k files, large — ~16× Excalidraw) + +Question: *"How does the extension host communicate with the main process?"* + +| metric | WITH codegraph | WITHOUT | +|---|---|---| +| main-session context | 47k / 43k / 50k (~47k) | 54k / 29k / 31k (~38k) | +| Explore agents | 0 / 0 / 0 | 0 / 1 / 1 (delegated 2/3) | +| codegraph calls | ~8 (search + explore×2–3 + context) | 0 | +| Reads (main+sub) | 0 / 1 / 0 | 6 / 26 / 19 | +| billable tokens | ~126k | ~176k | + +## Findings + +**Main-session context is scale-invariant with CodeGraph.** With codegraph, main-session +context was **~47k on VS Code — essentially identical to Excalidraw's ~50k**, despite a 16× +bigger repo. It didn't balloon. Reason: codegraph's `explore` payload is **budget-capped** and +retrieval is **targeted** — answering one question pulls in the relevant *flow/area*, not more +just because the repo is huge. So codegraph makes main-session context roughly scale-invariant +(~50k). The delegation-for-hygiene advantage stays marginal even on a large codebase — exactly +the opposite of "it gets significant at scale." + +The thing that *would* balloon at scale is reading many big files directly into main — and +Claude Code avoids that **without** codegraph by delegating to an Explore agent (29–31k main), +but at the cost of **17–26 reads** and ~28% more tokens. CodeGraph keeps main lean a *better* +way: a capped, targeted payload — no delegation, **0 reads**. + +**On "the Explore agents use codegraph."** I couldn't reproduce it: across **6/6** +with-codegraph runs (both repos), Claude Code **never delegated** — it answered directly every +time. The Explore-agent path only appeared in the `without` arm (using grep/read, since codegraph +wasn't in that config). So with the current instructions + codegraph present, Claude Code stays +in the main session — the lean-main-via-Explore-agent best case simply isn't what happens; +lean-main-via-capped-codegraph is, and it's cheaper. + +## Verdict + +**"Answer directly with codegraph" wins for Claude Code too — at every scale.** No per-agent +split is needed; the unified "answer directly" instruction is right for Claude Code *and* for +Codex / Cursor / opencode (which have no Explore-agent mechanism and would otherwise read files +directly). This conclusion drove updating the README's `## CodeGraph` example block, which +previously told agents to "NEVER call `codegraph_explore` directly / ALWAYS spawn an Explore +agent" — i.e., it steered Claude Code toward the *worse* (17–26 read, ~28%-more-token) path. + +**Caveat / future work (not a blocker):** an Explore agent that *itself uses codegraph* could in +principle get lean-main *and* low-work. But the "answer directly" instruction prevents delegation +in practice (0 delegations observed across 6 runs), the main-context gain would be marginal +(~50k → ~30k, both a few percent of a 1M window), and it adds a sub-agent round-trip. Worth a +future experiment, not a default. diff --git a/docs/benchmarks/call-sequence-analysis.md b/docs/benchmarks/call-sequence-analysis.md new file mode 100644 index 00000000..3c79bad5 --- /dev/null +++ b/docs/benchmarks/call-sequence-analysis.md @@ -0,0 +1,426 @@ +# Call-sequence analysis — why read savings don't convert to wall-clock + +**Date:** 2026-05-23 · **Branch:** `architectural-improvements` · **Source data:** the surviving +stream-json logs from the A/B matrix (`/tmp/ab-matrix//run-headless-{with,without}.jsonl`, +37 cells × 2 arms). Re-mined — **no re-runs** — with `scripts/agent-eval/seq-matrix.mjs`. + +## Why this exists + +The [A/B matrix](codegraph-ab-matrix.md) showed codegraph cuts **reads 75%** but **wall-clock only +~16%**, and 63% of the wall-clock win comes from just 3 large-repo cells. Reads are at the floor +(~0), so the remaining wall-clock is **round-trips + the synthesis turn** — neither of which read +count can explain. The matrix records tool *counts*, not the call **sequence** or per-call +**payload size**. This analysis recovers both, to find where the wall-clock actually goes. + +## TL;DR — the bottleneck is trace ADOPTION, not trace completeness + +1. **Trace is called in 3 of 37 cells** — even though every question is a canonical flow question + ("trace the controller → service → repository", "how does X reach Y"). The agent overwhelmingly + reaches for **`context → search → search → explore`** instead — the exact path-reconstruction + anti-pattern the instructions tell it to avoid. +2. **`explore` averages 17.9K chars/call; `trace` averages 0.8K** — a **22× payload difference**. + The path-scoped tool that solves the small-repo-bloat problem exists and is tiny. It's just not + being invoked. +3. **Small repos still get bloated payloads** because of the explore-default: a **6-file** repo + (`flutter_module_books`) pulls **17.4K**; a 10-file repo pulls 18.0K. This is precisely the + "too much context on small codebases" failure mode — happening right now, via explore. +4. **Round-trips are 25% fewer with codegraph (283 vs 375 turns)** but wall-clock is only 16% + faster — because the with-arm's turns each carry a ~18K explore payload, inflating TTFT and + eroding the turn savings. +5. **Root cause:** `src/mcp/server-instructions.ts` leads with *"answer directly … `codegraph_context` + first, then ONE `codegraph_explore`"* as the headline pattern. The trace-first guidance is buried + in a table + a chain list below it. Agents anchor on the prominent headline → context→explore. + +**Decision:** the next experiment is **trace-first steering / adoption**, not enriching trace. We +can't evaluate trace's completeness when it's used 3/37 times. Get adoption up first, then measure +whether the residual `node`/`explore` follow-ups need a richer trace. + +## Finding 1 — trace adoption: 3/37 + +| metric | value | +|---|---| +| flow-question cells | 37 (all of them) | +| cells that called `codegraph_trace` | **3** (`cpp-leveldb`, `excalidraw`, `c-redis`) | +| dominant pattern instead | `context` → `search`×N → `explore` | + +The 3 trace cells, and what followed the trace call: + +| repo | files | cg sequence | turns (with/without) | +|---|--:|---|---| +| cpp-leveldb | 134 | `trace, node, node` | 5 / 8 | +| excalidraw | 643 | `context, trace, trace, explore` | 6 / **19** | +| c-redis | 884 | `context, trace, explore, node` | 10 / 15 | + +Even when trace *is* used, the agent follows it with `node`/`explore` to fetch bodies — so a +secondary lever (after adoption) is making one trace call self-sufficient enough to kill those +follow-ups. But that's step 2. + +## Finding 2 — payload size: path-scoped trace (0.8K) vs breadth-scoped explore (17.9K) + +Across all cells, per codegraph tool — call count and **average payload per call**: + +| tool | calls | avg/call | total | +|---|--:|--:|--:| +| `explore` | 32 | **17.9K** | 573K | +| `context` | 36 | 4.3K | 156K | +| `search` | 39 | 1.3K | 50K | +| `files` | 5 | 3.4K | 17K | +| `node` | 19 | 2.0K | 38K | +| `trace` | 4 | **0.8K** | 3.4K | + +`context` (used in 36/37 cells) is the default opener; `explore` is the default closer. Together +they are the ~22K breadth dump. `trace` — the tool that would replace that with the actual path — +is 22× smaller and barely used. This is the user's premise confirmed in numbers: explore is +breadth-scoped (returns the neighborhood), trace is path-scoped (returns the line). + +## Finding 3 — payload grows with repo size, and over-returns on small repos + +With-arm **total** codegraph payload by repo-size tier: + +| tier | cells | avg total payload | range | +|---|--:|--:|--:| +| S (<200 files) | 19 | 12.7K | 3.0–31.2K | +| M (<2000) | 9 | 32.4K | 5.4–58.2K | +| L (≥2000) | 9 | 34.0K | 20.2–43.1K | + +The small-repo waste is concrete — these all have a 2–3 file flow but pull a full neighborhood: + +| repo | files | with-arm payload | sequence | +|---|--:|--:|---| +| flutter_module_books | 6 | 17.4K | `context, explore` | +| computer-database | 10 | 18.0K | `context, search, status, explore` | +| aspnet-realworld | 78 | 22.2K | `context, explore` | +| django-realworld | 44 | 14.8K | `context, explore` | + +`explore`'s per-call budget is already adaptive (#185), but it doesn't help here because the agent +isn't choosing the path-scoped tool — it's choosing breadth. + +## Finding 4 — round-trips, and the ToolSearch tax + +| metric | with | without | +|---|--:|--:| +| total turns (37 cells) | 283 | 375 | +| avg turns / cell | 7.6 | 10.1 | + +25% fewer turns, but only ~16% faster wall-clock — the gap is the per-turn cost of the big explore +payloads. Also: **every with-arm run opens with a `ToolSearch` round-trip** (MCP tools are deferred +in this harness), a fixed 1-turn tax before any codegraph call. Worth confirming whether the +production install defers codegraph tools the same way. + +## Conclusion → the experiment to run next + +Measure-first changed the plan. The hypothesis was "enrich trace so one call is self-sufficient." +The data says trace is **used 3/37 times**, so completeness is moot until adoption is fixed. + +**Experiment: trace-first steering A/B.** +- **Change:** rewrite the `server-instructions.ts` headline so a *flow* question (how does X reach Y + / trace / from→to) routes to `codegraph_trace` **first**, demoting the context→explore pattern to + non-flow/onboarding questions. Mirror into `instructions-template.ts` + `.cursor/rules/codegraph.mdc`. +- **Metric:** trace-adoption rate (target ≫ 3/37), with-arm total payload (expect ↓ sharply, + especially small repos), turns (expect ↓), wall-clock (expect the 16% gap to widen toward the + 25% turn gap as 18K explore payloads are replaced by <1K traces). +- **Control:** a non-flow "what's the deal with module X" question must still go context→explore — + don't over-steer everything to trace. +- **Then, step 2:** with adoption up, measure the `node`/`explore` follow-ups after trace + (cpp-leveldb/excalidraw/c-redis all had them). If they're frequent, enrich trace (per-hop body + snippet, capped per hop) so one trace call ends the flow investigation. + +## Reproduce + +```bash +node scripts/agent-eval/seq-matrix.mjs # regenerates every table above from /tmp/ab-matrix +``` + +--- + +# Ablation experiment — do `context`, `explore`, and `trace` compete? Is `trace` enough? + +**Date:** 2026-05-23 · 52 runs, ~$20. Tool surface trimmed **server-side** via the new +`CODEGRAPH_MCP_TOOLS` allowlist (so an ablated tool is genuinely absent from ListTools, not +denied-on-call); trace-first steering injected with `--append-system-prompt`. 6 repos (2 S / 2 M / +2 L) × 2 runs; arm E is a **non-flow** survey question on 2 repos. Driver `arms-matrix.sh`, +analysis `parse-arms.mjs`. + +| arm | tools | steering | adoption | reads | cgOut | turns | dur | +|---|---|---|--:|--:|--:|--:|--:| +| **A** control | all | none | 2/12 | 1.25 | 28.8K | 7.6 | 38s | +| **B** steer | all | trace-first | **8/12** | 1.00 | **32.0K** | 7.9 | 43s | +| **C** no-explore | hide explore | trace-first | 8/12 | **2.08** | **9.2K** | 9.0 | 44s | +| **D** trace-centric | hide explore+context | trace-first | 8/12 | 2.00 | 6.6K | 10.5 | 46s | +| **E** control-probe | hide explore+context | trace-first | 0/4 | 2.50 | 27.8K | **20.0** | **72s** | + +## What it says + +1. **Steering works for adoption, not for payload.** B lifted trace use **2/12 → 8/12** (and 4/4 on + the genuinely path-shaped questions — the 2 non-adopters, flutter "what widgets" and vapor "name + the route", aren't from→to questions). But B's payload (32.0K) is *bigger* than control (28.8K) + and it's slightly slower — because the agent calls trace **and still calls explore**. Steering + adds a trace hop without displacing the explore dump. +2. **`explore` is the payload, and it's load-bearing — but 3–5× too heavy.** Removing it (C) cuts + payload **71%** (32K→9.2K) — confirming it's the bloat. But reads **double** (1.0→2.1) and turns + rise: the agent Reads files to recover the bodies explore had inlined. So explore isn't + redundant; it's the only one-call body-supplier, just delivered with a 32K sledgehammer. +3. **`context` is the most redundant of the three — as a body-supplier.** Removing it on top of + explore (D vs C) left reads flat (2.08→2.00) but raised turns (9.0→10.5). It supplies no unique + bodies; it earns its keep only as a round-trip-saver (the composed orient call). +4. **Removing tools makes flow questions SLOWER, not faster.** Turns climb monotonically + A→D (7.6→10.5) and duration with them — the Read + trace-follow-up round-trips cost more + wall-clock than the saved payload. Leaner payload ≠ faster. +5. **`trace` is definitively NOT sufficient.** The non-flow probe (E) thrashed without the survey + tools — **20 turns, 72s** reconstructing an overview from search/node/files. Survey questions + need a survey tool; trace can't substitute. + +## Verdict on the three design questions + +- **Do we need all three?** Yes — but for different reasons. trace = flow tool (real, under-adopted). + explore = the one-call body-supplier (load-bearing, over-heavy). context = round-trip-saving + opener (redundant for bodies, useful for orientation). +- **Are they competing?** Yes: explore competes with trace and *wins by default* — even when steered, + the agent traces **and** explores, so the payload win never lands until explore is displaced. +- **Could trace be all we need?** No. E rules it out for non-flow questions; C/D rule it out even + for flow (reads double without explore's bodies). + +**Three cheap fixes are now ruled out by data:** "trace is all we need" (false), "just steer to +trace" (B: slower + bigger than control), and "remove explore" (C/D: more reads/turns, slower). + +## The fix the data points to → next experiment + +The only path that wins: **make `trace` self-sufficient by inlining per-hop bodies** (capped per +hop → still path-scoped) so one trace call supplies what explore does *and* what the Read fallback +recovers — displacing both for flow questions. Keep **one** survey tool (context; demote explore to +deep-survey, not the flow default) for the non-flow class E proved is load-bearing. + +- **Experiment:** enriched body-inlining `trace` + steering vs control. +- **Target:** C/D's lean payload (~7–9K, not 32K) **without** C/D's extra reads/turns, and **beat A + on wall-clock** (the bar B/C/D all failed). +- **Metric:** payload, reads (must stay ≈ A's ~1.0, not rise to 2.0), turns, duration. + +## Reproduce (ablation) + +```bash +bash scripts/agent-eval/arms-matrix.sh # 52 runs into /tmp/arms (RUNS=2 default) +node scripts/agent-eval/parse-arms.mjs # the arm-comparison tables above +``` + +--- + +# Validation — body-inlining trace (arm F) + +The ablation pointed to one fix: make `trace` self-sufficient by inlining per-hop **bodies** +(capped per hop → still path-scoped) so one trace call displaces both the explore dump and the +Read fallback. Implemented in `handleTrace` (`sourceRangeAt`, 28 lines / 1200 chars per hop, with a +`… (+N more lines)` marker). Arm **F** = arm B's surface (all tools + trace-first steering) run on +the body-inlining build, so **F vs B isolates the enrichment**. + +| arm | adoption | reads | cgOut | turns | dur | cost | +|---|--:|--:|--:|--:|--:|--:| +| A all/none | 2/12 | 1.25 | 28.8K | 7.6 | 38s | $0.390 | +| B all/steer (thin trace) | 8/12 | 1.00 | 32.0K | 7.9 | 43s | $0.411 | +| **F all/steer (body trace)** | 5/12 | **1.17** | **25.1K** | **6.8** | **37s** | **$0.348** | +| C no-explore | 8/12 | 2.08 | 9.2K | 9.0 | 44s | $0.356 | +| D trace-centric | 8/12 | 2.00 | 6.6K | 10.5 | 46s | $0.368 | + +**F is the best-balanced arm:** lowest turns (6.8), fastest (37s), cheapest, payload leaner than +A/B — and it hits the target the ablation set: **C/D-class efficiency without C/D's Read penalty** +(F reads 1.17 vs C/D's ~2.0). It gets there not by *removing* a tool but by giving the agent a +complete trace so it *stops early*. + +**The win is clearest where trace connects** — excalidraw (the validated 6-hop path): + +| arm | sequence | turns | reads | dur | +|---|---|--:|--:|--:| +| B (thin) | `trace → context → explore → Grep → Read` | 7 | 1 | 47s | +| **F (body) r1** | `trace → context` | **4** | **0** | **31s** | +| F (body) r2 | `trace → trace → explore` | 5 | 0 | 42s | + +The body-trace ended the investigation in `trace → context` (run 1) — 0 reads, 0 grep, 0 explore. + +**Connectivity is the cap.** On flows that break at *unbridged* dynamic dispatch — aspnet-realworld +(MediatR `_mediator.Send → Handle`), vapor-spi (closure routing) — trace returns "no path" and the +agent falls back to explore, so F ≈ B (no regression, no gain). F's aggregate lift is therefore +**gated by dynamic-dispatch coverage**: the more flows the graph connects end-to-end, the more often +the self-sufficient trace fires. (n=2/arm — adoption and per-repo numbers are noisy; excalidraw and +spring-halo, the connecting repos, are 2/2 trace in both B and F.) + +## Verdict & ship list + +1. **Ship the body-inlining trace** — strict improvement (best-balanced arm; clean 0-read/4-turn win + on connecting traces; no regression on non-connecting ones). +2. **Strengthen the steering.** Arm A (shipped server-instructions, which *already* say "trace first + for flow") adopted trace only 2/12 — the guidance is too buried. The explicit + `--append-system-prompt` used in B–F lifted it. Port that into `server-instructions.ts` + + `instructions-template.ts` + `.cursor/rules/codegraph.mdc` (house rule: all three together), + flow-gated so non-flow survey questions still go context/explore (arm E proved they must). +3. **Next frontier to widen F's reach:** bridge more dynamic dispatch (MediatR/.NET, Vapor routing) — + every newly-connected flow converts an F≈B repo into an F-win repo. + +## Reproduce (arm F) + +```bash +bash scripts/agent-eval/arms-F.sh # 12 runs (RUNS=2); needs the body-inlining build +node scripts/agent-eval/parse-arms.mjs # F appears alongside A/B/C/D/E +``` + +--- + +# Steering port — the negative result (arm G) + +F's win used `--append-system-prompt`, which real users don't get. Arm **G** = arm A's invocation +(NO append-prompt) on a build where the steering was ported into the production channels +(`server-instructions.ts` + the `context`/`trace` tool descriptions + `instructions-template.ts` + +`.cursor/rules`). Three wording iterations, 12 runs each: + +| arm | adoption | reads | payload | turns | dur | +|---|--:|--:|--:|--:|--:| +| A (shipped instructions) | 2/12 | 1.25 | 28.8K | 7.6 | **38s** | +| F (body-trace + append-prompt) | 5/12 | **1.17** | 25.1K | 6.8 | **37s** | +| G v1 — anti-explore wording | 6/12 | 2.08 | 13.8K | 8.8 | 46s | +| G v2 — restore explore as fallback | 6/12 | 1.67 | 22.0K | 7.8 | 46s | +| G v3 — restore context as opener | 6/12 | 2.08 | 11.7K | 8.9 | 46s | + +**Production-instruction steering does not reproduce F, and regresses the A baseline.** All three G +variants pin at **~46s** (slower than A's 38s and F's 37s) with reads at 1.7–2.1 (vs A 1.25, F 1.17). +Wording only shuffled the slack between Read and explore — v1 suppressed explore → Read; v2/v3 +restored explore → over-investigation — never landing F's lean `trace → context`. + +**Two root causes:** +1. **Salience.** The same trace-first wording works as a top-of-prompt `--append-system-prompt` (F) + but not as an MCP `initialize` instruction / tool description (G). An MCP server has no + higher-salience channel — this is an architectural limit, not a wording bug. +2. **Forcing trace-first backfires where trace doesn't connect.** Steering pushed trace onto + MediatR (`_mediator.Send`) and Spring interface-DI (`@Autowired` iface → impl) flows, where trace + returns no-path; the forced trace is then a wasted round-trip *before* the fallback → slower. + The **unsteered** agent (A) is better-calibrated: it traces only when trace will obviously + connect (2/12) and explores otherwise. + +## Arm H — body-trace alone (the ship candidate) regresses + +The clean ship test: body-inlining trace + ORIGINAL instructions + no steering (= A's invocation, +only the trace *tool* changed). H vs A isolates the body-trace feature with nothing else moving. + +| arm | adoption | reads | payload | turns | dur | +|---|--:|--:|--:|--:|--:| +| A (no body-trace) | 2/12 | 1.25 | 28.8K | 7.6 | **38s** | +| H (body-trace, no steering) | 3/12 | 1.50 | 29.7K | 8.0 | **45s** | +| F (body-trace + append-prompt) | 5/12 | 1.17 | 25.1K | 6.8 | 37s | + +**Body-trace alone does NOT beat A — it mildly regresses** (45s vs 38s). The sequences show why: +unsteered, the agent treats trace as just one more call in its usual loop — excalidraw H was +`context → trace → explore → node×3 → Grep → Read` (77s) — so the bigger body-trace payload is pure +added cost, not offset by fewer follow-ups. The body-trace only pays off when the agent **leads with +trace and stops after it**, which only the append-prompt (F) achieved. + +## Final verdict + +The body-inlining trace is a real win (F) but its value is **entirely contingent on +lead-with-and-stop-after-trace steering we cannot deliver through any production MCP channel** +(append-prompt salience ≫ server-instructions / tool-descriptions; G failed three times). On its own +(H) it regresses. So: + +- **SHIP: the `CODEGRAPH_MCP_TOOLS` allowlist** — independent, clean, validated. +- **DON'T ship the body-inlining trace or the steering as-is** — measured neutral-to-negative + without a steering channel we don't have. +- **The real lever is connectivity, not steering** — trace earns its keep only when flows connect + end-to-end; dynamic-dispatch synthesizers (MediatR/.NET, Spring interface-DI, Vapor closures) help + the *unsteered* agent, which already traces when trace will connect. +- **One untested lever** to rescue the body-trace: steer via the trace tool's OWN OUTPUT (the + highest-salience channel — the agent reads it fresh, right at the decision point) with a strong + leading "complete flow — answer from this, don't explore" banner. Instructions/descriptions are + too far from the action; the tool result is not. Unproven; the only remaining shot at making the + body-trace pay off in production. + +measure-first paid off three times: it killed three cheap fixes in the ablation, stopped a steering +change that would have shipped an ~8s/query regression (G), and stopped shipping the body-trace +itself on a confounded assumption (H showed it needs steering we can't deliver). + +## Reproduce (arm G) + +```bash +ARM=G bash scripts/agent-eval/arms-F.sh # production-instruction steering, no append-prompt +node scripts/agent-eval/parse-arms.mjs +``` + +--- + +# Arm I — sufficiency, not steering (the shippable win) + +An LLM stops investigating when its context is *sufficient*, not when it's told to stop. So arm I +makes the trace OUTPUT complete instead of steering — same invocation as H (original instructions, +**no steering**), only the trace tool changed: +1. **Hop bodies no longer clipped** at 28 lines (that clip is why H re-fetched `mutateElement`). +2. **The destination's own callees are inlined** — the "last mile" the agent otherwise explores/Reads + for (excalidraw: `renderStaticScene → _renderStaticScene / renderStaticSceneThrottled`). + +| arm | adoption | reads | greps | payload | turns | dur | cost | +|---|--:|--:|--:|--:|--:|--:|--:| +| A baseline | 2/12 | 1.25 | 1.17 | 28.8K | 7.6 | 38s | $0.390 | +| H body-trace alone | 3/12 | 1.50 | 0.42 | 29.7K | 8.0 | 45s | $0.398 | +| **I body-trace + dest callees** | 2/12 | **1.17** | **0.25** | 27.2K | **7.0** | 39s | **$0.359** | +| F body-trace + append-steer | 5/12 | 1.17 | 0.17 | 25.1K | 6.8 | 37s | $0.348 | + +**I ≥ A on every axis** (reads, greps, turns, cost down; wall-clock flat) and **≈ F on outcomes with +zero steering** — despite *lower* trace adoption (2/12 vs F's 5/12). The destination-callees fix +turned the body-trace from a net-negative (H, 45s) into a net-positive (I, 39s): one richer trace +call now displaces the explore+node+Read follow-ups it used to trigger. excalidraw I-r2 was +`context → trace → explore` — **0 reads, 5 turns**, stopped because the data was present. The residual +reads (I-r1) are the `canvasNonce` data-flow — the def-use frontier the graph deliberately omits. + +This confirms the thesis: **completeness stops the agent; steering doesn't.** Every steering arm +(B/F append-prompt, G instructions) was either unshippable or a regression; the sufficiency arm (I) +ships and needs no steering. + +## Revised final verdict (supersedes the arm-G/H verdict above) + +- **SHIP: body-inlining trace + destination callees** (arm I) — ≥ A on all axes, no steering, no + regression; makes the self-sufficient-trace property real (one trace call answers the flow). +- **SHIP: the `CODEGRAPH_MCP_TOOLS` allowlist** — independent, validated. +- **DON'T ship steering** (instructions or tool descriptions) — three variants regressed; MCP can't + deliver append-prompt salience, and forcing trace where it doesn't connect backfires. +- **Connectivity is the multiplier** — arm I helps most where the trace connects; MediatR/.NET, + Spring interface-DI, and Vapor closures are the next synthesizers, and they help the *unsteered* + agent (which already traces when trace will connect). + +## Reproduce (arm I) + +```bash +ARM=I bash scripts/agent-eval/arms-F.sh # body-trace + destination callees, no steering +node scripts/agent-eval/parse-arms.mjs +``` + +--- + +# Current-build with/without A/B — the 7 README repos (2026-05-24) + +Re-ran the published README benchmark on the **current build** (all 7 repos freshly reindexed), +same queries, **median of 4 runs/arm** (headless: codegraph-only MCP vs empty MCP): + +| repo | time with→without | tools w→wo | tokens w→wo (saved) | cost w→wo (saved) | +|---|---|--:|--:|--:| +| vscode | 1m10s→2m26s | 8→55 | 601k→2.8M (78%) | $0.60→$0.80 (26%) | +| excalidraw | 48s→2m58s | 3→79 | 344k→3.5M (90%) | $0.43→$0.90 (52%) | +| django | 1m19s→1m38s | 9→19 | 739k→1.2M (36%) | $0.59→$0.67 (12%) | +| tokio | 53s→3m2s | 4→53 | 379k→2.6M (86%) | $0.42→$2.41 (82%) | +| okhttp | 42s→1m1s | 6→11 | 636k→730k (13%) | $0.47→$0.47 (2%) | +| gin | 44s→1m0s | 6→10 | 444k→675k (34%) | $0.37→$0.47 (21%) | +| alamofire | 1m17s→2m27s | 12→69 | 1.0M→2.8M (64%) | $0.61→$1.14 (47%) | + +**Average saved: 35% cost · 57% tokens · 46% time · 71% tool calls** — reproduces the published +README headline (35% / 59% / 49% / 70%); the current build holds the benchmark with no regression. + +**Cost is lower, not "flat"** (corrects the earlier note). But the **mechanism is volume, not +cache-ability**: codegraph answers in far fewer turns over a much smaller accumulated context, while +the without-arm fans out across many more turns (55–79 tool calls on the big repos), each +re-processing a large, growing context. The without-arm's token volume is *mostly* cheap cache-reads, +which is why **token-count savings (57%) look bigger than cost savings (35%)**. Per-repo margin tracks +how hard the without-arm thrashes that run (tokio blew up to $2.41/3m; django thrashed less). + +**Measurement gotcha:** `result.usage` in this Claude Code version is the **last turn only**, not +cumulative — using it under-counts tokens badly (an earlier excalidraw cut reported "−34% tokens" +off this bug; the real figure is ~90%). Sum **per-turn assistant `usage`** for the true total. +`total_cost_usd` and `duration_ms` are already cumulative/correct. + +Reproduce: +```bash +bash scripts/agent-eval/bench-readme.sh # 7 repos × with/without × 4 runs (RUNS=4) → /tmp/ab-readme +node scripts/agent-eval/parse-bench-readme.mjs # medians + % saved (summed per-turn tokens) +``` diff --git a/docs/benchmarks/codegraph-ab-matrix.md b/docs/benchmarks/codegraph-ab-matrix.md new file mode 100644 index 00000000..a360a7b1 --- /dev/null +++ b/docs/benchmarks/codegraph-ab-matrix.md @@ -0,0 +1,111 @@ +# CodeGraph A/B benchmark — with vs without, every language × S/M/L + +**Date:** 2026-05-23 · **Branch:** `architectural-improvements` + +A headless agent (Claude Opus, `--permission-mode bypassPermissions`) answers one +**canonical flow question** per repo — twice: **with** the codegraph MCP server, and +**without** any MCP (built-in Read/Grep/Glob/Bash only). Same model, same prompt; codegraph +is the only variable. Each cell was **re-indexed fresh** first, so the "with" arm reflects the +current resolvers. + +## Headline + +**Across 37 cells, codegraph cut total file reads from 158 → 40 — 75% fewer.** It never +*increased* reads in any cell. The mechanism: a few sub-millisecond codegraph calls replace a +read-and-grep exploration. Token cost stays roughly flat (codegraph calls trade for reads) — +the win is **fewer tool calls + lower wall-clock**, which is the design target. + +The gap widens with repo size and flow complexity: on medium/large repos the without-codegraph +arm often **thrashes** — many greps/globs, shell `find`/`grep` (Bash), and occasionally spawning +a **sub-agent** — while the with-codegraph arm answers in 2–6 calls. On tiny repos (a handful of +files) the two arms tie or codegraph is marginally slower (MCP/index overhead doesn't pay off +when the whole flow fits in one or two files) — but reads still drop. + +## How to read the table + +- **R / G / Gl / B / Ag** = Read / Grep / Glob / Bash / sub-agent (Task) tool calls. +- **cg-calls** = codegraph MCP calls in the "with" arm (the trade for reads/greps). +- **dur** = wall-clock seconds. **files** = indexed file count (the size proxy). +- **reads saved** = without-reads − with-reads. +- One run per arm (a **snapshot** — run-to-run variance is real; treat ±1–2 reads and ±10s as + noise, look at the pattern across cells). 2-runs/arm headline numbers for several of these flows + live in `docs/design/dynamic-dispatch-coverage-playbook.md` §7. + +## Results + +| Language | Size | Repo | files | **with** R/G | cg-calls | dur | **without** R/G | dur | reads saved | +|---|---|---|--:|---|--:|--:|---|--:|--:| +| C | L | `c-redis` | 884 | 0R / 4G | 4 | 48s | 4R / 9G / 1Gl | 50s | 4 | +| C# | S | `aspnet-realworld` | 78 | 0R / 0G | 2 | 40s | 2R / 1G / 2Gl | 31s | 2 | +| C# | M | `aspnet-eshop` | 262 | 0R / 0G | 5 | 39s | 6R / 2G / 3Gl / 1B | 61s | 6 | +| C# | L | `aspnet-jellyfin` | 2081 | 4R / 0G | 2 | 61s | 13R / 0G / 4Gl / 21B / 1Ag | 132s | 9 | +| C++ | M | `cpp-leveldb` | 134 | 0R / 0G | 3 | 40s | 2R / 3G | 52s | 2 | +| Dart | S | `flutter_module_books` | 6 | 1R / 0G | 2 | 37s | 1R / 0G / 1Gl | 20s | 0 | +| Dart | M | `compass_app` | 212 | 2R / 0G | 2 | 31s | 3R / 1G / 3Gl | 47s | 1 | +| Go | S | `gin-realworld` | 21 | 2R / 1G | 3 | 31s | 4R / 0G / 1B | 44s | 2 | +| Go | M | `gin-vueadmin` | 625 | 0R / 0G | 2 | 31s | 3R / 3G / 2Gl | 47s | 3 | +| Go | L | `gin-gitness` | 4438 | 3R / 3G | 4 | 52s | 7R / 4G / 3Gl | 60s | 4 | +| Java | S | `spring-realworld` | 117 | 0R / 0G | 4 | 31s | 8R / 1G / 1Gl | 50s | 8 | +| Java | M | `spring-mall` | 536 | 1R / 0G | 5 | 51s | 5R / 0G / 4Gl | 64s | 4 | +| Java | L | `spring-halo` | 2444 | 0R / 1G | 8 | 75s | 9R / 5G / 8B | 148s | 9 | +| Kotlin | S | `kotlin-petclinic` | 43 | 1R / 0G | 1 | 23s | 3R / 0G / 2Gl | 26s | 2 | +| Kotlin | M | `Jetcaster` | 166 | 1R / 0G | 3 | 36s | 1R / 0G / 2Gl | 34s | 0 | +| Lua | S | `lualine.nvim` | 123 | 1R / 0G | 4 | 48s | 4R / 0G / 1Gl | 45s | 3 | +| Lua | M | `telescope.nvim` | 84 | 0R / 0G | 2 | 33s | 2R / 0G / 1Gl | 26s | 2 | +| Luau | S | `Knit` | 11 | 0R / 0G | 4 | 36s | 5R / 0G / 2Gl | 57s | 5 | +| PHP | S | `laravel-realworld` | 114 | 3R / 0G / 1Gl | 2 | 41s | 6R / 2G / 3Gl | 38s | 3 | +| PHP | M | `laravel-firefly` | 2047 | 4R / 4G | 5 | 79s | 5R / 3G / 3Gl / 2B | 70s | 1 | +| PHP | L | `laravel-bookstack` | 2160 | 0R / 1G | 5 | 42s | 3R / 2G / 2Gl | 46s | 3 | +| Python | S | `django-realworld` | 44 | 1R / 1G | 2 | 30s | 8R / 0G / 1Gl | 35s | 7 | +| Python | M | `django-wagtail` | 1672 | 3R / 0G | 5 | 73s | 7R / 5G / 2Gl / 1B | 63s | 4 | +| Python | L | `django-saleor` | 4429 | 1R / 2G | 3 | 59s | 6R / 5G / 2Gl / 1B | 72s | 5 | +| Ruby | S | `rails-realworld` | 59 | 0R / 0G | 2 | 34s | 4R / 0G / 3Gl | 40s | 4 | +| Ruby | M | `rails-spree` | 2905 | 1R / 2G | 8 | 60s | 3R / 4G / 3Gl | 56s | 2 | +| Ruby | L | `rails-forem` | 4658 | 3R / 1G | 3 | 54s | 3R / 2G / 1Gl | 49s | 0 | +| Rust | S | `rust-axum-realworld` | 13 | 1R / 0G | 4 | 28s | 3R / 1G / 1Gl | 49s | 2 | +| Rust | M | `rust-actix-examples` | 176 | 1R / 0G | 5 | 42s | 4R / 1G / 2B | 35s | 3 | +| Rust | L | `rust-cratesio` | 1053 | 0R / 0G | 3 | 20s | 1R / 2G | 15s | 1 | +| Scala | S | `computer-database` | 10 | 1R / 0G | 4 | 47s | 2R / 0G / 1B | 28s | 1 | +| Swift | S | `vapor-template` | 14 | 0R / 0G | 1 | 16s | 2R / 0G / 1Gl | 22s | 2 | +| Swift | M | `vapor-steampress` | 100 | 1R / 0G | 8 | 53s | 3R / 3G / 2B | 57s | 2 | +| Swift | L | `vapor-spi` | 542 | 2R / 0G | 5 | 49s | 2R / 3G / 2Gl | 36s | 0 | +| TypeScript/JS | S | `express-realworld` | 39 | 1R / 0G | 1 | 16s | 2R / 1G / 1Gl | 27s | 1 | +| TypeScript/JS | M | `excalidraw` | 643 | 0R / 0G | 4 | 53s | 9R / 7G | 98s | 9 | +| TypeScript/JS | L | `nest-immich` | 2759 | 1R / 1G | 6 | 50s | 3R / 1G / 2Gl | 57s | 2 | + +**Totals (37 cells):** with codegraph **40 reads / 21 greps**, without **158 reads / 71 greps** — +**75% fewer reads, ~70% fewer greps.** Codegraph never increased reads in any cell, and the +without-arm additionally ran shell `find`/`grep` (Bash) and a sub-agent that the with-arm never +needed. (74 agent runs, ~$29 total.) + +## Observations + +- **Biggest wins are medium/large backends with a real route→handler→service flow:** excalidraw + (0R vs 9R/7G), spring-halo (0R vs 9R + 8 Bash), spring-realworld (0R vs 8R), django-realworld + (1R vs 8R), aspnet-jellyfin (4R vs 13R + 21 Bash + a spawned sub-agent), aspnet-eshop (0R vs 6R). +- **Without codegraph, large repos make the agent thrash:** it falls back to shell `find`/`grep` + (Bash) and on jellyfin even spawned a sub-agent — exactly the behavior codegraph is meant to + prevent. The with-arm answers those in 2–6 codegraph calls. +- **Tie zone = tiny repos** (Dart books 6 files, Kotlin Jetcaster, Ruby forem, Swift spi): the whole + flow fits in 1–2 files, so reading is already cheap; codegraph ties on reads and is sometimes a + few seconds slower (MCP + index overhead). This matches the design note that codegraph's value + scales with repo size. +- **Duration tracks reads on the big repos** (jellyfin 61s vs 132s, spring-halo 75s vs 148s, + excalidraw 53s vs 98s) and is noise on small ones. +- Some "with" cells still read 2–4 files (jellyfin, gitness, laravel-firefly, forem) — the residual + is the documented frontier (anonymous handlers, deep service chains, dynamic finders); codegraph + gets the agent to the right file, then it reads one to confirm a detail. + +## Coverage note + +All 14 README frameworks and every flow-relevant language are validated (see the playbook). The +sizes here are by indexed file count; a few languages lack a clean third size in the corpus +(Dart/Kotlin = S/M, Scala/Luau = S only, C = L only, C++ = M only) — those cells are omitted rather +than faked. + +## Reproduce + +Driver + parser: `/tmp/ab-matrix/run.sh` (matrix of `lang|size|repo|question`) and +`/tmp/ab-matrix/parse-matrix.mjs`. Each cell: `rm -rf .codegraph && codegraph init -i`, then +`scripts/agent-eval/run-all.sh "" headless` (with = codegraph-only MCP, without = +empty MCP), parsed from the stream-json logs. diff --git a/docs/design/callback-edge-synthesis.md b/docs/design/callback-edge-synthesis.md new file mode 100644 index 00000000..7c4bfb06 --- /dev/null +++ b/docs/design/callback-edge-synthesis.md @@ -0,0 +1,179 @@ +# Design + status: general callback / observer edge synthesis + +**Status:** Phases 1–3 implemented & validated as a **prototype, uncommitted on `main`** +(as of 2026-05-22). This doc is the handoff for continuing the work. +**Motivation:** close the dynamic-dispatch hole that static extraction leaves for +observer / event-emitter / signal patterns, where a *dispatcher* invokes callbacks +registered elsewhere through a shared store — so flows like "how does an update +reach the screen" actually exist in the graph. + +--- + +## TL;DR for a new session + +We synthesize `dispatcher → callback` edges that static parsing misses. It works: + +- **Field observer** (excalidraw `Scene.onUpdate`/`triggerUpdate`): synthesizes + `triggerUpdate → triggerRender`. `trace(mutateElement, triggerRender)` now = 3 hops. +- **EventEmitter** (express `on('mount', …)`/`emit('mount')`): synthesizes `use → onmount`. +- Precision is high: excalidraw got **1** synthesized edge out of 27k (the correct one); + node count moved +3 after Phase 3 (no explosion). + +**Files touched (all uncommitted on `main`):** +- `src/resolution/callback-synthesizer.ts` — the whole-graph synthesis pass (Phase 1 + 2). +- `src/resolution/index.ts` — calls `synthesizeCallbackEdges()` at the end of + `resolveAndPersistBatched()` (after base edges are persisted) + the import. +- `src/extraction/tree-sitter.ts` — `visitFunctionBody` now extracts **named** nested + functions (Phase 3), so inline named handlers become linkable nodes. + +**How to reproduce / test:** +```bash +npm run build +rm -rf /tmp/codegraph-corpus/excalidraw/.codegraph +( cd /tmp/codegraph-corpus/excalidraw && codegraph init -i ) +# synthesized edges (provenance='heuristic', metadata.synthesizedBy in {callback,event-emitter}): +sqlite3 /tmp/codegraph-corpus/excalidraw/.codegraph/codegraph.db \ + "select s.name||' → '||t.name||' '||coalesce(e.metadata,'') from edges e \ + join nodes s on e.source=s.id join nodes t on e.target=t.id where e.provenance='heuristic';" +# end-to-end trace (uses the dev probes): +node scripts/agent-eval/probe-trace.mjs /tmp/codegraph-corpus/excalidraw triggerUpdate triggerRender +``` +Probe scripts (dev-only, in `scripts/agent-eval/`): `probe-node.mjs` (symbol + trail), +`probe-trace.mjs` (call path), `probe-context.mjs`, `probe-explore.mjs`. EventEmitter +fixture lives at `/tmp/cb-fixture/bus.js` (ephemeral — recreate or move into `__tests__/`). + +--- + +## The hole + +```ts +class Scene { + private callbacks = new Set(); + onUpdate(cb: Callback) { this.callbacks.add(cb); } // REGISTRAR + triggerUpdate() { for (const cb of this.callbacks) cb(); } // DISPATCHER +} +this.scene.onUpdate(this.triggerRender); // REGISTRATION SITE +``` + +The runtime edge `triggerUpdate → triggerRender` does not exist statically: +`triggerUpdate`'s only literal call is `cb()` (anonymous). Measured: `triggerUpdate`'s +only callee was `randomInteger`; `trace(triggerUpdate, triggerRender)` returned no path. + +## Why it's a whole-graph pass, not a `FrameworkResolver.resolve()` + +`resolve(ref)` answers "what does this **named** ref point to," one ref at a time. The +callback edge has **no ref to resolve** (`cb()` is anonymous) and needs **cross-file, +multi-site correlation** (registrar, registration, dispatcher). So it's a whole-graph +pass after base resolution, language-level (any OO observer), living in +`src/resolution/callback-synthesizer.ts` — **not** under `frameworks/`. + +> Sibling mechanism for the *other* dynamic-dispatch class — **named** attribute/ +> descriptor dispatch (e.g. django `self._iterable_class(...)`) — is the +> `claimsReference` hook (`resolution/types.ts` + `resolution/index.ts` pre-filter) +> + a `FrameworkResolver.resolve()` (django ORM resolver in `frameworks/python.ts`). +> That one *does* fit `resolve()` because the ref is named. Both are part of the same +> coverage effort; see the "Related work" section. + +--- + +## As-built algorithm (and where it diverged from the original design) + +### Field-observer channels (`fieldChannelEdges`, Phase 1) +1. **Candidates** by method/function **name** — registrar `^(on[A-Z]\w*|subscribe| + addListener|addEventListener|register|watch|listen|addCallback)$`; dispatcher + contains `(emit|trigger|notify|dispatch|fire|publish|flush)`. +2. **Confirm by body** (read via `ctx.readFile` + slice node lines): registrar has + `this..add|push|set(`; dispatcher has `for (… of [Array.from(]this.)` + a call, + or `this..forEach(`. +3. **Pairing — DIVERGENCE:** the design said pair by *class*; the build pairs by + **same file + same field `F`** (file as a class proxy — getting the containing class + reliably was harder). Works for the common 1-class-per-file case; revisit for + multi-class files. +4. **Registrations:** `queries.getIncomingEdges(registrar.id, ['calls'])` → for each, + read the caller's source at the edge line and **regex-recover the arg** + (`\s*\(\s*(?:this\.)?(\w+)`). DIVERGENCE: design preferred tree-sitter + re-parse; build uses regex (named refs only — arrows/inline args are missed here). +5. **Synthesize** `dispatcher → fn` (`getNodesByName(arg)` → method|function). Capped at + `MAX_CALLBACKS_PER_CHANNEL = 40`. + +### EventEmitter channels (`eventEmitterEdges`, Phase 2) +- **File-oriented scan** (`ctx.getAllFiles()` + `readFile`, substring pre-filter on + `.emit(`/`.on(`/etc). `ON_RE` = `\.(?:on|once|addListener)\(\s*['"]([^'"]+)['"]\s*,\s* + (?:function\s+(\w+)|(?:this\.)?(\w+))`; `EMIT_RE` = `\.(?:emit|fire|dispatchEvent)\(\s*['"]([^'"]+)['"]`. +- Dispatcher = **enclosing function** of the `emit('e')` call (`enclosingFn` finds the + tightest function/method/component node containing the line). Handler = `getNodesByName` + of the on-handler name. +- Correlate by **event-name literal**; synthesize dispatcher → handler. +- **Precision — DIVERGENCE:** design proposed receiver-type matching; build uses an + **event fan-out cap** (`EVENT_FANOUT_CAP = 6`) — skip events with >6 handlers or + dispatchers (generic names like `error`/`change` would over-link without type info). + +### Provenance — DIVERGENCE +`Edge.provenance` is a fixed enum (`'tree-sitter'|'scip'|'heuristic'`), so synthesized +edges use **`provenance: 'heuristic'`** + `metadata: { synthesizedBy: 'callback'| +'event-emitter', via/event/field }`. The design's `'callback-synthesis'` provenance and +high/medium/low **confidence tiers were NOT implemented** — the fan-out cap + +registrar-name uniqueness + named-only handlers are the precision guards instead. + +### Phase 3 — inline callback extraction (`tree-sitter.ts`) +The real blocker for EventEmitter on real repos: inline handlers +(`on('mount', function onmount(){})`) weren't **nodes**, so nothing could link to them. +Root cause: `visitFunctionBody` walked *through* nested functions without extracting them. +Fix: in `visitForCallsAndStructure`, when a body node is a `functionType` and +`extractName` returns a real name, call `extractFunction` (which extracts it and walks +its own body) and return. **Named only** — anonymous arrows fall through to the existing +recursion (so their inner calls stay attributed to the enclosing fn). This bounded it: +excalidraw +3 nodes, no explosion, no regression. + +--- + +## Validation results (actual) + +| Repo | Result | +|---|---| +| excalidraw | 1 synthesized edge `triggerUpdate → triggerRender` (of 27,214); `trace(mutateElement, triggerRender)` = 3 hops; nodes 9,286 → 9,289 | +| express | after Phase 3: `use → onmount` `{event-emitter, event:"mount"}` (`onmount` now extracted at `application.js:109`) | +| `/tmp/cb-fixture/bus.js` | `tick → handleRefresh`, `persist → handleSave` (named-method EventEmitter handlers) | +| excalidraw / express | no Phase-1 regression; node counts stable | + +--- + +## Remaining work (prioritized for the next session) + +1. **Anonymous-arrow handlers** — `on('e', () => foo())` still produce no edge (no node, + intentionally not extracted in Phase 3). The fix is **synthesizer link-through-body**: + parse the arrow's body and link `dispatcher → (calls inside the arrow)`. Highest + remaining recall win; handles the most common modern callback shape. +2. **Wire into `resolveAndPersist`** (incremental sync) — synthesis currently runs only + in `resolveAndPersistBatched` (full index). Incremental re-index won't refresh + synthesized edges. +3. **Receiver-type matching** for EventEmitter precision (replace/augment the fan-out + cap) — use `type_of` edges so `x.emit('change')` only links to `y.on('change', fn)` + when `x`,`y` are the same type. Lets the fan-out cap relax. +4. **Tree-sitter arg recovery** (replace the regex in field-channel Stage 4) — robust for + arrows, multi-arg, line-wrapped calls. +5. **Single-callback fields** (`this.onChange = cb; … this.onChange()`) — scalar-store + variant of the field observer; not built. +6. **Broad precision/recall audit** — run across the full corpus; tally synthesized edges + per repo, spot-check, confirm no explosion on EventEmitter-heavy repos. +7. **Tests + CHANGELOG** — the fixture is a ready vitest case for the synthesizer; add + extractor tests for Phase 3 (named-nested-fn extraction; confirm other languages + unaffected — the change is in the shared walker), resolver tests for the django side. + +## Edge cases / model +- **Over-approximation across instances** is accepted (reachability, not instance + precision). `unregister`/`off` ignored. +- Synthesized edges are **additive** — never replace static edges; tooling can filter on + `provenance='heuristic'` + `metadata.synthesizedBy`. + +## Related work (same coverage effort) +This is one half of closing dynamic-dispatch coverage. The other artifacts on `main`: +- **Named attribute/descriptor resolver**: `claimsReference` (`resolution/types.ts`, + pre-filter in `resolution/index.ts`) + django ORM resolver (`frameworks/python.ts`, + `_iterable_class` → `ModelIterable.__iter__`). +- **Retrieval/UX changes** (separate from coverage): `explore` whole-small-file + glue + fixes, `node`-with-trail, `codegraph_trace`, `context` call-paths — all in + `src/mcp/tools.ts` / `src/context/index.ts`. +- **Full investigation context + findings:** auto-memory + `project_codegraph_read_displacement` (why coverage — not prompting/hooks/new-tools — + is the lever for getting agents to use codegraph over Read). diff --git a/docs/design/dynamic-dispatch-coverage-playbook.md b/docs/design/dynamic-dispatch-coverage-playbook.md new file mode 100644 index 00000000..c78d474d --- /dev/null +++ b/docs/design/dynamic-dispatch-coverage-playbook.md @@ -0,0 +1,548 @@ +# Dynamic-Dispatch Coverage Playbook + +**Audience:** a Claude agent continuing this work. +**Mission:** systematically close static-extraction coverage holes for **dynamic +dispatch** across **every language and framework codegraph supports**, and validate +each one the same way, so cross-symbol *flows* exist in the graph everywhere. + +> This is the top-level playbook. The deep design for one mechanism (the callback +> synthesizer) is in [`callback-edge-synthesis.md`](./callback-edge-synthesis.md). +> Full investigation context + findings: auto-memory `project_codegraph_read_displacement`. + +--- + +## 1. The goal (why this matters) + +codegraph's value is being **the map** — answering structural/flow questions +(`trace`, `impact`, callers, "how does X reach Y") that grep/Read cannot. Agents +will use codegraph instead of Read **only when it is sufficient**. We proved +empirically (see memory) that the lever for sufficiency is **coverage**, not +prompting/hooks/new-tools: when a flow is missing from the graph, the agent reads +the files to reconstruct it; when the flow *is* in the graph, the agent can answer +completely without reading. + +**Validated end-to-end on excalidraw:** after closing the update-flow hole, 2/3 +headless agent runs answered the "how does an update reach the screen" question with +**Read 0 and a complete answer** — impossible before, because the key edge wasn't in +the graph. (Caveat: coverage *enables* the no-read path; agent confirm-by-reading +variance means it doesn't *force* it. Completeness improves unconditionally.) + +The mission is to make that true for **all** languages/frameworks. + +--- + +## 2. The problem class: dynamic dispatch + +Static tree-sitter extraction captures explicit calls (`foo()`, `this.bar()`). It +**misses** any call whose target is computed/indirect. Four recurring shapes, with a +**difficulty gradient** (do the cheap ones first): + +| # | Shape | Example | Fix mechanism | Cost | +|---|---|---|---|---| +| 1 | **Named attribute / descriptor** | django `self._iterable_class(self)` | framework resolver (`claimsReference` + `resolve()`) | **cheap** | +| 2 | **Field-backed observer** | `onUpdate(cb)` + `for(cb of cbs)cb()` | callback synthesizer (whole-graph pass) | medium | +| 3 | **String-keyed EventEmitter** | `on('e',fn)` / `emit('e')` | callback synthesizer (event-keyed) | medium | +| 4 | **Inline callback handler** | `on('e', function h(){})` / `() => {}` | extraction (named) + synthesizer link-through-body (anon) | named: cheap · anon: hard | + +Key distinction driving the mechanism choice: +- **A named ref exists** to resolve (`_iterable_class` is an attribute name) → **resolver**. +- **No ref exists** (`cb()` is anonymous; needs registrar↔dispatcher correlation) → **synthesizer**. + +--- + +## 3. Worked examples (the two mechanisms, end to end) + +### 3a. Django ORM descriptor — the **resolver** pattern (Python) +- **Hole:** `QuerySet._fetch_all` calls `self._iterable_class(self)` (a runtime-chosen + iterable, default `ModelIterable`), whose `__iter__` runs the SQL compiler. Static + parsing can't resolve the attribute-as-callable → `_fetch_all`'s only callee was + `_prefetch_related_objects`; `trace(_fetch_all, execute_sql)` returned no path. +- **Fix:** `djangoResolver` claims the unresolved `_iterable_class` ref through the + name-exists pre-filter, then resolves it to `ModelIterable.__iter__`. +- **Files:** `src/resolution/types.ts` (`claimsReference?` on `FrameworkResolver`), + `src/resolution/index.ts` (pre-filter in `resolveOne` consults `claimsReference`), + `src/resolution/frameworks/python.ts` (`djangoResolver.resolve` + `claimsReference` + + `resolveModelIterableIter`). +- **Result:** `trace(_fetch_all, execute_sql)` → `_fetch_all → __iter__ → execute_sql` (3 hops). + +### 3b. Excalidraw observer + EventEmitter — the **synthesizer** (TS) +- **Hole:** `Scene.triggerUpdate` does `for (cb of this.callbacks) cb()`; `triggerRender` + is registered via `scene.onUpdate(this.triggerRender)`. The `triggerUpdate → + triggerRender` edge is dynamic → `trace` returned no path; the whole update flow broke. +- **Fix:** a whole-graph pass that detects registrar/dispatcher channels, correlates + registration sites, and synthesizes `dispatcher → callback` edges. Plus extraction of + **named** inline callbacks so handlers like express's `function onmount(){}` are nodes. +- **Files:** `src/resolution/callback-synthesizer.ts` (the pass — field observers + + EventEmitter), `src/resolution/index.ts` (calls `synthesizeCallbackEdges()` at the end + of `resolveAndPersistBatched`), `src/extraction/tree-sitter.ts` (`visitFunctionBody` + extracts named nested functions). +- **Result:** `trace(mutateElement, triggerRender)` → 3 hops; express `use → onmount`. + +--- + +## 4. The repeatable methodology (run this per language/framework) + +### Step 1 — Pick the framework's canonical *flow* question +Every framework has a signature data/control flow. Pick the "how does X reach/become Y" +question and a real repo (add to `.claude/skills/agent-eval/corpus.json`). Examples: +- React state→DOM, Vue reactive→render, Svelte store→update +- Rails request→controller→view, Spring request→`@Controller`→service +- Express/Koa request→middleware→handler, FastAPI request→route→dependency +- Redux action→reducer→store, RxJS subscribe→operator→observer +- Any ORM: query builder → SQL execution (django pattern) + +### Step 2 — Measure the hole (deterministic, no agent) +```bash +rm -rf /.codegraph && ( cd && codegraph init -i ) +node scripts/agent-eval/probe-trace.mjs # does the flow break? where? +node scripts/agent-eval/probe-node.mjs # trail: is the next hop missing? +``` +A "No direct call path … breaks at dynamic dispatch" + a sparse trail at the break +point **locates the hole** (this is exactly how `_iterable_class` and `triggerUpdate` +were found). Confirm it's dynamic by reading the break symbol's body. + +### Step 3 — Classify → choose the mechanism (use the §2 table) +- `self.(...)` / descriptor / metaclass → **resolver** (§3a). +- `for(cb of store)cb()` / `store.forEach(cb=>cb())` → **field-observer synthesizer** (§3b). +- `on('e',fn)` + `emit('e')` → **EventEmitter synthesizer** (§3b). +- Inline handler not a node → **named:** extraction (already done generically in + `tree-sitter.ts`); **anonymous:** synthesizer link-through-body (not yet built). + +### Step 4 — Implement +- **Resolver:** add to `src/resolution/frameworks/.ts` — a `resolve()` branch + + `claimsReference(name)` if the ref name isn't a declared symbol. Copy `djangoResolver`. +- **Synthesizer channel:** extend `src/resolution/callback-synthesizer.ts` — add the + framework's registrar/dispatcher **name patterns** and **body patterns** (e.g. signals + use `.connect()`/`.emit()`; Rx uses `.subscribe()`/`.next()`). +- Reindex (Step 2 command) and re-run `probe-trace` — the flow should now connect. + +### Step 5 — Validate (the same way every time) +1. **Deterministic:** `probe-trace(from,to)` finds the path; `probe-node` shows the + bridged hop. The previously-broken hop is closed. +2. **Precision:** count + spot-check synthesized/resolved edges — no explosion, correct targets: + ```bash + sqlite3 /.codegraph/codegraph.db \ + "select s.name||' → '||t.name||' '||coalesce(e.metadata,'') from edges e \ + join nodes s on e.source=s.id join nodes t on e.target=t.id where e.provenance='heuristic';" + ``` + (Resolver edges aren't `heuristic`; verify via the trace + callees instead.) +3. **Regression:** node count stable (`select count(*) from nodes;` before/after — a big + jump means an extraction change over-fired); existing traces on a control repo intact. +4. **End-to-end agent eval:** run the flow question with codegraph and measure + **reads / answer-completeness / cost** vs a pre-fix baseline: + ```bash + # headless (exact cost + clean tool sequence) + bash scripts/agent-eval/run-agent.sh with "" + # or the full A/B + interactive Explore-subagent path: + scripts/agent-eval/audit.sh local "" all + ``` + Then parse: `Read` count, codegraph-tool count, cost, and whether the answer now + contains the glue symbols (the ones that previously required a read). + +### Success criteria (per language/framework) +- `trace` finds the canonical flow end-to-end (no dynamic-dispatch break). +- Agent can answer the flow question with **Read 0** (achievable in ≥ some runs) and the + glue symbols appear in the answer. +- **No node explosion** and no regression on a control repo. +- Synthesized edges are precise on a spot-check (no generic-name over-linking). + +--- + +## 5. Validation toolkit (reference) + +| Tool | Purpose | +|---|---| +| `scripts/agent-eval/probe-trace.mjs ` | call-path between two symbols (the hole detector) | +| `scripts/agent-eval/probe-node.mjs [code]` | symbol + trail (callers/callees); `code` adds the body | +| `scripts/agent-eval/probe-context.mjs ""` | context output incl. call-paths | +| `scripts/agent-eval/probe-explore.mjs ""` | explore output | +| `scripts/agent-eval/{audit,run-agent,itrun}.sh` | agent A/B (headless + interactive); also the `/agent-eval` skill | +| `sqlite3 /.codegraph/codegraph.db` | direct edge/node inspection (provenance, metadata, counts) | + +Probe scripts use the built `dist/` — run `npm run build` first. Reindex after any +extraction or resolution change (`rm -rf /.codegraph && codegraph init -i`) — the +synthesizer/resolvers run at index time. Test fixtures: keep a tiny per-pattern fixture +(see `/tmp/cb-fixture/bus.js`; **move into `__tests__/`** when shipping). + +--- + +## 6. Coverage matrix (fill in as you go) + +Status legend: ✅ done+validated · 🔬 hole identified · ⬜ not started. +`Mechanism`: R = resolver, S = synthesizer channel, X = extraction. + +| Language | Framework(s) | Canonical flow to test | Mechanism | Status | +|---|---|---|---|---| +| TypeScript/JS | React / observer / EventEmitter / React Router | state→render; dispatch→callback; route→component | S + X | ✅ rendering+dispatch (excalidraw); **React Router JSX routing** `` (v5) + `element={}` (v6) → component (react-realworld **0→10, 10/10**). + **object data-router** `createBrowserRouter([{path, element/Component}])` (literal form); Next.js config/`nextjs-pages` false-positives FIXED. 🔬 lazy data-router (`path: paths.x.path, lazy: () => import()` — variable paths + lazy modules) | +| TypeScript/JS | Vue / Nuxt | template events (@click→handler); component composition; reactive→render | S + X | ✅ events + composition (vitepress S / vben M / element-plus L); 🔬 reactive→render (vue-core Proxy runtime — frontier, deferred) | +| TypeScript/JS | Svelte / SvelteKit | template calls/composition; SvelteKit action→api; store→DOM | X | ✅ already strong (realworld S / skeleton M / shadcn L): template `{fn()}` calls, `` composition, `import * as api` namespace, `load`→api all work out of the box. + exported-const object-of-functions extraction (SvelteKit `actions`). 🔬 `$lib`-namespace-from-action + store/reactive frontier | +| TypeScript/JS | Express / Koa | request → route → handler → service | R + X | ✅ named handlers + middleware + controller/service (resolver) + **inline arrow handlers → service body calls** (realworld S 19 / parse M / ghost L 65 edges). 🔬 custom routers (payload had 0 routes — not `app.get`-style) | +| TypeScript/JS | NestJS | request → @Controller → DI service → repo | R | ✅ already well-covered (realworld S / immich M-L / amplication L): @decorator routes (HTTP/GraphQL/microservice/WS) via resolver + DI `this.svc.method()` controller→service resolves correctly at scale (name + co-location). No dynamic-dispatch hole. 🔬 committed `dist/` build output gets indexed (realworld) — general build-dir-ignore follow-up | +| TypeScript/JS | RxJS / signals | subscribe → operator → observer | S | ⬜ | +| Python | Django ORM | QuerySet → SQL compiler | R | ✅ | +| Python | Django / DRF (views) | url → view → model | R + X | ✅ url→view (`path`/`url`/`as_view`) + **DRF `router.register`→ViewSet** (realworld S / wagtail M / saleor L); ORM QuerySet→SQL (prior work). 🔬 signals (`post_save`→receiver), DRF viewset CRUD actions (inherited), saleor GraphQL resolvers | +| Python | Flask / FastAPI | request → route → handler → dependency | R + X | ✅ **Flask: handler resolved across intervening decorators (`@login_required`) + stacked `@x.route` lines** (microblog S 6→27, redash L decorator routes 6/6); **FastAPI: empty-path router-root routes `@router.get("")` incl. multi-line** (realworld S 12→20 / Netflix dispatch L **290/290 100%**) + **bare-name builtin guard** — a handler named after a Python builtin method (`index`/`get`/`update`/`count`…) was filtered as a builtin and lost its route→handler edge. + **Flask-RESTful `add_resource(Resource,'/x')` → Resource class** (redash 6→**77**) + **tuple `methods=('GET',)`** (was mislabeled GET) + **broadened detection** (requirements/Pipfile/setup + subdir app-factory entrypoints — flask-realworld 0→**19**). 🔬 FastAPI `Depends()` dependency edges (light validation) | +| Go | Gin / chi / gorilla/mux / net-http | request → route → handler → service | X | ✅ **routes on ANY group var** (`v1.GET`, `PublicGroup.GET`) not just `r/router` (gin-vue-admin S→M 4→259 / realworld S / gitness L) — was missing all group-routed apps; named handlers resolve precisely. **gorilla/mux confirmed covered** by the any-receiver `HandleFunc`/`Handle` handling (subrouter-var `s.HandleFunc(...)` + namespaced handlers; `.Methods()` chain ignored). 🔬 inline `func(c){}` handlers (anonymous, body lost); subrouter/`PathPrefix` path-prefix not prepended (label only); gitness chi custom (26/321) | +| Rust | Axum / actix / Rocket | request → route → handler | R + X | ✅ **Axum chained methods + namespaced handlers** — `.route("/x", get(h1).post(h2))` emitted only the first method+handler, and `get(mod::handler)` captured the module not the fn (realworld-axum S **12→19, 19/19**); balanced-paren scan + per-method nodes + last-`::`-segment handler. **Rocket attribute macros 550/556 (99%)** (Rocket repo L) — already strong. crates.io named axum routes resolve (6/8; rest are closures/var handlers; its API is mostly the utoipa `routes!` macro = frontier). Cargo-workspace module resolution (prior work). **actix builder API** `web::resource("/x").route(web::get().to(h))` / `.to(h)` / App `.route("/x", web::get().to(h))` (actix-examples **51→128 routes, 35→112 resolved**) — was the dominant actix style and fully missed (the handler is in `.to(h)`, not `get(h)`). 🔬 actix `web::scope("/api")` prefix (not prepended to nested resource paths) + anonymous `.to` closure handlers | +| Java | Spring | request → @RestController → @Autowired service → repo | R + X | ✅ **bare `@GetMapping`/`@PostMapping` + class `@RequestMapping` prefix join → route→method** (realworld S / mall M / halo L) — was missing all path-less method mappings; DI controller→service resolves (name + dir) + **interface→impl dispatch synthesizer** (`interfaceOverrideEdges`: a class's `implements`/`extends` → link each interface/base method → its same-name override; JVM-gated, capped, **overload-aware**; mall **310** / halo **734** synth edges, node count unchanged) so trace follows controller→service-**interface**→**impl** instead of dead-ending at the abstract method — `trace("PmsProductController.getList","PmsProductServiceImpl.list")` connects in **3 hops** (probe-validated). ⚠️ **agent A/B null** (n=2: the agent went context→explore→Read and never invoked `trace`, so the synth edges weren't exercised — adoption-gated, the recurring wall; see `docs/benchmarks/call-sequence-analysis.md`). The fix is correct + improves trace/callees/impact/context connectivity regardless; agent-visible read reduction needs trace adoption. 🔬 Spring Data JPA derived queries (`findByEmail`) — metaprogramming frontier | +| Kotlin | Spring Boot / Jetpack Compose | request → @RestController → service; @Composable → child | R + X | ✅ **Spring Boot Kotlin** — the Spring resolver was `['java']`-only with a Java-syntax method regex (`public X name()`); extended to `.kt` + Kotlin `fun name(` handler matching (petclinic-kotlin **0→18, 18/18**; class-prefix joins; DI controller→repo resolves — `showOwner ← GET /owners/{ownerId}` → `OwnerRepository.findById`). **Compose composition already static** (@Composable→child are plain function calls — Jetcaster `PodcastInformation→HtmlTextContainer`). Java Spring unchanged (realworld 19/19). 🔬 Ktor `routing { get("/x"){…} }` lambda handlers (anonymous) + Compose recomposition (implicit `mutableStateOf`, no setState gate) + coroutines/Flow | +| Swift | Vapor | request → route → controller | R + X | ✅ **was 0 routes on every real app** — the extractor required an `app/router/routes` receiver + a `"path"` literal, but real Vapor routes on grouped builders (`let todos = routes.grouped("todos"); todos.get(use: index)`) with NO path arg. Rewrote: any receiver, optional/non-string path segments, `.grouped`/`.group{}` prefix tracking, `use:` discriminator. vapor-template S **0→3 (3/3**, nested `/todos/:todoID`), SteamPress M **0→27 (27/27)**, SwiftPackageIndex-Server L **0→14 (14/14** handler resolution). 🔬 typed-route enums (SPI `SiteURL.x.pathComponents` — path label only, handler still resolves) + closure handlers `app.get("x"){ }` (anonymous) | +| C# | ASP.NET Core | request → [Http*] action → DI service → EF | X | ✅ **feature-folder detection** (realworld 0→19 — was undetected) + **bare `[HttpGet]` + class `[Route]` prefix** (eShopOnWeb 9→33 / jellyfin L) — co-located so no claimsReference needed. 🔬 EF Core LINQ/DbSet (metaprogramming frontier) | +| Ruby | Rails / Sinatra | request → routes.rb → Controller#action → model | R | ✅ **RESTful `resources`/`resource` routing → controller#action** (realworld S 16 / spree M / forem L), pluralization + only/except + claimsReference; explicit routes fixed to precise `controller#action` too. 🔬 ActiveRecord dynamic finders (`Article.find_by_slug`) — metaprogramming frontier | +| PHP | Laravel | request → route → controller → Eloquent | R | ✅ **precise `Route::get([Ctrl::class,'m'])` / `'Ctrl@m'` → Ctrl@method** (realworld S / firefly M / bookstack L) — was resolving the bare method name to the WRONG controller (every `index`→ArticleController); Route::resource→controller. 🔬 Eloquent dynamic finders/relationships (metaprogramming frontier) | +| PHP | Drupal | request → *.routing.yml → _controller/_form | R | ✅ **`claimsReference` for FQCN handlers** (`\Drupal\…\Class::method` passed the pre-filter only because the `::method` name was known; bare `_form` FQCNs `\…\FormClass` and single-colon `Class:method` controller-services were dropped before resolve()) + **single-colon controller match** + **detect via composer `type:drupal-*` / `name:drupal/*` + `*.info.yml` fallback** (a contrib module with empty `require` was undetected → 0 routes). admin_toolbar S **0→14 (14/14)** / webform M 208 (**144**) / core L 836 (536→**731, 87%**). Remainder is the **entity-annotation handler frontier** (`_entity_form: type.op` resolves via the entity's PHP `#[ContentEntityType]` handlers, not a direct class). 🔬 **OOP `#[Hook]` attributes** — Drupal 11 moved ~all procedural hooks to attribute methods (core: 418 `#[Hook]` files vs 3 procedural), so the resolver's docblock/`module_hook` detection is obsolete for modern core (0 hook edges) | +| C/C++ | C++ vtables / inheritance | virtual call → override; general direct dispatch | S + X | ✅ **general dispatch strong** (redis C **29k** cross-file calls / leveldb C++ **1.4k**) + **C++ inheritance extraction fix** (`base_class_clause` was unhandled, so C++ extends edges were missing — leveldb **219→298**) + **cpp-override synthesizer** (base virtual method → subclass override, gated to C++, capped — leveldb 12 precise: `Iterator::Next→MergingIterator`). 🔬 C callback structs (`s->fn()` → 422-way fan-out, too noisy to synthesize) + C++ pure-virtual base methods (`virtual void f()=0;` declarations aren't extracted as nodes, so those overrides can't bridge) | +| Dart | Flutter | setState → build; build → child widgets | S + X | ✅ **setState→build synthesizer** (Dart analog of react-render: a State method whose body calls `setState(` → `build`) gated to `.dart` + **foundational Dart method-range fix** — Dart models a method body as a *sibling* of the signature, so method nodes were signature-only (`end==start`); now `endLine` spans the body (required for ALL body analysis: callees, context slices, the synthesizer's body scan). counter `initState→build`, books `build→BookDetail/BookForm`; widget composition already static (compass_app `build→ErrorIndicator/HomeButton`). Controls unchanged (excalidraw 9,290 / django 302 — the range fix only extends sibling-body grammars). 🔬 MVVM Command/ChangeNotifier dispatch (compass_app — no setState) + `Navigator.push(MaterialPageRoute(builder:))` nav routes | +| Lua / Luau | Neovim / Roblox | module dispatch (require→mod, mod.fn); event/callback | — | ✅ **already covered for the dominant flow (measure-first, no code change)** — Neovim is module-heavy (`require('x')` + `x.fn()`), and the general import + name resolution already handles it: telescope.nvim **220 imports + 335 cross-file `mod.fn` calls**, traces end-to-end (`map_entries ← init.lua → get_current_picker (state.lua)`). Luau instance-path `require(game:GetService(...))` handled by the extractor. 🔬 event-callback registration (`vim.keymap.set(…, fn)`, autocmd `callback=`, Roblox `signal:Connect(fn)`) is predominantly INLINE anonymous closures (corpus ~12 inline vs ~2 named) — the anonymous-handler frontier; named handlers too rare to justify a synthesizer | +| Scala | Play / Akka | request → conf/routes → controller action | R + X | ✅ **Play `conf/routes` → controller** — the extensionless `conf/routes` wasn't indexed; added narrow file-walk opt-in (`isPlayRoutesFile`) + a Play resolver parsing `METHOD /path Controller.action(args)` → the action method (computer-database **0→8, 7/8**; starter 0→4, 3/4 — the unresolved are Play's framework `Assets` controller, external). Scala general controller→DAO dispatch already resolves. No-regression: the file-walk change only ADDS Play routes files (excalidraw 9,290 / suite 800 unchanged). 🔬 SIRD programmatic router (`-> /v1 Router` include + `case GET(p"/x")` in code) + Akka actor `receive`/`Behaviors.receiveMessage` message→handler | + +(Verify the exact supported set against `src/extraction/languages/` and +`src/resolution/frameworks/` before starting — this table is a starting point.) + +--- + +## 7. Known limits & gotchas (from the excalidraw/django work) + +- **Coverage enables, doesn't force, the no-read path.** Agents still read to *confirm + source* sometimes; cost stays ~flat (codegraph calls trade for reads). The reliable + win is **completeness** + making Read-0 *possible*. Don't expect a guaranteed cost drop. +- **Vue (validated 2026-05-23, vitepress S / vben M / element-plus L).** SFC `