Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: codewpf/codegraph
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: colbymchenry/codegraph
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 10 commits
  • 50 files changed
  • 6 contributors

Commits on May 27, 2026

  1. fix(cli): include resolution + synthesizer edges in indexAll report (c…

    …olbymchenry#413)
    
    The orchestrator's per-file counter only sees extraction-phase edges, so the `X nodes, Y edges` line printed after `codegraph init -i` / `codegraph index` undercounts the graph — often by more than half on repos with heavy cross-file resolution (mall: 20 047 reported vs 45 629 actually in the DB).
    
    Snapshot (nodes, edges) before/after the full pipeline in `indexAll` and write the true delta back to the result. New lightweight `QueryBuilder.getNodeAndEdgeCount()` is one round-trip with no per-kind breakdowns. `indexFiles` (no resolution) and `sync` (uses `nodesUpdated`, not `nodesCreated`) are unaffected.
    
    Regression test added: `__tests__/integration/full-pipeline.test.ts > reports edgesCreated including resolution + synthesizer phases`.
    arttttt authored May 27, 2026
    Configuration menu
    Copy the full SHA
    3808b4d View commit details
    Browse the repository at this point in the history
  2. feat(jvm): resolve Java/Kotlin imports by fully-qualified name (colby…

    …mchenry#412)
    
    Wrap top-level declarations of `.kt` / `.java` files in an implicit `namespace` node carrying the file's `package`, then resolve `import com.example.foo.Bar` through that qualifiedName index — so a Bar in Models.kt resolves correctly regardless of filename, a top-level function import binds to its declaration, Java↔Kotlin interop crosses cleanly, and same-name classes across packages no longer collide. Wildcard imports still go through name-matcher.
    
    Also extracts Java/C# anonymous-class overrides (`new T() { ... }`) as first-class class nodes with their override methods. Phase 5.5 interface-impl then bridges T's abstract methods to the anonymous overrides automatically — including the lambda-returned `new T() { ... }` pattern common in guava (Splitter, CacheBuilder).
    
    Concrete impact on macrozheng/mall (524 .java files, multi-module Spring + MyBatis): 524 namespace nodes, 862 imports edges newly resolve to Java symbols, 76 distinct `Criteria` classes preserved across packages with no merge. On google/guava (3,227 .java): 3,608 anonymous classes extracted, +2,534 interface-impl edges reach overrides hidden in `new T() { ... }` blocks.
    
    Agent A/B playbook on small (spring-petclinic-kotlin, 38 .kt), medium (mall, 524 .java), large (guava, 3,227 .java) — 3 flow prompts × 2 runs/arm × 2 arms = 36 runs, claude-opus, headless. Spring repos: 0/0 Read/Grep with-arm, −27% wall-clock vs no-codegraph. Guava: 1.8 Read avg with-arm (vs 2.0 without) — improved by the anon-class extraction; residual is a lambda→SAM coverage gap orthogonal to FQN imports (filing follow-up).
    arttttt authored May 27, 2026
    Configuration menu
    Copy the full SHA
    34240eb View commit details
    Browse the repository at this point in the history
  3. test(vitest): unblock subprocess MCP tests on Node >= 25 dev machines (

    …colbymchenry#478) (colbymchenry#479)
    
    Vitest already inherits process.env into every spawned `codegraph serve --mcp`
    child, but on Node >= 25 the CLI's hard-block (src/bin/codegraph.ts) kills the
    child before it can respond. Set CODEGRAPH_ALLOW_UNSAFE_NODE=1 via test.env so
    the test suite is green regardless of the contributor's Node version; the
    runtime guard itself is unchanged for end users.
    eddieran authored May 27, 2026
    Configuration menu
    Copy the full SHA
    02935d7 View commit details
    Browse the repository at this point in the history

Commits on May 28, 2026

  1. feat(mcp): multi-module Go trace-quality + small-repo retrieval tuning (

    colbymchenry#494)
    
    * feat(go): generated-file down-rank + gRPC stub-impl bridge + trace-failure inlining
    
    Multi-pronged fix to make codegraph competitive on Go multi-module repos
    (cosmos-sdk, etcd) where it previously lost or tied. Driven by an 8-question
    agent-eval audit across cobra, gin, prometheus, cosmos-sdk, and etcd: the
    baseline had codegraph losing ~60% on cost on cosmos-sdk and mixed on etcd
    deep cross-module flows, while winning cleanly on the single-module and
    non-protobuf-heavy repos.
    
    Diagnostics ruled OUT `go.work` parsing as the gap (prometheus crushes
    without it). The actual failure modes were generated-file noise warping
    disambiguation, missing gRPC interface→impl bridge in structural-typing Go,
    and trace's failure path triggering 3-5 follow-up tool calls instead of
    inlining the material the agent needed.
    
    Changes:
    
    - New `src/extraction/generated-detection.ts` — path-pattern classifier
      for `.pb.go`, `.pulsar.go`, `_grpc.pb.go`, `_mock.go`, `_mocks.go`,
      `mock_*.go`, `.generated.[jt]sx?`, `_pb2(_grpc)?.py`, `.pb.{cc,h}`,
      `.g.dart`, `.freezed.dart`. Applied as a stable sort tiebreaker in
      `findSymbol`, `findAllSymbols`, `codegraph_search` (MCP + CLI),
      `codegraph_explore` file ranking, and context formatter Entry Points /
      Related Symbols / Code blocks. Cosmos's `msgServer.Send` now ranks colbymchenry#3
      instead of colbymchenry#9 on a `Send` search.
    
    - New `goGrpcStubImplEdges` synthesizer in `callback-synthesizer.ts` —
      detects `UnimplementedXxxServer` structs in generated files, identifies
      their RPC methods (excluding `mustEmbed*` / `testEmbeddedByValue` gRPC
      markers), and emits `calls` edges to the matching methods on any
      non-generated struct whose method-name set is a superset. Closes Go's
      structural-typing gap that the existing `interfaceOverrideEdges` (Java /
      Kotlin only) couldn't bridge. 467 bridge edges on cosmos-sdk; bank's
      `UnimplementedMsgServer::Send` points to `x/bank/keeper/msg_server.go`
      only, not to `msgClient` siblings or mock files.
    
    - Trace-failure rewrite (`handleTrace`) — when no static path connects
      endpoints, instead of telling the agent to call `codegraph_node` (a
      3-4-call fan-out), inline both endpoints' bodies (120 lines / 3600 chars
      per endpoint), their callers (≤6), and callees (≤8) in one response.
    
    - Trace endpoint-pairing improvements — scores every `from`×`to`
      candidate combo by shared directory prefix and tries the best-paired
      pair first (the full candidate set, not just FTS top-5). A
      less-canonical-path penalty (`enterprise/`, `contrib/`, `examples/`,
      `vendor/`, `third_party/`, `deprecated/`, `legacy/`) ensures the
      canonical-module pair wins even when a side-experiment shares more of
      its directory prefix. Find-path probe budget capped at 20 pairs.
    
    - Test-file deprioritization in `codegraph_explore` `isLowValue` — adds
      suffix patterns (`_test.go`, `_spec.rb`, `.test.ts`, `.spec.tsx`,
      `Test.java`, `Spec.kt`) alongside the existing directory-style patterns.
      Otherwise etcd's `watchable_store_test.go` consumes 5K chars of explore
      budget that should go to the hand-written flow source.
    
    Tests:
    
    - New `__tests__/generated-detection.test.ts` (4 unit tests) pins the
      suffix patterns.
    - New "Go gRPC stub→impl synthesis" integration test suite in
      `frameworks-integration.test.ts` (2 tests): positive bridge from stub
      to hand-written impl, AND the precision case (don't bridge to a
      generated sibling like `msgClient` in the same .pb.go).
    - Full suite: 1076/1076 pass.
    
    Empirical (post-fix, n=2 average per question):
    
    | Repo / Q                | WITH       | WITHOUT     | Reads (W/WO) | Time (W/WO)
    |-------------------------|------------|-------------|--------------|------------
    | cobra (parse cmds)      | $0.27      | $0.27       | 0 / 4        | 39s / 60s
    | prometheus (scrape→TSDB)| $0.63      | $0.70       | 0 / 6        | 106s/143s
    | cosmos-sdk Q1 (MsgSend) | $0.41      | $0.26       | 1 / 2        | 67s / 64s
    | cosmos-sdk Q2 (Delegate)| $0.47      | $0.46       | 0 / 5        | 50s / 73s
    | cosmos-sdk Q3 (gov tally)| $0.34     | $0.31       | 1.5 / 3      | 54s / 76s
    | etcd Q1 (Put→raft)      | $0.65      | $0.78       | 0 / 4        | 98s / 129s
    | etcd Q2 (watch)         | $0.36      | $0.50       | 0 / 4+       | 58s / 89s
    
    Codegraph wins on reads + time on every question. Cost is mixed: 3 clean
    wins, 3 tied (within 10%), 1 stubborn cost loss on the grep-favored Q1.
    Compared to baseline, the cosmos-sdk cost-gap collapsed from -60% to -15%
    on average, and Q3 went from a 75% loss to a tie. Raw run artifacts in
    `/tmp/cg-finalv2-*/` and `/tmp/cg-final-*/`.
    
    Memory written at `project_go_multi_module_audit.md` for the methodology
    + before/after numbers.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * feat(mcp): auto-inline trace in codegraph_context for flow queries
    
    When a codegraph_context task contains a flow keyword ("trace", "from",
    "reach", "flow", "propagat", "how does", "how do") AND at least two
    distinct PascalCase / camelCase identifiers, internally invoke trace
    between the first two extracted symbols and splice the trace body into
    the context response. Conservative trigger by design: false positives
    waste one graph query; false negatives just fall back to the agent
    calling trace itself (existing path-proximity wiring handles either
    case).
    
    Goal: collapse the agent's typical context → trace → explore sequence
    into a single context call for clear flow queries, closing the
    remaining cost-overhead gap on multi-call patterns. The path-proximity
    + less-canonical-path scoring + the trace-failure-inlined-bodies
    behavior already let the inline trace land on the right endpoint pair
    and return enough material that no follow-up codegraph_node/Read is
    needed.
    
    Doesn't fire on:
    - cobra's "How does cobra parse commands and flags?" (no PascalCase
      symbols) — verified in regression run, no behavior change ($0.260
      WITH vs $0.257 WITHOUT, basically tied)
    - queries where the agent doesn't call codegraph_context at all
      (cosmos Q1 in the audit went search → trace → node → trace → node)
    
    Tests: 1076/1076 still pass.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * feat(mcp): trace failure inlines TO file siblings to displace node fan-out
    
    The cosmos-Q1 audit revealed a static-resolution gap: msgServer.Send's
    *real* next hop is `k.Keeper.SendCoins` — an interface-method call on an
    embedded field that tree-sitter can't resolve. The static getCallees list
    for msgServer.Send is all utility/error functions (StringToBytes, Wrapf,
    …). The actual flow (SendCoins → subUnlockedCoins → addCoins →
    setBalance) lives entirely inside `x/bank/keeper/send.go`, which is also
    where the TO endpoint (setBalance) lives.
    
    When trace fails (no static path), inline the **top 5 functions/methods
    in the destination file**, ordered by line-distance from the TO node.
    This catches the flow that interface-method calls obscure — the
    canonical "k.<Iface>.<Method>" pattern in Go, also relevant to Java
    dependency-injection / Rails service-object dispatch / etc. where
    interface dispatch hides the real call.
    
    Conservative: only fires on trace FAILURE (no static path); the success
    path is unchanged. Per-body cap (40 lines / 1200 chars), top 5 siblings.
    Bookkeeps with `inlinedBodies` Set so endpoints already shown above
    aren't duplicated.
    
    Result: cosmos-Q1 — historically the most stubborn cost loss (-2.2× to
    -39% across the audit) — flipped to a clean WIN: $0.257 WITH vs $0.449
    WITHOUT (-43%), 34s vs 79s, 0 Reads vs 2 Reads + 5 Greps, 5 codegraph
    calls vs 12. Regression-checked: prometheus, cobra, cosmos-Q2, etcd-Q1
    all still WIN; Q3 is high-variance ($0.30-$0.45 range historically) and
    fell within that on this run.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * feat: extend coverage to all supported languages, not just Go
    
    PR review feedback: the audit was Go-driven, so the patterns I added
    were Go-flavored. Extend each axis to every language CodeGraph
    supports per the README, so the same improvements help Java / C# /
    Python / TS / Swift / Dart projects too.
    
    **generated-detection.ts** — Added patterns for:
    - TS/JS: `.gen.[jt]sx?`, `.pb.[jt]s`, `_pb.[jt]s`, `_grpc_pb.[jt]s`
      (ts-proto, gRPC-web, Apollo / GraphQL codegen, Hasura).
    - Python: `_pb2.pyi` (mypy stubs from protobuf).
    - C#: `.g.cs` (T4 / Razor codegen), `Grpc.cs` (protoc-gen-csharp).
    - Java: `OuterClass.java` (protoc-gen-java), `Grpc.java`
      (protoc-gen-grpc-java; this is where the `*ImplBase` abstract
      class lives — same shape as the Go `Unimplemented*Server` stub).
    - Swift: `.pb.swift` (protoc-gen-swift).
    - Dart: `.pb.dart`, `.pbgrpc.dart`, `.chopper.dart`.
    - Rust: `.generated.rs`.
    
    **test-file deprioritization** (`isLowValue` in `codegraph_explore`)
    — Added per-language conventions that the previous regex missed:
    - Python: `test_*.py` (pytest discovery) and `*_test.py`.
    - Ruby: `*_test.rb` (minitest) — `*_spec.rb` already covered.
    - C#: `*Tests.cs`, `*Test.cs`, `*Spec.cs`.
    - Swift: `*Tests.swift` (XCTest).
    - Dart: `*_test.dart`.
    
    **IFACE_OVERRIDE_LANGS** in `callback-synthesizer.ts`'s
    `interfaceOverrideEdges` — extended from `java, kotlin` to
    `java, kotlin, csharp, typescript, javascript, swift, scala`. Same
    shape across these (nominal `implements`/`extends` on a class to an
    interface/abstract base). Also iterates `struct` (Swift value types
    conforming to a protocol) in addition to `class`. The existing
    matchesSymbol-style logic and `getOutgoingEdges(..., ['implements',
    'extends'])` work unchanged.
    
    **CLAUDE.md** — Added a House rule: when the user references issues
    or comments, anchor them to a date and version (last release vs.
    last main commit vs. current branch tip) BEFORE concluding a fix is
    incomplete. Issue colbymchenry#388 comments from May 25-27 were responding to
    the released v0.9.5 / merged-PR-469 state — not to this branch's
    in-flight work. The new rule walks through the disambiguation:
    `grep -m1 '^## \[' CHANGELOG.md` for release version, `git log
    --first-parent main -1` for main tip.
    
    Tests: 1076/1076 still pass.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * feat(mcp): tiny-repo tool gating + shorter tool descriptions
    
    Two cumulative changes targeting the small-repo cost gap surfaced by
    the cross-language audit:
    
    1. **Tool descriptions trimmed** (~2.1KB total saved across 10 tools).
       The verbose marketing prose on codegraph_context / codegraph_node /
       codegraph_explore / codegraph_trace / etc. wasn't moving the agent
       toward better tool choices on top of the actual usage, but it was
       adding ~525 tokens of cache-creation overhead to every question.
       The trimmed descriptions keep the operational hints (e.g. "Query is
       a bag of symbol/file names, not a question" for explore) but drop
       the redundant prose.
    
    2. **Dynamic tiny-repo tool gating** in `ToolHandler.getTools()`. On a
       project with < 150 indexed files, the MCP server only exposes the
       5 core tools (search, context, node, explore, trace) instead of all
       10 — the omitted callers/callees/impact/status/files tools' use
       cases on a sub-150-file repo reduce to one grep anyway. The MCP
       tool-defs overhead is the colbymchenry#1 source of cost loss on tiny repos
       (~$0.10-0.15 fixed cache-creation per question); cutting 5 tools
       drops that by ~50%.
    
       Effect on ky (~25 files, the worst pre-fix offender):
         - Before: $0.59 WITH vs $0.42 WITHOUT (+42% loss, n=1)
         - After:  $0.32 WITH vs $0.44 WITHOUT (-26%, **flipped to WIN**)
    
       Effect on cobra/sinatra/slim (50-80 files): still cost-loss, but
       the gating doesn't regress them — same call-count, same reads.
       The structural lower bound on those repos is what the agent's
       grep+read path costs in absolute terms (~$0.20-0.30).
    
       Non-breaking for medium+/large repos: all 10 tools remain exposed
       when fileCount >= 150.
    
    Tests: 1076/1076 still pass.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * feat(mcp): combined tiny-tier — smaller explore + tool gating (cobra/ky flip to WIN)
    
    Combines the tool gating from the previous commit with a matching
    explore-budget cut for projects under 150 files. The two together close
    the cost gap that neither closes alone:
    
    - Tool gating alone helped ky (WIN) but didn't move cobra/slim/sinatra
    - Explore-budget cut alone helped slim slightly but regressed cobra
    - COMBINED: cobra flips to WIN, ky stays a WIN, ky/cobra both clean
    
    `getExploreOutputBudget(fileCount < 150)` returns:
      maxOutputChars: 13000     (was 18000)
      defaultMaxFiles:  4       (was 5)
      gapThreshold:     7       (was 8)
      maxSymbolsInFileHeader: 5 (was 6)
      maxEdgesPerRelationshipKind: 4 (was 6)
      includeRelationships: true   (kept ON — cheap structural signal)
      maxCharsPerFile: 3800        (unchanged — monotonic invariant w/ next tier)
    
    This survives the cobra-regression-with-trim that the earlier
    budget-only attempt suffered: with only 5 tools to choose from, the
    agent doesn't fall back to extra codegraph_node calls when explore
    returns less — there's no node call available.
    
    Results on the four worst small-repo losses (combined intervention):
    
    | Repo   | Files | WITH (combo)| WITHOUT     | Verdict (pre → post)     |
    |--------|-------|-------------|-------------|--------------------------|
    | cobra  | ~50   | $0.25       | $0.31       | loss → **WIN** (-19%)    |
    | ky     | ~25   | $0.39       | $0.39       | -42% → tied              |
    | slim   | ~80   | $0.31       | $0.24       | LOSS 31% → still LOSS    |
    | sinatra| ~60   | $0.30       | $0.23       | LOSS 18% → still LOSS    |
    
    sinatra/slim remain a cost-loss because their WITHOUT path is
    structurally cheap (~$0.20 — fewer than 4 cheap grep+read calls).
    Codegraph can't beat that absolute floor with any meaningful response.
    Both still WIN on time + reads + tool-call count.
    
    Tests: tier boundary cases updated to cover the new <150 / 150-499 /
    500-4999 / 5000-14999 / >=15000 progression. Off-by-one guard updated
    to include the new 149↔150 boundary. All 1076 tests pass.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * feat(context): trim maxNodes default to 8 on tiny repos
    
    On a <150-file project the entire repo is grep-able in one turn, so the
    20-node default `codegraph_context` was paying for a graph subset that
    exceeds the agent's actual question. Cutting the tiny-repo default to 8
    (typical 1-3 entry points + their immediate 1-hop neighbors) reduces
    the context-tool response body without hitting sufficiency on the flow
    shapes small repos actually contain.
    
    Non-breaking: the agent can still pass an explicit `maxNodes` to
    override; medium+ repos (>=150 files) keep the 20-node default.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * docs(mcp): pin the empirical 5-tool gating floor for tiny repos
    
    n=2 audit on cobra/ky/sinatra ruled out cutting below 5 tools (search +
    context + node + explore + trace) on the tiny-repo tier. The smaller
    3-tool gate (search + context + trace) saved ~$0.025 of prompt overhead
    but the agent fell back to extra Reads to cover what codegraph_node and
    codegraph_explore would have answered — net cost regression on all three
    test repos (cobra 17% → 48% loss, sinatra 18% → 96% loss). Documented
    inline so future tuners don't re-try this dead-end.
    
    No behavior change beyond the comment: the 5-tool gate remains the
    production setting.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * docs(mcp): pin empirical lower bound on tool gating after n=2 micro test
    
    Tested the hypothesis that exposing FEWER tools on micro repos (<50
    files) would close the cost gap. Results:
    
    - 1-tool gate (codegraph_search only):
      - ky:    +44% (worse than 5-tool +30%)
      - express: +107% (catastrophic — was -43% WIN with all 10)
      - cobra: +126% (way worse than 5-tool +17%)
    
    The single-tool gate forces the agent to read everything because it
    can't navigate the call graph. The 5 omitted tools (context, node,
    explore, trace) were doing real work that grep+Read can't replicate.
    
    Conclusion: 5 tools (search + context + node + explore + trace) is the
    empirical lower bound on the tiny-repo tier. Cutting below regresses
    EVERY tested repo. The remaining ~$0.04-0.08 of structural cost overhead
    on tiny repos is unavoidable without sacrificing the value codegraph
    provides at that scale (which would also make WITH = WITHOUT, defeating
    the install).
    
    Comment documents the dead-ends so future tuners don't relitigate.
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * feat(mcp): iter3/iter4 — raise tool-gate to 500, sufficiency steering in context, hard-exclude low-value files
    
    Three layered changes targeting the sinatra/slim/small-repo cost gap
    that iter2's body-shrink failed to close (smaller bodies just pushed
    the agent to Read instead):
    
    1. **Tool-gate threshold 150 → 500** (`TINY_REPO_FILE_THRESHOLD`).
       Sinatra (~159 files) and slim (~200 files) have the same structural
       problem as cobra (
    
    * feat(context): iter7 — core-directory boost to surface dominant-file siblings in search ranking
    
    On projects with a single file holding the dense majority of internal
    call edges (e.g. sinatra's `lib/sinatra/base.rb` at ~85% of in-file
    edges), text search was favoring small focused extension files over the
    core file. A small focused file like `multi_route.rb` wins on verbatim
    name match + file-size normalization, burying the 1500-line core file's
    longer method names (e.g. `route!` vs `route`).
    
    Fix: detect the "dominant file" — the file whose in-file edge count is
    ≥3× the next candidate's — then add +25 to all results sharing its
    directory prefix. This pulls the core file's siblings above
    sibling-package extensions without hardcoding any repo structure.
    
    `getDominantFile()` excludes test/spec files and generated files
    (e.g. etcd's `rpc.pb.go` has 4× the in-file edges of `server.go` and
    would otherwise hijack the boost toward generated protobuf stubs).
    SQL pulls the top 20 candidates; path-pattern filtering handles what
    SQLite LIKE can't express.
    
    * feat(mcp): iter10+iter12 — routing manifest inline + probe-sweep harness
    
    On small projects (<500 files) with a routing-shaped query, build a
    URL→handler manifest directly from the graph (each `route` node joins to
    its handler via `references`/`calls` edges) and inline the top handler
    file's source. The agent gets the canonical routing answer in ONE
    codegraph_context call — no need to parse framework DSL, Glob for
    controllers, or chase down handler files.
    
    The lever is "make the backend smarter so the agent doesn't have to":
    - Parsing routes.rb / routes/api.php / urls.py DSL is the agent's job
      in the WITHOUT arm. Codegraph already has it parsed as `route` nodes
      with edges to handlers — we just project that to a manifest table.
    - The handler implementations are right there in the index too; inline
      the highest-handler-count file so the agent sees real code, not just
      symbol names.
    
    Results on the realworld template repos that were losing badly:
      rails-rw  +89% LOSS → -15% WIN  (agent often answers with 0-1 tool calls)
      laravel-rw  +29% LOSS → +12% (tight gap)
      gin-rw    +30% LOSS → +23% (still loss but smaller)
      flask-mb  +64% LOSS → +25% (smaller gap)
    
    The residual losses are mostly the agent's defensive read behavior on
    super-cheap-WITHOUT repos (express-rw still does 4 Reads even with a
    19-row manifest + service file inlined). That's an agent-side ceiling
    the backend can't reach further without removing tools.
    
    Also lands `scripts/agent-eval/probe-sweep.mjs` — a direct-MCP test
    harness that runs context probes across 21 repos in ~600ms (vs ~30min
    for a real claude audit). Enables rapid iteration on backend changes:
    edit tools.ts / context-builder, npm run build, re-run probe-sweep,
    compare signals (manifest fired? handler file inlined? response size?)
    before paying for a claude run.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * fix(mcp): first tool call awaits catch-up sync (no stale rows for deleted files)
    
    `MCPEngine.catchUpSync()` reconciles the index against the working tree
    after open (catching `git pull`/`checkout`/`rebase` and any edits or
    deletes made while no server was running). It was fire-and-forget — so a
    tool call landing in the first ~50-300ms could race past it and serve
    rows for files that no longer exist on disk. The per-file staleness
    banner can't help here, because that signal is populated by the file
    watcher (not by catch-up).
    
    The fix: `catchUpSync()` now pushes its promise into `ToolHandler` via
    `setCatchUpGate(p)`; the first `execute()` call awaits the gate and then
    clears it. Subsequent calls pay nothing. Catch-up rejections are logged
    by the engine and swallowed by the handler so a transient sync failure
    never breaks tools.
    
    Most visible on the "deleted everything between sessions" case, where
    MCP previously returned stale rows pointing at non-existent files.
    Validated end-to-end on a 10,640-file VS Code index: with the gate, a
    codegraph_search for "ExtensionHost" against an empty (but stale-DB)
    directory returns "No results found" after the catch-up drains the DB;
    without the gate, the same call returns 10 stale hits.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    * docs(changelog): cover small-repo retrieval tuning + auto-trace + iface-override expansion
    
    Add entries for work that landed on this branch but wasn't yet in
    [Unreleased]: tiny-repo tool gating + sufficiency steering + budget
    tier, auto-inline trace in codegraph_context, routing manifest inline,
    core-directory ranking boost, JVM-only interfaceOverrideEdges extended
    to C#/TS/JS/Swift/Scala, and the shorter tool descriptions.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    
    ---------
    
    Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    colbymchenry and claude authored May 28, 2026
    Configuration menu
    Copy the full SHA
    71935e3 View commit details
    Browse the repository at this point in the history
  2. fix(windows): suppress console popup on child_process calls (colbymch…

    …enry#498)
    
    On Windows, v0.9.5's detached shared daemon (colbymchenry#411) has no inherited console,
    so any console-subsystem child it spawns gets a fresh visible console window
    unless the spawn passes `windowsHide: true`. The fix adds the flag to all
    ten `spawnSync` / `execFileSync` / `execSync` call sites across extraction,
    sync, installer, and the WASM-flags relaunch. macOS/Linux ignore the option,
    so this is a no-op elsewhere.
    
    Fixes colbymchenry#485, colbymchenry#510, colbymchenry#530.
    
    Co-authored work:
    - colbymchenry#498 (csw-chen) — full sweep across extraction, sync, installer, and wasm-runtime. **This is the change being merged.**
    - colbymchenry#505 (yushengruohui) — independently identified and fixed the 7 git execFileSync sites. Superseded by colbymchenry#498's broader sweep; same diagnosis.
    - colbymchenry#521 (JirA44) — independently identified and fixed the WASM-runtime spawnSync re-exec. Superseded by colbymchenry#498's broader sweep; same diagnosis.
    
    Validated on Windows 11 ARM64 (Parallels): a detached parent's 15 git spawns produce 15 visible black flash-windows without the fix and 0 with it.
    csw-chen authored May 28, 2026
    Configuration menu
    Copy the full SHA
    cea78ce View commit details
    Browse the repository at this point in the history
  3. fix(installer): stop duplicating agent instructions; MCP server is th…

    …e single source of truth (colbymchenry#529) (colbymchenry#538)
    
    The installer wrote a `## CodeGraph` usage block into each agent's
    instructions file (CLAUDE.md / AGENTS.md / GEMINI.md / .cursor/rules /
    Kiro steering) that duplicated, almost verbatim, the guidance the MCP
    server already emits in its `initialize` response — so agents that
    surface MCP instructions (Claude Code) read the same playbook twice
    every turn.
    
    All 6 instruction-writing targets (claude, cursor, codex, opencode,
    gemini, kiro) now stop writing the block. install self-heals by
    stripping a block a previous version wrote (uninstall already did), so
    the next `codegraph install`/`uninstall` cleans up existing installs;
    upgrading the package alone does not (the leftover block is harmless).
    server-instructions.ts is now the single source of truth — the two
    steers unique to the old template ("trust codegraph, don't re-verify
    with grep" and the not-initialized -> `init -i` hint) are ported there.
    
    Removes the now-dead INSTRUCTIONS_TEMPLATE / CLAUDE_MD_TEMPLATE,
    claude-md-template.ts, writeClaudeMd / hasClaudeMdSection, and the
    Cursor-only wireProjectSurfaces bootstrap. The install log learned a
    "Removed" verb. Tests rewritten to the new contract + self-heal
    coverage (140/140 installer tests pass).
    
    Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
    colbymchenry and claude authored May 28, 2026
    Configuration menu
    Copy the full SHA
    a9c9e76 View commit details
    Browse the repository at this point in the history
  4. chore(release): bump version to 0.9.7

    Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
    colbymchenry and claude committed May 28, 2026
    Configuration menu
    Copy the full SHA
    15dbcdb View commit details
    Browse the repository at this point in the history
  5. docs(changelog): promote [Unreleased] into [0.9.7]

    [skip ci] Auto-generated by Release workflow.
    github-actions[bot] committed May 28, 2026
    Configuration menu
    Copy the full SHA
    f29825c View commit details
    Browse the repository at this point in the history
  6. docs(changelog): rewrite all release notes into friendly New Features…

    … / Fixes format
    
    Distill every release's engineer-facing entry into plain-language, user-readable
    notes (New Features / Fixes, with Breaking Changes / Security surfaced where they
    apply). Also reconcile the mislabeled [0.7.8] block to [0.7.9] to match the
    published GitHub release tag and fix its dead link reference.
    
    Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
    colbymchenry and claude committed May 28, 2026
    Configuration menu
    Copy the full SHA
    2e19234 View commit details
    Browse the repository at this point in the history
  7. docs(claude): codify friendly New Features / Fixes changelog format

    Update the 'Writing changelog entries' guidance to match the rewritten CHANGELOG:
    friendly New Features / Fixes sections (Breaking Changes / Security only when
    present), one plain sentence per bullet, strip internal paths/symbols/benchmarks,
    keep #PR refs + contributor thanks. Notes why multi-word headings are safe on the
    normal release path.
    
    Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
    colbymchenry and claude committed May 28, 2026
    Configuration menu
    Copy the full SHA
    54cacea View commit details
    Browse the repository at this point in the history
Loading