Skip to content

Commit ddb1a8f

Browse files
colbymchenryclaude
andauthored
fix: issue-triage quick wins (extraction, MCP probes, gitignore, CJK, impact) (colbymchenry#654)
Batch of small, localized fixes from an open-issue triage: - .codegraph/.gitignore now ignores everything but itself, so the database, daemon.pid, sockets, and logs stop showing up in git status (colbymchenry#492, colbymchenry#484) - MCP server answers resources/list and prompts/list with empty lists instead of -32601, clearing scary log lines in opencode/Codex (colbymchenry#621) - index SAP HANA .xsjs/.xsjslib as JavaScript (colbymchenry#556) and TS .mts/.cts (colbymchenry#366) - visit anonymous AMD/CommonJS/IIFE wrapper bodies so their inner functions and calls are indexed instead of coming up empty (colbymchenry#528) - batch the changed-file lookup so a huge first sync no longer hits "too many SQL variables" (colbymchenry#540) - list files with `git ls-files -z` so non-ASCII/CJK paths survive core.quotepath and are no longer silently skipped (colbymchenry#541) - attach Go methods on generic receivers (*T[P]) to their type (colbymchenry#583, RC1) - impact no longer climbs the structural `contains` edge, so a leaf symbol stops dragging in its sibling methods (colbymchenry#536) - README: explicit `codegraph install` step, run in a new shell (colbymchenry#631) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 2a22f9f commit ddb1a8f

15 files changed

Lines changed: 197 additions & 45 deletions

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,15 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
1313

1414
- The background file watcher no longer exhausts your machine's file-descriptor budget. On macOS it previously kept **one open file handle per watched file**, so on a large project the running MCP server could pile up tens of thousands of handles and blow past the system-wide limit — at which point *unrelated* apps (your shell, editor, Docker, browser) started failing with "too many open files" until the codegraph process was killed. The watcher now uses a single recursive watch on macOS and Windows, and bounded per-directory watches on Linux, so its cost stays flat no matter how large the project is. (#644, #496, #555, #628, #579)
1515
- Indexing a project with very symbol-dense files (tens of thousands of functions or methods in a single file) no longer runs out of memory. The step that links dynamic call relationships used to load every function and method into memory at once, which could exhaust the heap and abort indexing with "JavaScript heap out of memory" on large or generated codebases; it now streams them, so memory stays flat no matter how many symbols the project has. (#610)
16+
- Indexing a very large repository no longer aborts during its first sync with a "too many SQL variables" error. (#540)
17+
- Files under directories with non-ASCII names (for example CJK characters) are no longer silently skipped during indexing. (#541)
18+
- The `.codegraph/` index folder no longer clutters `git status`: its generated ignore file now excludes everything in the folder except itself, so the database, `daemon.pid`, sockets, and logs stop showing up as untracked changes. (#492, #484)
19+
- SAP HANA `.xsjs` / `.xsjslib` files are now indexed as JavaScript. (#556)
20+
- TypeScript `.mts` and `.cts` module files are now indexed instead of being skipped. (#366)
21+
- JavaScript modules that wrap their code in an anonymous function — AMD/RequireJS, NetSuite SuiteScript, IIFE bundles — now have their inner functions and calls indexed, instead of the file coming up nearly empty. (#528)
22+
- Go methods declared on generic types (e.g. `func (s *Stack[T]) Push(...)`) are now correctly attached to their type, so callers, callees, and impact include them. (#583)
23+
- Asking what a symbol impacts no longer drags in every unrelated sibling method of its class — impact now follows real dependencies instead of the structural "contains" relationship, keeping the result focused on what actually depends on the symbol. (#536)
24+
- CodeGraph's MCP server now answers an agent's `resources/list` and `prompts/list` probes with an empty list instead of an error, clearing the `-32601` messages some clients (opencode, Codex) logged on connect. (#621)
1625

1726
## [0.9.9] - 2026-06-02
1827

README.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@
2929

3030
## Get Started
3131

32+
### 1. Install the CLI
33+
3234
**No Node.js required** — one command grabs the right build for your OS:
3335

3436
```bash
@@ -42,13 +44,22 @@ irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 |
4244
Already have Node? Use npm instead (works on any version):
4345

4446
```bash
45-
npx @colbymchenry/codegraph # zero-install, or:
4647
npm i -g @colbymchenry/codegraph
4748
```
4849

49-
<sub>CodeGraph bundles its own runtime — nothing to compile, no native build, works the same everywhere. The interactive installer auto-configures your agent(s) — Claude Code, Cursor, Codex CLI, opencode, Hermes Agent, Gemini CLI, Antigravity IDE, Kiro.</sub>
50+
<sub>CodeGraph bundles its own runtime — nothing to compile, no native build, works the same everywhere. The installer puts `codegraph` on your PATH but **doesn't change your current shell** — open a new terminal before the next step so the command resolves.</sub>
51+
52+
### 2. Wire up your agent(s)
53+
54+
In a **new terminal**, run the installer to connect CodeGraph to the agents you use:
55+
56+
```bash
57+
codegraph install
58+
```
59+
60+
<sub>Detects and auto-configures Claude Code, Cursor, Codex CLI, opencode, Hermes Agent, Gemini CLI, Antigravity IDE, and Kiro — wiring the CodeGraph MCP server into each. **This is the step that connects CodeGraph to your agent;** installing the CLI in step 1 does not do it on its own. (Shortcut: `npx @colbymchenry/codegraph` downloads and runs this in one go.)</sub>
5061

51-
### Initialize Projects
62+
### 3. Initialize each project
5263

5364
```bash
5465
cd your-project

__tests__/extraction.test.ts

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ import * as path from 'path';
1010
import * as os from 'os';
1111
import { CodeGraph } from '../src';
1212
import { extractFromSource, scanDirectory } from '../src/extraction';
13-
import { detectLanguage, isLanguageSupported, getSupportedLanguages, initGrammars, loadAllGrammars } from '../src/extraction/grammars';
13+
import { detectLanguage, isLanguageSupported, getSupportedLanguages, initGrammars, loadAllGrammars, isSourceFile } from '../src/extraction/grammars';
1414
import { normalizePath } from '../src/utils';
1515

1616
beforeAll(async () => {
@@ -4387,3 +4387,49 @@ void helperFunction(int count) {
43874387
expect(getSupportedLanguages()).toContain('objc');
43884388
});
43894389
});
4390+
4391+
describe('Regression: issue-specific extraction fixes', () => {
4392+
it('indexes inner functions of an anonymous AMD/CommonJS module wrapper (#528)', () => {
4393+
const code = `
4394+
define(['dep'], function (dep) {
4395+
function innerHelper(x) { return x + 1; }
4396+
function compute(y) { return innerHelper(y); }
4397+
return { compute: compute };
4398+
});
4399+
`;
4400+
const result = extractFromSource('amd-module.js', code);
4401+
const fns = result.nodes.filter((n) => n.kind === 'function').map((n) => n.name);
4402+
expect(fns).toContain('innerHelper');
4403+
expect(fns).toContain('compute');
4404+
});
4405+
4406+
it('attaches Go methods on generic receivers to their type (#583)', () => {
4407+
const code = `
4408+
package main
4409+
4410+
type Stack[T any] struct { items []T }
4411+
4412+
func (s *Stack[T]) Push(v T) { s.items = append(s.items, v) }
4413+
func (s Stack[T]) Len() int { return len(s.items) }
4414+
`;
4415+
const result = extractFromSource('stack.go', code);
4416+
const methods = result.nodes.filter((n) => n.kind === 'method');
4417+
expect(methods.find((m) => m.name === 'Push')?.qualifiedName).toBe('Stack::Push');
4418+
expect(methods.find((m) => m.name === 'Len')?.qualifiedName).toBe('Stack::Len');
4419+
});
4420+
4421+
it('indexes new module extensions: .mts/.cts (TS) and .xsjs/.xsjslib (JS) (#366, #556)', () => {
4422+
expect(isSourceFile('mod.mts')).toBe(true);
4423+
expect(isSourceFile('mod.cts')).toBe(true);
4424+
expect(isSourceFile('service.xsjs')).toBe(true);
4425+
expect(isSourceFile('lib.xsjslib')).toBe(true);
4426+
expect(detectLanguage('mod.mts')).toBe('typescript');
4427+
expect(detectLanguage('service.xsjs')).toBe('javascript');
4428+
4429+
// End-to-end: a .mts file is parsed as TS, a .xsjs file as JS.
4430+
const ts = extractFromSource('mod.mts', 'export function hello(): number { return 1; }');
4431+
expect(ts.nodes.find((n) => n.name === 'hello' && n.kind === 'function')).toBeDefined();
4432+
const js = extractFromSource('service.xsjs', 'function handleRequest() { return 1; }');
4433+
expect(js.nodes.find((n) => n.name === 'handleRequest' && n.kind === 'function')).toBeDefined();
4434+
});
4435+
});

__tests__/foundation.test.ts

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,10 @@ describe('CodeGraph Foundation', () => {
5454
expect(fs.existsSync(gitignorePath)).toBe(true);
5555

5656
const content = fs.readFileSync(gitignorePath, 'utf-8');
57-
expect(content).toContain('*.db');
57+
// Ignore everything in .codegraph/ except this file itself, so transient
58+
// files (db, daemon.pid, sockets, logs) never show up in git. (#492, #484)
59+
expect(content).toContain('*');
60+
expect(content).toContain('!.gitignore');
5861

5962
cg.close();
6063
});

__tests__/graph.test.ts

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,19 @@ export { main };
309309
expect(impact.nodes.size).toBeGreaterThan(0);
310310
expect(impact.nodes.has(formatValue.id)).toBe(true);
311311
});
312+
313+
it('does not drag in sibling members via the structural contains edge (#536)', () => {
314+
const getName = cg.getNodesByKind('method').find((n) => n.name === 'getName');
315+
const derived = cg.getNodesByKind('class').find((n) => n.name === 'DerivedClass');
316+
expect(getName).toBeDefined();
317+
expect(derived).toBeDefined();
318+
319+
const impact = cg.getImpactRadius(getName!.id, 3);
320+
// The containing class must NOT be pulled into impact just because it
321+
// *contains* getName — climbing that contains edge would re-expand every
322+
// sibling method and explode impact for a leaf symbol. (#536)
323+
expect(impact.nodes.has(derived!.id)).toBe(false);
324+
});
312325
});
313326

314327
describe('findPath()', () => {

__tests__/mcp-initialize.test.ts

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,4 +154,30 @@ describe('MCP initialize handshake (issue #172)', () => {
154154
expect(json.id).toBe(0);
155155
expect(json.result.serverInfo.name).toBe('codegraph');
156156
}, 20000);
157+
158+
it('answers resources/list and prompts/list with empty lists, not -32601 (issue #621)', async () => {
159+
child = spawnServer(tempDir);
160+
const events = tagStreams(child);
161+
sendInitialize(child, tempDir);
162+
await waitFor(events, (e) => e.stream === 'stdout', 5000); // initialize reply
163+
164+
child.stdin.write(JSON.stringify({ jsonrpc: '2.0', id: 1, method: 'resources/list', params: {} }) + '\n');
165+
child.stdin.write(JSON.stringify({ jsonrpc: '2.0', id: 2, method: 'prompts/list', params: {} }) + '\n');
166+
167+
const replyFor = async (id: number) => {
168+
const ev = await waitFor(events, (e) => {
169+
if (e.stream !== 'stdout') return false;
170+
try { return JSON.parse(e.text).id === id; } catch { return false; }
171+
}, 5000);
172+
return JSON.parse(ev.text);
173+
};
174+
175+
const resources = await replyFor(1);
176+
expect(resources.error).toBeUndefined();
177+
expect(resources.result.resources).toEqual([]);
178+
179+
const prompts = await replyFor(2);
180+
expect(prompts.error).toBeUndefined();
181+
expect(prompts.result.prompts).toEqual([]);
182+
}, 15000);
157183
});

src/db/queries.ts

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1588,10 +1588,19 @@ export class QueryBuilder {
15881588
getUnresolvedReferencesByFiles(filePaths: string[]): UnresolvedReference[] {
15891589
if (filePaths.length === 0) return [];
15901590

1591-
const placeholders = filePaths.map(() => '?').join(',');
1592-
const rows = this.db
1593-
.prepare(`SELECT * FROM unresolved_refs WHERE file_path IN (${placeholders})`)
1594-
.all(...filePaths) as UnresolvedRefRow[];
1591+
// Chunk under SQLite's parameter limit: the first sync of a very large repo
1592+
// passes every changed file here, which an unbounded `IN (...)` would bind
1593+
// as one parameter each — exceeding MAX_VARIABLE_NUMBER and aborting with
1594+
// "too many SQL variables". (#540)
1595+
const rows: UnresolvedRefRow[] = [];
1596+
for (let i = 0; i < filePaths.length; i += SQLITE_PARAM_CHUNK_SIZE) {
1597+
const chunk = filePaths.slice(i, i + SQLITE_PARAM_CHUNK_SIZE);
1598+
const placeholders = chunk.map(() => '?').join(',');
1599+
const chunkRows = this.db
1600+
.prepare(`SELECT * FROM unresolved_refs WHERE file_path IN (${placeholders})`)
1601+
.all(...chunk) as UnresolvedRefRow[];
1602+
rows.push(...chunkRows);
1603+
}
15951604

15961605
return rows.map((row) => ({
15971606
fromNodeId: row.from_node_id,

src/directory.ts

Lines changed: 6 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -83,22 +83,11 @@ export function createDirectory(projectRoot: string): void {
8383
// Create .gitignore inside .codegraph (if it doesn't exist)
8484
const gitignorePath = path.join(codegraphDir, '.gitignore');
8585
if (!fs.existsSync(gitignorePath)) {
86-
const gitignoreContent = `# CodeGraph data files
87-
# These are local to each machine and should not be committed
88-
89-
# Database
90-
*.db
91-
*.db-wal
92-
*.db-shm
93-
94-
# Cache
95-
cache/
96-
97-
# Logs
98-
*.log
99-
100-
# Hook markers
101-
.dirty
86+
const gitignoreContent = `# CodeGraph data files — local to each machine, not for committing.
87+
# Ignore everything in .codegraph/ except this file itself, so transient
88+
# files (the database, daemon.pid, sockets, logs) never show up in git.
89+
*
90+
!.gitignore
10291
`;
10392

10493
fs.writeFileSync(gitignorePath, gitignoreContent, 'utf-8');
@@ -245,7 +234,7 @@ export function validateDirectory(projectRoot: string): {
245234
const gitignorePath = path.join(codegraphDir, '.gitignore');
246235
if (!fs.existsSync(gitignorePath)) {
247236
try {
248-
const gitignoreContent = `# CodeGraph data files\n# These are local to each machine and should not be committed\n\n# Database\n*.db\n*.db-wal\n*.db-shm\n\n# Cache\ncache/\n\n# Logs\n*.log\n\n# Hook markers\n.dirty\n`;
237+
const gitignoreContent = `# CodeGraph data fileslocal to each machine, not for committing.\n# Ignore everything in .codegraph/ except this file itself, so transient\n# files (the database, daemon.pid, sockets, logs) never show up in git.\n*\n!.gitignore\n`;
249238
fs.writeFileSync(gitignorePath, gitignoreContent, 'utf-8');
250239
} catch {
251240
// Non-fatal: warn but don't block

src/extraction/grammars.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,15 @@ const WASM_GRAMMAR_FILES: Record<GrammarLanguage, string> = {
4646
export const EXTENSION_MAP: Record<string, Language> = {
4747
'.ts': 'typescript',
4848
'.tsx': 'tsx',
49+
// ESM/CJS TypeScript module extensions — parsed as TS (no JSX). (#366)
50+
'.mts': 'typescript',
51+
'.cts': 'typescript',
4952
'.js': 'javascript',
5053
'.mjs': 'javascript',
5154
'.cjs': 'javascript',
55+
// SAP HANA XS Classic server-side JavaScript. (#556)
56+
'.xsjs': 'javascript',
57+
'.xsjslib': 'javascript',
5258
'.jsx': 'jsx',
5359
'.py': 'python',
5460
'.pyw': 'python',

src/extraction/index.ts

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -198,32 +198,32 @@ function collectGitFiles(repoDir: string, prefix: string, files: Set<string>): v
198198
// Without this, monorepos using submodules index 0 files. (See issue #147.)
199199
// Note: --recurse-submodules only supports -c/--cached and --stage modes — it
200200
// can't be combined with -o, so untracked files are gathered separately below.
201-
const tracked = execFileSync('git', ['ls-files', '-c', '--recurse-submodules'], gitOpts);
202-
for (const line of tracked.split('\n')) {
203-
const trimmed = line.trim();
204-
if (trimmed) {
205-
files.add(normalizePath(prefix + trimmed));
206-
}
201+
// -z gives NUL-separated, unquoted output so non-ASCII (e.g. CJK) paths
202+
// survive verbatim. Without it git octal-escapes and double-quotes such paths
203+
// (the core.quotepath default), and the quoted form never matches a real file
204+
// on disk → those files are silently dropped from the index. (#541)
205+
const tracked = execFileSync('git', ['ls-files', '-z', '-c', '--recurse-submodules'], gitOpts);
206+
for (const rel of tracked.split('\0')) {
207+
if (rel) files.add(normalizePath(prefix + rel));
207208
}
208209

209210
// Untracked files (submodules manage their own untracked state). Embedded git
210211
// repos surface here as a single "subdir/" entry that git refuses to descend
211212
// into — recurse into those as their own repos so their source gets indexed.
212-
const untracked = execFileSync('git', ['ls-files', '-o', '--exclude-standard'], gitOpts);
213-
for (const line of untracked.split('\n')) {
214-
const trimmed = line.trim();
215-
if (!trimmed) continue;
216-
if (trimmed.endsWith('/')) {
213+
const untracked = execFileSync('git', ['ls-files', '-z', '-o', '--exclude-standard'], gitOpts);
214+
for (const rel of untracked.split('\0')) {
215+
if (!rel) continue;
216+
if (rel.endsWith('/')) {
217217
// git only emits a trailing-slash directory entry for an embedded repo.
218218
// Guard with a .git check anyway, and skip anything else exactly as git
219219
// itself skips it (we never descend into a non-repo opaque dir).
220-
const childDir = path.join(repoDir, trimmed);
220+
const childDir = path.join(repoDir, rel);
221221
if (fs.existsSync(path.join(childDir, '.git'))) {
222-
collectGitFiles(childDir, prefix + trimmed, files);
222+
collectGitFiles(childDir, prefix + rel, files);
223223
}
224224
continue;
225225
}
226-
files.add(normalizePath(prefix + trimmed));
226+
files.add(normalizePath(prefix + rel));
227227
}
228228
}
229229

0 commit comments

Comments
 (0)