Skip to content

Commit e1eb13c

Browse files
colbymchenryclaude
andauthored
fix(mcp): normalize root-ish path filters in codegraph_files (colbymchenry#426) (colbymchenry#466)
The agent (opencode/Gemini Flash on Windows) called codegraph_files with path="/" and got "No files found matching the criteria.", which pushed it straight back to Read/Glob. Indexed file paths are stored as project-relative POSIX (e.g. "src/foo.py"), and the old startsWith filter matched nothing for any of the root-ish or platform-flavored shapes an agent might guess: "/", ".", "./", "", "\\", leading-slash and leading-./ subpaths, or Windows backslash subpaths. Normalize the filter (strip leading "/", "./", "\", bare "."; convert "\" to "/"; trim trailing "/"), then match by exact equal or "<filter>/" boundary — which also kills a sibling-prefix bleed where filter "src" used to match "src-utils/...". Validated on macOS + Linux (Docker) + Windows (Parallels) with 13 new unit tests plus the existing mcp-input-limits/concurrent-locking suites, and end-to-end through opencode in tmux (Big Pickle/OpenCode Zen): codegraph_files [path=/] now returns the project tree and the agent answers directly instead of falling back to Read. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c0cf9c1 commit e1eb13c

3 files changed

Lines changed: 128 additions & 3 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
1010
## [Unreleased]
1111

1212
### Fixed
13+
- **`codegraph_files` now returns the whole project when an agent passes `path="/"`, `"."`, `"./"`, `""`, or a Windows-style `"\\"` — instead of "No files found matching the criteria."** Indexed file paths are stored as project-relative POSIX (e.g. `src/foo.ts`), but the path filter used a plain `startsWith`, so a leading slash or any of the other root-ish shapes an agent might guess matched nothing and pushed the agent back to Read/Glob — the exact opencode + Gemini Flash regression reported on Windows 11. Subdirectory filters are now equally forgiving: `"/src"`, `"./src"`, `"src/"`, `"src\\components"`, etc. all resolve correctly. Sibling-prefix bleed (`"src"` was previously matching `src-utils/...`) is also fixed — the filter now requires either an exact match or a `<filter>/` boundary. Closes #426.
1314
- **File watcher no longer marks edited files as fresh when another process holds the index lock.** When a second writer (concurrent `codegraph index`, a git hook, another MCP daemon) held `.codegraph/codegraph.lock`, `CodeGraph.sync()` returned a zero-shape no-op instead of throwing. The file watcher took that as a successful sync and cleared `pendingFiles` — so the per-file staleness signal MCP tools surface to agents (issue #403) dropped immediately, even though the edit was never indexed. `CodeGraph.watch()` now converts that no-op into a typed `LockUnavailableError` thrown into the watcher; the existing retry path preserves `pendingFiles` and reschedules until the lock becomes available. The error is logged at debug only (no `onSyncError` callback) so a long-running external indexer doesn't spam stderr every debounce cycle. Closes #449.
1415
- **Watch sync no longer aborts with `FOREIGN KEY constraint failed`.** PR #62 plugged this FK violation at the extraction layer (empty-named nodes whose containment edges had no target), but the same violation kept reappearing on v0.9.5 during the daemon's *watch sync* — not on initial index. Once an agent's daemon had been running long enough to accumulate edits, a resolver lookup that crossed a framework-specific cache could hand back a node whose row had been removed by a recent file rewrite, and the FK check then aborted the entire resolution batch, leaving the user's daemon log filling with `Watch sync failed { error: 'FOREIGN KEY constraint failed' }`. `QueryBuilder.insertEdges` now validates every batch's endpoints against the `nodes` table directly (one fresh `SELECT id IN (...)` per batch, no cache) and silently skips edges with missing source or target — so a stale lookup result drops one edge instead of aborting the whole sync. Surfaces as a fresh `codegraph init`/`index` cycle now surviving its first watch-sync cycle without the FK error, and the daemon recovering naturally instead of compounding into further failures. Closes #455.
1516
- **Hermes Agent: `codegraph install --target hermes` no longer corrupts `~/.hermes/config.yaml`.** Hermes serializes its config with PyYAML's default block style, which writes list items at the *same* indent as the parent mapping key (`cli:` and `- hermes-cli` both at column 2). The previous line-based YAML patcher mistook that first ` - hermes-cli` for the next sibling key, truncated the `cli:` block, and then spliced `- mcp-codegraph` at indent 4 *before* the existing items — leaving subsequent entries (`- browser`, `- clarify`, …) and even other platforms (`telegram:`, `discord:`) appearing at the `platform_toolsets:` level, which is no longer parseable YAML. The installer now recognizes the same-indent list style, finds the real end of the block at the next sibling key, and appends `- mcp-codegraph` at whatever indent the existing items already use. Re-installing on an already-corrupted file (or a 4-space-nested config that worked before) still produces a clean, parseable result. Closes #456.
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
/**
2+
* codegraph_files path-filter normalization (#426)
3+
*
4+
* Stored file paths are project-relative POSIX (e.g. "src/foo.ts"). Some
5+
* agents pass project-root variants like "/", ".", "./" or "" when they want
6+
* "the whole project", and Windows-style backslashes or leading "/" / "./"
7+
* prefixes when they want a subtree. The old filter used a plain
8+
* `startsWith(pathFilter)`, so any of those buried the agent at "no files
9+
* found" and pushed it back to Read/Glob — the exact opencode regression in
10+
* #426. These tests pin every branch of the normalization.
11+
*/
12+
13+
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
14+
import * as fs from 'fs';
15+
import * as path from 'path';
16+
import * as os from 'os';
17+
import CodeGraph from '../src/index';
18+
import { ToolHandler } from '../src/mcp/tools';
19+
20+
describe('codegraph_files path normalization', () => {
21+
let tempDir: string;
22+
let cg: CodeGraph;
23+
let handler: ToolHandler;
24+
25+
beforeEach(async () => {
26+
tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-files-paths-'));
27+
fs.mkdirSync(path.join(tempDir, 'src', 'components'), { recursive: true });
28+
fs.mkdirSync(path.join(tempDir, 'tests'), { recursive: true });
29+
fs.writeFileSync(path.join(tempDir, 'src', 'index.ts'), `export const x = 1;\n`);
30+
fs.writeFileSync(
31+
path.join(tempDir, 'src', 'components', 'Button.ts'),
32+
`export const Button = () => 1;\n`
33+
);
34+
fs.writeFileSync(path.join(tempDir, 'tests', 'a.test.ts'), `export const t = 1;\n`);
35+
cg = await CodeGraph.init(tempDir, {
36+
config: { include: ['**/*.ts'], exclude: [] },
37+
});
38+
await cg.indexAll();
39+
handler = new ToolHandler(cg);
40+
});
41+
42+
afterEach(() => {
43+
if (cg) cg.destroy();
44+
if (fs.existsSync(tempDir)) {
45+
fs.rmSync(tempDir, { recursive: true, force: true });
46+
}
47+
});
48+
49+
async function listed(pathFilter: string | undefined): Promise<string> {
50+
const result = await handler.execute('codegraph_files', {
51+
...(pathFilter !== undefined ? { path: pathFilter } : {}),
52+
format: 'flat',
53+
includeMetadata: false,
54+
});
55+
expect(result.isError).toBeFalsy();
56+
return result.content[0]!.text as string;
57+
}
58+
59+
// Root-ish filters: every shape an agent might guess for "whole project"
60+
// must list the same files as no filter at all.
61+
for (const rootish of ['/', '.', './', '', '\\', '//', './/']) {
62+
it(`treats path=${JSON.stringify(rootish)} as project root`, async () => {
63+
const output = await listed(rootish);
64+
expect(output).toContain('src/index.ts');
65+
expect(output).toContain('src/components/Button.ts');
66+
expect(output).toContain('tests/a.test.ts');
67+
});
68+
}
69+
70+
it('matches a real subdirectory prefix', async () => {
71+
const output = await listed('src');
72+
expect(output).toContain('src/index.ts');
73+
expect(output).toContain('src/components/Button.ts');
74+
expect(output).not.toContain('tests/a.test.ts');
75+
});
76+
77+
it('tolerates a leading slash on a real subdirectory', async () => {
78+
const output = await listed('/src');
79+
expect(output).toContain('src/index.ts');
80+
expect(output).not.toContain('tests/a.test.ts');
81+
});
82+
83+
it('tolerates a leading "./" on a real subdirectory', async () => {
84+
const output = await listed('./src');
85+
expect(output).toContain('src/index.ts');
86+
expect(output).not.toContain('tests/a.test.ts');
87+
});
88+
89+
it('tolerates a trailing slash on a real subdirectory', async () => {
90+
const output = await listed('src/');
91+
expect(output).toContain('src/index.ts');
92+
expect(output).not.toContain('tests/a.test.ts');
93+
});
94+
95+
it('normalizes Windows backslashes', async () => {
96+
const output = await listed('src\\components');
97+
expect(output).toContain('src/components/Button.ts');
98+
expect(output).not.toContain('src/index.ts');
99+
});
100+
101+
// Old code matched on raw `startsWith`, so a filter "src" would also
102+
// return a sibling like "src-utils/...". The new code requires either an
103+
// exact match or a "<filter>/" boundary, so prefixes don't bleed.
104+
it('does not match sibling directories that share a prefix', async () => {
105+
fs.mkdirSync(path.join(tempDir, 'src-utils'), { recursive: true });
106+
fs.writeFileSync(path.join(tempDir, 'src-utils', 'helper.ts'), `export const h = 1;\n`);
107+
await cg.indexAll();
108+
109+
const output = await listed('src');
110+
expect(output).toContain('src/index.ts');
111+
expect(output).not.toContain('src-utils/helper.ts');
112+
});
113+
});

src/mcp/tools.ts

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2248,9 +2248,20 @@ export class ToolHandler {
22482248
return this.textResult('No files indexed. Run `codegraph index` first.');
22492249
}
22502250

2251-
// Filter by path prefix
2252-
let files = pathFilter
2253-
? allFiles.filter(f => f.path.startsWith(pathFilter) || f.path.startsWith('./' + pathFilter))
2251+
// Filter by path prefix. Stored paths are project-relative POSIX (e.g.
2252+
// "src/foo.ts"), but agents commonly pass project-root variants like "/",
2253+
// ".", "./", "" or Windows-style "src\foo" — and prefixes with leading
2254+
// "/", "./" or "\". Normalize all of those before matching so the agent
2255+
// gets results instead of falling back to Read/Glob (see #426).
2256+
const normalizedFilter = pathFilter
2257+
? pathFilter
2258+
.replace(/\\/g, '/')
2259+
.replace(/^(?:\.?\/+)+/, '')
2260+
.replace(/^\.$/, '')
2261+
.replace(/\/+$/, '')
2262+
: '';
2263+
let files = normalizedFilter
2264+
? allFiles.filter(f => f.path === normalizedFilter || f.path.startsWith(normalizedFilter + '/'))
22542265
: allFiles;
22552266

22562267
// Filter by glob pattern

0 commit comments

Comments
 (0)