Skip to content

Commit 8eed243

Browse files
andreinknvcolbymchenryclaude
authored
feat(extraction): instantiates + decorates graph edges (colbymchenry#134)
* feat(extraction): instantiates + decorates graph edges Two new structural edges that fill gaps in the call graph for modern JS/TS / Java / C# / Python / Kotlin codebases. 1) `instantiates` edges from `new Foo(...)`: The bulk-extraction and visitFunctionBody dispatchers only recognised `call_expression`; `new_expression` (and the equivalent `object_creation_expression` / `instance_creation_expression` in other grammars) was silently ignored. Adds INSTANTIATION_KINDS, extractInstantiation(), and dispatch from BOTH the top-level visitNode and the per-function-body walker. Children are still descended so nested calls inside constructor args (`new Foo(bar())`) get their own `calls` refs. Output: a `bootstrap` function that does `new UserService(); new UserController(svc)` now produces two `instantiates` edges to those class nodes — previously zero edges. 2) `decorates` edges from `@Decorator` annotations: Tree-sitter places decorator nodes BEFORE the symbol they apply to in the AST, so the original walk-time dispatch saw the wrong nodeStack head (file/class instead of class/method). Replaced with extractDecoratorsFor(declNode, decoratedId) that runs from inside extractClass / extractFunction / extractMethod after the symbol's node id is known. Looks for decorator nodes in two places: - Direct named children of the declaration (method/property style) - Preceding siblings in the parent (TypeScript class style: @foo class X {} parses as parent { decorator, class_decl }) Sibling check uses startIndex comparison rather than reference identity — tree-sitter web bindings return fresh JS wrappers from parent/namedChild navigation, so `===` is unreliable. Took a debug session to spot this; flagging in the comment so the next reader doesn't re-introduce the bug. Output: a `@Controller` class decorator + `@Get` method decorator on a NestJS-style controller now produce two `decorates` edges (class→Controller, method→Get) with the correct source nodes. Verified live on a synthetic NestJS-shape fixture; all 380 existing tests pass. * fix(extraction): address reviewer findings — decorator boundary, generic constructors, property/field decorators, marker_annotation, tests Five fixes from independent semantic review: - extractDecoratorsFor sibling walk now iterates BACKWARD from the declaration and stops at the first non-decorator/annotation separator. Previous version walked forward up to declStart and consumed every decorator-typed sibling — so two adjacent decorated classes (`@A class Foo {} @b class Bar {}`) had `@A` spuriously attributed to `Bar`. - extractInstantiation strips the type-argument suffix from the constructor field text. `new Map<K, V>()` was producing referenceName 'Map<K, V>' (the constructor field is a generic_type node) and resolution always failed. - extractProperty and extractField now call extractDecoratorsFor after their createNode calls. NestJS-style `@Inject() private svc: Foo` and Java field annotations were being silently dropped. - consider() in extractDecoratorsFor recognises 'marker_annotation' in addition to 'decorator'/'annotation'. Java's tree-sitter grammar emits marker_annotation for arg-less annotations like @OverRide and @deprecated; without this every Java marker annotation was silently skipped. - 6 new extraction tests covering: instantiates ref for new Foo(), generic-type stripping (`new Container<string>()` -> 'Container'), qualified-new keeps trailing identifier (`new ns.Foo()` -> 'Foo'), decorates ref for @foo class X {}, regression for adjacent decorated classes (each gets its OWN decorator), decorates ref for @foo method(). Full test suite: 386 passed (was 380, +6 new extraction tests). * feat(resolution): kind-aware scoring + Python instantiation promotion Two follow-ups to the new instantiates/decorates ref kinds, surfaced during review: 1) name-matcher previously only had a kind bonus for `calls` (preferring function/method). When a class and a function share a name across modules, an `instantiates` ref would tie or pick the wrong candidate. Adds: - `instantiates` → +25 for class/struct/interface - `decorates` → +25 for function/method, +15 for class (Python class decorators, Java annotation interfaces) 2) Python (and Ruby) have no `new` keyword — `Foo()` is the standard instantiation syntax, indistinguishable from a function call at extraction time. Resolution can tell the difference once the target is known: when a `calls` ref resolves to a class/struct, promote it to `instantiates`. Mirrors the existing extends→ implements promotion in createEdges. Verified: 386 → 389 passing (+3 tests covering the kind biases and the Python promotion). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Colby McHenry <me@colbymchenry.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 2dc4bc3 commit 8eed243

5 files changed

Lines changed: 444 additions & 2 deletions

File tree

__tests__/extraction.test.ts

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3079,3 +3079,101 @@ describe('Directory Exclusion', () => {
30793079
expect(files.every((f) => !f.includes('vendor'))).toBe(true);
30803080
});
30813081
});
3082+
3083+
describe('Instantiates + Decorates edge extraction', () => {
3084+
it('emits an instantiates ref for `new Foo()`', () => {
3085+
const code = `
3086+
class Foo {}
3087+
function bootstrap() { return new Foo(); }
3088+
`;
3089+
const result = extractFromSource('app.ts', code);
3090+
const ref = result.unresolvedReferences.find(
3091+
(r) => r.referenceKind === 'instantiates' && r.referenceName === 'Foo'
3092+
);
3093+
expect(ref).toBeDefined();
3094+
});
3095+
3096+
it('strips type-argument suffix from generic constructors', () => {
3097+
const code = `
3098+
class Container<T> { constructor(_: T) {} }
3099+
function go() { return new Container<string>('x'); }
3100+
`;
3101+
const result = extractFromSource('app.ts', code);
3102+
const ref = result.unresolvedReferences.find(
3103+
(r) => r.referenceKind === 'instantiates'
3104+
);
3105+
expect(ref).toBeDefined();
3106+
// Container<string> must be normalised to "Container" — otherwise
3107+
// resolution can never match the class node.
3108+
expect(ref!.referenceName).toBe('Container');
3109+
});
3110+
3111+
it('keeps trailing identifier from qualified `new ns.Foo()`', () => {
3112+
const code = `
3113+
const ns = { Foo: class {} };
3114+
function go() { return new ns.Foo(); }
3115+
`;
3116+
const result = extractFromSource('app.ts', code);
3117+
const ref = result.unresolvedReferences.find(
3118+
(r) => r.referenceKind === 'instantiates'
3119+
);
3120+
// We can't always resolve which Foo, but the name should be the
3121+
// simple identifier so name-matching has a chance.
3122+
expect(ref?.referenceName).toBe('Foo');
3123+
});
3124+
3125+
it('emits a decorates ref for `@Foo class X {}`', () => {
3126+
const code = `
3127+
function Foo(_arg: string) { return (cls: any) => cls; }
3128+
@Foo('x')
3129+
class X {}
3130+
`;
3131+
const result = extractFromSource('app.ts', code);
3132+
const decorClass = result.unresolvedReferences.find(
3133+
(r) => r.referenceKind === 'decorates' && r.referenceName === 'Foo'
3134+
);
3135+
expect(decorClass).toBeDefined();
3136+
});
3137+
3138+
it('does NOT attribute a prior class\'s decorator to the next class', () => {
3139+
// Regression: the sibling-walk must stop at the first non-
3140+
// decorator separator. `@A class Foo {} @B class Bar {}` must
3141+
// produce `decorates(Foo, A)` and `decorates(Bar, B)` — never
3142+
// `decorates(Bar, A)`.
3143+
const code = `
3144+
function A(cls: any) { return cls; }
3145+
function B(cls: any) { return cls; }
3146+
@A
3147+
class Foo {}
3148+
@B
3149+
class Bar {}
3150+
`;
3151+
const result = extractFromSource('app.ts', code);
3152+
const decoratesEdges = result.unresolvedReferences.filter(
3153+
(r) => r.referenceKind === 'decorates'
3154+
);
3155+
// Exactly one decorates ref per decorated class, no cross-attribution.
3156+
const fromBar = decoratesEdges.filter((r) =>
3157+
result.nodes.find((n) => n.id === r.fromNodeId && n.name === 'Bar')
3158+
);
3159+
expect(fromBar.length).toBe(1);
3160+
expect(fromBar[0]!.referenceName).toBe('B');
3161+
});
3162+
3163+
it('emits a decorates ref for `@Foo method() {}`', () => {
3164+
const code = `
3165+
function Get(p: string) { return (t: any, k: string) => t; }
3166+
class Svc {
3167+
@Get('/x') method() { return 1; }
3168+
}
3169+
`;
3170+
const result = extractFromSource('app.ts', code);
3171+
const decorMethod = result.unresolvedReferences.find(
3172+
(r) => r.referenceKind === 'decorates' && r.referenceName === 'Get'
3173+
);
3174+
expect(decorMethod).toBeDefined();
3175+
// The decorated symbol must be `method`, not the constructor or class.
3176+
const decoratedNode = result.nodes.find((n) => n.id === decorMethod!.fromNodeId);
3177+
expect(decoratedNode?.name).toBe('method');
3178+
});
3179+
});

__tests__/resolution.test.ts

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -606,5 +606,109 @@ function main(): void {
606606
// Should have attempted resolution
607607
expect(result.stats.total).toBeGreaterThanOrEqual(0);
608608
});
609+
610+
it('promotes calls→instantiates when target resolves to a class (Python)', async () => {
611+
// Python has no `new` keyword — `Foo()` is the standard
612+
// instantiation syntax. Extraction can't tell that apart from
613+
// a function call without symbol info, so it emits a `calls`
614+
// ref. Resolution promotes it to `instantiates` once the
615+
// target is known to be a class.
616+
const srcDir = path.join(tempDir, 'src');
617+
fs.mkdirSync(srcDir, { recursive: true });
618+
619+
fs.writeFileSync(
620+
path.join(srcDir, 'app.py'),
621+
`class UserService:
622+
def __init__(self):
623+
self.db = None
624+
625+
def bootstrap():
626+
return UserService()
627+
`
628+
);
629+
630+
cg = await CodeGraph.init(tempDir, { index: true });
631+
cg.resolveReferences();
632+
633+
const bootstrap = cg
634+
.getNodesByKind('function')
635+
.find((n) => n.name === 'bootstrap');
636+
expect(bootstrap).toBeDefined();
637+
638+
const outgoing = cg.getOutgoingEdges(bootstrap!.id);
639+
const instantiates = outgoing.find((e) => e.kind === 'instantiates');
640+
expect(instantiates).toBeDefined();
641+
// Same edge must NOT also appear as a `calls` edge — promotion
642+
// replaces the kind, doesn't duplicate.
643+
const callsToUserService = outgoing.filter(
644+
(e) => e.kind === 'calls' && e.target === instantiates!.target
645+
);
646+
expect(callsToUserService).toHaveLength(0);
647+
});
648+
});
649+
650+
describe('Name Matcher: kind bias for new ref kinds', () => {
651+
const baseContext = (candidates: Node[]): ResolutionContext => ({
652+
getNodesInFile: () => [],
653+
getNodesByName: (name) => candidates.filter((c) => c.name === name),
654+
getNodesByQualifiedName: () => [],
655+
getNodesByKind: () => [],
656+
fileExists: () => true,
657+
readFile: () => null,
658+
getProjectRoot: () => '/test',
659+
getAllFiles: () => [],
660+
getNodesByLowerName: () => [],
661+
getImportMappings: () => [],
662+
});
663+
664+
it('prefers a class candidate over a function for `instantiates` refs', () => {
665+
// A class and a function share a name across the codebase.
666+
// Without the kind bias, the function (which gets the +25 `calls`
667+
// bonus historically applied to all candidates of that kind) would
668+
// win. Now the instantiates branch reverses it.
669+
const fn: Node = {
670+
id: 'func:utils.ts:Logger:5', kind: 'function', name: 'Logger',
671+
qualifiedName: 'utils.ts::Logger', filePath: 'utils.ts', language: 'typescript',
672+
startLine: 5, endLine: 7, startColumn: 0, endColumn: 0, updatedAt: Date.now(),
673+
};
674+
const cls: Node = {
675+
id: 'class:logger.ts:Logger:10', kind: 'class', name: 'Logger',
676+
qualifiedName: 'logger.ts::Logger', filePath: 'logger.ts', language: 'typescript',
677+
startLine: 10, endLine: 30, startColumn: 0, endColumn: 0, updatedAt: Date.now(),
678+
};
679+
680+
const ref = {
681+
fromNodeId: 'func:main.ts:bootstrap:1',
682+
referenceName: 'Logger',
683+
referenceKind: 'instantiates' as const,
684+
line: 5, column: 0, filePath: 'main.ts', language: 'typescript' as const,
685+
};
686+
687+
const result = matchReference(ref, baseContext([fn, cls]));
688+
expect(result?.targetNodeId).toBe('class:logger.ts:Logger:10');
689+
});
690+
691+
it('prefers a function candidate over a non-function for `decorates` refs', () => {
692+
const variable: Node = {
693+
id: 'var:config.ts:Inject:5', kind: 'variable', name: 'Inject',
694+
qualifiedName: 'config.ts::Inject', filePath: 'config.ts', language: 'typescript',
695+
startLine: 5, endLine: 5, startColumn: 0, endColumn: 0, updatedAt: Date.now(),
696+
};
697+
const decorator: Node = {
698+
id: 'func:di.ts:Inject:10', kind: 'function', name: 'Inject',
699+
qualifiedName: 'di.ts::Inject', filePath: 'di.ts', language: 'typescript',
700+
startLine: 10, endLine: 20, startColumn: 0, endColumn: 0, updatedAt: Date.now(),
701+
};
702+
703+
const ref = {
704+
fromNodeId: 'class:svc.ts:UserService:1',
705+
referenceName: 'Inject',
706+
referenceKind: 'decorates' as const,
707+
line: 5, column: 0, filePath: 'svc.ts', language: 'typescript' as const,
708+
};
709+
710+
const result = matchReference(ref, baseContext([variable, decorator]));
711+
expect(result?.targetNodeId).toBe('func:di.ts:Inject:10');
712+
});
609713
});
610714
});

0 commit comments

Comments
 (0)