Skip to content

Add GitHub Copilot custom instructions and reusable prompts#1

Draft
Copilot wants to merge 60 commits into
mainfrom
copilot/optimize-for-github-copilot
Draft

Add GitHub Copilot custom instructions and reusable prompts#1
Copilot wants to merge 60 commits into
mainfrom
copilot/optimize-for-github-copilot

Conversation

Copy link
Copy Markdown

Copilot AI commented Feb 7, 2026

The repo has CLAUDE.md for Claude Code but no equivalent configuration for GitHub Copilot. This adds Copilot-native configuration files so Copilot gets project-specific context automatically.

Repository-level instructions

  • .github/copilot-instructions.md — Architecture, coding conventions, type system (NodeKind/EdgeKind/Language), build commands, MCP tool reference, CLI usage

File-specific instructions (auto-applied via applyTo globs)

  • testing.instructions.md__tests__/** — vitest patterns, temp directory conventions, test structure
  • extraction.instructions.mdsrc/extraction/** — tree-sitter parsing pipeline, query file conventions, how to add languages
  • mcp.instructions.mdsrc/mcp/** — tool definition schema, handler patterns, response formatting, protocol details

Reusable prompts (.github/prompts/)

  • new-language.prompt.md — Step-by-step checklist for adding a new language (types → grammars → queries → tests)
  • new-mcp-tool.prompt.md — Step-by-step checklist for adding a new MCP tool (definition → routing → handler → formatting)
Original prompt

review this repo and see how it can be optimized for github opilot and copilot cli current versions


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…nd reusable prompts

Co-authored-by: southerncoder <13037278+southerncoder@users.noreply.github.com>
Copilot AI changed the title [WIP] Optimize repository for GitHub Copilot and CLI Add GitHub Copilot custom instructions and reusable prompts Feb 7, 2026
Copilot AI requested a review from southerncoder February 7, 2026 22:47
GeneralClaw and others added 22 commits February 9, 2026 14:35
Arrow functions and function expressions assigned to variables
(e.g. `export const useAuth = () => { ... }`) were not being indexed
because the arrow_function AST node has no `name` field — the name
lives on the parent variable_declarator node.

Additionally, `isExported()` for TypeScript and JavaScript extractors
only checked 10 characters back from the node's start position, which
missed `export` for deeply nested nodes like arrow functions inside
variable declarations inside export statements.

Changes:
- extractFunction(): When an arrow_function or function_expression
  resolves to '<anonymous>', look up the parent variable_declarator
  for the name before skipping.
- isExported() (TS + JS): Walk the parent chain to find an
  export_statement ancestor instead of substring matching.
- Add 6 test cases covering arrow function exports, function
  expression exports, non-exported arrow functions, anonymous
  arrow functions, multiple exports, and JavaScript files.

Tested on a real monorepo (238 files): node count increased from
779 to 958 (+23%), with 94 new nodes in packages/ that previously
had 0 coverage.
Extend extraction to index two additional categories of symbols
that were previously invisible:

1. Type aliases (e.g. `export type X = ...` in TypeScript,
   `type X` in Go, `type X = ...` in Rust, `typealias X` in Swift,
   `type_alias` in Kotlin). Adds `typeAliasTypes` to the
   LanguageExtractor interface with values for all 13 languages.

2. Exported variable declarations that aren't functions, including:
   - Zustand stores: `export const useX = create(...)`
   - XState machines: `export const xMachine = createMachine(...)`
   - Zod schemas: `export const schema = z.object(...)`
   - Config objects: `export const config = { ... }`
   - Constants: `export const MAX = 3`
   - Arrays: `export const NAMES = [...] as const`

   The extractExportedVariables() method is called when visiting
   export_statement nodes. It skips variable_declarator values that
   are already handled by functionTypes (arrow_function,
   function_expression) to avoid duplicate extraction.

Adds 11 new test cases (59 total extraction tests, 215 total).

Tested on production monorepo: nodes increased from 958 to 1,172
(+22%), with 109 new variable nodes and 105 new type_alias nodes.
Only 4 files remain at 0 nodes — all are re-export barrels or
ambient declaration files with no extractable symbols.
Adds support for Dart and Liquid languages with tree-sitter parsing.
Improves accuracy of code symbol extraction for existing languages.
Indexes project files to enhance code navigation features.
Migrates build system to facilitate code contributions.
Removes git hook functionality.
Integrates Sentry for error tracking and reporting.
Enhances project initialization and configuration loading.
…lias extraction

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…locking

Implements security improvements inspired by PR colbymchenry#16 (credit: MO2k4):

- Add validatePathWithinRoot() to prevent path traversal attacks in
  extraction and context building
- Clamp MCP tool inputs (limit, depth, maxDepth) to sane ranges
- Use atomic writes (temp file + rename) for config saves
- Add symlink cycle detection in directory scanning to prevent infinite loops
- Replace all JSON.parse calls in db/queries.ts with safeJsonParse fallbacks
  to handle corrupted database metadata gracefully
- Add cross-process FileLock for DB write operations (indexAll, indexFiles,
  sync) to prevent concurrent writes from CLI, MCP server, and git hooks
- Remove unused path import from context/index.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests cover all security utilities introduced in the previous commit:
- validatePathWithinRoot: path traversal prevention (7 tests)
- safeJsonParse: corrupted JSON fallback handling (6 tests)
- clamp: input range clamping (5 tests)
- FileLock: cross-process file locking (7 tests)
- Atomic config writes: temp file + rename pattern (3 tests)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Security hardening: path validation, input clamping, safe JSON, file locking
- Fix Float32Array embedder bug: was creating zero-filled array instead
  of copying data from TypedArray-like objects
- Fix VSS search query: use subquery pattern so LIMIT applies before JOIN
- Pin tree-sitter versions: remove caret ranges for ABI stability, add
  overrides to lock tree-sitter core at 0.22.4
- Lazy grammar loading: load native bindings on first use per language
  instead of all at startup, so one missing grammar doesn't affect others
- Remove stale src/extraction/queries copy from copy-assets script

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SQLite performance pragmas: synchronous=NORMAL, 64MB cache,
  memory temp store, 256MB mmap (safe with WAL mode)
- Batch insert for unresolved refs: single transaction instead of
  N individual inserts per file
- Symbol caching (warmCaches): pre-load all nodes into memory maps
  before resolution, eliminating repeated SQLite queries per ref
- Async file I/O: fs.stat/readFile in indexFile() are now non-blocking
- Denormalize filePath/language onto UnresolvedReference: avoids N
  node lookups during resolution, with schema migration v2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix arrow function extraction: explicitly call extractFunction() for
  arrow functions/function expressions in variable declarations instead
  of silently skipping them (all 6 arrow function tests now pass)
- Best-candidate resolution: collect candidates from all strategies and
  return highest confidence match instead of first match
- Fix graph traversal 'both' direction: correctly determine next node
  for mixed incoming/outgoing edges in BFS and DFS

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Exposes the existing uninitialize() method via `codegraph uninit [path]`.
Includes confirmation prompt (skippable with --force) before deleting
the .codegraph/ directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add findSymbol() helper that prefers exact name matches and notes
  alternatives when multiple symbols share the same name
- Add output truncation (15K char cap) to prevent context window bloat
- Apply to callers, callees, impact, node, search, and files tools

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers arrow function extraction, best-candidate resolution, graph
traversal direction fix, MCP symbol disambiguation, output truncation,
CLI uninit command, and more. Tests requiring better-sqlite3 native
bindings are conditionally skipped.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sqlite-vss fails to compile on Windows — moving it to optional lets
install succeed while the code already falls back to brute-force
vector search. tree-sitter-liquid is removed entirely as it has ABI
incompatibility with tree-sitter 0.22+ and Liquid is handled by the
built-in regex extractor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Node insertion used plain INSERT which crashes on duplicate IDs.
In large C/C++ projects, tree-sitter can produce duplicate nodes for
the same symbol (e.g. typedef struct where both struct_specifier and
type_definition resolve to the same name/kind/line, or multiple
anonymous constructs on the same line).

- Change nodes INSERT to INSERT OR REPLACE (idempotent, same data)
- Change edges INSERT to INSERT OR IGNORE (skip duplicate edges)

The node ID is sha256(filePath:kind:name:line) which already uses full
relative paths, so cross-directory collisions are not the issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…raint-collision

Fix UNIQUE constraint crash on large C/C++ projects
PostToolUse(Edit|Write) marks the project dirty via .codegraph/.dirty,
and Stop syncs only if dirty — batching all edits into one sync per
Claude response. The installer now writes these hooks to settings.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Claude Code hooks for automatic CodeGraph sync
- Add 'svelte' to Language type, DEFAULT_CONFIG includes, grammars, and config validation
- Add SvelteExtractor that extracts <script> blocks and delegates to TS/JS TreeSitterExtractor
- Add Svelte framework resolver for runes ($state, $derived, $effect, etc.), store auto-subscriptions, SvelteKit module aliases ($app/*, $env/*, $lib/*), and SvelteKit route detection
- Update README to list Svelte and Dart in supported languages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
colbymchenry and others added 30 commits February 10, 2026 03:07
…ings

All grammar packages now have compatible peer deps with tree-sitter 0.21.x,
resulting in zero warnings during npm install. Downgraded grammars:
c 0.24.1→0.23.2, php 0.24.2→0.23.11, python 0.23.6→0.23.4,
rust 0.24.0→0.23.1, swift 0.7.1→0.6.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Normalize paths to forward slashes in matchesGlob() and scanDirectory()
so glob exclude patterns work on Windows. Add getGitIgnoredDirectories()
using git ls-files to skip .gitignore'd directories during indexing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e lock

- Add validateProjectPath() to reject sensitive system directories
- Add isPathWithinRoot/isPathWithinRootReal for symlink-aware path checks
- Replace hand-rolled glob-to-regex with picomatch to prevent ReDoS
- Add isSafeRegex() to reject custom patterns with nested quantifiers
- Replace FileLock with PID-tracking version that detects stale locks
- Add symlink detection in removeDirectory/listDirectoryContents
- Add subdirectory name validation in ensureSubdirectory
- Add atomicWriteFileSync and corrupted file backup in config-writer
- Add MCP input validation (validateString) for all tool handlers
- Fix CLAUDE.md section replacement to handle ### subsections correctly
- Add provenance column on edges for tracking how edges were created
- Add project_metadata table for version/provenance tracking
- Make unresolved_refs file_path/language NOT NULL with defaults
- Add composite indexes for unresolved_refs and edges.provenance
- Update v2 migration to handle all new schema additions
- Record schema version on initialize to prevent re-migration
- Add dynamic prepared statement cache (getDynamicStmt) for varying SQL
- Add batch methods: getNodesByIds, getNodesByKinds, getFileHashMap, getFileSyncMap
- Add project metadata methods: getMetadata, setMetadata, getAllMetadata
- Optimize getStats to use single aggregate query
- Optimize getStaleFiles to use temporary table JOIN
- Add provenance parameter to getOutgoingEdges
- Add intent field to SearchOptions type
- Add src/search/query-utils.ts with extractSearchTerms, scorePathRelevance,
  kindBonus, detectApiIntent, inferRouteDirectories
- Add multi-signal scoring to searchNodes (kind bonus + path relevance)
- Improve FTS query sanitization (strip :^ chars, filter boolean operators)
- Add comprehensive search tests
- Create file-kind nodes for each parsed source file
- Add isInsideClassLikeNode() for method vs function detection
- Extract arrow functions and function expressions from variable declarators
- Batch file I/O with FILE_IO_BATCH_SIZE=10 using Promise.all
- Add symlink cycle detection with visitedDirs Set in scanDirectory
- Add lazy grammar loading with exported getGrammar() function
- Add indexFileWithContent() for pre-read content processing
- Add tests for file nodes and arrow function extraction
Keeps the PR's visitedDirs rename and main's gitIgnoredDirs addition.
- Remove extractFunctionVariable() and its dispatch (already handled by extractVariable)
- Remove dead getGrammar() export (zero callers)
- Deduplicate indexFile by delegating to indexFileWithContent
- Remove redundant arrow function variable extraction tests (covered by existing suite)
feat: File nodes, arrow functions, parallel I/O
Both functions have zero callers — dead code on arrival. Remove them
and their tests (9 tests) to keep the module focused on what's
actually used: search term extraction, path relevance scoring, and
kind bonuses.
feat: Search query utilities + multi-signal scoring
- extraction/index.ts: use picomatch with static import (replacing
  dynamic require) and keep normalizePath for other call sites
- utils.ts: keep normalizePath from main, take PR's PID-based FileLock
…n-redos

security: Path validation, ReDoS prevention, picomatch
- Remove getNodesByIds, getNodesByKinds (zero callers)
- Remove getFileHashMap, getFileSyncMap (zero callers)
- Remove getDynamicStmt cache (only used by removed batch methods)
- Revert getStaleFiles to simple implementation (zero callers, no need to optimize)
- Remove intent field from SearchOptions (never read in search logic)
…ema-v2

feat: Schema v2 migration + database performance
Fixes colbymchenry#28 - Python site-packages directories (e.g.
audio_tools/python/Lib/site-packages/) were not excluded by default,
causing massive index bloat and FOREIGN KEY failures when indexing
large libraries like tensorflow. The FK crash itself was already fixed
via INSERT OR IGNORE, but excluding these directories prevents the
bloat in the first place.
The resolving refs phase stalled on large projects (3400+ files, 38k+ nodes)
because matchFuzzy loaded ALL functions/methods/classes per ref, import
mappings were re-extracted per ref, and fileExists hit disk every call.

Add kindCache, lowerNameCache, importMappingCache, and knownFiles set to
warmCaches(). Rewrite matchFuzzy to use O(1) lowercase index lookup instead
of 3x getNodesByKind scans. Cache import mappings per file. Pre-build file
existence set from the index for O(1) fileExists checks.
…ce-resolution

Optimize reference resolution with in-memory caches
Replace native tree-sitter with web-tree-sitter + tree-sitter-wasms for
universal cross-platform support. Add node-sqlite3-wasm as a fallback
when better-sqlite3 native bindings aren't available. Move better-sqlite3
and sqlite-vss to optionalDependencies so installs never fail.

Fix installer to use npx fallback when global npm install fails, so MCP
config, hooks, and quick-start instructions all work without the bare
codegraph command in PATH.

Fix tests: update schema version expectation, fix db test paths and
method names, extract MAX_OUTPUT_LENGTH as module constant, normalize
Windows path separators in import resolver.
…nd reusable prompts

Co-authored-by: southerncoder <13037278+southerncoder@users.noreply.github.com>
…lot' into copilot/optimize-for-github-copilot
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants