refactor(agent/agentcontext): fixed-location shallow context discovery by kylecarbs · Pull Request #26596 · coder/coder

kylecarbs · 2026-06-23T02:19:25Z

Problem

agent/agentcontext resolved workspace context by walking the working directory recursively (depth 8) and matching files by basename. This over-injected context:

Instruction-file matching was case-insensitive, so the generated docs/reference/api/agents.md was treated as an instruction file.
Symlinked instruction files (CLAUDE.md, .cursorrules -> AGENTS.md) shipped as duplicate resources.
Nested AGENTS.md (e.g. site/AGENTS.md) were collected from anywhere in the tree.
Skills were discovered from any skills/ directory anywhere in the tree, and .mcp.json from any depth.

Resolving the repo root produced six instruction sources for what was effectively one file of guidance, plus skills/MCP found by an open-ended walk.

What changed

Replace the recursive scan with fixed-location, shallow discovery. Each scan root is inspected at its top level only: the resolver never descends into subdirectories and never climbs to a parent. Additional directories are added explicitly as sources (HTTP API) or via the CODER_AGENT_EXP_*_DIRS seeding env vars.

Single working-dir scan root. The working directory is one scan root. Instruction files and .mcp.json are read only at its top level.
Fixed-location skills. Skills are discovered only from skills, .agents/skills, .claude/skills, .codex/skills (one skill per immediate subdir with a SKILL.md), not from arbitrary skills/ directories.
Case-sensitive instruction names. Exact AGENTS.md/CLAUDE.md/.cursorrules; a lower-case agents.md is ignored.
Symlink dedup. Resources are attributed to their resolved target, so symlinked CLAUDE.md/.cursorrules collapse into the single AGENTS.md.
Watcher mirrors the same fixed-location set instead of recursively watching every scan root (no more walking node_modules).
The recursive walkDir, skipDirNames, MaxScanDepth, and isSkillsContainer are removed.

Resolving the repo root now yields AGENTS.md, .mcp.json, and the .agents/skills/.claude/skills skills, with no nested instruction-file noise.

Behavior change

site/AGENTS.md is no longer auto-injected when the working dir is the repo root. It loads when the working dir is site/ (its top level), or when site/ is added as an explicit source. There is intentionally no walk-up to a .git project root: an agent started in a subdirectory does not auto-inherit ancestor AGENTS.md; those directories are added explicitly.

Verification & decision log

codex research (confirmed via source). Instruction files: codex walks up to the first .git ancestor and reads root->cwd, one file per directory, exact-cased names (codex-rs/core/src/agents_md.rs). Skills: fixed roots (.agents/skills, .codex/skills, $CODEX_HOME/skills, ...) with bounded in-root recursion (core-skills/src/loader.rs). MCP: .codex/config.toml via walk-up; a project .mcp.json is not a runtime source in codex. No resource type triggers an unbounded downward walk.

Decisions.

Adopt codex's fixed-location, shallow discovery (case-sensitive names, symlink dedup, top-level-only files, container-only skills).
Deliberately omit codex's walk-up to the .git project root. In Coder the working dir is the scan root and extra directories are added explicitly (HTTP API / CODER_AGENT_EXP_*_DIRS), so the implicit ancestor climb added surprise without benefit (e.g. context add ./site should scan ./site, not the repo root).
Keep .mcp.json (codex uses config.toml, intentionally not added).
Include .claude/skills and .codex/skills in the container list so the repo's existing .claude/skills skills are not regressed; skills recurse one level inside a container.

Tests. TestManager_WorkingDirScannedShallow (working dir read at top level; ancestor root and nested subdir both excluded); TestResolver_SkillsOnlyFromFixedContainers, TestResolver_MCPConfigOnlyAtScanRoot, TestResolver_SymlinkedInstructionFilesDeduplicated, TestResolver_InstructionFilesOnlyAtScanRoot, TestResolver_InstructionNamesAreCaseSensitive. Cap tests use multiple scan roots.

Local checks. gofmt, go vet, golangci-lint, go test -race, and make lint/emdash pass for the package.

🤖 Generated by Coder Agents on behalf of @kylecarbs.

Instruction-file resolution over-injected context: the resolver walked the working directory recursively, matched filenames case-insensitively, and emitted symlinked duplicates. Resolving the repo root produced six instruction sources (AGENTS.md plus its CLAUDE.md and .cursorrules symlinks, a nested site/AGENTS.md and site/CLAUDE.md, and the lower-case docs/reference/api/agents.md API doc) that amounted to one file's worth of guidance. Align with codex, which finds the project root by walking up to .git and reads at most one AGENTS.md per ancestor directory rather than descending into subdirectories: - Recognize instruction files only at a scan root's top level. Skills and .mcp.json keep recursive discovery, since they are a Coder extension with no codex equivalent. - Match instruction basenames case-sensitively, so a lower-case agents.md is not mistaken for an instruction file. - Attribute each resource to its resolved symlink target so CLAUDE.md and .cursorrules symlinked to AGENTS.md collapse into one resource via the existing ID-based dedup instead of shipping identical content multiple times. On resolve failure the original path is kept so the error points at the offending link. Resolving the repo root now yields a single instruction resource.

… discovery The resolver previously walked the working directory recursively to depth 8 and matched any nested instruction file, .mcp.json, or skills/ directory. codex never walks the working tree downward: it reads fixed locations and walks up to the project root. Replace the recursive engine with codex-style discovery. - Manager walks up from the working dir to the nearest .git ancestor (a directory, or a worktree/submodule .git file) and feeds the root->cwd chain as separate scan roots. - Resolver inspects only each scan root's top level for instruction files and .mcp.json, and discovers skills from fixed container locations (skills, .agents/skills, .claude/skills, .codex/skills), one skill per immediate subdirectory. The recursive walkDir, skipDirNames, MaxScanDepth, and isSkillsContainer are removed. - Watcher mirrors the same fixed-location set instead of recursively watching every scan root, so it no longer walks node_modules and other large trees. .mcp.json is kept (no codex config.toml support). Resolving the repo root now yields AGENTS.md, .mcp.json, and the .agents/skills and .claude/skills skills, with no nested instruction-file noise.

… dir directly

github-actions Bot assigned kylecarbs Jun 23, 2026

kylecarbs marked this pull request as ready for review June 23, 2026 02:24

sreya approved these changes Jun 23, 2026

View reviewed changes

kylecarbs changed the title ~~fix(agent/agentcontext): match codex instruction-file discovery~~ refactor(agent/agentcontext): adopt codex-style fixed-location context discovery Jun 23, 2026

kylecarbs added 2 commits June 23, 2026 03:22

test(agent/agentcontext): cover user sources not walking up to git root

cda7f00

refactor(agent/agentcontext): drop project-root walk-up, scan working…

f062097

… dir directly

kylecarbs changed the title ~~refactor(agent/agentcontext): adopt codex-style fixed-location context discovery~~ refactor(agent/agentcontext): fixed-location shallow context discovery Jun 23, 2026

kylecarbs enabled auto-merge (squash) June 23, 2026 03:39

kylecarbs merged commit 0f37522 into main Jun 23, 2026
33 of 34 checks passed

kylecarbs deleted the kylecarbs/agentcontext-instruction-discovery branch June 23, 2026 03:42

github-actions Bot locked and limited conversation to collaborators Jun 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(agent/agentcontext): fixed-location shallow context discovery#26596

refactor(agent/agentcontext): fixed-location shallow context discovery#26596
kylecarbs merged 4 commits into
mainfrom
kylecarbs/agentcontext-instruction-discovery

kylecarbs commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kylecarbs commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

What changed

Behavior change

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kylecarbs commented Jun 23, 2026 •

edited

Loading