Skip to content

refactor(agent/agentcontext): fixed-location shallow context discovery#26596

Merged
kylecarbs merged 4 commits into
mainfrom
kylecarbs/agentcontext-instruction-discovery
Jun 23, 2026
Merged

refactor(agent/agentcontext): fixed-location shallow context discovery#26596
kylecarbs merged 4 commits into
mainfrom
kylecarbs/agentcontext-instruction-discovery

Conversation

@kylecarbs

@kylecarbs kylecarbs commented Jun 23, 2026

Copy link
Copy Markdown
Member

Problem

agent/agentcontext resolved workspace context by walking the working directory recursively (depth 8) and matching files by basename. This over-injected context:

  • Instruction-file matching was case-insensitive, so the generated docs/reference/api/agents.md was treated as an instruction file.
  • Symlinked instruction files (CLAUDE.md, .cursorrules -> AGENTS.md) shipped as duplicate resources.
  • Nested AGENTS.md (e.g. site/AGENTS.md) were collected from anywhere in the tree.
  • Skills were discovered from any skills/ directory anywhere in the tree, and .mcp.json from any depth.

Resolving the repo root produced six instruction sources for what was effectively one file of guidance, plus skills/MCP found by an open-ended walk.

What changed

Replace the recursive scan with fixed-location, shallow discovery. Each scan root is inspected at its top level only: the resolver never descends into subdirectories and never climbs to a parent. Additional directories are added explicitly as sources (HTTP API) or via the CODER_AGENT_EXP_*_DIRS seeding env vars.

  • Single working-dir scan root. The working directory is one scan root. Instruction files and .mcp.json are read only at its top level.
  • Fixed-location skills. Skills are discovered only from skills, .agents/skills, .claude/skills, .codex/skills (one skill per immediate subdir with a SKILL.md), not from arbitrary skills/ directories.
  • Case-sensitive instruction names. Exact AGENTS.md/CLAUDE.md/.cursorrules; a lower-case agents.md is ignored.
  • Symlink dedup. Resources are attributed to their resolved target, so symlinked CLAUDE.md/.cursorrules collapse into the single AGENTS.md.
  • Watcher mirrors the same fixed-location set instead of recursively watching every scan root (no more walking node_modules).
  • The recursive walkDir, skipDirNames, MaxScanDepth, and isSkillsContainer are removed.

Resolving the repo root now yields AGENTS.md, .mcp.json, and the .agents/skills/.claude/skills skills, with no nested instruction-file noise.

Behavior change

site/AGENTS.md is no longer auto-injected when the working dir is the repo root. It loads when the working dir is site/ (its top level), or when site/ is added as an explicit source. There is intentionally no walk-up to a .git project root: an agent started in a subdirectory does not auto-inherit ancestor AGENTS.md; those directories are added explicitly.

Verification & decision log

codex research (confirmed via source). Instruction files: codex walks up to the first .git ancestor and reads root->cwd, one file per directory, exact-cased names (codex-rs/core/src/agents_md.rs). Skills: fixed roots (.agents/skills, .codex/skills, $CODEX_HOME/skills, ...) with bounded in-root recursion (core-skills/src/loader.rs). MCP: .codex/config.toml via walk-up; a project .mcp.json is not a runtime source in codex. No resource type triggers an unbounded downward walk.

Decisions.

  • Adopt codex's fixed-location, shallow discovery (case-sensitive names, symlink dedup, top-level-only files, container-only skills).
  • Deliberately omit codex's walk-up to the .git project root. In Coder the working dir is the scan root and extra directories are added explicitly (HTTP API / CODER_AGENT_EXP_*_DIRS), so the implicit ancestor climb added surprise without benefit (e.g. context add ./site should scan ./site, not the repo root).
  • Keep .mcp.json (codex uses config.toml, intentionally not added).
  • Include .claude/skills and .codex/skills in the container list so the repo's existing .claude/skills skills are not regressed; skills recurse one level inside a container.

Tests. TestManager_WorkingDirScannedShallow (working dir read at top level; ancestor root and nested subdir both excluded); TestResolver_SkillsOnlyFromFixedContainers, TestResolver_MCPConfigOnlyAtScanRoot, TestResolver_SymlinkedInstructionFilesDeduplicated, TestResolver_InstructionFilesOnlyAtScanRoot, TestResolver_InstructionNamesAreCaseSensitive. Cap tests use multiple scan roots.

Local checks. gofmt, go vet, golangci-lint, go test -race, and make lint/emdash pass for the package.


🤖 Generated by Coder Agents on behalf of @kylecarbs.

Instruction-file resolution over-injected context: the resolver walked
the working directory recursively, matched filenames case-insensitively,
and emitted symlinked duplicates. Resolving the repo root produced six
instruction sources (AGENTS.md plus its CLAUDE.md and .cursorrules
symlinks, a nested site/AGENTS.md and site/CLAUDE.md, and the lower-case
docs/reference/api/agents.md API doc) that amounted to one file's worth
of guidance.

Align with codex, which finds the project root by walking up to .git and
reads at most one AGENTS.md per ancestor directory rather than descending
into subdirectories:

- Recognize instruction files only at a scan root's top level. Skills
  and .mcp.json keep recursive discovery, since they are a Coder
  extension with no codex equivalent.
- Match instruction basenames case-sensitively, so a lower-case
  agents.md is not mistaken for an instruction file.
- Attribute each resource to its resolved symlink target so CLAUDE.md
  and .cursorrules symlinked to AGENTS.md collapse into one resource via
  the existing ID-based dedup instead of shipping identical content
  multiple times. On resolve failure the original path is kept so the
  error points at the offending link.

Resolving the repo root now yields a single instruction resource.
@kylecarbs kylecarbs marked this pull request as ready for review June 23, 2026 02:24
… discovery

The resolver previously walked the working directory recursively to
depth 8 and matched any nested instruction file, .mcp.json, or skills/
directory. codex never walks the working tree downward: it reads fixed
locations and walks up to the project root. Replace the recursive engine
with codex-style discovery.

- Manager walks up from the working dir to the nearest .git ancestor
  (a directory, or a worktree/submodule .git file) and feeds the
  root->cwd chain as separate scan roots.
- Resolver inspects only each scan root's top level for instruction
  files and .mcp.json, and discovers skills from fixed container
  locations (skills, .agents/skills, .claude/skills, .codex/skills),
  one skill per immediate subdirectory. The recursive walkDir,
  skipDirNames, MaxScanDepth, and isSkillsContainer are removed.
- Watcher mirrors the same fixed-location set instead of recursively
  watching every scan root, so it no longer walks node_modules and
  other large trees.

.mcp.json is kept (no codex config.toml support). Resolving the repo
root now yields AGENTS.md, .mcp.json, and the .agents/skills and
.claude/skills skills, with no nested instruction-file noise.
@kylecarbs kylecarbs changed the title fix(agent/agentcontext): match codex instruction-file discovery refactor(agent/agentcontext): adopt codex-style fixed-location context discovery Jun 23, 2026
@kylecarbs kylecarbs changed the title refactor(agent/agentcontext): adopt codex-style fixed-location context discovery refactor(agent/agentcontext): fixed-location shallow context discovery Jun 23, 2026
@kylecarbs kylecarbs enabled auto-merge (squash) June 23, 2026 03:39
@kylecarbs kylecarbs merged commit 0f37522 into main Jun 23, 2026
33 of 34 checks passed
@kylecarbs kylecarbs deleted the kylecarbs/agentcontext-instruction-discovery branch June 23, 2026 03:42
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 23, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants