feat: update agent visibility and enhance delegation guidelines for improved context management

rokartur · rokartur · commit d44433a0d0fa · 2026-04-20T13:59:22.000+02:00
diff --git a/src/agents/caveman.ts b/src/agents/caveman.ts
@@ -45,4 +45,17 @@ Resume caveman after clear part done.
 
 Code blocks / commits / file content: write normal (no caveman in code).
 Level persist until changed or session end.
+
+# Token Efficiency Rules
+
+Reduce token consumption on every turn. These rules apply to ALL agents and subagents:
+
+1. **Structured queries before reads**: Use graph tools, LSP, ast-search, code-stats BEFORE reading files. They return concise metadata — one graph-symbols call replaces reading 5 files.
+2. **Narrow then read**: Never read a whole file when graph-query or LSP can pinpoint the exact lines needed. Use graph-symbols \`find\`/\`callers\`/\`callees\` to locate, then Read only the relevant range.
+3. **Batch tool calls**: Call multiple independent tools in a single response. Never serialize calls that could run in parallel.
+4. **Delegate to save context**: Sub-agents have separate context windows. Every token a sub-agent spends is a token you don't spend. Delegate research to explore/librarian/oracle instead of reading many files yourself.
+5. **code-stats over manual counting**: Never pipe \`find | wc -l\` or read files to count lines — use \`code-stats\`.
+6. **LSP over grep for symbols**: \`lsp-definition\`/\`lsp-references\`/\`lsp-hover\` are precise and return only what's needed. Grep returns noisy partial matches.
+7. **ast-search over grep for patterns**: Structural pattern matching avoids false positives and returns fewer, more relevant results.
+8. **Concise output**: Report only what was asked. No recap of unchanged context. No restating the question.
 `
diff --git a/src/agents/explore.ts b/src/agents/explore.ts
@@ -7,7 +7,7 @@ export const exploreAgent: AgentDefinition = {
 	displayName: 'explore',
 	description: 'Open-ended exploration agent — optimised for parallel codebase discovery and research.',
 	mode: 'subagent',
-	hidden: true,
+	hidden: false,
 	temperature: 0.2,
 	toolSupported: true,
 	tools: {
diff --git a/src/agents/forge.ts b/src/agents/forge.ts
@@ -106,10 +106,10 @@ You have access to three graph tools: graph-query, graph-symbols, and graph-anal
 
 ## Agent delegation
 
-**Delegation is the default for non-trivial research, exploration, and review.** Sub-agents have their own context window — every token they spend is a token you do NOT spend, and you can run several in parallel. Treat sub-agents as your primary discovery and review mechanism, not a fallback. Use inline \`Task\` for quick lookups, \`agent_<name>\` tools for direct agent calls, or \`bg_spawn\` for longer-running parallel work.
+**Delegation is the default for non-trivial research, exploration, and review.** Sub-agents have their own context window — every token they spend is a token you do NOT spend, and you can run several in parallel. Treat native OpenCode Task/subtask child sessions as your primary discovery and review mechanism, not a fallback. Prefer the built-in \`Task\` flow for subagent launches so they are visible in OpenCode and their output can be explored via child-session navigation. Use \`agent_<name>\` and \`bg_*\` only as compatibility fallbacks when the native Task/subtask flow is unavailable or you explicitly need legacy session/task IDs.
 
 ### Delegation resilience
-If a delegation tool (Task, agent_*, bg_spawn) returns an error or is unavailable, **do the work inline silently** — use graph tools, Read, Grep, and your own analysis. NEVER tell the user "agents unavailable" or "running inline analysis" — that is an implementation detail. Just do the work and present results.
+If native Task/subtask invocation returns an error or is unavailable, first try the compatibility \`agent_*\` / \`bg_*\` path if it clearly fits; otherwise **do the work inline silently** — use graph tools, Read, Grep, and your own analysis. NEVER tell the user "agents unavailable" or "running inline analysis" — that is an implementation detail. Just do the work and present results.
 
 ### When to delegate (default to YES)
 
@@ -131,18 +131,13 @@ Skip delegation when:
 - The user explicitly asked you to do it directly.
 - The result depends on context only you have (in-progress edits, pending tool output).
 
-### Background delegation
-- Use \`bg_spawn\` to run a sub-agent in a separate background session.
-- Use \`bg_status\` to check progress. Use \`bg_wait\` for critical-path tasks.
-- Use \`bg_continue\` to send follow-up prompts to a running/completed background task — full context is preserved.
-- Use \`bg_cancel\` to stop tasks that are no longer needed.
-
-### Conversational sub-agents
-Agent tools (\`agent_explore\`, \`agent_librarian\`, \`agent_sage\`, \`agent_oracle\`, \`agent_prometheus\`, \`agent_metis\`) support multi-turn conversations:
-- **First call**: Omit \`session_id\` — creates a new session. Response includes a \`session_id\`.
-- **Follow-up calls**: Provide \`session_id\` from the previous response — continues the conversation with full context.
-- Use this when the first answer is insufficient and you need the sub-agent to dig deeper, clarify, or expand.
-- Works for both sync and background modes.
+### Native subagent workflow
+- Prefer the built-in \`Task\` tool to invoke subagents by name.
+- Native Task/subtask runs create child sessions visible in OpenCode.
+- To inspect subagent output, navigate with \`session_child_first\`, \`session_child_cycle\`, \`session_child_cycle_reverse\`, and \`session_parent\`.
+- Use \`@<agent>\` when a direct user-visible subagent call is appropriate.
+- Use compatibility \`agent_*\` wrappers only when you explicitly need \`session_id\`-based continuation or the native Task/subtask path is unavailable.
+- Use \`bg_*\` only for detached compatibility/background flows when fire-and-forget execution matters more than native OpenCode child-session visibility.
 
 | Task Type | Delegate To | Notes |
 |-----------|-------------|-------|
@@ -154,11 +149,11 @@ Agent tools (\`agent_explore\`, \`agent_librarian\`, \`agent_sage\`, \`agent_ora
 | Analyse agent routing | metis | Recommends which agent to use |
 
 ### Delegation guidelines
-- **Fan out early**: Spawn up to 3 explore/librarian agents in parallel at the start of any non-trivial task — one per independent sub-question. Do not serialize research that could run concurrently.
+- **Fan out early**: Spawn up to 3 explore/librarian agents in parallel using native \`Task\` calls at the start of any non-trivial task — one per independent sub-question. Do not serialize research that could run concurrently.
 - **Brief them well**: Each delegated prompt must include the concrete question, the files/symbols already known, and the exact format you need back (a list, a snippet, a yes/no). Vague briefs waste sub-agent context.
-- **Wait, then act**: \`bg_wait\` for research that is on the critical path before editing. Poll others with \`bg_status\`.
+- **Wait, then act**: Prefer child-session navigation to inspect native subagent output on the critical path. Use \`bg_wait\` / \`bg_status\` only for detached compatibility/background runs.
 - **Review before done**: For any change spanning multiple files or touching shared code, spawn a sage review of the diff before reporting completion.
-- **Choose the right size**: Inline \`Task\` for a single quick lookup; \`bg_spawn\` for anything that would otherwise eat >100 lines of your own context or run >30s.
+- **Choose the right size**: Inline \`Task\` for a single quick lookup; native \`Task\`/subtask child sessions for parallel research and review; \`bg_spawn\` only for detached compatibility/background work.
 
 # Code references
 When referencing code, use the pattern \`file_path:line_number\` for easy navigation.
diff --git a/src/agents/librarian.ts b/src/agents/librarian.ts
@@ -8,7 +8,7 @@ export const librarianAgent: AgentDefinition = {
 	description:
 		'Research-only agent — finds information in the codebase using read-only tools. Returns structured findings.',
 	mode: 'subagent',
-	hidden: true,
+	hidden: false,
 	temperature: 0.0,
 	toolSupported: true,
 	tools: {
diff --git a/src/agents/metis.ts b/src/agents/metis.ts
@@ -7,7 +7,7 @@ export const metisAgent: AgentDefinition = {
 	displayName: 'metis',
 	description: 'Meta-agent — analyses the current session context and recommends which agent to use next.',
 	mode: 'subagent',
-	hidden: true,
+	hidden: false,
 	temperature: 0.1,
 	toolSupported: true,
 	tools: {
diff --git a/src/agents/muse.ts b/src/agents/muse.ts
@@ -64,10 +64,10 @@ You have access to four graph tools: graph-status, graph-query, graph-symbols, a
 
 ## Agent delegation
 
-**Delegation is the default for research, not a fallback.** As a planning agent your output quality depends on broad, accurate research. Sub-agents have their own context windows, can run in parallel, and use the same graph-first discovery you do — every token they spend on research is a token you keep for synthesizing the plan. Treat \`Task\`/\`agent_<name>\`/\`bg_spawn\` as your primary research mechanism.
+**Delegation is the default for research, not a fallback.** As a planning agent your output quality depends on broad, accurate research. Sub-agents have their own context windows, can run in parallel, and use the same graph-first discovery you do — every token they spend on research is a token you keep for synthesizing the plan. Treat native OpenCode \`Task\`/subtask child sessions as your primary research mechanism so subagent runs stay visible and explorable in OpenCode. Use \`agent_<name>\` and \`bg_*\` only as compatibility fallbacks when the native Task/subtask flow is unavailable or you explicitly need legacy session/task IDs.
 
 ### Delegation resilience
-If a delegation tool (Task, agent_*, bg_spawn) returns an error or is unavailable, **do the work inline silently** — use graph tools, Read, Grep, and your own analysis. NEVER tell the user "agents unavailable" or "running inline analysis" — that is an implementation detail. Just do the work and present results.
+If native Task/subtask invocation returns an error or is unavailable, first try the compatibility \`agent_*\` / \`bg_*\` path if it clearly fits; otherwise **do the work inline silently** — use graph tools, Read, Grep, and your own analysis. NEVER tell the user "agents unavailable" or "running inline analysis" — that is an implementation detail. Just do the work and present results.
 
 ### When to delegate (default to YES)
 
@@ -87,17 +87,13 @@ Skip delegation when:
 - The task is a small, well-scoped tweak to an area you already understand.
 - The result depends on synthesis only you can do (writing the plan itself, weighing tradeoffs, making the final recommendation).
 
-### Background delegation
-- Use \`bg_spawn\` to run a sub-agent in a separate background session.
-- Use \`bg_status\` to check progress. Use \`bg_wait\` for critical-path research.
-- Use \`bg_continue\` to send follow-up prompts to a running/completed background task — full context is preserved.
-- Use \`bg_cancel\` to stop tasks that are no longer needed.
-
-### Conversational sub-agents
-Agent tools (\`agent_explore\`, \`agent_librarian\`, \`agent_sage\`, \`agent_oracle\`, \`agent_prometheus\`, \`agent_metis\`) support multi-turn conversations:
-- **First call**: Omit \`session_id\` — creates a new session. Response includes a \`session_id\`.
-- **Follow-up calls**: Provide \`session_id\` from the previous response — continues the conversation with full context.
-- Use this when initial research is insufficient and you need the sub-agent to dig deeper or expand.
+### Native subagent workflow
+- Prefer the built-in \`Task\` tool to invoke subagents by name.
+- Native Task/subtask runs create child sessions visible in OpenCode.
+- To inspect subagent output, navigate with \`session_child_first\`, \`session_child_cycle\`, \`session_child_cycle_reverse\`, and \`session_parent\`.
+- Use \`@<agent>\` when a direct user-visible subagent call is appropriate.
+- Use compatibility \`agent_*\` wrappers only when you explicitly need \`session_id\`-based continuation or the native Task/subtask path is unavailable.
+- Use \`bg_*\` only for detached compatibility/background flows when fire-and-forget execution matters more than native OpenCode child-session visibility.
 
 | Task Type | Delegate To | Notes |
 |-----------|-------------|-------|
@@ -108,11 +104,11 @@ Agent tools (\`agent_explore\`, \`agent_librarian\`, \`agent_sage\`, \`agent_ora
 | Analyse agent routing | metis | Recommends which agent to use |
 
 ### Delegation guidelines
-- **Fan out early**: At the start of research, spawn up to 3 explore/librarian agents in parallel — one per independent sub-question. Do not serialize what could run concurrently.
+- **Fan out early**: At the start of research, spawn up to 3 explore/librarian agents in parallel using native \`Task\` calls — one per independent sub-question. Do not serialize what could run concurrently.
 - **Brief them well**: Each delegated prompt must include the concrete question, the files/symbols already known, the conventions you care about, and the exact format you need back (a list of files, a summary table, a yes/no with rationale). Vague briefs waste sub-agent context and produce useless results.
-- **Wait, then design**: \`bg_wait\` for research on the critical path before writing the plan. Poll the rest with \`bg_status\`.
+- **Wait, then design**: Prefer child-session navigation to inspect native subagent output on the critical path before writing the plan. Use \`bg_wait\` / \`bg_status\` only for detached compatibility/background runs.
 - **Validate assumptions**: When the design hinges on a specific behavior or convention, spawn a focused librarian/oracle call to confirm before locking it into the plan.
-- **Choose the right size**: Inline \`Task\` for a single quick lookup; \`bg_spawn\` for anything that would otherwise eat >100 lines of your own context or run >30s.
+- **Choose the right size**: Inline \`Task\` for a single quick lookup; native \`Task\`/subtask child sessions for parallel research and review; \`bg_spawn\` only for detached compatibility/background work.
 
 # Following conventions
 When planning changes, first understand the existing code conventions:
diff --git a/src/agents/oracle.ts b/src/agents/oracle.ts
@@ -7,7 +7,7 @@ export const oracleAgent: AgentDefinition = {
 	displayName: 'oracle',
 	description: 'Q&A agent — answers specific questions about the codebase with short, precise responses.',
 	mode: 'subagent',
-	hidden: true,
+	hidden: false,
 	temperature: 0.0,
 	toolSupported: true,
 	tools: {
diff --git a/src/agents/prometheus.ts b/src/agents/prometheus.ts
@@ -7,7 +7,7 @@ export const prometheusAgent: AgentDefinition = {
 	displayName: 'prometheus',
 	description: 'Generator agent — creates code scaffolding, boilerplate, migrations, and templates.',
 	mode: 'subagent',
-	hidden: true,
+	hidden: false,
 	temperature: 0.3,
 	toolSupported: true,
 	tools: {
diff --git a/src/agents/sage.ts b/src/agents/sage.ts
@@ -43,11 +43,12 @@ Your role in research mode is to investigate the codebase systematically and pro
 1. **Scope Understanding**: Start with a clear understanding of the research question.
 2. **Named-symbol lookup (LSP-first)**: When the question is about a specific named symbol in a supported language (TS/JS/Python/Rust/Go), prefer \`lsp-definition\`, \`lsp-references\`, and \`lsp-hover\` over regex grep.
 3. **High-Level Analysis**: Begin with project structure and architecture overview using graph tools (\`graph-query\` with \`top_files\`, \`packages\`).
-4. **Targeted Investigation**: Drill down into specific areas based on the research question using \`graph-symbols\` and \`graph-query\`.
-5. **Cross-Reference**: Examine relationships and dependencies across components (\`file_deps\`, \`file_dependents\`, \`callers\`, \`callees\`, \`cochanges\`).
-6. **Pattern Recognition**: Identify recurring patterns and design decisions; use \`ast-search\` for structural patterns text-grep cannot express.
-7. **Insight Synthesis**: Provide context and explanations for discovered patterns.
-8. **Actionable Recommendations**: Offer insights for better understanding or follow-up investigation.
+4. **Project overview**: Use \`code-stats\` for language/LOC summaries when project scale or composition matters — avoids reading many files.
+5. **Targeted Investigation**: Drill down into specific areas based on the research question using \`graph-symbols\` and \`graph-query\`.
+6. **Cross-Reference**: Examine relationships and dependencies across components (\`file_deps\`, \`file_dependents\`, \`callers\`, \`callees\`, \`cochanges\`).
+7. **Pattern Recognition**: Identify recurring patterns and design decisions; use \`ast-search\` for structural patterns text-grep cannot express.
+8. **Insight Synthesis**: Provide context and explanations for discovered patterns.
+9. **Actionable Recommendations**: Offer insights for better understanding or follow-up investigation.
 
 ### Research Response Structure
 
diff --git a/src/config.ts b/src/config.ts
@@ -367,6 +367,12 @@ function createAgentConfigs(agents: Record<AgentRole, AgentDefinition>): Record<
 
 	for (const agent of Object.values(agents)) {
 		const tools: Record<string, boolean> = {}
+		if (agent.tools?.include) {
+			// Whitelist mode: explicitly enable only listed tools
+			for (const tool of agent.tools.include) {
+				tools[tool] = true
+			}
+		}
 		if (agent.tools?.exclude) {
 			for (const tool of agent.tools.exclude) {
 				tools[tool] = false
diff --git a/src/runtime/agent-as-tool.ts b/src/runtime/agent-as-tool.ts
@@ -73,7 +73,9 @@ function createAgentTool(
 ): ReturnType<typeof tool> {
 	return tool({
 		description:
-			`Invoke the ${def.displayName} agent: ${def.description}\n\n` +
+			`Compatibility wrapper for the native ${def.displayName} subagent: ${def.description}\n\n` +
+			'Prefer the built-in Task/subtask flow or @mentions when you want OpenCode-native child-session visibility. ' +
+			'Use this wrapper only when you explicitly need session_id-based continuation or a legacy fallback path.\n\n' +
 			'Supports multi-turn conversations: omit session_id for a new session, ' +
 			'or provide a session_id from a previous call to continue the conversation ' +
 			'with full context preserved.',
diff --git a/src/tools/background.ts b/src/tools/background.ts
@@ -137,7 +137,8 @@ export function createBackgroundTools(
 		return {
 			bg_spawn: tool({
 				description:
-					'Spawn a background agent task using the lightweight session-backed runtime. ' +
+					'Compatibility background agent launcher using the lightweight session-backed runtime. ' +
+					'Prefer native Task/subtask child sessions when you want OpenCode-visible subagent runs and browsable output. ' +
 					'Returns a task_id and session_id that can be monitored with bg_status / bg_wait.',
 				args: {
 					agent: z
@@ -336,7 +337,7 @@ export function createBackgroundTools(
 	return {
 		bg_spawn: tool({
 			description:
-				'Spawn a background agent task. The agent runs in a separate session and can be monitored, continued, or cancelled.\n\n' +
+				'Compatibility background agent launcher. Prefer native Task/subtask child sessions when you want OpenCode-visible subagent runs and browsable output. The agent runs in a separate session and can be monitored, continued, or cancelled.\n\n' +
 				'Returns a task_id and session_id. Use session_id with bg_continue to send follow-up messages.',
 			args: {
 				agent: z
diff --git a/test/agents.test.ts b/test/agents.test.ts
@@ -27,6 +27,14 @@ describe('Agent definitions', () => {
 			expect(sageAgent.temperature).toBe(0.0)
 		})
 
+		test('plugin subagents are visible for native OpenCode invocation', async () => {
+			const { agents } = await import('../src/agents')
+			for (const name of ['explore', 'librarian', 'oracle', 'prometheus', 'metis'] as const) {
+				expect(agents[name].mode).toBe('subagent')
+				expect(agents[name].hidden).not.toBe(true)
+			}
+		})
+
 		test('sage agent has expected tool exclusions', () => {
 			expect(sageAgent.tools?.exclude).toBeDefined()
 			expect(sageAgent.tools?.exclude).toContain('plan-execute')