The plan_and_code MCP tool lets Claude Code trigger StackMemory’s multi‑agent flow silently and receive a single JSON result. It plans with Claude, implements with Codex or Claude, and critiques the result — with optional retry loops and context recording.
- Planner (Claude): generates a concise plan with acceptance criteria and risks.
- Implementer (Codex/Claude): applies a focused change per step.
- Critic (Claude): returns
{ approved, issues[], suggestions[] }to gate retries. - Verification commands: optional task-specific repro/test commands run after each implementation attempt and included in the critic input.
- Returns a single JSON payload:
{ plan, implementation, critique, iterations[] }.
- name:
plan_and_code - arguments:
task(string, required): short task descriptionimplementer("codex" | "claude", default:codex)maxIters(number, default:2): retry loop iterationsexecute(boolean, default:false): iffalse, implementer is dry‑runverificationCommands(string[], optional): repro/test commands that must pass after each implementation attemptrecord(boolean, default:false): write plan/critique as simple context rowsrecordFrame(boolean, default:false): write a real frame + anchors
If not specified in arguments, the MCP handler reads these env vars:
STACKMEMORY_MM_PLANNER_MODEL(e.g.,claude-sonnet-4-20250514)STACKMEMORY_MM_REVIEWER_MODEL(defaults to planner model if unset)STACKMEMORY_MM_IMPLEMENTER(codexorclaude)STACKMEMORY_MM_MAX_ITERS(e.g.,3)
{
"method": "tools/call",
"params": {
"name": "plan_and_code",
"arguments": {
"task": "Refactor config loader into provider pattern",
"implementer": "codex",
"maxIters": 2,
"execute": true,
"verificationCommands": [
"npx vitest run src/orchestrators/multimodal/__tests__/determinism.test.ts --reporter=dot"
],
"recordFrame": true
}
}
}Response content is a single text item containing a JSON string:
{
"ok": true,
"result": {
"plan": { "summary": "...", "steps": [ ... ], "risks": [ ... ] },
"implementation": { "success": true, "summary": "...", "commands": [ ... ] },
"critique": { "approved": true, "issues": [], "suggestions": [] },
"iterations": [
{ "command": "...", "ok": true, "outputPreview": "...", "critique": { ... } }
]
}
}record: truewrites two entries into.stackmemory/context.db(simplecontextstable):Plan: <summary>(importance 0.8)Critique: approved|needs_changes(importance 0.6)
recordFrame: truewrites a real frame + anchors using the FrameManager:- Frame:
Plan & Code: <task> - Anchors:
DECISION(plan summary),FACT(commands),RISK(first few issues),TODO(first few suggestions) - Closes the frame with
{ approved: true|false }
- Frame:
- Both modes are best‑effort. If the DB isn’t ready, handler returns JSON without failing.
- Implementer
codexcallscodex-sm(must be on PATH). Use--executein CLI, orexecute: truein MCP, to actually run it; otherwise it’s a dry‑run. - Audit files are saved to
.stackmemory/build/spike-<timestamp>.jsonto support review/debugging. - You can compare models:
- Planner/critic: override with
STACKMEMORY_MM_PLANNER_MODEL/STACKMEMORY_MM_REVIEWER_MODEL. - Implementer: set to
claudeto A/B against Codex, or keepcodex(default).
- Planner/critic: override with
- Quiet JSON output:
stackmemory build "Refactor config loader" --jsonstackmemory skills spike --task "Refactor config loader" --json
- Execute implementer and record as frame:
stackmemory skills spike --task "Refactor" --execute --max-iters 3 --json --record-frame
- Execute with a task-specific verification harness:
stackmemory build "Fix deterministic replay drift" --verify "npm run determinism:test" --execute
Use this two‑phase flow when you want the plan reviewed before any code runs.
Request (tools/call):
{
"method": "tools/call",
"params": {
"name": "plan_gate",
"arguments": {
"task": "Refactor config loader into provider pattern",
"plannerModel": "claude-sonnet-4-20250514"
}
}
}Response (content[0].text is a JSON string):
{
"ok": true,
"approvalId": "appr_1738612345678_ab12cd",
"plan": { "summary": "...", "steps": [ ... ], "risks": [ ... ] }
}Render plan for review; store approvalId for Phase 2.
Request (tools/call):
{
"method": "tools/call",
"params": {
"name": "approve_plan",
"arguments": {
"approvalId": "appr_1738612345678_ab12cd",
"implementer": "codex",
"maxIters": 2,
"execute": true,
"recordFrame": true
}
}
}Response (content[0].text is a JSON string):
{
"ok": true,
"approvalId": "appr_1738612345678_ab12cd",
"result": {
"plan": { ... },
"implementation": { "success": true, "commands": [ ... ] },
"critique": { "approved": true, "issues": [], "suggestions": [] },
"iterations": [ { "command": "...", "ok": true, "critique": { ... } } ]
}
}Notes:
recordFrame: truecreates a real StackMemory frame + anchors (plan summary, commands, issues, suggestions).execute: trueactually invokes the implementer; otherwise it’s a dry‑run.- Approval IDs are persisted to
.stackmemory/build/pending.jsonso editor restarts don’t lose pending approvals.
plan_only: Returns a plan JSON without running code.call_claude: Calls Claude directly (prompt/model/system).call_codex: Calls Codex viacodex-sm(prompt/args/execute).pending_list: Lists pending approval-gated plans withapprovalId,task, andcreatedAt. Supports optional filters:{ taskContains: "refactor", sort: "desc", limit: 10 }{ olderThanMs: 3600000 }(older than 1 hour){ newerThanMs: 600000 }(newer than 10 minutes)
pending_clear: Clears pending approvals. Args:{ approvalId }, or{ all: true }, or{ olderThanMs: <ms> }.pending_show: Returns a stored pending plan by{ approvalId }.