Generated: 2026-04-03 Source: Deep analysis of claw-code, oh-my-codex (OMX), oh-my-claudecode (OMC), clawhip, memsearch, bb25 Status legend:
[ ]pending,[~]in progress,[x]done,[!]blocked
These address the most critical architectural gaps in the runtime engine. Inspired by: claw-code hook DAG, OMC 11 lifecycle events, clawhip event pipeline.
Gap:
engine.tsandengine-streaming.tshave only 2 audit log points (FLOW_EXECUTION_START, FLOW_EXECUTION_END). No per-node or per-tool hooks exist. OMC has 11 lifecycle hooks; claw-code has a full DAG-based hook pipeline.
Current state (verified in code):
engine.tsline 140-146:FLOW_EXECUTION_STARTaudit logengine.tsline 244-254:FLOW_EXECUTION_ENDaudit log- No events emitted between nodes, before/after tool calls, or on errors
Tasks:
- A1.1 — Define
FlowHookEventtype with 8 events (addedonPreCompact):onFlowStart,onFlowComplete,onFlowError,beforeNodeExecute,afterNodeExecute,beforeToolCall,afterToolCall,onPreCompact
- A1.2 — Created
src/lib/runtime/hooks.ts:FlowHookRegistryclass withaddSink(sink)andemit(payload)WebhookHookSinkfor fire-and-forget webhook delivery (5s timeout)createHooksFromFlowContent()factory,emitHook()convenience wrapper- Error in any sink never crashes the flow (try/catch with logger.warn)
- A1.3 — Integrated hooks into
engine.ts(5 points + auto-initialization) - A1.4 — Integrated hooks into
engine-streaming.ts(5 points in all 3 codepaths) - A1.5 — Added
experimental_onToolCallStart/experimental_onToolCallFinishto both AI handlers - A1.6 —
WebhookHookSinkimplemented (fire-and-forget POST, 5s AbortSignal timeout) - A1.7 — Hook config in
FlowContentJSON:hookWebhookUrls?: string[],hookEvents?: string[]- Zod validation: max 10 URLs, enum of valid event types; no DB migration needed
- A1.8 — 17 unit tests: FlowHookRegistry, WebhookHookSink, factory, emitHook — all pass
- A1.9 — Integration test: flow with hooks → webhook receives events (deferred)
Files to modify:
src/lib/runtime/hooks.ts(NEW)src/lib/runtime/engine.tssrc/lib/runtime/engine-streaming.tssrc/lib/runtime/handlers/ai-response-handler.tssrc/lib/runtime/handlers/ai-response-streaming-handler.tssrc/lib/runtime/types.ts(add FlowHookEvent type)
Estimated effort: Medium (3-5 days) Impact: Foundation for all monitoring, debugging, and external integrations
Gap:
engine.tsline 134-137 doesmessageHistory.slice(-MAX_HISTORY). This is a brutal truncation that permanently loses context. OpenClaw triggers a "silent agentic turn" before compaction — asks the AI to save critical information to AgentMemory BEFORE truncating. OMC has apre-compacthook that fires before context window compression.
Current state (verified in code):
engine.tsline 134:context.messageHistory = context.messageHistory.slice(-MAX_HISTORY)engine-streaming.tsline 25-27: same MAX_HISTORY=100- No pre-truncation logic whatsoever — context is lost permanently
Tasks:
- A2.1 — Create
src/lib/runtime/context-compaction.ts:compactContext(context, model)function- Step 1: Generate summary prompt: "Summarize the key facts, decisions, and state from this conversation that should be preserved"
- Step 2: Call AI with summarization prompt (use cheap model: deepseek-chat or haiku)
- Step 3: Write summary to AgentMemory via
memory_writehandler (key:__context_summary_{timestamp}, category:context_compaction) - Step 4: Optionally write important variable states to memory
- Step 5: THEN truncate messageHistory to MAX_HISTORY
- Step 6: Prepend a system message with the summary to the truncated history
- A2.2 — Add compaction threshold to engine.ts:
COMPACTION_THRESHOLD = 80(trigger compaction at 80% of MAX_HISTORY)- At line ~134: if
messageHistory.length > COMPACTION_THRESHOLD, callcompactContext()before slice
- A2.3 — Same integration in engine-streaming.ts
- A2.4 — Emit
onPreCompacthook event before compaction runs (added tocompactContext()) - A2.5 — Add agent-level config:
enableSmartCompaction: boolean(default: true for new agents) - A2.6 — Write unit tests:
- Test: compaction saves summary to AgentMemory
- Test: truncated history includes prepended summary
- Test: compaction threshold triggers correctly at 80%
- Test: compaction works even if AI call fails (graceful fallback to raw truncation)
Files to modify:
src/lib/runtime/context-compaction.ts(NEW)src/lib/runtime/engine.ts(line ~134)src/lib/runtime/engine-streaming.ts
Estimated effort: Medium (2-3 days) Impact: ENORMOUS — eliminates the #1 problem with long-running agents (context loss)
Gap:
reflexive_loophas max 5 iterations and no build/test/lint verification. OMC's$ralphmode runs indefinitely with single ownership until a verifier confirms completion. Thepersistent-modehook prevents stopping.
Current state (verified in code):
reflexive-loop-handler.tsline 34-37:maxIterationscapped at 1-5, default 3- Evaluation is AI-only (no build/test/lint commands)
- No "persistent mode" concept — loop always terminates after max iterations
Tasks:
- A3.1 — Added
persistentmode toreflexive_loopnode config:mode: "bounded" | "persistent"(default: "bounded" = current behavior)- In persistent mode: max iterations raised to 20 (MAX_PERSISTENT_ITERATIONS), primary exit is verifier pass
- A3.2 — Added verification commands to evaluator step:
- New config field:
verificationCommands: string[](e.g., ["npm run build", "npm run test", "npm run lint"]) - After AI evaluation passes, if commands configured, runs them via
child_process.execFile(NOT sandbox — see note below) - All commands must pass (exit code 0) for the verifier to approve
- Command output appended to evaluator feedback for next iteration
- Security: whitelist regex for command prefixes, shell metacharacter blocking, 60s timeout, execFile (no shell)
- Design deviation: TASKS.md originally specified
python_codeorcode_interpretersandbox, but both blockos/subprocessimports.execFilewith whitelist + metacharacter blocking is safer thanexecand sufficient for build/test/lint commands.
- New config field:
- A3.3 — Added
persistent-modecheck toendnode handler:- If
__persistent_mode = trueand__verifier_confirmed = false, routes back to__persistent_return_node - Updated maxVisits in both
engine.tsandengine-streaming.ts:reflexive_loopgets 110 (likeloop),endgets 25 when persistent
- If
- A3.4 — Emit
onPersistentCaphook event when persistent loop exhausts iteration cap- Renamed from
session.blockedtoonPersistentCapfor camelCase consistency with existing hook events
- Renamed from
- A3.5 — 16 unit tests for persistent mode (bounded defaults, persistent variables, cleanup on pass/fail/error, verification command filtering, end-handler routing)
- A3.6 — Persistent state cleanup:
__persistent_mode,__verifier_confirmed,__persistent_return_nodecleaned up in updatedVariables on all exit paths (passed, failed, error)
Files modified:
src/lib/runtime/handlers/reflexive-loop-handler.ts— persistent mode, verification commands, variable cleanupsrc/lib/runtime/handlers/end-handler.ts— persistent routing back to reflexive_loopsrc/lib/runtime/engine.ts— maxVisits for reflexive_loop + persistent endsrc/lib/runtime/engine-streaming.ts— same maxVisits updatesrc/lib/runtime/types.ts— addedonPersistentCapto FlowHookEventTypesrc/lib/validators/flow-content.ts— addedonPersistentCapto Zod HOOK_EVENT_TYPESsrc/lib/runtime/handlers/__tests__/persistent-mode.test.ts(NEW — 16 tests)
Estimated effort: Medium (2-3 days) Impact: Enables production-grade autonomous workflows that self-verify
Inspired by: OMC's 5 execution modes (Autopilot, Ultrapilot, Swarm, Pipeline, Ecomode).
Gap: No equivalent exists. OMC Swarm mode spawns N agents that pull from a shared task pool. Each agent atomically claims a task, executes it, and marks it complete. Prevents duplicate work.
Current state: Closest is parallel node (MAX_BRANCHES=5, fixed branch assignment).
No dynamic task claiming or shared pool.
Tasks:
- B1.1 — Add
swarmtoNodeTypeunion insrc/types/index.ts(56th type)- Also added to
flow-content.tsNODE_TYPES validator array
- Also added to
- B1.2 — Created
src/lib/runtime/handlers/swarm-handler.ts(~350 lines):- Config:
tasks: string[],tasksVariable: string,workerCount: number(1-10, default 3),workerModel: string,mergeStrategy: "concat" | "summarize",systemPrompt,taskContext - Task queue with atomic claiming (safe in single-threaded Node.js)
- N workers via
Promise.allSettled, continue until queue empty or deadline - Merge: concat (join results) or summarize (AI-powered synthesis)
- Routes to
doneorfailedsourceHandle based on success rate - Respects
__model_tier_overrideand__ecomode_enabledfrom cost_monitor - Safety: TASK_TIMEOUT_MS=60s, OVERALL_TIMEOUT_MS=300s, MAX_TASKS=50, MAX_WORKERS=10
- Config:
- B1.3 — Registered handler in
src/lib/runtime/handlers/index.ts - B1.4 — Created
src/components/builder/nodes/swarm-node.tsx(amber theme, Boxes icon) - B1.5 — Added to node picker in
src/components/builder/node-picker.tsx(ai category) - B1.6 — Added property editor in
src/components/builder/property-panel.tsx:- Worker count, model selector, system prompt, task context, tasks variable, static task list editor, merge strategy, output variable
- Also registered in
flow-builder.tsxNODE_TYPES map and OUTPUT_VAR_TYPES set
- B1.7 — 16 unit tests in
swarm-handler.test.ts: empty tasks, string array, newline-separated, variable source, worker capping, MAX_TASKS, output variables, done/failed routing, model override, ecomode, error handling, empty task filtering- Also updated
node-picker.test.tsx: count 55→56, ai category 13→14, added Boxes mock
- Also updated
- B1.8 — Updated CLAUDE.md section 3 + 6 with swarm node (56 types, handler description)
Files to create/modify:
src/types/index.ts(add to NodeType union)src/lib/validators/flow-content.ts(add"swarm"to NODE_TYPES array — kontrolni ček addition)src/lib/runtime/handlers/swarm-handler.ts(NEW)src/lib/runtime/handlers/index.ts(register)src/components/builder/nodes/swarm-node.tsx(NEW)src/components/builder/flow-builder.tsx(import + register in NODE_TYPES map — kontrolni ček addition)src/components/builder/node-picker.tsxsrc/components/builder/property-panel.tsx
Note (kontrolni ček): swarm is NOT self-routing — uses default edge after completion. Do NOT add to SELF_ROUTING_NODES in engine.ts.
Estimated effort: High (3-4 days) Impact: Enables sprint-backlog-style parallel work — major new orchestration pattern
Gap:
parallel-handler.tsline 35 uses shallow copy{ ...context.variables }. Nested objects (arrays, sub-objects) in variables are shared between branches — branch A can mutate branch B's data. OMC Ultrapilot uses full context isolation per worker.Kontrolni ček correction: messageHistory already uses spread
[...context.messageHistory](line 80) which is sufficient for {role,content} string objects. The real bug is ONLY in variables shallow copy.structuredClone()is the fix (Node 18+, handles Date/RegExp/Map).
Current state (verified in code):
- MAX_BRANCHES = 5 (hardcoded at line 4 in BOTH parallel-handler.ts AND parallel-streaming-handler.ts)
- Branch context:
{ ...context, variables: { ...context.variables } }(shallow copy — BUG) - messageHistory:
[...context.messageHistory](spread copy — OK, strings only)
Tasks:
- B2.1 — Deep-copy variables with
structuredClone()in both handlers:parallel-handler.tsline 35:{ ...context.variables }→structuredClone(context.variables)parallel-streaming-handler.tsline 37: same fix- structuredClone handles: nested objects, arrays, Date, RegExp, Map, Set
- structuredClone throws on functions — context.variables should never contain functions (verified)
- B2.3 — Increase MAX_BRANCHES from 5 to 10 in BOTH handlers (line 4 and line 8)
- Also updated existing tests: parallel-handler.test.ts (12 branches → expect 10), parallel-streaming-handler.test.ts (13 branches → expect 10)
- B2.4 — Write tests for context isolation (branch A writes nested variable, branch B doesn't see it)
- 6 tests in
parallel-context-isolation.test.ts: variable isolation, original untouched, deep nested, arrays, MAX_BRANCHES=10, Date isolation
- 6 tests in
Files to modify:
src/lib/runtime/handlers/parallel-handler.ts(structuredClone + MAX_BRANCHES)src/lib/runtime/handlers/parallel-streaming-handler.ts(structuredClone + MAX_BRANCHES)
Estimated effort: Low (0.5 day) Impact: Prevents context pollution between parallel branches — correctness bug fix
Gap: cost_monitor adaptive mode already has tier downgrade at 60/80/95%. OMC Ecomode additionally routes each sub-task to the cheapest capable model automatically, achieving 30-50% token savings.
Kontrolni ček correction (CRITICAL):
__model_tier_overrideis SET by cost-monitor adaptive mode but NEVER READ by ai-response handlers! Both ai-response-handler.ts and ai-response-streaming-handler.ts use(node.data.model as string) ?? DEFAULT_MODELand completely ignore__model_tier_override. Only plan_and_execute reads it. This means adaptive mode currently has NO EFFECT on regular ai_response nodes. Fix required as prerequisite before ecomode can work.Kontrolni ček correction #2: ecomode classify call should use fastest model (haiku/groq), not DEFAULT_MODEL, to minimize latency overhead.
Kontrolni ček correction #3: scope is ai_response only. Other AI handlers (ai_classify, ai_extract, ai_summarize) are specialized and already use cheap models.
Current state (verified in code):
- Adaptive mode: Tier 1 (60%) -> balanced, Tier 2 (80%) -> fast, Tier 3 (95%) -> block/fast
- Sets
__model_tier_overridevariable for downstream nodes - BUG: ai-response handlers do NOT read
__model_tier_override— only plan_and_execute does
Tasks:
- B3.0 — PREREQUISITE: Add
__model_tier_overridereading to BOTH ai-response handlers:- Added
getModelByTierimport + model selection cascade: explicit model > ecomode > tier override > default - Same logic in both ai-response-handler.ts and ai-response-streaming-handler.ts
- Fixes existing adaptive mode bug:
__model_tier_overridenow actually affects ai_response nodes
- Added
- B3.1 — Add
ecomodeto cost_monitor modes (alongside monitor/budget/alert/adaptive/enforce):- Sets
__ecomode_enabled = truein context variables - ai-response handlers check this flag and call
classifyTaskComplexity()before model selection - Uses fastest available model via
getModelByTier("fast")for classify call
- Sets
- B3.2 — Created
src/lib/cost/ecomode.ts:classifyTaskComplexity(prompt, model)→ "simple" | "moderate" | "complex"complexityToTier()maps to fast/balanced/powerful- In-memory cache: hash(prompt first 200 chars) → tier, 5 min TTL, max 500 entries
- Graceful fallback: LLM failure → "moderate"
- B3.3 — Add per-task model selection log to cost tracking output:
- Track
{ nodeId, taskComplexity, modelUsed, tokensSaved }per node (deferred — needs token pricing integration)
- Track
- B3.4 — Write tests for ecomode routing and __model_tier_override fix
- 15 tests: ecomode.test.ts (11 — classify, cache, tier mapping), ecomode-integration.test.ts (4 — cost-monitor modes)
Files to modify:
src/lib/runtime/handlers/ai-response-handler.ts(read __model_tier_override)src/lib/runtime/handlers/ai-response-streaming-handler.ts(read __model_tier_override)src/lib/runtime/handlers/cost-monitor-handler.ts(add ecomode)src/lib/cost/ecomode.ts(NEW — classify helper)
Estimated effort: Medium (2 days) Impact: 30-50% token cost reduction + fixes existing adaptive mode bug
Inspired by: memsearch (Zilliz), OpenClaw memory system, clawhip MEMORY.md + shards, bb25.
Gap: AgentMemory stores JSON in PostgreSQL only. Not human-readable, not editable, not exportable. memsearch uses Markdown files as source of truth with vector index as cache. OpenClaw uses MEMORY.md index + memory/ shards for hot/cold tiers.
Current state (verified in code — kontrolni ček 2026-04-04):
AgentMemorymodel:key,value (String),category,importance,embedding (vector 1536),accessCount,accessedAt- Two fully-implemented handler nodes exist:
memory-write-handler.ts(253 lines) — 5 merge strategies (replace, merge_object, deep_merge, append_array, increment), auto-eviction at 1000 limit, async embedding generationmemory-read-handler.ts(300 lines) — 3 modes: key lookup, category filter, vector-semantic search with HNSW acceleration, fallback to text search
- No hot/cold tier distinction (separate optimization layer needed)
- No markdown export or import
- No human-editable UI (no Memory tab/page exists)
- No automatic memory injection into context before AI nodes — engine.ts does NOT load AgentMemory on flow start
- HNSW index exists:
agentmemory_embedding_hnsw_idx(vector_cosine_ops, m=16, ef_construction=64)
Tasks:
- C1.1 — Created
src/lib/memory/markdown-export.ts:exportAgentMemoryAsMarkdown(agentId)— MEMORY.md with hot section + per-category groupingexportMemoryShards(agentId)— per-category shard files (Map<filename, content>)parseMemoryMarkdown(markdown)— parses- **key** [category]: value _(importance: 0.95, accessed: 2h ago)_formatimportMemoryFromMarkdown(agentId, markdown)— upserts parsed entries, returns { imported, skipped }
- C1.2 — Created
src/lib/memory/hot-cold-tier.ts:getHotMemories(agentId, limit=10)— composite score: importance×0.4 + recency×0.35 + frequency×0.25getColdMemories(agentId, query)— HNSW vector search, 0.3 similarity thresholdinjectHotMemoryIntoContext(context)— sets__hot_memoryvariable, swallows errorsformatHotMemoryForContext(memories)— markdown list under "## Agent Memory (active context)"- Hot criteria: accessed in 24h OR importance > 0.8 OR accessCount > 10
- C1.3 — Integrated hot memory injection into engine.ts and engine-streaming.ts:
injectHotMemoryIntoContext(context)called before first node, after hooks init__hot_memoryconsumed by ai-response handlers, prepended to effectiveSystemPrompt
- C1.4 — Created Memory UI at
/memory/[agentId]page:- SWR data fetching, search filter, category filter, edit/delete dialogs
- Hot memories: Flame icon + amber styling; Cold: Snowflake icon + blue
- Export/Import buttons, Memory link on agent cards (Brain icon)
- C1.5 — API routes:
GET /api/agents/[agentId]/memory— paginated list with category filter + sortPATCH /api/agents/[agentId]/memory/[memoryId]— edit value/category/importanceDELETE /api/agents/[agentId]/memory/[memoryId]— delete with ownership checkGET /api/agents/[agentId]/memory/export— MEMORY.md downloadPOST /api/agents/[agentId]/memory/import— parse + upsert, 1MB limit
- C1.6 — Tests: 32 tests (16 hot-cold-tier + 16 markdown-export), all passing
Files to create/modify:
src/lib/memory/markdown-export.ts(NEW)src/lib/memory/hot-cold-tier.ts(NEW)src/lib/runtime/engine.ts(hot memory injection)src/lib/runtime/engine-streaming.ts(hot memory injection)src/app/api/agents/[agentId]/memory/route.ts(NEW)src/app/api/agents/[agentId]/memory/[memoryId]/route.ts(NEW)src/app/api/agents/[agentId]/memory/export/route.ts(NEW)src/app/api/agents/[agentId]/memory/import/route.ts(NEW)- Agent detail page (new Memory tab)
Estimated effort: High (5-7 days) Impact: Transparent, human-editable agent memory — major UX and capability improvement
Gap: rbac.ts has READ/EXECUTE/ADMIN permissions only. OMC uses 3-layer composition: Guarantee -> Enhancement -> Execution. Guarantee skills (security, guardrails) always run first.
Current state (verified in code):
rbac.ts:AccessLevelenum with READ/EXECUTE/ADMIN — flat hierarchy- No skill ordering, no composition layers, no mandatory skills
Tasks:
- C2.1 — Added
compositionLayer String @default("execution")+@@index([compositionLayer])to Skill model - C2.2 — Created
src/lib/ecc/skill-composer.ts:composeSkillPipeline(agentId, taskSkillId?)— raw SQL query (compositionLayer not in generated types), orders guarantee → enhancement → execution, then by nameformatSkillPipelineForPrompt(skills)— XML<skill_pipeline>with per-layer sections, 2000-char truncationgetGuaranteeSkills(agentId)— lightweight guarantee-only callvalidateLayer(raw)— defaults unknown to "execution"
- C2.3 — Integrated skill composition into both AI response handlers:
- Skill pipeline injected between hot memory and safety check in effectiveSystemPrompt
- Non-fatal: composition failure is caught and logged, never blocks AI call
- C2.4 — Skills Browser UI: composition layer badge (red=guarantee, blue=enhancement, hidden for execution)
- Skills API augmented with compositionLayer via raw SQL + merge (generated types pending)
- C2.5 — Prisma schema updated (applied via
pnpm db:push) - C2.6 — Tests: 16 tests in
skill-composer.test.ts(pipeline ordering, layer enforcement, formatting, truncation, error handling, validateLayer)
Files to create/modify:
prisma/schema.prisma(add compositionLayer to Skill model)src/lib/ecc/skill-composer.ts(NEW)src/lib/runtime/handlers/ai-response-handler.tssrc/app/skills/page.tsx
Estimated effort: Medium (2-3 days) Impact: Ensures security/guardrail skills always execute — safety improvement
Gap:
search.tsuses manual RRF with fixed weights (0.7 semantic / 0.3 keyword). bb25 uses Bayesian calibration to automatically balance semantic and keyword scores without scale mismatch. Proven +1.0%p NDCG on SQuAD benchmark. Rust core.
Current state (verified in code — kontrolni ček 2026-04-04):
search.tsline 282-289:reciprocalRankFusion()with k=60, default weights 0.5/0.5 in function signature- Weights overridden by
KnowledgeBase.hybridAlpha(default 0.7 semantic / 0.3 keyword) — already configurable per-KB - Contextual Enrichment enabled → semantic weight auto-bumped to 0.8 (line 489)
- Post-fusion: min-max normalization via
normalizeRRFScores()(line 315-325) hybridAlphaIS already used:search.tsline 491 reads from kbConfig- No
fusionStrategyfield yet — RRF is the only fusion strategy (Bayesian is future work)
Tasks:
- C3.1 — Decision: Option B — ported Bayesian calibration to TypeScript (no Python subprocess needed)
- Sigmoid transform:
P(relevant | rank) = 1 / (1 + exp(-(a - b * rank)))with a=2.0, b=0.15 - Tuned for typical BM25 rank distributions, graceful decay at high ranks
- Sigmoid transform:
- C3.2 — Added
fusionStrategy String @default("rrf")to KnowledgeBase model:- Added to
kbConfigUpdateSchemaandkbConfigResponseSchemainsrc/lib/schemas/kb-config.ts - Fetched via raw SQL in
loadKBConfig()(generated types pending)
- Added to
- C3.3 — Implemented
bayesianFusion()insearch.ts:- Sigmoid calibration of BM25 rank → posterior probability (0-1 range, no normalization needed)
- Weighted sum fusion:
semanticWeight * cosineScore + keywordWeight * calibratedKeywordScore - Activated when
kbConfig.fusionStrategy === "bayesian", applied in bothhybridSearch()andrunSingleSearch()
- C3.4 — Benchmark: deferred (requires production data for meaningful NDCG/MRR comparison)
- C3.5 — Tests: 11 tests in
bayesian-fusion.test.ts(empty inputs, merge, sigmoid decay, weights, metadata preservation)
Files to modify:
src/lib/knowledge/search.tssrc/lib/schemas/kb-config.ts(add fusionStrategy)prisma/schema.prisma(optional: add fusionStrategy to KnowledgeBase)
Estimated effort: Medium (3-4 days, includes benchmarking) Impact: Better RAG search precision — measurable improvement in retrieval quality
Inspired by: OMC verification protocol, OMC omc ask cross-provider delegation.
Gap:
reflexive_loopevaluator is AI-only — never runs build/test/lint commands. OMC verifier runs: BUILD, TEST, LINT, FUNCTIONALITY, ARCHITECT review, ERROR_FREE.Kontrolni ček (2026-04-04):
- TASKS.md originally specified "run each command via sandbox (code_interpreter or python_code handler)" but both sandboxes block os/subprocess imports.
execFilewith whitelist is the correct approach, identical to A3.2's design deviation.ShieldCheckicon is already used byguardrailsnode — useCircleCheckBiginstead.validateCommand()andrunVerificationCommands()in reflexive-loop-handler.ts are private. Must extract to shared module before D1.1 can import them.
Tasks:
- D0 — PREREQUISITE: Extract
validateCommand()+runVerificationCommands()fromreflexive-loop-handler.ts→src/lib/runtime/verification-commands.ts(shared module). Bothreflexive-loop-handler.tsand newverification-handler.tsimport from shared module. - D1.1 — Create
src/lib/runtime/handlers/verification-handler.ts:- New node type:
verification - Config:
checks: Array<{ type: "build" | "test" | "lint" | "custom", command: string, label?: string }> - Execution:
execFile+ whitelist (imports fromverification-commands.ts) - Per-check timeout: 60s, overall timeout: 300s
- Output variable:
verificationResults: Array<{ type, command, label, exitCode, output, durationMs }> - Routes:
findNextNode(context, node.id, "passed")orfindNextNode(context, node.id, "failed") - Design deviation: Uses
execFile(not sandbox) — same rationale as A3.2
- New node type:
- D1.2 — Register
verificationnode type:src/types/index.ts→ add"verification"to NodeType unionsrc/lib/validators/flow-content.ts→ add to NODE_TYPES arraysrc/lib/runtime/handlers/index.ts→ register handlersrc/components/builder/flow-builder.tsx→ add to NODE_TYPES map- Do NOT add to SELF_ROUTING_NODES (uses sourceHandles via findNextNode)
- D1.3 — Create
src/components/builder/nodes/verification-node.tsx:- Icon:
CircleCheckBig(NOT ShieldCheck — already used by guardrails) - Theme: green (
bg-green-950 border-green-600) - Shows: list of checks with type badges
- Icon:
- D1.4 — Add to node picker + property panel:
- Node picker: category
"utilities"(alongside guardrails), count 56→57 - Property panel: checks CRUD (add/remove/edit rows), command + label input
- OUTPUT_VAR_TYPES: add
"verification" - Mock
CircleCheckBiginnode-picker.test.tsx, update counts
- Node picker: category
- D1.5 — Starter flow template
"verification-pipeline"insrc/data/starter-flows.ts:ai_response→verification(checks: npm run build, npm run test) → passed: end / failed: ai_response (fix)
- D1.6 — Tests:
src/lib/runtime/__tests__/verification-commands.test.ts— shared module (whitelist, metachar blocking)src/lib/runtime/handlers/__tests__/verification-handler.test.ts— handler (all pass → passed, one fail → failed, timeout, empty checks → passed, output variable)
Files to create/modify:
src/lib/runtime/verification-commands.ts(NEW — extracted from reflexive-loop-handler)src/lib/runtime/handlers/reflexive-loop-handler.ts(MODIFIED — import from shared module)src/lib/runtime/handlers/verification-handler.ts(NEW)src/types/index.ts(add to NodeType union)src/lib/validators/flow-content.ts(add to NODE_TYPES)src/lib/runtime/handlers/index.ts(register)src/components/builder/flow-builder.tsx(add to NODE_TYPES map)src/components/builder/nodes/verification-node.tsx(NEW)src/components/builder/node-picker.tsx(add node definition)src/components/builder/property-panel.tsx(add property editor)src/data/starter-flows.ts(new template)
Estimated effort: Medium (2-3 days) Impact: Agents can verify their own work with real commands — production quality
Gap:
call_agenthandler calls sibling agents but they all use the same provider ecosystem. OMC'somc asksends tasks to Claude, Codex, and Gemini separately and synthesizes results. OMC'sccgskill fans out to Codex+Gemini with Claude synthesizing.Kontrolni ček (2026-04-04):
providerOverridemust be applied to CALLEE's FlowContent.nodes (not caller's). Mechanism: same asevalModelOverridein chat route.ts lines 209-215 — overridedata.modelon allai_responsenodes in-memory before sub-engine execution. call-agent-handler.ts already loads callee FlowContent viaparseFlowContent().- Starter flow should use
ai_responsenodes with differentdata.modelvalues (notcall_agent— would require pre-existing agents). Works out-of-the-box.- Provider availability:
getModel(providerOverride)throws if API key missing. Handler must catch and fall back to callee's original model.
Tasks:
- D2.1 — Add
providerOverrideoption tocall_agenthandler:- New config field:
providerOverride?: string(model ID) - In internal mode: after loading calleeFlowContent, override
data.modelon allai_responsenodes in-memory (identical to chat route.ts evalModelOverride logic) - Log audit with
providerOverridevalue for traceability - Graceful fallback: if providerOverride model unavailable, log warning and use original
- New config field:
- D2.2 — Create
"cross-provider-synthesis"starter flow insrc/data/starter-flows.ts:message→parallel(3 branches, eachai_responsewith different model: claude-sonnet-4-6, deepseek-chat, gemini-2.5-flash) →ai_response(synthesizer, merges outputs) →end- No
call_agentdependency — works without pre-existing agents
- D2.3 — Property panel: add
Provider Overrideselect tocall_agentnode editor:- Model dropdown (nullable — "Use agent default" option)
- Provider badge shown next to selected agent when providerOverride is set
- D2.4 — Tests in
src/lib/runtime/handlers/__tests__/cross-provider.test.ts:- providerOverride applied to callee's ai_response nodes
- Without providerOverride — original model unchanged
- Override does not persist to DB
- Fallback when override model unavailable (missing API key)
Files to modify:
src/lib/runtime/handlers/call-agent-handler.tssrc/data/starter-flows.ts(new template)src/components/builder/property-panel.tsx
Estimated effort: Low (1-2 days) Impact: Better results through multi-model synthesis — quality improvement
Inspired by: clawhip typed event pipeline, renderer/sink split, session.* events.
Gap: notification-handler.ts supports generic levels (info/warning/error/success) but no standardized session lifecycle events. clawhip defines: session.started, session.blocked, session.finished, session.failed, session.pr_created.
Tasks:
- E1.1 — Define
SessionEventTypeinsrc/lib/runtime/types.ts:session.started|session.blocked|session.finished|session.failed|session.timeout|session.verification_passed|session.verification_failed
- E1.2 — Emit session events from engine.ts and engine-streaming.ts:
session.startedat flow startsession.finishedat successful flow endsession.failedon flow errorsession.timeouton MAX_ITERATIONS hitsession.blockedon human_approval waitForInput
- E1.3 — Auto-fire notifications for session events (configurable per agent):
- Agent config:
sessionNotifications: { events: SessionEventType[], channel: "webhook" | "in_app", webhookUrl?: string }
- Agent config:
- E1.4 — Add Discord and Slack webhook presets to notification config:
- Discord: format message with embed (title, color by event type, fields)
- Slack: format as Block Kit message
- E1.5 — Write tests for each session event type emission
Files to modify:
src/lib/runtime/types.tssrc/lib/runtime/engine.tssrc/lib/runtime/engine-streaming.tssrc/lib/runtime/handlers/notification-handler.ts
Estimated effort: Medium (2-3 days) Impact: Real-time visibility into agent execution — essential for production monitoring
Gap: notification-handler.ts mixes formatting and transport in a single handler. clawhip separates renderer (format message) from sink (deliver to Discord/Slack/etc). This makes adding new sinks trivial without touching rendering logic.
Tasks:
- E2.1 — Refactor notification-handler.ts into renderer + sink pattern:
NotificationRendererinterface:render(event, options) -> RenderedMessageNotificationSinkinterface:deliver(rendered, config) -> DeliveryResult
- E2.2 — Create renderers:
PlainTextRenderer— current behaviorDiscordRenderer— Discord embed format (rich)SlackRenderer— Slack Block Kit format (mrkdwn)MarkdownRenderer— for in-app display
- E2.3 — Create sinks:
WebhookSink— current HTTP POST behaviorInAppSink— current in-app behaviorLogSink— current logger behavior
- E2.4 — Config:
{ renderer: "plain" | "discord" | "slack" | "markdown", sink: "webhook" | "in_app" | "log" } - E2.5 — Write tests for each renderer x sink combination
Files to modify:
src/lib/runtime/handlers/notification-handler.ts(refactor)src/lib/notifications/renderers/(NEW directory)src/lib/notifications/sinks/(NEW directory)
Estimated effort: Medium (2-3 days) Impact: Extensible notification system — easy to add Telegram, email, Teams, etc.
Redosled implementacije: F3 → F2 → F1 (najlakše → najteže, sve tri su međusobno nezavisne) Kontrolni ček datum: 2026-04-04 — sve pretpostavke verifikovane čitanjem koda
Gap: ECC skills are loaded statically via
composeSkillPipeline()(C2.3). claw-code/clawhip approach: auto-detect which skill is relevant for the current task and inject ONLY that skill. Reduces context bloat dramatically.Kontrolni ček (2026-04-04):
Skillmodel has NO embedding field — cache embeddings in memory + Redis, key prefix"skill-emb:", TTL 600s (same pattern asembedding-cache.ts).generateEmbedding(text)fromembeddings.tsis directly reusable for single strings.cosineSimilarity(a, b)fromsrc/lib/evals/semantic.tsis exported and handles all edge cases — directly reusable, no duplication needed.- C2.3
composeSkillPipeline()is called WITHOUT anyisECCEnabled()guard in BOTH ai-response handlers. F3 must REPLACE step 5 (skill composition), not add a new step. Logic:if (isECCEnabled() && routedSkills.length > 0)→ use dynamic; else → fall back tocomposeSkillPipeline()(C2.3 static pipeline).- Injection point is step 5 in system prompt assembly (after hot memory).
acquireEmbeddingSemaphore()must be respected — batch skill embeddings sequentially on first load, NOT Promise.all for 60 skills simultaneously.- Scope: ONLY
ai-response-handler.tsandai-response-streaming-handler.ts.
Tasks:
- F3.1 — Create
src/lib/ecc/skill-router.ts:getCachedSkillEmbedding(skillId, description)→number[]— in-memory Map + Redis 600s TTLrouteToSkill(prompt, agentId, topN=3)→Promise<Skill[]>— cosine similarity, threshold 0.35invalidateSkillCache(skillId)— call on skill update/delete- Uses
generateEmbedding()fromembeddings.ts,cosineSimilarity()fromsemantic.ts - Respects
acquireEmbeddingSemaphore()/releaseEmbeddingSemaphore()for batch init - Guard:
isECCEnabled()returns[]immediately when ECC disabled
- F3.2 — Integrate into both ai-response handlers (step 5 replacement):
if (isECCEnabled() && routedSkills.length > 0)→ inject routed skills- else → fall back to
composeSkillPipeline()(non-breaking, C2.3 preserved) - Non-fatal: router error always falls back to static, never blocks AI call
- F3.3 — Tests (12 tests):
src/lib/ecc/__tests__/skill-router.test.ts— cache hit/miss, cosine threshold, topN, ECC disabled →[], error swallowing, semaphore, invalidationsrc/lib/ecc/__tests__/skill-router-integration.test.ts— handler uses routed skills, fallback to static composition on empty result, no injection when ECC off
Files to create/modify:
src/lib/ecc/skill-router.ts(NEW)src/lib/runtime/handlers/ai-response-handler.ts(replace step 5)src/lib/runtime/handlers/ai-response-streaming-handler.ts(replace step 5)
Estimated effort: Medium (2-3 days) Impact: Less context bloat, more relevant skill context — quality + cost improvement
Gap: No AST-level code analysis. OMC integrates ast-grep for precise pattern matching and refactoring via syntax trees.
Kontrolni ček (2026-04-04):
code-interpreter-handler.tshas NOmodefield — safe to add"eval" | "ast_match" | "ast_replace"with default"eval"(existing behavior unchanged).GitBranchicon is TAKEN (Logic category + 2 node definitions). UseBracesforast_transform.Code2icon is TAKEN (1 node).Braces,TreePine,FileCodeare all free.code_interpreteris NOT inOUTPUT_VAR_TYPESset — existing bug. Fix alongside F2 work.@ast-grep/napiis a Rust native addon — use dynamicimport()with try/catch, never top-level require. Graceful fallback: return[]/ original code if unavailable.ast_transformis NOT self-routing — standardnextNodeId, NOT inSELF_ROUTING_NODES.- Node count baseline: 57 (verified).
ast_transform= node 58.
Tasks:
- F2.1 — Add
@ast-grep/napias optional dependency (pnpm add @ast-grep/napi) - F2.2 — Create
src/lib/ast/pattern-matcher.ts:loadAstGrep()→ module | null — dynamic import, try/catch, logger.warn on failisAstGrepAvailable()→ booleanmatchPattern(code, pattern, language: SgLang)→AstMatch[]— graceful[]if unavailablereplacePattern(code, pattern, replacement, language)→ string — returns original if unavailablegetSupportedLanguages()→SgLang[]SgLang = "TypeScript" | "JavaScript" | "Python" | "Rust" | "Go" | "Java" | "Css" | "Html"AstMatch = { text, range: { start, end }, metaVariables: Record<string,string> }
- F2.3 — Extend
code-interpreter-handler.ts:- New
modefield:"eval" | "ast_match" | "ast_replace"(default"eval") - New fields:
pattern?: string,replacement?: string,language?: SgLang ast_matchoutput:{ matches: AstMatch[], count: number }ast_replaceoutput:{ result: string, originalLength: number, newLength: number }- Fallback: if ast-grep unavailable and mode !=
"eval"→ return warning in output
- New
- F2.4 — Create
ast_transformnode type (58th):- Config:
operation: "match" | "replace",pattern,replacement?,language,inputVariable,outputVariable - Icon:
Braces(FREE — verified), theme: purple (bg-purple-950 border-purple-600) - Register in:
types/index.ts,flow-content.ts,handlers/index.ts,flow-builder.tsx - NOT in
SELF_ROUTING_NODES - Add
"ast_transform"toOUTPUT_VAR_TYPES; also add missing"code_interpreter"(bug fix)
- Config:
- F2.5 — Update node picker + property panel + tests:
node-picker.tsx: addast_transformto utilities (57→58, utilities 8→9, importBraces)property-panel.tsx:AstTransformProperties+mode/pattern/replacementfor code_interpreternode-picker.test.tsx: count 57→58, add"Braces"to lucide mock (55→56 icons)
- F2.6 — Tests (20 tests):
src/lib/ast/__tests__/pattern-matcher.test.ts— match/replace, graceful unavailable, metaVariables capture, all language enum values, empty input, invalid patternsrc/lib/runtime/handlers/__tests__/ast-transform-handler.test.ts— match/replace operations, inputVariable resolution, outputVariable set, error never throws
Files to create/modify:
src/lib/ast/pattern-matcher.ts(NEW)src/lib/runtime/handlers/ast-transform-handler.ts(NEW)src/lib/runtime/handlers/code-interpreter-handler.ts(add mode field)src/types/index.ts(add"ast_transform")src/lib/validators/flow-content.ts(add to NODE_TYPES array)src/lib/runtime/handlers/index.ts(register handler)src/components/builder/nodes/ast-transform-node.tsx(NEW — Braces icon, purple)src/components/builder/flow-builder.tsx(add to NODE_TYPES map)src/components/builder/node-picker.tsx(count 57→58, Braces import)src/components/builder/property-panel.tsx(AstTransformProperties, OUTPUT_VAR_TYPES fix)src/components/builder/__tests__/node-picker.test.tsx(count 57→58, Braces mock)
Estimated effort: Medium (3-4 days) Impact: Precise code transformations — quality improvement for code agents
Gap: No Language Server Protocol support. OMC has real LSP integration with 15s timeout for semantic analysis, go-to-definition, find-references, rename-symbol.
Kontrolni ček (2026-04-04):
- Use
typescript-language-server(standard LSP over stdio, wraps tsserver). NOTtsserverdirectly (non-standard protocol). NOTvscode-languageclient(designed for VS Code extension context, not Node standalone). Package for types:vscode-languageserver-protocol(LSP message types only).- Railway uses Dockerfile builder (
railway.toml: builder = "DOCKERFILE"). Addtypescript-language-serverto Dockerfile RUN step.nixpacks.tomlis NOT active when Dockerfile builder is used.- LSP pool: MAX_LSP_CONNECTIONS = 3 (NOT 5 — tsserver uses 200-500MB RAM each). Idle TTL = 300s. Cleanup interval = 30s (more aggressive than MCP's 60s — LSP is expensive). Pattern: replicate
src/lib/mcp/pool.ts(LRU, dead detection, SIGTERM shutdown, Redis tracking).spawnfor persistent stdio process — same pattern ascli-session-manager.ts.- LSP
initializehandshake timeout: 30s (not 15s — tsserver is slow on cold start). Operation timeout: 15s per request. Cache initialized connection in pool.- Security:
validateWorkspacePath(path, agentId)— only/tmp/agent-{agentId}/allowed, block..path traversal. Analogous tovalidateExternalUrlWithDNS()for file system.Code2icon is TAKEN. UseFileSearchforlsp_query(FREE — verified).lsp_queryNOT inSELF_ROUTING_NODES— standard nextNodeId routing.- Node count: depends on F2. If F2 done first:
lsp_query= 59th. Standalone: 58th.- MVP scope: TypeScript/JavaScript only. Python (pylsp) = future work.
Tasks:
- F1.1 — Package setup + Dockerfile:
pnpm add vscode-languageserver-protocol- Add
RUN npm install -g typescript-language-server typescriptto Dockerfile
- F1.2 — Create LSP infrastructure (
src/lib/lsp/):src/lib/lsp/types.ts— LSP type re-exports + local interfaces (LSPConnection,LSPQueryResult)src/lib/lsp/pool.ts—LSPConnectionPool: MAX=3, TTL=300s, cleanup=30s, LRU eviction, SIGTERM graceful shutdown, dead connection detection. Pattern:src/lib/mcp/pool.ts.src/lib/lsp/client.ts:startLSPServer(agentId, workspacePath)— spawntypescript-language-server --stdio,initializewith 30s timeout, store in poolgetDefinition(agentId, file, line, character)→Location[]— 15s timeoutgetReferences(agentId, file, line, character)→Location[]— 15s timeoutgetDiagnostics(agentId, file, content)→Diagnostic[]— 15s timeouthoverInfo(agentId, file, line, character)→string— 15s timeoutstopLSPServer(agentId)— evict from poolvalidateWorkspacePath(path, agentId)— security guard
- F1.3 — Create
lsp_querynode type (58th or 59th — current+1):- Config:
operation: "definition" | "references" | "diagnostics" | "hover",fileVariable,contentVariable?,lineVariable?,characterVariable?,outputVariable - Icon:
FileSearch(FREE — verified), theme: violet (bg-violet-950 border-violet-600) - Sets
__lsp_contextvariable with formatted LSP result - Register in:
types/index.ts,flow-content.ts,handlers/index.ts,flow-builder.tsx - NOT in
SELF_ROUTING_NODES; add"lsp_query"toOUTPUT_VAR_TYPES
- Config:
- F1.4 — Integrate LSP context into ai_response handlers:
- Check
context.variables["__lsp_context"]— if set, prepend<lsp_context>...</lsp_context> - Non-fatal: missing variable → no change to prompt
- Check
- F1.5 — Update node picker + property panel + tests:
node-picker.tsx: addlsp_queryto utilities (count +1, importFileSearch)property-panel.tsx:LSPQueryPropertiescomponentnode-picker.test.tsx: count +1, add"FileSearch"to lucide mock (+1 icon)
- F1.6 — Tests (25 tests):
src/lib/lsp/__tests__/client.test.ts— mock LSP server over stdio: initialize handshake, all 4 operations, 15s timeout enforcement, pool LRU eviction, dead connection cleanup, validateWorkspacePath blocks..traversalsrc/lib/runtime/handlers/__tests__/lsp-query-handler.test.ts— all 4 operations, missing variables graceful fallback,__lsp_contextset correctly, handler never throws
Files to create/modify:
src/lib/lsp/types.ts(NEW)src/lib/lsp/pool.ts(NEW)src/lib/lsp/client.ts(NEW)src/lib/runtime/handlers/lsp-query-handler.ts(NEW)src/types/index.ts(add"lsp_query")src/lib/validators/flow-content.ts(add to NODE_TYPES)src/lib/runtime/handlers/index.ts(register)src/components/builder/nodes/lsp-query-node.tsx(NEW — FileSearch icon, violet)src/components/builder/flow-builder.tsx(add to NODE_TYPES map)src/components/builder/node-picker.tsx(count +1, FileSearch import)src/components/builder/property-panel.tsx(LSPQueryProperties, OUTPUT_VAR_TYPES)src/components/builder/__tests__/node-picker.test.tsx(count +1, FileSearch mock)Dockerfile(add typescript-language-server RUN step)
Estimated effort: High (7-10 days) Impact: ENORMOUS for developer agents — semantic code understanding
| Faza | Tasks | Effort | Priority |
|---|---|---|---|
| A — Runtime Hooks | A1-A3 (19 subtasks) | 7-11 days | P0 |
| B — Execution Modes | B1-B3 (15 subtasks) | 7-11 days | P1 |
| C — Memory Upgrade | C1-C3 (17 subtasks) | 10-14 days | P1 |
| D — Verification | D1-D2 (10 subtasks) | 3-5 days | P1-P2 |
| E — Notifications | E1-E2 (10 subtasks) | 4-6 days | P2 |
| F — Advanced | F1-F3 (12 subtasks) | 12-17 days | P3 |
| TOTAL | 6 faza, 83 subtasks | 43-64 days | — |
Recommended implementation order: A2 -> A1 -> B2 -> C1 -> A3 -> B1 -> D1 -> C2 -> C3 -> E1 -> E2 -> B3 -> D2 -> F3 -> F2 -> F1
- claw-code — Agent harness architecture, hook DAG, session management
- oh-my-codex (OMX) — $ralph, $team, SKILL.md, .omx/ state
- oh-my-claudecode (OMC) — 5 execution modes, 29 agents, 32 skills, hook system, verification protocol
- OMC ARCHITECTURE.md — 3-layer skills, hook events, verification protocol
- OMC REFERENCE.md — Full agent roster, skill list, CLI commands
- clawhip — Event pipeline, MEMORY.md + shards, renderer/sink split
- memsearch — Markdown-first memory, vector cache, human-editable
- bb25 — Bayesian BM25 hybrid search, Rust core, +1.0%p NDCG
- OpenClaw Memory System — Hot/cold tiers, agentic compaction