fix(mship): add tool watchdog by Sg312 · Pull Request #4991 · simstudioai/sim

Sg312 · 2026-06-12T02:32:00Z

Summary

Add tool timeout for vfs reads

Type of Change

Bug fix

Testing

Manual

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

vercel · 2026-06-12T02:32:06Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
docs	Skipped		Jun 12, 2026 3:48am

greptile-apps · 2026-06-12T02:38:35Z

Greptile Summary

This PR adds a two-layer tool watchdog to prevent hung async tool executions from permanently blocking the copilot checkpoint loop. Previously, a VFS read (or any tool) that neither resolved nor rejected would park the loop's Promise.allSettled forever, wedging the chat's pending-stream lock.

Per-tool watchdog (executeToolWithWatchdog): wraps every tool execution in a Promise.race against a setTimeout. Short-lived tools (reads, metadata ops) get 60 s; legitimately long-running tools (workflow runs, media generation, research, E2B file ops) get the full orchestration budget so their inner timeouts can fire first.
Structural backstop (run.ts checkpoint loop): replaces the bare Promise.allSettled with a raced wait-budget (max watchdog across pending tools + 30 s grace). If the budget expires, forceFailHungToolCall settles each hung tool into a terminal error state — persisting a failed async row and publishing a terminal confirmation — then the loop resumes Go with an error result instead of hanging.

Confidence Score: 4/5

Safe to merge — the watchdog correctly unblocks hung tool calls without interfering with tools that complete normally, and state transitions are verified to satisfy the downstream result builder.

The two-layer design is sound: per-tool watchdog in executeToolWithWatchdog and a structural backstop in the checkpoint loop. The forceFailHungToolCall path correctly sets tool.result before the resume body builder checks it, preventing the missing result exception. The unawaited publishTerminalToolConfirmation in forceFailHungToolCall means a client could be left with a spinning tool indicator if that call fails silently, but this is a minor edge case in an already-error recovery path.

executor.ts — the forceFailHungToolCall function's unawaited terminal confirmation publish is the only call site that lacks error handling; all other code paths in the file handle errors consistently.

Important Files Changed

Filename	Overview
apps/sim/lib/copilot/constants.ts	Adds three new timeout constants: TOOL_WATCHDOG_DEFAULT_MS (60s), TOOL_WATCHDOG_LONG_RUNNING_MS (alias of ORCHESTRATION_TIMEOUT_MS), and TOOL_WATCHDOG_RESUME_GRACE_MS (30s). Values are well-reasoned and well-commented.
apps/sim/lib/copilot/request/tools/executor.ts	Adds LONG_RUNNING_TOOL_IDS set, toolWatchdogTimeoutMs classifier, executeToolWithWatchdog (per-tool race against a hard timer), and forceFailHungToolCall (last-resort state settlement). Core logic is correct; publishTerminalToolConfirmation is fire-and-forget without error handling, consistent with existing usage patterns.
apps/sim/lib/copilot/request/lifecycle/run.ts	Replaces bare Promise.allSettled with a raced wait-budget (max tool watchdog + grace), then iterates hung tools sequentially calling forceFailHungToolCall. State transitions verified to satisfy the downstream result builder's !tool.result guard.
apps/sim/lib/copilot/request/lifecycle/run.test.ts	New test uses vi.useFakeTimers, stubs forceFailHungToolCall with the correct state shape (matching setTerminalToolCallState output), and validates that forceFailHungToolCall is called, the resume URL is used, and the error result is correctly serialized.

Sequence Diagram

sequenceDiagram
    participant CL as Checkpoint Loop
    participant WD as executeToolWithWatchdog
    participant TH as Tool Handler
    participant FF as forceFailHungToolCall
    participant Go as Go Resume Endpoint

    CL->>WD: executeToolAndReport(toolCallId)
    WD->>TH: executeTool(name, params)

    alt Tool completes within watchdog cap
        TH-->>WD: result
        WD-->>CL: pendingPromise resolves
        CL->>Go: POST /api/tools/resume with results
    else Per-tool watchdog fires first
        WD-->>CL: rejects with ToolExecutionTimeoutError
        CL->>Go: POST /api/tools/resume with error result
    else Tool hangs past waitBudget (watchdog + grace)
        Note over CL: sleep(maxWatchdog + 30s) wins race
        CL->>FF: forceFailHungToolCall(toolCallId)
        FF->>FF: setTerminalToolCallState error
        FF->>FF: completeAsyncToolCall failed
        FF->>FF: publishTerminalToolConfirmation
        FF-->>CL: done
        CL->>Go: POST /api/tools/resume with error result
    end

_{Reviews (1): Last reviewed commit: "Add tool watchdog" | Re-trigger Greptile}

Add tool watchdog

4b47b91

greptile-apps Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread apps/sim/lib/copilot/request/lifecycle/run.ts

Comment thread apps/sim/lib/copilot/request/tools/executor.ts

Updates

e23d2b3

vercel Bot temporarily deployed to Preview June 12, 2026 02:39 Inactive

Greptile comment

ee881ae

vercel Bot temporarily deployed to Preview June 12, 2026 02:41 Inactive

FIxes

8475399

vercel Bot temporarily deployed to Preview June 12, 2026 03:28 Inactive

Fix

0edd031

vercel Bot temporarily deployed to Preview June 12, 2026 03:48 Inactive

Sg312 merged commit ebf434f into staging Jun 12, 2026
14 checks passed

waleedlatif1 mentioned this pull request Jun 12, 2026

v0.7.4: round-robin byok support, table block fix, db read replica routing, trigger.dev, temporal, latex, quartr, brex, convex integrations #4978

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mship): add tool watchdog#4991

fix(mship): add tool watchdog#4991
Sg312 merged 5 commits into
stagingfrom
improvement/mship-tool-watchdog

Sg312 commented Jun 12, 2026

Uh oh!

vercel Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Sg312 commented Jun 12, 2026

Summary

Type of Change

Testing

Checklist

Uh oh!

vercel Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 12, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 12, 2026 •

edited

Loading