Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix(executor): address code review feedback on retry logic
- Scope retries to LLM blocks only (agent, evaluator, router) to avoid
  duplicate side effects on non-idempotent blocks like HTTP, email, webhook
- Treat statusless errors as non-retryable by default — only retry known
  network errors (ECONNRESET, ETIMEDOUT, etc.) not JS runtime errors
- Respect Retry-After header for 503 responses in addition to 429 (RFC 7231)
- Add afterEach vi.useRealTimers() to prevent timer leak between tests
  • Loading branch information
Rabba-Meghana committed Mar 25, 2026
commit 70ce97fdf02cc77d087b3ac9a19f883ecc128c6b
18 changes: 13 additions & 5 deletions apps/sim/executor/execution/block-executor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -121,11 +121,19 @@ export class BlockExecutor {
cleanupSelfReference?.()

try {
const output = await withRetry(
() => handler.executeWithNode
? handler.executeWithNode(ctx, block, resolvedInputs, nodeMetadata)
: handler.execute(ctx, block, resolvedInputs)
)
const isLLMBlock = isAgentBlockType(block.metadata?.id) ||
block.metadata?.id === BlockType.EVALUATOR ||
block.metadata?.id === BlockType.ROUTER ||
block.metadata?.id === BlockType.ROUTER_V2
const output = isLLMBlock
? await withRetry(
() => handler.executeWithNode
? handler.executeWithNode(ctx, block, resolvedInputs, nodeMetadata)
: handler.execute(ctx, block, resolvedInputs)
)
: await (handler.executeWithNode
? handler.executeWithNode(ctx, block, resolvedInputs, nodeMetadata)
: handler.execute(ctx, block, resolvedInputs))
Comment thread
Rabba-Meghana marked this conversation as resolved.

const isStreamingExecution =
output && typeof output === 'object' && 'stream' in output && 'execution' in output
Expand Down
4 changes: 4 additions & 0 deletions apps/sim/executor/utils/retry.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ describe('withRetry', () => {
vi.useFakeTimers()
})

afterEach(() => {
vi.useRealTimers()
})

it('returns result immediately on success', async () => {
Comment on lines +6 to +13
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing afterEach to restore real timers

vi.useFakeTimers() is called in beforeEach but there is no corresponding vi.useRealTimers() in an afterEach. If Vitest runs these tests in the same worker as other test files that depend on real timers (e.g. setTimeout, Date.now()), the leaked fake timer state can cause subtle test failures elsewhere.

afterEach(() => {
  vi.useRealTimers()
})

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 70ce97f — addressed all review feedback: scoped retries to LLM blocks only, added isNetworkError() for statusless errors, added Retry-After support for 503, and added afterEach vi.useRealTimers() to tests.

const fn = vi.fn().mockResolvedValue('ok')
const result = await withRetry(fn)
Expand Down
28 changes: 23 additions & 5 deletions apps/sim/executor/utils/retry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,24 @@ function parseRetryAfterHeader(headers: Headers): number | null {
return null
}

/**
* Returns true only for known transient network errors (no HTTP status code).
* Deliberately excludes JS runtime errors (TypeError, RangeError, etc.)
* to avoid masking bugs with silent retries.
*/
function isNetworkError(error: unknown): boolean {
if (!(error instanceof Error)) return false
const networkErrorCodes = ['ECONNRESET', 'ECONNREFUSED', 'ETIMEDOUT', 'ENOTFOUND', 'EPIPE']
const code = (error as any)?.code
if (code && networkErrorCodes.includes(code)) return true
const networkErrorNames = ['FetchError', 'NetworkError', 'AbortError']
return networkErrorNames.includes(error.name)
}

/**
* Wraps an async function with retry logic using exponential backoff and jitter.
* Respects Retry-After headers from LLM providers on 429 responses.
* Only retries on transient errors (429, 503, 529) — never on user errors (4xx).
* Respects Retry-After headers from LLM providers on 429 and 503 responses.
* Only retries on known transient errors — never on JS runtime errors or user errors (4xx).
*/
export async function withRetry<T>(
fn: () => Promise<T>,
Expand All @@ -63,17 +77,21 @@ export async function withRetry<T>(
const status = (error as any)?.status ?? (error as any)?.statusCode ?? null
const responseHeaders: Headers | null = (error as any)?.headers ?? null

// Only retry known transient HTTP codes or recognised network errors
// Statusless JS errors (TypeError, RangeError, etc.) are NOT retried
const isRetryable =
status === null ||
retryableStatusCodes.includes(status)
status !== null
? retryableStatusCodes.includes(status)
: isNetworkError(error)

if (!isRetryable) {
logger.warn('Non-retryable error, aborting retry loop', { status, attempt })
throw error
}

// Respect Retry-After header for both 429 and 503 (RFC 7231 §7.1.3)
let delayMs: number
if (responseHeaders && status === 429) {
if (responseHeaders && (status === 429 || status === 503)) {
const retryAfterMs = parseRetryAfterHeader(responseHeaders)
delayMs = retryAfterMs ?? calculateBackoffDelay(attempt, initialDelayMs, maxDelayMs)
Comment thread
cursor[bot] marked this conversation as resolved.
Outdated
} else {
Expand Down