UN-3344 [FIX] Unified retry for LLM and embedding providers#1886
UN-3344 [FIX] Unified retry for LLM and embedding providers#1886muhammad-ali-e merged 12 commits intomainfrom
Conversation
litellm's retry only works for SDK-based providers (OpenAI/Azure). httpx-based providers (Anthropic, Vertex, Bedrock, Mistral) and ALL embedding calls silently ignore max_retries. This adds self-managed retry with exponential backoff at the SDK layer, disabling litellm's own retry entirely for consistency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds configurable retry support: adapter JSON schemas gain optional Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client Code
participant Module as Embedding/LLM Module
participant RetryUtil as Retry Utils
participant LiteLLM as LiteLLM
participant Provider as External Provider
Client->>Module: Call get_embedding / completion
Module->>Module: Extract max_retries from kwargs
Module->>Module: Disable LiteLLM internal retry (set max_retries=0, num_retries=0)
Module->>RetryUtil: Invoke call_with_retry / acall_with_retry / iter_with_retry
loop Until success or max_retries exhausted
RetryUtil->>LiteLLM: Execute wrapped call
LiteLLM->>Provider: Send request
alt Provider success
Provider-->>LiteLLM: Return response
LiteLLM-->>RetryUtil: Return result
RetryUtil-->>Client: Return result
else Transient provider error
Provider-->>LiteLLM: Error
LiteLLM-->>RetryUtil: Raise exception
RetryUtil->>RetryUtil: is_retryable_litellm_error()? → True
RetryUtil->>RetryUtil: Compute backoff delay
RetryUtil->>RetryUtil: Wait & retry
else Non-retryable error
Provider-->>LiteLLM: Error
LiteLLM-->>RetryUtil: Raise exception
RetryUtil-->>Client: Raise exception
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Move retry loops out of LLM/Embedding classes into generic call_with_retry, acall_with_retry, and iter_with_retry functions in retry_utils.py. Both classes now call these directly instead of maintaining their own retry helper methods. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
| Filename | Overview |
|---|---|
| unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py | Core of the PR: adds is_retryable_litellm_error (MRO-safe, builtins-aware), _get_retry_delay (shared decision + backoff helper), pop_litellm_retry_kwargs (zeros both max_retries and num_retries), and call_with_retry/acall_with_retry/iter_with_retry wrappers. All previously flagged issues resolved. |
| unstract/sdk1/src/unstract/sdk1/llm.py | complete/acomplete/stream_complete all migrated to pop_litellm_retry_kwargs + call_with_retry/acall_with_retry/iter_with_retry; litellm double-retry properly disabled. |
| unstract/sdk1/src/unstract/sdk1/embedding.py | All four embedding paths (sync/async, single/batch) migrated to pop_litellm_retry_kwargs + call_with_retry/acall_with_retry; kwargs.copy() pattern prevents mutation of self.kwargs across retries. |
| unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/azure.json | Added max_retries field (default 3) to Azure embedding UI schema; consistent with OpenAI/Bedrock embedding schemas. |
| unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/vertexai.json | Added max_retries field (default 3); consistent with the pre-existing VertexAI embedding model which uses BaseEmbeddingParameters default timeout. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/ollama.json | Added max_retries field (default 3) to Ollama LLM; OllamaLLMParameters inherits max_retries from BaseChatCompletionParameters so pydantic validation handles it correctly. |
Sequence Diagram
sequenceDiagram
participant C as Caller LLM/Embedding
participant P as pop_litellm_retry_kwargs
participant R as call_with_retry or iter_with_retry
participant D as _get_retry_delay
participant L as litellm completion or embedding
C->>P: kwargs including user max_retries
P-->>C: extracted max_retries, kwargs zeroed for litellm
loop attempt 0 to max_retries
C->>R: fn lambda, max_retries, is_retryable_litellm_error
R->>L: litellm call with max_retries=0 and num_retries=0
alt Success
L-->>R: response
R-->>C: result
else Transient error 408 429 5xx or APIConnectionError
L-->>R: Exception
R->>D: error, attempt, max_retries, predicate
D->>D: check Retry-After header then exponential backoff
D-->>R: delay seconds or None if give up or non-retryable
alt delay returned
R->>R: sleep delay then retry
else None returned
R-->>C: raise original exception
end
end
end
Reviews (11): Last reviewed commit: "Merge branch 'main' into fix/unified-ret..." | Re-trigger Greptile
- Extract _get_retry_delay() shared helper to eliminate duplicated retry decision logic across call_with_retry, acall_with_retry, iter_with_retry, and retry_with_exponential_backoff - Add num_retries=0 to embedding._pop_retry_params() to fully disable litellm's internal retry for embedding calls - Expose max_retries in UI JSON schemas for embedding adapters (OpenAI, Azure, VertexAI, Ollama) and Ollama LLM — previously the field existed in Pydantic models but wasn't shown to users, silently defaulting to 0 retries - Add debug logging to LLM and Embedding retry parameter extraction - Clarify docstrings distinguishing is_retryable_litellm_error() from is_retryable_error() (different exception hierarchies) - Remove stale noqa: C901 from simplified retry_with_exponential_backoff Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dapters Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…check - Fix `requests.ConnectionError` shadowing Python's builtin `ConnectionError` in `is_retryable_litellm_error()` — rename import to `RequestsConnectionError` and use `builtins.ConnectionError` / `builtins.TimeoutError` explicitly - Use `__mro__`-based class name check instead of `type(error).__name__` to also catch subclasses of retryable error types - P1 (num_retries not zeroed) was already fixed in prior commit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py (1)
254-317: Extract the sync retry loop instead of maintaining two copies.
retry_with_exponential_backoff()now reimplements the same attempt/sleep/decision flow ascall_with_retry(), which is the SonarCloud failure on Line 254. Pulling the common loop into one helper will lower the branching here and keep the decorator path from drifting away from the SDK path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 254 - 317, The retry logic in retry_with_exponential_backoff duplicates the attempt/sleep/decision flow from call_with_retry; extract that shared loop into a single helper function (e.g., _run_retry_loop or reuse call_with_retry signature) that encapsulates attempts, calling the target, invoking _get_retry_delay, sleeping, logging, and raising, then have retry_with_exponential_backoff's decorator/wrapper call this helper (passing func, max_retries, base_delay, multiplier, jitter, exceptions, logger_instance, prefix, retry_predicate) to eliminate the duplicated flow and keep both the decorator path and call_with_retry using the same implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`:
- Around line 113-186: The three retry helpers (call_with_retry,
acall_with_retry, iter_with_retry) must validate max_retries is >= 0 before
entering their loops; add an early check in each function (call_with_retry,
acall_with_retry, iter_with_retry) that raises a ValueError (or similar) when
max_retries is negative so we fail fast instead of returning None or yielding
nothing when range(max_retries + 1) is empty; this mirrors the existing
decorator validation and prevents silent no-op behavior.
- Around line 33-57: The isinstance check in is_retryable_litellm_error
currently uses a shadowed ConnectionError (imported from requests.exceptions)
which narrows detection to requests-only errors; update the import/usage so the
check uses the built-in ConnectionError and TimeoutError types (i.e., remove or
rename the requests import and reference the built-ins in
is_retryable_litellm_error) and also add "APITimeoutError" to the
_RETRYABLE_ERROR_NAMES allowlist so OpenAI/OpenAI-mapped 408 timeout exceptions
are recognized by the name-based check.
---
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`:
- Around line 254-317: The retry logic in retry_with_exponential_backoff
duplicates the attempt/sleep/decision flow from call_with_retry; extract that
shared loop into a single helper function (e.g., _run_retry_loop or reuse
call_with_retry signature) that encapsulates attempts, calling the target,
invoking _get_retry_delay, sleeping, logging, and raising, then have
retry_with_exponential_backoff's decorator/wrapper call this helper (passing
func, max_retries, base_delay, multiplier, jitter, exceptions, logger_instance,
prefix, retry_predicate) to eliminate the duplicated flow and keep both the
decorator path and call_with_retry using the same implementation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b73e0a58-1ac6-4696-9a91-27485df74d1b
📒 Files selected for processing (9)
unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/azure.jsonunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/bedrock.jsonunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/ollama.jsonunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/openai.jsonunstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/vertexai.jsonunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/ollama.jsonunstract/sdk1/src/unstract/sdk1/embedding.pyunstract/sdk1/src/unstract/sdk1/llm.pyunstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
There was a problem hiding this comment.
🧹 Nitpick comments (2)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py (2)
257-324: Consider extracting the innerwrapperfunction to reduce cognitive complexity.SonarCloud flags cognitive complexity of 20 (allowed: 15). The nested decorator structure with try/except and conditional logging contributes to this. You could extract the "giving up" logging block or the entire wrapper body into a helper function.
Possible refactor approach
+def _log_final_failure( + logger_instance: logging.Logger, + func_name: str, + attempt: int, + prefix: str, +) -> None: + """Log when retry attempts are exhausted.""" + if attempt > 0: + logger_instance.exception( + "Giving up '%s' after %d attempt(s) for %s", + func_name, + attempt + 1, + prefix, + ) + def retry_with_exponential_backoff(...) -> Callable: ... def decorator(func: Callable) -> Callable: `@wraps`(func) def wrapper(*args: Any, **kwargs: Any) -> Any: for attempt in range(max_retries + 1): try: result = func(*args, **kwargs) ... return result except exceptions as e: delay = _get_retry_delay(...) if delay is None: - if attempt > 0: - logger_instance.exception( - "Giving up '%s' after %d attempt(s) for %s", - func.__name__, - attempt + 1, - prefix, - ) + _log_final_failure(logger_instance, func.__name__, attempt, prefix) raise time.sleep(delay)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 257 - 324, The wrapper inside retry_with_exponential_backoff is too complex; extract the retry loop or the "giving up" logging and sleep/raise logic into a helper function to reduce cognitive complexity. Create a private helper (e.g., _execute_with_retries or _handle_attempt) that accepts func, args, kwargs, max_retries, base_delay, multiplier, jitter, exceptions, retry_predicate, prefix, and logger_instance and returns the result or raises; have wrapper call this helper and keep wrapper only responsible for `@wraps` and invoking the helper; ensure the helper uses _get_retry_delay and preserves the existing logging messages (including the success log in wrapper when attempt > 0 or move that logging into the helper) so behavior is unchanged.
138-145: Consider tightening type hints forfnparameter.The current
Callable[[], object]is imprecise:
acall_with_retryawaits the result, sofnshould return an awaitableiter_with_retryiterates the result, sofnshould return an iterableThis won't cause runtime issues but weakens static type checking.
Suggested type hints
from collections.abc import Awaitable, Iterable from typing import TypeVar T = TypeVar("T") async def acall_with_retry( fn: Callable[[], Awaitable[T]], *, max_retries: int, ... ) -> T: def iter_with_retry( fn: Callable[[], Iterable[T]], *, max_retries: int, ... ) -> Generator[T, None, None]:Also applies to: 160-167
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 138 - 145, Tighten the type hints for acall_with_retry and iter_with_retry: introduce a TypeVar T and change the fn parameter for acall_with_retry to Callable[[], Awaitable[T]] with the function returning T, and change iter_with_retry to accept Callable[[], Iterable[T]] and return Generator[T, None, None]; import Awaitable and Iterable from collections.abc (and Generator if not already present) and TypeVar from typing to make the signatures precise and improve static checking while keeping existing runtime behavior (refer to function names acall_with_retry and iter_with_retry and update their return annotations accordingly).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`:
- Around line 257-324: The wrapper inside retry_with_exponential_backoff is too
complex; extract the retry loop or the "giving up" logging and sleep/raise logic
into a helper function to reduce cognitive complexity. Create a private helper
(e.g., _execute_with_retries or _handle_attempt) that accepts func, args,
kwargs, max_retries, base_delay, multiplier, jitter, exceptions,
retry_predicate, prefix, and logger_instance and returns the result or raises;
have wrapper call this helper and keep wrapper only responsible for `@wraps` and
invoking the helper; ensure the helper uses _get_retry_delay and preserves the
existing logging messages (including the success log in wrapper when attempt > 0
or move that logging into the helper) so behavior is unchanged.
- Around line 138-145: Tighten the type hints for acall_with_retry and
iter_with_retry: introduce a TypeVar T and change the fn parameter for
acall_with_retry to Callable[[], Awaitable[T]] with the function returning T,
and change iter_with_retry to accept Callable[[], Iterable[T]] and return
Generator[T, None, None]; import Awaitable and Iterable from collections.abc
(and Generator if not already present) and TypeVar from typing to make the
signatures precise and improve static checking while keeping existing runtime
behavior (refer to function names acall_with_retry and iter_with_retry and
update their return annotations accordingly).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 04b772de-c8c2-491a-9b94-28b8bcc53a9a
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
…tries - Add APITimeoutError to _RETRYABLE_ERROR_NAMES for explicit OpenAI SDK timeout coverage - Add _validate_max_retries() guard to call_with_retry, acall_with_retry, iter_with_retry to fail fast on negative values instead of silently returning None Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (3)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py (3)
266-333: Consider reducing cognitive complexity.SonarCloud flagged this function with cognitive complexity of 20 (allowed 15). The nested conditionals within the exception handlers contribute to this. Consider extracting the "giving up" logging block (lines 319-325) into a helper or simplifying the control flow.
💡 Suggested refactor
+def _log_giving_up( + logger_instance: logging.Logger, + func_name: str, + attempt: int, + prefix: str, +) -> None: + """Log when giving up after all retry attempts.""" + if attempt > 0: + logger_instance.exception( + "Giving up '%s' after %d attempt(s) for %s", + func_name, + attempt + 1, + prefix, + ) + def retry_with_exponential_backoff( ... ) -> Callable: def decorator(func: Callable) -> Callable: `@wraps`(func) def wrapper(*args: Any, **kwargs: Any) -> Any: for attempt in range(max_retries + 1): try: result = func(*args, **kwargs) if attempt > 0: logger_instance.info( "Successfully completed '%s' after %d retry attempt(s)", func.__name__, attempt, ) return result except exceptions as e: delay = _get_retry_delay(...) if delay is None: - if attempt > 0: - logger_instance.exception( - "Giving up '%s' after %d attempt(s) for %s", - func.__name__, - attempt + 1, - prefix, - ) + _log_giving_up(logger_instance, func.__name__, attempt, prefix) raise time.sleep(delay) except Exception: raise return wrapper return decorator🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 266 - 333, The retry_with_exponential_backoff wrapper has high cognitive complexity due to nested exception handling and an inline "giving up" logging block; extract that block into a small helper (e.g., _log_giving_up or _handle_give_up) and call it from inside the except exceptions as e: branch when _get_retry_delay returns None, passing logger_instance, prefix, func.__name__, and attempt so the main wrapper focuses on control flow only, reducing nesting; keep the existing calls to _get_retry_delay and time.sleep unchanged and ensure the helper performs the logger.exception call and any formatting before re-raising the exception.
145-152: Type hint should reflect awaitable return type.The
fnparameter is awaited on line 158, so the type hint should beCallable[[], Awaitable[object]]rather thanCallable[[], object].💡 Suggested fix
+from collections.abc import Awaitable + async def acall_with_retry( - fn: Callable[[], object], + fn: Callable[[], Awaitable[object]], *, max_retries: int, retry_predicate: Callable[[Exception], bool], description: str = "", logger_instance: logging.Logger | None = None, ) -> object:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 145 - 152, Update the type hint for the `fn` parameter in `acall_with_retry` to reflect that it's awaited: change `Callable[[], object]` to `Callable[[], Awaitable[object]]`, and add the necessary `Awaitable` import from `typing` (or `collections.abc`) at the top of the module so the annotation is valid; keep the function's async signature and return annotation as-is.
168-175: Type hint should reflect iterable return type.The
fnresult is iterated on line 186, so the type hint should reflect this, e.g.,Callable[[], Iterable[T]].💡 Suggested fix
+from collections.abc import Iterable +from typing import TypeVar + +T = TypeVar("T") + def iter_with_retry( - fn: Callable[[], object], + fn: Callable[[], Iterable[T]], *, max_retries: int, retry_predicate: Callable[[Exception], bool], description: str = "", logger_instance: logging.Logger | None = None, -) -> Generator: +) -> Generator[T, None, None]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 168 - 175, The type hint for iter_with_retry is wrong because fn is expected to return an iterable; update the signature to use a generic TypeVar (e.g., T) and change fn's type to Callable[[], Iterable[T]] and the function's return type to Generator[T, None, None] so callers of iter_with_retry and the loop over fn() have correct typing; ensure you import or declare typing.TypeVar and Iterable where needed and update any related annotations in iter_with_retry accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`:
- Around line 266-333: The retry_with_exponential_backoff wrapper has high
cognitive complexity due to nested exception handling and an inline "giving up"
logging block; extract that block into a small helper (e.g., _log_giving_up or
_handle_give_up) and call it from inside the except exceptions as e: branch when
_get_retry_delay returns None, passing logger_instance, prefix, func.__name__,
and attempt so the main wrapper focuses on control flow only, reducing nesting;
keep the existing calls to _get_retry_delay and time.sleep unchanged and ensure
the helper performs the logger.exception call and any formatting before
re-raising the exception.
- Around line 145-152: Update the type hint for the `fn` parameter in
`acall_with_retry` to reflect that it's awaited: change `Callable[[], object]`
to `Callable[[], Awaitable[object]]`, and add the necessary `Awaitable` import
from `typing` (or `collections.abc`) at the top of the module so the annotation
is valid; keep the function's async signature and return annotation as-is.
- Around line 168-175: The type hint for iter_with_retry is wrong because fn is
expected to return an iterable; update the signature to use a generic TypeVar
(e.g., T) and change fn's type to Callable[[], Iterable[T]] and the function's
return type to Generator[T, None, None] so callers of iter_with_retry and the
loop over fn() have correct typing; ensure you import or declare typing.TypeVar
and Iterable where needed and update any related annotations in iter_with_retry
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 22f7cf9d-7244-483f-90ab-c81ca438acc8
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
pk-zipstack
left a comment
There was a problem hiding this comment.
@chandrasekharan-zipstack Looks like there are some comments from sonarcloud. Please check them out. LGTM otherwise.
…lause Address SonarCloud findings on PR #1886: - S3776: Flatten retry_with_exponential_backoff.wrapper by moving the success logging + return out of the try block and using `continue` in the retry path, so the except branch only handles the give-up case. - S2737: Drop the `except Exception: raise` clause — it was a no-op that added complexity without changing behavior (non-matching exceptions propagate naturally). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py (2)
292-328: Optional: trim one point of cognitive complexity to satisfy SonarCloud (16 → 15).SonarCloud reports 16 vs. threshold 15. The easiest reduction is extracting the give-up logging into a small helper so
wrapperhas one less nested branch; alternatively, just raise the threshold if the team prefers. Not blocking.♻️ Example extraction
+ def _log_giveup(func: Callable, attempt: int) -> None: + if attempt > 0: + logger_instance.exception( + "Giving up '%s' after %d attempt(s) for %s", + func.__name__, + attempt + 1, + prefix, + ) + def decorator(func: Callable) -> Callable: `@wraps`(func) def wrapper(*args: Any, **kwargs: Any) -> Any: # noqa: ANN401 for attempt in range(max_retries + 1): try: result = func(*args, **kwargs) except exceptions as e: delay = _get_retry_delay( e, attempt, max_retries, retry_predicate, prefix, logger_instance, base_delay, multiplier, 60.0, jitter, ) if delay is not None: time.sleep(delay) continue - if attempt > 0: - logger_instance.exception( - "Giving up '%s' after %d attempt(s) for %s", - func.__name__, - attempt + 1, - prefix, - ) + _log_giveup(func, attempt) raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 292 - 328, The wrapper function inside decorator has one extra nested branch for the "give up" logging; extract that logic into a small helper function (e.g., _log_give_up or _log_and_raise) referenced from decorator/wrapper so wrapper no longer contains the nested if that calls logger_instance.exception and raise; call the helper with parameters (logger_instance, func.__name__, attempt, prefix) after _get_retry_delay returns None and before re-raising the exception, and keep existing behavior of logging only when attempt > 0 and then re-raising the exception.
145-175: Tightenfntype hints for async and generator variants.
acall_with_retryawaitsfn()anditer_with_retryiterates overfn(), but both declarefn: Callable[[], object]. Type checkers will not catch a caller that passes a plain sync function intoacall_with_retry, or a non-iterable factory intoiter_with_retry. Recommend narrowing:♻️ Suggested tightening
-from collections.abc import Callable, Generator +from collections.abc import Awaitable, Callable, Generator, Iterable @@ async def acall_with_retry( - fn: Callable[[], object], + fn: Callable[[], Awaitable[object]], *, @@ def iter_with_retry( - fn: Callable[[], object], + fn: Callable[[], Iterable[object]], *,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 145 - 175, The signatures for acall_with_retry and iter_with_retry should be made generic and their fn arguments narrowed: introduce a TypeVar T and change acall_with_retry to async def acall_with_retry(fn: Callable[[], Awaitable[T]], ..., ) -> T so the checker knows fn is awaitable, and change iter_with_retry to def iter_with_retry(fn: Callable[[], Iterable[T]], ..., ) -> Generator[T, None, None] (or Callable[[], Iterator[T]] if fn yields an iterator) so the checker knows fn returns an iterable; update imports (TypeVar, Awaitable, Iterable/Iterator) and adjust the return type annotations accordingly while keeping the existing behavior in the bodies of acall_with_retry and iter_with_retry.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`:
- Around line 292-328: The wrapper function inside decorator has one extra
nested branch for the "give up" logging; extract that logic into a small helper
function (e.g., _log_give_up or _log_and_raise) referenced from
decorator/wrapper so wrapper no longer contains the nested if that calls
logger_instance.exception and raise; call the helper with parameters
(logger_instance, func.__name__, attempt, prefix) after _get_retry_delay returns
None and before re-raising the exception, and keep existing behavior of logging
only when attempt > 0 and then re-raising the exception.
- Around line 145-175: The signatures for acall_with_retry and iter_with_retry
should be made generic and their fn arguments narrowed: introduce a TypeVar T
and change acall_with_retry to async def acall_with_retry(fn: Callable[[],
Awaitable[T]], ..., ) -> T so the checker knows fn is awaitable, and change
iter_with_retry to def iter_with_retry(fn: Callable[[], Iterable[T]], ..., ) ->
Generator[T, None, None] (or Callable[[], Iterator[T]] if fn yields an iterator)
so the checker knows fn returns an iterable; update imports (TypeVar, Awaitable,
Iterable/Iterator) and adjust the return type annotations accordingly while
keeping the existing behavior in the bodies of acall_with_retry and
iter_with_retry.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9daf0733-561a-496c-a360-99875598852e
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
…e complexity Sonar still flagged retry_with_exponential_backoff at complexity 16 after the previous flatten. Nested def decorator / def wrapper counted against the outer function's score. Move the retry body to a module-level _invoke_with_retries helper so the decorator factory just delegates, bringing the outer function well under the 15 threshold. Behavior is unchanged — all paths (success, retry, give-up, non-retryable propagate) are preserved and covered by the existing SDK1 tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (3)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py (3)
87-97: Minor: debug log counter vs. warning counter use different denominators.The debug line logs
attempt+1/max_retries+1(total attempts including initial try), while the warning on Line 101-103 logsattempt+1/max_retries(retries only). When reading logs for the same failure, these will show different/Ntotals and be confusing. Additionally, on the last attempt (attempt == max_retries) the debug line still printsretryable=Trueeven though the function returnsNoneand the caller re-raises — the log reader sees “retryable=True” for a call that was not retried.Consider either unifying the denominators or appending a "giving up" branch before returning
None:Suggested tweak
- if not should_retry or attempt >= max_retries: - return None + if not should_retry or attempt >= max_retries: + logger_instance.debug( + "Giving up: attempts exhausted or non-retryable (attempt=%d, max_retries=%d, description=%s)", + attempt + 1, max_retries, description, + ) + return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 87 - 97, The debug and warning logs in the retry decision flow are inconsistent and can mislead: update the logging in the function that calls logger_instance.debug (use symbols attempt, max_retries, should_retry, description) so both messages use the same denominator (either total attempts = max_retries+1 or retries = max_retries), and add a conditional "giving up" log path just before returning None when not should_retry or attempt >= max_retries so the message shows retryable=False (or explicitly states giving up) for the final non-retry case; ensure you adjust the debug message's retryable value to reflect the actual branch taken.
176-198: Docstring: clarify thatfnmust return a fresh iterable on each call.Each retry re-invokes
fn(), which assumesfnis a factory (e.g.,lambda: litellm.completion(stream=True, ...)). Passing an already-materialized generator would be silently wrong: after the first failure the exhausted generator would either yield nothing or raiseStopIteration/the same error. Worth a one-liner in the docstring so callers don't pass a bare iterator.Suggested wording
- """Yield from fn() with retry. Only retries before the first yield. - - Once items have been yielded to the caller a mid-iteration failure is - raised immediately — partial output can't be un-yielded. - """ + """Yield from fn() with retry. Only retries before the first yield. + + `fn` must be a factory that returns a *fresh* iterable on each call + (e.g., ``lambda: litellm.completion(stream=True, ...)``). A bare + generator must not be passed, since retrying would re-consume an + already-exhausted iterator. + + Once items have been yielded to the caller a mid-iteration failure is + raised immediately — partial output can't be un-yielded. + """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 176 - 198, Update the function docstring to explicitly state that the parameter fn must be a zero-argument factory that returns a fresh iterable each time it is called (i.e., pass a callable like lambda: generator() rather than an already-materialized iterator), because the code re-invokes fn() on each retry; reference the use of fn() and the retry loop (for attempt in range(max_retries + 1)) and the has_yielded logic so callers understand why a fresh iterable is required.
145-165: Type hints:fnparameters are incorrectly typed asCallable[[], object]but actual usage differs.
acall_with_retryawaits the return value (await fn()), sofnshould be typed asCallable[[], Awaitable[object]]iter_with_retryyields from the return value, sofnshould be typed asCallable[[], Iterable[object]]Current imports at line 10 only include
CallableandGenerator. AddAwaitableandIterableto fix type checker compatibility.Suggested fix
-from collections.abc import Callable, Generator +from collections.abc import Awaitable, Callable, Generator, Iterable @@ async def acall_with_retry( - fn: Callable[[], object], + fn: Callable[[], Awaitable[object]], *, @@ def iter_with_retry( - fn: Callable[[], object], + fn: Callable[[], Iterable[object]], *,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 145 - 165, The type hints for the retry utilities are incorrect: update the imports to include Awaitable and Iterable, then change acall_with_retry's fn annotation from Callable[[], object] to Callable[[], Awaitable[object]] and change iter_with_retry's fn annotation from Callable[[], object] to Callable[[], Iterable[object]] so the declared callable return types match actual usage; locate and update the import list (where Callable and Generator are imported) and the fn parameter annotations in acall_with_retry and iter_with_retry accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`:
- Around line 87-97: The debug and warning logs in the retry decision flow are
inconsistent and can mislead: update the logging in the function that calls
logger_instance.debug (use symbols attempt, max_retries, should_retry,
description) so both messages use the same denominator (either total attempts =
max_retries+1 or retries = max_retries), and add a conditional "giving up" log
path just before returning None when not should_retry or attempt >= max_retries
so the message shows retryable=False (or explicitly states giving up) for the
final non-retry case; ensure you adjust the debug message's retryable value to
reflect the actual branch taken.
- Around line 176-198: Update the function docstring to explicitly state that
the parameter fn must be a zero-argument factory that returns a fresh iterable
each time it is called (i.e., pass a callable like lambda: generator() rather
than an already-materialized iterator), because the code re-invokes fn() on each
retry; reference the use of fn() and the retry loop (for attempt in
range(max_retries + 1)) and the has_yielded logic so callers understand why a
fresh iterable is required.
- Around line 145-165: The type hints for the retry utilities are incorrect:
update the imports to include Awaitable and Iterable, then change
acall_with_retry's fn annotation from Callable[[], object] to Callable[[],
Awaitable[object]] and change iter_with_retry's fn annotation from Callable[[],
object] to Callable[[], Iterable[object]] so the declared callable return types
match actual usage; locate and update the import list (where Callable and
Generator are imported) and the fn parameter annotations in acall_with_retry and
iter_with_retry accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9c300ebb-e374-45df-a953-22e5ecc58150
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py
muhammad-ali-e
left a comment
There was a problem hiding this comment.
Line-by-line review. Not blocking — PR is fundamentally sound and CI is green. Main asks before merge:
- Add unit tests for the new
call_with_retry/acall_with_retry/iter_with_retry/is_retryable_litellm_error/_get_retry_delay. The existingtest_retry_utils.py(598 lines) has zero coverage for any of the new symbols — suggest at minimum: success path, exhaust-and-raise, non-retryable short-circuit, iter retry before first yield, iter no-retry after yield,is_retryable_litellm_errorstatus_code + MRO-walk + builtin-vs-requests disambiguation. - Close the generator on retry in
iter_with_retry(see inline). - Consider honoring
Retry-Afterheaders on 429/503 for OpenAI/Azure — disabling SDK-level retry loses this (see inline).
Rest are polish. See inline comments.
…e-up log Address review comments on PR #1886: - #10 (resource leak): close the generator returned by fn() before retrying in iter_with_retry — otherwise streaming providers leak an in-flight HTTP socket until GC. - #12 (behavioral regression): when we zero out SDK/wrapper retries we also lose the OpenAI SDK's native Retry-After handling on 429/503. _get_retry_delay now checks error.response.headers["retry-after"] and uses that value ahead of exponential backoff. HTTP-date form is not parsed; those fall back to backoff. - #8 (observability gap): move the "Giving up ... after N attempt(s)" log into _get_retry_delay so all four retry helpers (call_with_retry, acall_with_retry, iter_with_retry, decorator) share the same exhaustion signal. Previously only the decorator path logged it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… wrappers Address review comments on PR #1886: - #9 (typing): call_with_retry / acall_with_retry / iter_with_retry previously returned `object`, erasing caller type info. Add PEP 695 generics so the return type flows from the wrapped callable: acall_with_retry now takes Callable[[], Awaitable[T]] and iter_with_retry takes Callable[[], Iterable[T]] -> Generator[T, ...]. - #11 / #13 (DRY): `_pop_retry_params` in embedding.py and `_disable_litellm_retry` in llm.py were identical logic. Lift to shared `pop_litellm_retry_kwargs` helper in retry_utils.py and delete both methods. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
max_retriesin UI for embedding adapters (OpenAI, Azure, VertexAI, Ollama) and Ollama LLM — previously missing from the adapter form_get_retry_delay()helper used by all 4 retry pathsWhy
max_retriesconfigured in the adapter UI was silently ignored for:embedding_with_retries()does not exist)max_retriesin the UI, so users could not configure retries even if the backend supported themcall_with_retry,acall_with_retry,iter_with_retry, andretry_with_exponential_backoffHow
litellm.completion()/litellm.embedding()callsmax_retries=0(SDK-level) +num_retries=0(litellm wrapper)is_retryable_litellm_error()predicate uses duck-typing (no litellm import) to detect transient errors: status codes 408/429/500/502/503/504,ConnectionError,TimeoutError, and litellm/httpx exception class namesiter_with_retry) only retries before the first chunk is yielded — mid-stream failures propagate immediately_get_retry_delay()shared helper — all retry functions delegate retry decisions to this single functionCan this PR break any existing features. If yes, please list possible items. If no, please explain why.
No breaking changes:
max_retriesin saved metadata: Pydantic defaults toNone, code converts to0— no retry, identical to current behaviormax_retries=3, users can adjustlitellm.completion()/litellm.embedding()Database Migrations
Env Config
PLATFORM_SERVICE_MAX_RETRIES,PROMPT_SERVICE_MAX_RETRIESetc. are unaffected.Relevant Docs
max_retriestonum_retriesRelated Issues or PRs
Dependencies Versions
Notes on Testing
Tested locally with docker containers against:
Screenshots
N/A — backend-only changes + JSON schema additions
Checklist
I have read and understood the Contribution Guidelines.