Skip to content

ASR actor refactor to GPU/CPU#1795

Closed
jperez999 wants to merge 6 commits intoNVIDIA:mainfrom
jperez999:archetype
Closed

ASR actor refactor to GPU/CPU#1795
jperez999 wants to merge 6 commits intoNVIDIA:mainfrom
jperez999:archetype

Conversation

@jperez999
Copy link
Copy Markdown
Collaborator

Description

Fix ASR actor to have both CPU(url) and GPU(model) run capability with archetype

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

@jperez999 jperez999 self-assigned this Apr 3, 2026
@jperez999 jperez999 requested review from a team as code owners April 3, 2026 16:50
@jperez999 jperez999 requested a review from ChrisJar April 3, 2026 16:50
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 15, 2026

Greptile Summary

This PR refactors ASRActor into a proper ArchetypeOperator pattern: a shared _ASRBaseActor base holds all transcription logic, ASRCPUActor handles remote-endpoint (URL) and CPU-local paths, ASRGPUActor handles GPU-local inference, and the public ASRActor delegates to the correct variant at runtime based on hardware and params. The __init__.py and tests are updated accordingly.

Confidence Score: 5/5

Safe to merge; all findings are P2 style/quality suggestions with no blocking correctness issues.

The archetype resolution logic is sound, tests cover remote/local/GPU paths and hardware-based dispatch, and the public API is cleanly exported. The only findings are unreachable dead code from an incomplete cleanup and a potential silent zip truncation that depends on model behavior.

nemo_retriever/src/nemo_retriever/audio/asr_actor.py — unreachable _transcribe_local path and silent zip truncation in _call_local_batch.

Important Files Changed

Filename Overview
nemo_retriever/src/nemo_retriever/audio/asr_actor.py Refactors ASRActor into _ASRBaseActor + ASRCPUActor/ASRGPUActor + ArchetypeOperator; _transcribe_local and the local-model path in _transcribe_one are unreachable dead code left over from the refactoring.
nemo_retriever/src/nemo_retriever/audio/init.py Adds ASRCPUActor and ASRGPUActor to the public all export; straightforward and correct.
nemo_retriever/tests/test_asr_actor.py New and updated tests cover remote/local/GPU actor paths, segment_audio, and archetype resolution; good coverage of the happy paths.
nemo_retriever/tests/test_operator_flags_and_cpu_actors.py Adds ASRCPUActor/ASRGPUActor/ASRActor to the GPU/CPU flag and archetype classification tests; all assertions are accurate.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["ASRActor(params)\n(ArchetypeOperator)"] --> B{"prefers_cpu_variant?\n(_use_remote(params))"}
    B -- "Yes (URL set)" --> C["ASRCPUActor\n→ _get_client() → remote gRPC"]
    B -- "No (local model)" --> D{"GPU available?"}
    D -- "Yes" --> E["ASRGPUActor\n→ _create_local_model()"]
    D -- "No" --> F["ASRCPUActor\n→ _create_local_model()"]
    C --> G["_call_remote_batch()\n→ _transcribe_one() per row\n→ _transcribe_remote()"]
    E --> H["_call_local_batch()\nbatched transcribe()"]
    F --> H
    G --> I["_build_output_rows()"]
    H --> I
Loading

Comments Outside Diff (3)

  1. nemo_retriever/src/nemo_retriever/audio/asr_actor.py, line 372-376 (link)

    P2 _transcribe_local is unreachable dead code

    _transcribe_one is only ever called from _call_remote_batch (line 177), which itself is only called when self._client is not None (line 166). Therefore the else branch here — and the entire _transcribe_local method — can never be reached in normal execution. This appears to be an incomplete refactoring: local-model transcription is now handled exclusively in _call_local_batch, but the per-row fallback in _transcribe_one was not removed.

    Consider either removing _transcribe_local and the else branch, or restructuring _call_remote_batch to reuse _call_local_batch for rows where the remote call fails.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
    Line: 372-376
    
    Comment:
    **`_transcribe_local` is unreachable dead code**
    
    `_transcribe_one` is only ever called from `_call_remote_batch` (line 177), which itself is only called when `self._client is not None` (line 166). Therefore the `else` branch here — and the entire `_transcribe_local` method — can never be reached in normal execution. This appears to be an incomplete refactoring: local-model transcription is now handled exclusively in `_call_local_batch`, but the per-row fallback in `_transcribe_one` was not removed.
    
    Consider either removing `_transcribe_local` and the `else` branch, or restructuring `_call_remote_batch` to reuse `_call_local_batch` for rows where the remote call fails.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. nemo_retriever/src/nemo_retriever/audio/asr_actor.py, line 238-247 (link)

    P2 Silent row truncation if model returns fewer transcripts than paths

    zip(rows_list, transcripts) silently drops tail rows when transcripts is shorter than rows_list. This can happen if self._model.transcribe skips or omits results for empty-string paths ("") that are appended to paths_for_model when path resolution fails (e.g. both path and bytes are missing). Adding a length check with a warning would surface this early.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
    Line: 238-247
    
    Comment:
    **Silent row truncation if model returns fewer transcripts than paths**
    
    `zip(rows_list, transcripts)` silently drops tail rows when `transcripts` is shorter than `rows_list`. This can happen if `self._model.transcribe` skips or omits results for empty-string paths (`""`) that are appended to `paths_for_model` when path resolution fails (e.g. both `path` and `bytes` are missing). Adding a length check with a warning would surface this early.
    
    How can I resolve this? If you propose a fix, please make it concise.
  3. nemo_retriever/src/nemo_retriever/audio/asr_actor.py, line 420-433 (link)

    P2 apply_asr_to_df kwargs are accepted but ignored

    The **kwargs: Any parameter is declared but never forwarded to actor() or ASRParams. Either pass it through or remove the parameter to avoid silently swallowing caller-supplied options.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
    Line: 420-433
    
    Comment:
    **`apply_asr_to_df` kwargs are accepted but ignored**
    
    The `**kwargs: Any` parameter is declared but never forwarded to `actor()` or `ASRParams`. Either pass it through or remove the parameter to avoid silently swallowing caller-supplied options.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
Line: 372-376

Comment:
**`_transcribe_local` is unreachable dead code**

`_transcribe_one` is only ever called from `_call_remote_batch` (line 177), which itself is only called when `self._client is not None` (line 166). Therefore the `else` branch here — and the entire `_transcribe_local` method — can never be reached in normal execution. This appears to be an incomplete refactoring: local-model transcription is now handled exclusively in `_call_local_batch`, but the per-row fallback in `_transcribe_one` was not removed.

Consider either removing `_transcribe_local` and the `else` branch, or restructuring `_call_remote_batch` to reuse `_call_local_batch` for rows where the remote call fails.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
Line: 238-247

Comment:
**Silent row truncation if model returns fewer transcripts than paths**

`zip(rows_list, transcripts)` silently drops tail rows when `transcripts` is shorter than `rows_list`. This can happen if `self._model.transcribe` skips or omits results for empty-string paths (`""`) that are appended to `paths_for_model` when path resolution fails (e.g. both `path` and `bytes` are missing). Adding a length check with a warning would surface this early.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
Line: 420-433

Comment:
**`apply_asr_to_df` kwargs are accepted but ignored**

The `**kwargs: Any` parameter is declared but never forwarded to `actor()` or `ASRParams`. Either pass it through or remove the parameter to avoid silently swallowing caller-supplied options.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Merge branch 'main' into archetype" | Re-trigger Greptile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 15, 2026

Greptile Summary

This PR refactors ASRActor into an archetype that resolves at runtime to ASRCPUActor (remote gRPC endpoint, or local-model CPU fallback) or ASRGPUActor (local Parakeet model on GPU), following the existing ArchetypeOperator pattern used throughout the codebase.

  • P1 — _call_local_batch lacks per-row error isolation and silent row drop: Unlike _call_remote_batch which wraps each row in try/except, a model failure in the local path crashes the entire batch; empty-string sentinel paths appended for rows missing both path and bytes can trigger this. The bare zip(rows_list, transcripts) on line 246 also silently drops rows if the model returns fewer transcripts than inputs.
  • P2 — Dead code: _transcribe_local and the else branch of _transcribe_one are unreachable — _transcribe_one is only called from _call_remote_batch, which already requires self._client is not None.

Confidence Score: 4/5

Safe to merge after addressing the local-batch error-handling gap; the archetype resolution logic and GPU/CPU split are otherwise sound.

One P1 issue: _call_local_batch has no per-row error isolation (unlike the remote path) and passes empty-string sentinel paths to the model, which can crash an entire batch; the zip truncation can also silently drop rows. The rest of the design is correct and well-tested.

nemo_retriever/src/nemo_retriever/audio/asr_actor.py — specifically _call_local_batch (lines 188–253) and _transcribe_local/else-branch in _transcribe_one (lines 270–376)

Important Files Changed

Filename Overview
nemo_retriever/src/nemo_retriever/audio/asr_actor.py Introduces ASRCPUActor (remote gRPC or CPU-local fallback), ASRGPUActor (GPU-local only), and ASRActor archetype. Two issues: (1) _call_local_batch lacks per-row error recovery unlike _call_remote_batch, and empty-path strings can crash the whole batch; (2) _transcribe_local and the else branch in _transcribe_one are dead code.
nemo_retriever/src/nemo_retriever/audio/init.py Adds ASRCPUActor and ASRGPUActor to public exports; straightforward change.
nemo_retriever/tests/test_asr_actor.py Good coverage for CPU/GPU actor split, archetype resolution, remote vs. local model path, empty batch, segmentation, and rejection of remote config on GPU actor.
nemo_retriever/tests/test_operator_flags_and_cpu_actors.py Adds ASRCPUActor/ASRGPUActor to the flags-and-archetype test suite; clean additions consistent with existing patterns.

Comments Outside Diff (2)

  1. nemo_retriever/src/nemo_retriever/audio/asr_actor.py, line 238-246 (link)

    P1 Missing per-row error isolation in local batch path

    The remote batch path (_call_remote_batch) wraps each row in a try/except so one bad file only skips that row. The local batch path has no equivalent guard: if self._model.transcribe(paths_for_model) raises on any input (including the "" strings appended for rows where both path and raw were unavailable — line 235), the entire batch crashes. Additionally, when the model returns fewer transcripts than input paths (e.g., it silently skips empty-path entries), the bare zip on line 246 silently truncates output rows, causing undetected data loss.

    Consider filtering empty paths before calling the model, and using itertools.zip_longest (with a fallback of "") to ensure every input row produces an output row.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
    Line: 238-246
    
    Comment:
    **Missing per-row error isolation in local batch path**
    
    The remote batch path (`_call_remote_batch`) wraps each row in a `try/except` so one bad file only skips that row. The local batch path has no equivalent guard: if `self._model.transcribe(paths_for_model)` raises on *any* input (including the `""` strings appended for rows where both `path` and `raw` were unavailable — line 235), the entire batch crashes. Additionally, when the model returns fewer transcripts than input paths (e.g., it silently skips empty-path entries), the bare `zip` on line 246 silently truncates output rows, causing undetected data loss.
    
    Consider filtering empty paths before calling the model, and using `itertools.zip_longest` (with a fallback of `""`) to ensure every input row produces an output row.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. nemo_retriever/src/nemo_retriever/audio/asr_actor.py, line 270-287 (link)

    P2 _transcribe_local is unreachable dead code

    _transcribe_local is only invoked from the else branch of _transcribe_one (lines 372–376), but _transcribe_one is exclusively called from _call_remote_batch, which is itself guarded by if self._client is not None in process. This means the else branch in _transcribe_one — and therefore _transcribe_local entirely — can never execute. The local single-row transcription logic is also re-implemented (with differences) inside _call_local_batch, adding to the confusion.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
    Line: 270-287
    
    Comment:
    **`_transcribe_local` is unreachable dead code**
    
    `_transcribe_local` is only invoked from the `else` branch of `_transcribe_one` (lines 372–376), but `_transcribe_one` is exclusively called from `_call_remote_batch`, which is itself guarded by `if self._client is not None` in `process`. This means the `else` branch in `_transcribe_one` — and therefore `_transcribe_local` entirely — can never execute. The local single-row transcription logic is also re-implemented (with differences) inside `_call_local_batch`, adding to the confusion.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
Line: 238-246

Comment:
**Missing per-row error isolation in local batch path**

The remote batch path (`_call_remote_batch`) wraps each row in a `try/except` so one bad file only skips that row. The local batch path has no equivalent guard: if `self._model.transcribe(paths_for_model)` raises on *any* input (including the `""` strings appended for rows where both `path` and `raw` were unavailable — line 235), the entire batch crashes. Additionally, when the model returns fewer transcripts than input paths (e.g., it silently skips empty-path entries), the bare `zip` on line 246 silently truncates output rows, causing undetected data loss.

Consider filtering empty paths before calling the model, and using `itertools.zip_longest` (with a fallback of `""`) to ensure every input row produces an output row.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/audio/asr_actor.py
Line: 270-287

Comment:
**`_transcribe_local` is unreachable dead code**

`_transcribe_local` is only invoked from the `else` branch of `_transcribe_one` (lines 372–376), but `_transcribe_one` is exclusively called from `_call_remote_batch`, which is itself guarded by `if self._client is not None` in `process`. This means the `else` branch in `_transcribe_one` — and therefore `_transcribe_local` entirely — can never execute. The local single-row transcription logic is also re-implemented (with differences) inside `_call_local_batch`, adding to the confusion.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (2): Last reviewed commit: "Merge branch 'main' into archetype" | Re-trigger Greptile

@jperez999 jperez999 closed this Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants