You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR adds multilingual voice input directly to the OpenCode prompt composer.
A new voice button is available next to the file attachment control. When activated, it records microphone audio, transcribes it locally using onnx-community/whisper-large-v3-turbo through Transformers/WebGPU when available, and inserts the resulting transcript at the current cursor position in the prompt.
The transcription workflow is designed to remain client-side, avoiding the need to send spoken prompts to a remote speech recognition service. The Whisper model is loaded lazily on first use to reduce startup overhead. If the local Whisper runtime is unavailable, the feature falls back to the browser's native speech recognition API.
Additional changes include:
Multilingual speech recognition with automatic language detection through Whisper.
Browser speech recognition fallback using the current OpenCode locale.
Localized labels, status messages, and error states for the voice input workflow.
How did you verify your code works?
Executed:
bun test --preload ./happydom.ts ./src/i18n/parity.test.ts
Notes:
bun typecheck from packages/app is currently blocked in this Windows worktree because src/custom-elements.d.ts is a text pointer to ../../ui/src/custom-elements.d.ts.
The repository-wide pre-push typecheck is also blocked by unrelated local workspace issues (packages/mission-control exists locally but is not present in the lockfile) as well as the same custom-elements.d.ts pointer issue in packages/enterprise.
Screenshots / recordings
UI change — screenshots or recordings can be added here.
Checklist
I have tested my changes locally
I have not included unrelated changes in this PR
If you do not follow this template your PR will be automatically rejected.
Related: Voice/microphone input feature with Whisper support, though focused on TUI rather than app
These PRs address voice input and Whisper transcription functionality. PR #29663 and #11345 appear most closely related as they both target voice input with Whisper, though the scope and components may differ.
* PR description is missing required template sections. Please use the [PR template](../blob/dev/.github/pull_request_template.md).
Please edit this PR description to address the above within 2 hours, or it will be automatically closed.
If you believe this was flagged incorrectly, please let a maintainer know.
Update
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #
Type of change
What does this PR do?
This PR adds multilingual voice input directly to the OpenCode prompt composer.
A new voice button is available next to the file attachment control. When activated, it records microphone audio, transcribes it locally using
onnx-community/whisper-large-v3-turbothrough Transformers/WebGPU when available, and inserts the resulting transcript at the current cursor position in the prompt.The transcription workflow is designed to remain client-side, avoiding the need to send spoken prompts to a remote speech recognition service. The Whisper model is loaded lazily on first use to reduce startup overhead. If the local Whisper runtime is unavailable, the feature falls back to the browser's native speech recognition API.
Additional changes include:
How did you verify your code works?
Executed:
Notes:
bun typecheckfrompackages/appis currently blocked in this Windows worktree becausesrc/custom-elements.d.tsis a text pointer to../../ui/src/custom-elements.d.ts.packages/mission-controlexists locally but is not present in the lockfile) as well as the samecustom-elements.d.tspointer issue inpackages/enterprise.Screenshots / recordings
UI change — screenshots or recordings can be added here.
Checklist
If you do not follow this template your PR will be automatically rejected.