Skip to content

improvement(chat-voice): modernize ElevenLabs TTS to Flash v2.5#4943

Merged
waleedlatif1 merged 2 commits into
stagingfrom
fix/elevenlabs-voice-modernize
Jun 10, 2026
Merged

improvement(chat-voice): modernize ElevenLabs TTS to Flash v2.5#4943
waleedlatif1 merged 2 commits into
stagingfrom
fix/elevenlabs-voice-modernize

Conversation

@waleedlatif1

@waleedlatif1 waleedlatif1 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Switch deployed-chat default TTS model from eleven_turbo_v2_5 to eleven_flash_v2_5 — ElevenLabs now recommends Flash over Turbo in all cases (functionally equivalent, ~75ms latency, built for real-time/Agents)
  • Replace the legacy default voice Sarah (EXAVITQu4vr4xnSDxMaL) with Jessica (cgSgspJ2msm6clMCkdW9) — Sarah has no high-quality eleven_flash_v2_5 base; Jessica is a current premade conversational voice optimized for Flash v2.5. Verified against the live account.
  • Drop the deprecated optimize_streaming_latency knob and the legacy use_pvc_as_ivc / enable_ssml_parsing flags from the proxy request
  • Move output_format to the query string (where ElevenLabs reads it) and raise it from mp3_22050_32 (32 kbps) to mp3_44100_128 for noticeably better audio quality
  • Switch apply_text_normalization from off to auto so numbers/dates are pronounced correctly (level-4 latency mode had silently disabled the normalizer)

Scope is limited to the deployed-chat voice path (/api/proxy/tts/stream, its contract, the audio-streaming hook, and the chat default voice). STT (scribe_v2_realtime, single-use token flow) was already current and is untouched.

Type of Change

  • Improvement

Testing

Smoke-tested the exact new request (Jessica + eleven_flash_v2_5 + output_format=mp3_44100_128 + apply_text_normalization=auto) against the live ElevenLabs API → HTTP 200, valid 128 kbps / 44.1 kHz MP3, numbers/dates normalized correctly. bun run check:api-validation and tsc --noEmit pass clean.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

- Switch default TTS model from eleven_turbo_v2_5 to eleven_flash_v2_5 (ElevenLabs recommends Flash over Turbo in all cases; ~75ms latency)
- Drop deprecated optimize_streaming_latency knob plus legacy use_pvc_as_ivc / enable_ssml_parsing flags
- Move output_format to the query string and raise it from mp3_22050_32 to mp3_44100_128 for higher audio quality
- Switch apply_text_normalization from off to auto for correct number/date pronunciation
@vercel

vercel Bot commented Jun 10, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 10, 2026 4:00pm

Request Review

@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR modernizes the deployed-chat TTS path by switching the default model from eleven_turbo_v2_5 to eleven_flash_v2_5, upgrading audio quality from mp3_22050_32 to mp3_44100_128, and cleaning up deprecated ElevenLabs request fields.

  • Model + quality upgrade: Default modelId updated in both the Zod contract and the hook default; output_format moved to the query string at 128 kbps / 44 100 Hz (was 32 kbps / 22 050 Hz body param).
  • Deprecated flag removal: optimize_streaming_latency, use_pvc_as_ivc, and enable_ssml_parsing dropped from the request body; apply_text_normalization changed from 'off' to 'auto' so numbers and dates are spoken correctly.

Confidence Score: 5/5

Safe to merge — changes are additive quality improvements with no logic regressions on the audio proxy path.

The diff is small and targeted: a model name swap, a bitrate bump, and removal of deprecated fields that ElevenLabs no longer honours. The contract default and the hook default are kept in sync, output_format placement matches the ElevenLabs streaming endpoint spec, and no auth or data-flow logic is touched. No new error paths are introduced.

No files require special attention.

Important Files Changed

Filename Overview
apps/sim/app/api/proxy/tts/stream/route.ts Moves output_format to query string and upgrades it to mp3_44100_128, drops deprecated optimize_streaming_latency/enable_ssml_parsing/use_pvc_as_ivc flags, and switches apply_text_normalization to auto — all changes are consistent with ElevenLabs' current API.
apps/sim/app/chat/hooks/use-audio-streaming.ts Single-line default model change from eleven_turbo_v2_5 to eleven_flash_v2_5; hook logic and streaming/buffering behaviour are otherwise unchanged.
apps/sim/lib/api/contracts/media/tts-stream.ts Contract default for modelId updated from eleven_turbo_v2_5 to eleven_flash_v2_5, keeping it consistent with the hook default.

Sequence Diagram

sequenceDiagram
    participant Client as Browser (useAudioStreaming)
    participant Proxy as /api/proxy/tts/stream
    participant EL as ElevenLabs API

    Client->>Proxy: "POST {text, voiceId, modelId="eleven_flash_v2_5", chatId}"
    Proxy->>Proxy: validateChatAuth(chatId)
    Proxy->>EL: "POST /v1/text-to-speech/{voiceId}/stream?output_format=mp3_44100_128"
    Note over Proxy,EL: body: {model_id, voice_settings, apply_text_normalization="auto"}
    EL-->>Proxy: audio/mpeg stream (128 kbps, 44100 Hz)
    Proxy-->>Client: streamed audio/mpeg via TransformStream
    Client->>Client: arrayBuffer() → decodeAudioData → play
Loading

Reviews (1): Last reviewed commit: "improvement(chat-voice): modernize Eleve..." | Re-trigger Greptile

Replace the legacy Sarah default (EXAVITQu4vr4xnSDxMaL), which has no high-quality
eleven_flash_v2_5 base, with Jessica (cgSgspJ2msm6clMCkdW9) — a current premade
conversational voice verified against the live account and optimized for Flash v2.5.
@waleedlatif1 waleedlatif1 merged commit 540e608 into staging Jun 10, 2026
9 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/elevenlabs-voice-modernize branch June 10, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant