v1.1.0

@TParcollet

This major release extends SpeechBrain's support for SpeechLLMs and introduces several new features, recipes, and improvements.

Highlights

Feature Caching — Save extracted features (e.g. wav2vec embeddings) to disk and load them on the fly, skipping recomputation. This powers our first ASR SpeechLLM recipe on LibriSpeech, enabling LLM-based training with pre-computed embeddings.
New Recipes — SpeechLLM for ASR and translation, streaming SSL, FocalCodec, and SENSE models.

Along with internal improvements and bug fixes. Here follows a changelog of the main changes (omitting some minor bugfixes):

What's Changed

Reforging LLaMA lobe (code from Samsung AI Center-Cambridge) by @TParcollet in #2850
Streaming recipe for BestRQ by @Chaanks in #2790
Librilight data preparation for SpeechBrain SSL (code from Samsung AI Center Cambridge) by @shucongzhang in #2765
Alignment with CTC ASR models powered by k2. by @ZhaoZeyu1995 in #2772
SpeechLLM (with LLaMA) and Conformer recipe for speech translation on CoVoST (Code from Samsung AI Center Cambridge) by @TParcollet in #2865
Feature caching proposal: CachedDynamicItem by @pplantinga in #2985
Adapting transducer greedy decoding by @younessdkhissi in #2975
Replace torchaudio I/O with soundfile-based audio_io wrapper by @Copilot in #2989
FocalCodec [NeurIPS 2025] by @lucadellalib in #3000
Caching: add compression + filename + closing/loading by @Adel-Moumen in #3005
Implement per-key padding configuration in PaddedBatch by @Adel-Moumen in #3008
SpeechLLM LibriSpeech recipe by @Adel-Moumen in #2885
Remove CTC CUDA + Move transducer loss in integrations by @Adel-Moumen in #3028
Adding SENSE models by @MaryemBouziane in #2998

New Contributors

@emmanuel-ferdman made their first contribution in #2900
@ofiryaish made their first contribution in #2871
@omidiu made their first contribution in #2923
@svecjan made their first contribution in #2934
@nouranali made their first contribution in #2855
@OscarFree made their first contribution in #2988
@younessdkhissi made their first contribution in #2975
@jordanozang made their first contribution in #2982
@jrochdi made their first contribution in #2947
@seohyunjun made their first contribution in #2996
@Daheer made their first contribution in #3022
@raotnameh made their first contribution in #2617
@Mr-Neutr0n made their first contribution in #3029
@georgesabr made their first contribution in #3038
@MaryemBouziane made their first contribution in #2998

Full Changelog: v1.0.3...v1.1.0

What's Changed

Fix broken links due to change in voxlingua107 location by @pplantinga in #2877
Reforging LLaMA lobe (code from Samsung AI Center-Cambridge) by @TParcollet in #2850
Attempt to fix self.device for AMP by @Adel-Moumen in #2882
fix convolutions docstrings by @gfdb in #2883
Attempt fix for input_norm test by @pplantinga in #2887
Fix dropout at inference by @Adel-Moumen in #2889
Avoid running tests on draft PRs by @pplantinga in #2890
Fix input normalization and global normalization variance calculation by @pplantinga in #2835
Bump version of pyroomacoustics for recipes/DNS/enhancement by @rogiervd in #2899
Fix argument to rec_loss in (Variational)AutoencoderLoss (from Samsung AI Center Cambridge) by @rogiervd in #2902
Ensure that x is not unbound in WordEmbeddingEncoder (from Samsung, AI Center, Cambridge) by @rogiervd in #2905
Fix bug in ModuleList.insert (from Samsung, AI Center, Cambridge) by @rogiervd in #2903
Fix bug constructing ValueError in Conv2d (from Samsung, AI Center, Cambridge) by @rogiervd in #2904
Make types correct (from Samsung, AI Center, Cambridge) by @rogiervd in #2912
Use correct signature for torch.tensor.type() (from Samsung, AI Center, Cambridge) by @rogiervd in #2910
Update handling of dtype in read_pkl to torch.Tensor (from Samsung, AI Center, Cambridge) by @rogiervd in #2907
Make arpa_to_fst work when arguments are str (from Samsung, AI Center, Cambridge) by @rogiervd in #2919
Make f_max integer (from Samsung, AI Center, Cambridge) by @rogiervd in #2918
Account for all combinations of arguments in IterativeCSVWriter.write (from Samsung, AI Center, Cambridge) by @rogiervd in #2916
Migrate to modern Python Logger API by @emmanuel-ferdman in #2900
Do not forget to raise ValueError (from Samsung, AI Center, Cambridge) by @rogiervd in #2909
Replace axis= by dim= for calls to Torch (from Samsung, AI Center, Cambridge) by @rogiervd in #2913
Remove unused value (from Samsung, AI Center, Cambridge) by @rogiervd in #2914
Make Scores.repr actually return something by @rogiervd in #2897
Remove deprecated abstractproperty (from Samsung, AI Center, Cambridge) by @rogiervd in #2915
Use "cls" instead of "self" in classmethod by @rogiervd in #2908
Change location from which to import tqdm by @rogiervd in #2921
Fix type annotations (from Samsung, AI Center, Cambridge) by @rogiervd in #2911
Add test for various length mask functions (code from Samsung, AI Center, Cambridge) by @rogiervd in #2894
Change test for make_padding_mask to expose mistake by @rogiervd in #2895
fix bug in reverberate with rescale_amp by @ofiryaish in #2871
docs: Fix contributing.md by @omidiu in #2923
Renaming largescale ASR to the Loquacious Set by @TParcollet in #2928
Move files with optional dependencies to integrations folder by @pplantinga in #2782
Streaming recipe for BestRQ by @Chaanks in #2790
Librilight data preparation for SpeechBrain SSL (code from Samsung AI Center Cambridge) by @shucongzhang in #2765
Re-ran tutorial to remove errors by @pplantinga in #2925
Create FetchConfig for standardizing use of fetch by @pplantinga in #2828
Fix issue 'GlobalNorm with DDP' by @svecjan in #2934
Bump torch from 2.6.0 to 2.7.1 in /docs by @dependabot[bot] in #2939
Solves #2846 Refactor run_opts into a @DataClass by @nouranali in #2855
Make run_on_main and main_process_only return the result to all proce… by @rogiervd in #2943
Saveable Generator: initial import by @flexthink in #2937
Alignment with CTC ASR models powered by k2. by @ZhaoZeyu1995 in #2772
Fix forced alignment tutorial link to google colab by @pplantinga in #2951
Fix missing defaults in run options by @pplantinga in #2952
Add recipe installation instructions to installation.md by @pplantinga in #2948
SpeechLLM (with LLaMA) and Conformer recipe for speech translation on CoVoST (Code from Samsung AI Center Cambridge) by @TParcollet in #2865
Use epoch loop loader for transfer to correctly handle end_of_epoch by @pplantinga in #2965
fix unnecessary pos_emb for RoPE by @shucongzhang in #2963
Adopt pyproject and move to Ruff by @TParcollet in #2946
Update version constraints for torch and torchaudio by @Adel-Moumen in #2978
Bump torch from 2.7.1 to 2.8.0 in /docs by @dependabot[bot] in #2976
Fix #2973 bug with period in path by @pplantinga in #2974
Bandaid fix for torchaudio 2.9+ compatibility by @OscarFree in #2988
Feature caching proposal: CachedDynamicItem by @pplantinga in #2985
Adapting transducer greedy decoding by @younessdkhissi in #2975
Fix for Tracing ECAPA-TDNN Embedding Model by @jordanozang in #2982
fix speed perturb inversion by @gfdb in #2994
Add functions for running code once per node by @pplantinga in #2992
Replace torchaudio I/O with soundfile-based audio_io wrapper by @Copilot in #2989
SGMSE Voicebank Speech Enhancement Recipe by @jrochdi in #2947
Update versions of python due to 3.9 EOL by @pplantinga in #3001
fix issue with --arg value and --arg=value by @Adel-Moumen in #2999
Rename FetchConfig.use_auth_token to token and pass token to HF hub by @seohyunjun in #2996
FocalCodec [NeurIPS 2025] by @lucadellalib in #3000
Caching: add compression + filename + closing/loading by @Adel-Moumen in #3005
HDF5 integration not loading/caching etc. by @Adel-Moumen in #3007
Fix local_rank when using single process by @Adel-Moumen in #3006
Implement per-key padding configuration in PaddedBatch by @Adel-Moumen in #3008
Add init files so pages get picked up for docs by @pplantinga in #3011
ksponspeech remove by @Adel-Moumen in #3013
Replace deprecated torch.cuda.amp.custom_fwd with torch.amp.custom_fw by @omidiu in #2922
Fix duplicate mic position in beamforming tutorial by @Adel-Moumen in #3019
Fix text_file assignment - SentencePiece by @pplantinga in #3016
Update CI requirements by @Adel-Moumen in #3023
add tests + doc + fix recipes + new get_available_cpu_count fn by @Adel-Moumen in #3025
Fix typo in literature reference by @Daheer in #3022
Update scorer.py by @raotnameh in #2617
SpeechLLM LibriSpeech recipe by @Adel-Moumen in #2885
Remove CTC CUDA + Move transducer loss in integrations by @Adel-Moumen in #3028
Add missing mean_stat_per_model method to StatObject_SB by @Mr-Neutr0n in #3029
support torch>=2.9.0 by @eschmidbauer in #3032
Fix mutable default arguments in function signatures by @Mr-Neutr0n in #3034
Fix author name typo: Abous-Rjeili → Abou-Rjeili by @georgesabr in #3038
Adding SENSE models by @MaryemBouziane in #2998
Recipe tests new release by @Adel-Moumen in #3042
v1.0.4 Dev -> Main by @Adel-Moumen in #3044

New Contributors

@emmanuel-ferdman made their first contribution in #2900
@ofiryaish made their first contribution in #2871
@omidiu made their first contribution in #2923
@svecjan made their first contribution in #2934
@nouranali made their first contribution in #2855
@OscarFree made their first contribution in #2988
@younessdkhissi made their first contribution in #2975
@jordanozang made their first contribution in #2982
@jrochdi made their first contribution in #2947
@seohyunjun made their first contribution in #2996
@Daheer made their first contribution in #3022
@raotnameh made their first contribution in #2617
@Mr-Neutr0n made their first contribution in #3029
@georgesabr made their first contribution in #3038
@MaryemBouziane made their first contribution in #2998

Full Changelog: v1.0.3...v1.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

What's Changed

New Contributors

What's Changed

New Contributors

Contributors

Uh oh!