SpeechBrain v0.5.14
This release is a minor yet important release. It increases significantly the number of features available while fixing quite a lot of small bugs and issues. A summary of the achievements of this release is given below, while a complete detailed list of all the changes can be found at the bottom of this release note.
Notable achievements
- 22 new contributors, thank you so much, everyone!
- 31 new recipes (ASR, SLU, AST, AER, Interpretability, SSL).
- FULL automatic recipe testing.
- Increased coverage for the continuous integration over the code, URLs, YAML, recipes, and HuggingFace models.
- New Conformer Large model for ASR.
- Integration of Whisper for fine-tuning or inference.
- Full pre-training of wav2vec2 entirely re-implemented AND documented.
- Low resource Speech Translation with IWSLT.
- Many other novelties... see below.
What's Changed
- fix 1522 by @anautsch in #1526
- bug-fix: fixed OPEN_RIR data preparation process conflict. by @xin-w8023 in #1536
- add noise and reverberance version for BinauralWSJ0Mix by @huangzj421 in #1502
- fix distributed namespace by @anautsch in #1566
- feat: use member field instead of hard-code by @xin-w8023 in #1567
- Update logo to new version by @pplantinga in #1575
- IWSLT 2022 speech translation recipe by @mzboito in #1475
- Fix Issue #1277 timit recipe missing uppercase option by @Adel-Moumen in #1564
- Update README.md by @qanastek in #1577
- Output hiddens states from all the transformer layers of huggingface_wav2vec by @BenoitWang in #1570
- Fix bugs of update_learning_rate by @wangxin22 in #1578
- Fix to use output of unsqueeze() in Tacotron2 parse_decoder_outputs() by @jqug in #1525
- wav2vec2 pretraining implemented with speechbrain by @RuABraun in #1312
- In filter_ctc_output(), remove redundant filtering by @olvb in #1584
- Fixed output_all_hiddens for hubert in huggingface_wav2vec by @gorinars in #1587
- Fix return value of batch_evaluation for separation recipes by @z-wony in #1555
- fix endless doctest despite no example by @anautsch in #1591
- Fix documented min python version to 3.7 by @asumagic in #1595
- Conformer separation by @ycemsubakan in #1519
- Add CTC recipe to AISHELL-1 by @BenoitWang in #1576
- Add templates for issues by @Adel-Moumen in #1588
- Added workaround for CyclicLR saving by @Gastron in #1683
- scikit-learn import and comment fix by @underdogliu in #1485
- Adding recipe for HiFiGAN training using LibriTTS dataset by @pradnya-git-dev in #1621
- fix LibriSpeech CTC pretrainer by @BenoitWang in #1594
- wav2vec German model added by @sangeet2020 in #1557
- issue 1615 typo fix by @sharmadhiraj86 in #1700
- typo in TransformerASR.py by @Adel-Moumen in #1704
- Causality in Conv2d by @fpaissan in #1608
- Switchboard Recipe by @dwgnr in #1460
- read_audio fixes and docs cleanup by @asumagic in #1592
- Fix path flake8 in pre-commit by @Adel-Moumen in #1721
- Added german_cleaners by @padmalcom in #1642
- fixing issue 1707 by @TParcollet in #1728
- explicit fetch args & download-only option by @anautsch in #1735
- fix sorting bug by @anautsch in #1730
- remove discussions references by @Adel-Moumen in #1737
- Fix torchaudio mel_normalized for Tacotron2&HifiGAN by @BenoitWang in #1740
- Whisper finetuning by @Adel-Moumen in #1717
- loss must be avg when BS>1 when calling evaluate_batch() by @sangeet2020 in #1744
- [FIX] Flush gradients and save memory for validation. by @MartinKocour in #1739
- add coloring in tqdm progress bar by @sangeet2020 in #1573
- Fix librispeech transformer recipe by @TParcollet in #1775
- 🖍️ improving type-hints in
speechbrain/pretrained/interfaces.pyby @jonasvdd in #1725 - Enabling the retrieval of whisper's hidden states by @Hguimaraes in #1751
- Added fix to use DDP with hifi_gan training on ljspeech by @padmalcom in #1781
- Fix wav2vec2 masking by @TParcollet in #1799
- fix #1794 by @Adel-Moumen in #1805
- refactor: recipe testing CSVs by @anautsch in #1600
- fix 1788 by @BenoitWang in #1842
- fix docstring for pooling by @BenoitWang in #1843
- Whisper finetunng common voice by @poonehmousavi in #1809
- fixing the convtasnet causal=True bug by @ycemsubakan in #1851
- Fix Whisper doc + improve max_decode_ratio by @Adel-Moumen in #1858
- Rewrite multi-GPU documentation by @asumagic in #1861
- SLU Media recipe by @GaelleLaperriere in #1172
- edits for refactoring check tool by @anautsch in #1838
- minor fixes for recipe testing by @anautsch in #1872
- Fix Whisper
avoid_if_longer_thannever used by @Adel-Moumen in #1882 - Starting a recipe for ESC50 by @ycemsubakan in #1605
- Fix for #1886 by @anautsch in #1890
- fix batch_to_right by @anthony-wss in #1884
- Fixes for pre-release testing by @anautsch in #1895
- Fix Conformer Instabilities and add Large Model by @TParcollet in #1892
- Downsampling by @salah-zaiem in #1888
- fix core.py bf16 by @Adel-Moumen in #1898
- S2SGreedySearcher : Do not continue decoding when EOS token was generated for all samples from a batch by @Jeronymous in #1899
- quick fixes before minor by @anautsch in #1896
New Contributors
- @xin-w8023 made their first contribution in #1536
- @mzboito made their first contribution in #1475
- @qanastek made their first contribution in #1577
- @wangxin22 made their first contribution in #1578
- @jqug made their first contribution in #1525
- @olvb made their first contribution in #1584
- @gorinars made their first contribution in #1587
- @z-wony made their first contribution in #1555
- @asumagic made their first contribution in #1595
- @pradnya-git-dev made their first contribution in #1621
- @sangeet2020 made their first contribution in #1557
- @sharmadhiraj86 made their first contribution in #1700
- @fpaissan made their first contribution in #1608
- @dwgnr made their first contribution in #1460
- @padmalcom made their first contribution in #1642
- @MartinKocour made their first contribution in #1739
- @jonasvdd made their first contribution in #1725
- @poonehmousavi made their first contribution in #1809
- @GaelleLaperriere made their first contribution in #1172
- @anthony-wss made their first contribution in #1884
- @salah-zaiem made their first contribution in #1888
- @Jeronymous made their first contribution in #1899
Full Changelog: v0.5.13...v0.5.14