Skip to content

Commit 4aff580

Browse files
author
Titouan Parcollet
committed
update new model at 2.8
1 parent 40eeb35 commit 4aff580

4 files changed

Lines changed: 46 additions & 305 deletions

File tree

recipes/LibriSpeech/ASR/transducer/README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,14 @@ pip install numba
1818
python train.py train/train.yaml
1919
```
2020

21-
# Librispeech 100H Results
21+
# Librispeech Results
2222

23-
| Release | hyperparams file | Val. CER | Val. WER | Test WER (test clean) | Model link | GPUs |
24-
|:-------------:|:---------------------------:| ------:| :-----------:| :------------------:| --------:| :-----------:|
25-
| 2020-10-22 | train.yaml | 5.2 | GS: 11.45 | BS (beam=4): 11.03 | Not Available | 1xRTX-8000 48GB |
23+
Dev. clean is evaluated with Greedy Decoding while the test sets are using Greedy Decoding OR a RNNLM + Beam Search.
24+
25+
| Release | hyperparams file | Dev. Clean | Test-clean Greedy | Test-other Greedy | Test-clean BS+RNNLM| Test-other BAS+RNNLM | Model link | GPUs |
26+
|:-------------:|:---------------------------:| :------:| :-----------:| :------------------:| :------------------:| :------------------:| :--------:| :-----------:|
27+
| 2020-10-22 | conformer_transducer.yaml | 3.0 | ... | ... | 2.8 | ... | Not Available | 4xA100 80GB |
2628

27-
The output folder with the checkpoints and training logs is available [here](https://drive.google.com/drive/folders/17kEW0crU3tyP-8-u5TeoFom4ton_B-j2?usp=sharing).
2829

2930

3031
# **About SpeechBrain**

recipes/LibriSpeech/ASR/transducer/hparams/conformer_transducer.yaml

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
# Encoder: Conformer
44
# Decoder: LSTM + beamsearch + RNNLM
55
# Tokens: BPE with unigram
6-
# losses: Transducer + CTC + CE
6+
# losses: Transducer + CTC (optional) + CE (optional)
77
# Training: Librispeech 960h
8-
# Authors: Titouan Parcollet 2022, Abdel HEBA, Mirco Ravanelli, Sung-Lin Yeh 2020
8+
# Authors: Titouan Parcollet 2023, Abdel HEBA, Mirco Ravanelli, Sung-Lin Yeh 2020
99
# ############################################################################
1010

1111
# Seed needs to be set at top of yaml, before objects with parameters are made
@@ -58,7 +58,8 @@ ce_weight: 0.0 # Multitask with CE for the decoder (0.0 = disabled)
5858
max_grad_norm: 5.0
5959
loss_reduction: 'batchmean'
6060

61-
# Used if and only if dynamic batching is set to False
61+
# The batch size is used if and only if dynamic batching is set to False
62+
# Validation and testing are done with fixed batches and not dynamic batching.
6263
batch_size: 8
6364
grad_accumulation_factor: 1
6465
sorting: random
@@ -80,10 +81,10 @@ valid_dataloader_opts:
8081
test_dataloader_opts:
8182
batch_size: !ref <batch_size_valid>
8283

83-
# This setup works well for V100 32GB GPU, adapts it to your needs.
84+
# This setup works well for A100 80GB GPU, adapts it to your needs.
8485
# Or turn it off (but training speed will decrease)
8586
dynamic_batching: True
86-
max_batch_len: 900
87+
max_batch_len: 400
8788
max_batch_len_val: 100 # we reduce it as the beam is much wider (VRAM)
8889
num_bucket: 200
8990

@@ -221,11 +222,11 @@ dec: !new:speechbrain.nnet.RNN.LSTM
221222
re_init: True
222223
dropout: 0.1
223224

224-
# For MTL with LM over the decoder
225-
dec_lin: !new:speechbrain.nnet.linear.Linear
226-
input_size: !ref <joint_dim>
227-
n_neurons: !ref <output_neurons>
228-
bias: False
225+
# For MTL with LM over the decoder (need to uncomment to activate)
226+
# dec_lin: !new:speechbrain.nnet.linear.Linear
227+
# input_size: !ref <joint_dim>
228+
# n_neurons: !ref <output_neurons>
229+
# bias: False
229230

230231
# For MTL
231232
ce_cost: !name:speechbrain.nnet.losses.nll_loss
@@ -275,7 +276,7 @@ modules:
275276
proj_ctc: !ref <proj_ctc>
276277
proj_dec: !ref <proj_dec>
277278
proj_enc: !ref <proj_enc>
278-
dec_lin: !ref <dec_lin>
279+
# dec_lin: !ref <dec_lin>
279280

280281
# for MTL
281282
# update model if any HEAD module is added

recipes/LibriSpeech/ASR/transducer/hparams/train.yaml

Lines changed: 0 additions & 281 deletions
This file was deleted.

0 commit comments

Comments
 (0)