Libriheavy (Code from SAIC-Cambridge)#2781
Conversation
|
Thanks @shucongzhang! I think @mravanelli, @Adel-Moumen and @pplantinga could have a look (or even review) :-) |
Adel-Moumen
left a comment
There was a problem hiding this comment.
Hi @shucongzhang, thanks for this great PR! I left a couple of comments :)
|
Hi all, I ran the PR, and the scripts are all working! (nice) I only have concern regarding the splits. Indeed, the current state of the PR forces the user to use the dev/test sets of Libriheavy. Unfortunately, those splits contains audio all located in the large split. If you are only using the small/medium (usually because of storage bottleneck) sets, you can't use this PR. What I would suggest in this case, is:
Other than that, it's all good to me. I made one change by adding the possibility to modify SpeechBrain audio backend instead of using soundfile directly. We should also add this note in the README saying that we are using Soundfile for performances reasons. |
TParcollet
left a comment
There was a problem hiding this comment.
LGTM! Thanks @shucongzhang and @Adel-Moumen for the work! Let's advertise the recipe this week!
| ["id", "sig", "wrd", "tokens_bos", "tokens_eos", "tokens"], | ||
| ) | ||
|
|
||
| # 5. If Dynamic Batching is used, we instantiate the needed samplers. |
There was a problem hiding this comment.
I am all for that we remove the possibility of NOT usng dynamic batching in future recipes. It's a waste of compute.
What does this PR do?
This PR adds the recipes of data preparation and training AED models with Libriheavy.