Libriheavy (Code from SAIC-Cambridge) by shucongzhang · Pull Request #2781 · speechbrain/speechbrain

shucongzhang · 2024-12-09T21:26:57Z

What does this PR do?

This PR adds the recipes of data preparation and training AED models with Libriheavy.

TParcollet · 2024-12-09T21:35:52Z

Thanks @shucongzhang! I think @mravanelli, @Adel-Moumen and @pplantinga could have a look (or even review) :-)
One last thing to do @shucongzhang, as it's a new recipe, you'd need to add a recipe testing case (see the tests/ folder). You'd need to add a csv file into a folder with a line equivalent to the ones we have for the LibriSpeech test csv. You may also want to write somewhere that the full dataset is 50k hours+ so prettttttty heavy to store :p

Adel-Moumen

Hi @shucongzhang, thanks for this great PR! I left a couple of comments :)

into libriheavy

Adel-Moumen · 2025-02-07T18:00:20Z

Hi all,

I ran the PR, and the scripts are all working! (nice)

I only have concern regarding the splits. Indeed, the current state of the PR forces the user to use the dev/test sets of Libriheavy. Unfortunately, those splits contains audio all located in the large split. If you are only using the small/medium (usually because of storage bottleneck) sets, you can't use this PR. What I would suggest in this case, is:

do not force the dev nor test set from libriheavy in the prepare.py
Document this behaviour in the README and make the user is aware that he can alternatively use the dev/test of LibriSpeech directly by following our recipes to prepare the data and change the location of valid_csv and test_csv.

Other than that, it's all good to me. I made one change by adding the possibility to modify SpeechBrain audio backend instead of using soundfile directly. We should also add this note in the README saying that we are using Soundfile for performances reasons.

TParcollet

LGTM! Thanks @shucongzhang and @Adel-Moumen for the work! Let's advertise the recipe this week!

TParcollet · 2025-02-08T16:44:05Z

+        ["id", "sig", "wrd", "tokens_bos", "tokens_eos", "tokens"],
+    )
+
+    # 5. If Dynamic Batching is used, we instantiate the needed samplers.


I am all for that we remove the possibility of NOT usng dynamic batching in future recipes. It's a waste of compute.

Shucong Zhang/Embedded AI /SRUK/Engineer/Samsung Electronics and others added 3 commits December 9, 2024 21:20

libriheavy AED

acdb94e

Update README.md

36ae236

Merge branch 'develop' into libriheavy

aa99142

TParcollet requested review from Adel-Moumen, mravanelli and pplantinga December 9, 2024 21:36

Adel-Moumen reviewed Dec 10, 2024

View reviewed changes

Shucong Zhang/Embedded AI /SRUK/Engineer/Samsung Electronics added 3 commits December 10, 2024 16:14

modified README and added the test csv

aaa57af

Merge branch 'libriheavy' of https://github.com/shucongzhang/speechbrain

542b709

into libriheavy

update PR

56d8398

Adel-Moumen assigned Adel-Moumen and unassigned Adel-Moumen Feb 1, 2025

Adel-Moumen added 4 commits February 7, 2025 11:51

Merge remote-tracking branch 'origin/develop' into libriheavy

cfdeac3

add the possibility to have a different audio backend in SB

b4c3ce3

followup backend

d04dcf5

add Readme in root folder

a3da48e

Adel-Moumen and others added 12 commits February 7, 2025 13:24

add dynamic backend

fa3b28d

add backend info in header of train.py

0d70dad

placeholder

1cdc97b

dev split should be defined through the yaml

26a8a69

data root + dev split

0172b6d

dev

e1daa61

READMEs

1946dc1

pre-commit

5af3ed7

pre-commit fix: how is it possible???

2a71290

add dataclass doc

20880ed

last pre-commit inchallah

1fea170

Merge branch 'develop' into libriheavy

e141b6e

Adel-Moumen added 3 commits February 7, 2025 17:10

remove links

c3fa4ce

fix link

195c994

remove speed perturb as unused

1efd090

TParcollet approved these changes Feb 8, 2025

View reviewed changes

TParcollet merged commit 948fa2b into speechbrain:develop Feb 8, 2025

Adel-Moumen mentioned this pull request Feb 10, 2025

Bump torchaudio version + add bytes #2821

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Libriheavy (Code from SAIC-Cambridge)#2781

Libriheavy (Code from SAIC-Cambridge)#2781
TParcollet merged 25 commits intospeechbrain:developfrom
shucongzhang:libriheavy

shucongzhang commented Dec 9, 2024

Uh oh!

TParcollet commented Dec 9, 2024 •

edited

Loading

Uh oh!

Adel-Moumen left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Adel-Moumen commented Feb 7, 2025

Uh oh!

TParcollet left a comment

Uh oh!

TParcollet Feb 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shucongzhang commented Dec 9, 2024

What does this PR do?

Uh oh!

TParcollet commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Adel-Moumen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Adel-Moumen commented Feb 7, 2025

Uh oh!

TParcollet left a comment

Choose a reason for hiding this comment

Uh oh!

TParcollet Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TParcollet commented Dec 9, 2024 •

edited

Loading