Recipe for SEP-28k dataset by IliasMAOUDJ · Pull Request #2574 · speechbrain/speechbrain

IliasMAOUDJ · 2024-06-18T08:15:10Z

What does this PR do?

This PR adds the SEP-28k dataset (https://github.com/apple/ml-stuttering-events-dataset) to the list of recipes with the partitioning suggested in https://rdcu.be/dK8Ei (https://github.com/th-nuernberg/ml-stuttering-events-dataset-extended).
Additionally, we provide a repository to download the deleted podcast "StrongVoices" and "IStutterSoWhat", so that every researcher can have access to the full dataset.
We provide a minimal working example for training.

Additional: This adds a new task to SpeechBrain which is Stuttering Event Detection and/or Pathological Speech Detection. Other recipes could be added (FluencyBank, UCLASS, ...) if SpeechBrain reviewers are interested.

Before submitting

Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
Review the self-review checklist to ensure the code is ready for review

…nto SEP28k

This reverts commit 8ea99e0, reversing changes made to 5de0026.

This reverts commit 5de0026.

IliasMAOUDJ · 2024-06-19T15:49:52Z

unittests and doctests pass succesfully but linters fails on files that I didn't change.

asumagic · 2024-06-25T12:22:13Z

Note: CI is currently failing outside of the PR, when #2581 is merged (will notify) do update the fork branch against develop.

asumagic · 2024-06-25T12:31:25Z

Note: CI is currently failing outside of the PR, when #2581 is merged (will notify) do update the fork branch against develop.

Done and it seems to work, you can now do that.

…into SEP28k

This reverts commit bff9188.

This reverts commit dffba97.

asumagic · 2024-07-02T08:20:06Z

Note: CI is currently failing outside of the PR, when #2581 is merged (will notify) do update the fork branch against develop.

Done and it seems to work, you can now do that.

Sorry, you need a new develop merge again, the CI fix had broken.

This reverts commit 025da37, reversing changes made to 0691acd.

This reverts commit 94cec8d.

IliasMAOUDJ · 2024-07-02T13:04:56Z

Thank you for updating.

pplantinga

Thanks for adding this recipe for a new task, will be great to have in the toolkit.

I was not able to run this recipe yet, when I tried, I got FileNotFoundError: [Errno 2] No such file or directory: 'manifests/train.csv' which suggests that the recipe is not creating the manifest files as our recipes typically do. Please add code to automatically create the manifest, unless I'm somehow running the recipe wrongly.

I will try to run again once the comments here are addressed.

@pplantinga

Most changes here follow the comments made by @pplantinga . TODO: Look into BinaryMetrics for score computation.

IliasMAOUDJ · 2025-01-14T16:07:57Z

Thank you for your review. All comments were addressed. Hopefully the changes can be accepted.

Concerning the task name, I suggest "Stuttering-Detection" which is straightforward and precise. It can be considered as a subcategory of Fluency/Speech Impairments if the toolkit ever welcome other datasets for specific impairments (which can be produced by various pathologies such as Parkinson's Disease, Alzheimer's Disease, etc).

pplantinga

Okay, I've made a few fixes, and the recipe seems to work now, recipe tests and all.

Please make sure the recipe is still running according to your expectations and address any lingering questions. Then this is ready for merge.

pplantinga · 2025-01-14T19:36:29Z

+Run the following command to train the model:
+`python train.py hparams/train.yaml`
+
+Note that this is a minimal working example. The model and training parameters should be modified accordingly.


Do you have results to share with a bigger/better model? A brief discussion of results achieved with the recipe is often included in the README.

Thanks for your fixes, I ran the recipe after your changes and it works as intended. You can merge it.

I have not developed a better model compared to the literature yet, however I can point to papers dealing with this task.
I am using the dataset for research purpose but not to develop "a better model", this may involve future development for the recipe, we can discuss it in private if you'd like more information.

IliasMAOUDJ and others added 13 commits June 17, 2024 17:33

Setup a working example

2a36fff

documentation and fix useless elements

ddf8b7f

fix yaml file

5de0026

Setup recipe test (PASSED)

b669935

Fix files

0864998

Merge branch 'speechbrain:develop' into SEP28k

8ea99e0

Merge branch 'SEP28k' of https://github.com/IliasMAOUDJ/speechbrain i…

f380b3b

…nto SEP28k

Revert "Merge branch 'speechbrain:develop' into SEP28k"

ebd3890

This reverts commit 8ea99e0, reversing changes made to 5de0026.

FIX prepare file

a2d63a5

Revert "fix yaml file"

6b7c3af

This reverts commit 5de0026.

fix speechtokenizer doctest not passing because of commented line

0ba0c9e

revert changes

b8fc7c3

fix errors in test

c8f82ec

Update README.md

0123455

IliasMAOUDJ added 5 commits June 25, 2024 15:14

Merge branch 'develop' of https://github.com/speechbrain/speechbrain …

d1e54d5

…into SEP28k

regen

dffba97

clean

bff9188

Revert "clean"

befd4cf

This reverts commit bff9188.

Revert "regen"

9342900

This reverts commit dffba97.

IliasMAOUDJ added 3 commits July 2, 2024 12:50

Merge branch 'SEP28k' into develop

025da37

Revert "Merge branch 'SEP28k' into develop"

94cec8d

This reverts commit 025da37, reversing changes made to 0691acd.

Revert "Revert "Merge branch 'SEP28k' into develop""

d979193

This reverts commit 94cec8d.

IliasMAOUDJ added 3 commits July 12, 2024 10:36

Merge branch 'develop' into SEP28k

a297a0b

Merge branch 'develop' into SEP28k

c53528b

Merge branch 'develop' into SEP28k

0f37213

Merge branch 'develop' into SEP28k

d21c40e

pplantinga requested changes Jan 12, 2025

View reviewed changes

IliasMAOUDJ and others added 4 commits January 14, 2025 15:42

Merge branch 'develop' into SEP28k

54ee717

Changes following request by reviewer

830c188

Most changes here follow the comments made by @pplantinga . TODO: Look into BinaryMetrics for score computation.

Use BinaryMetricStats for score computation

1706a9c

Harmonize the task name in recipes and tests folders

4cdf58b

pplantinga added 4 commits January 14, 2025 12:39

Move Stuttering-Detection to stuttering-detection

2c715cf

Add the missed path to the README file for SEP-28k

cd1e72f

Harmonize the recipe with the recipe test

6a27e1b

Revert change to voxceleb recipe not relevant to SEP28

75462bb

pplantinga added the recipes Changes to recipes only (add/edit) label Jan 14, 2025

pplantinga approved these changes Jan 14, 2025

View reviewed changes

pplantinga merged commit 388995a into speechbrain:develop Jan 16, 2025

Conversation

IliasMAOUDJ commented Jun 18, 2024 • edited by pplantinga Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

PR review

Uh oh!

IliasMAOUDJ commented Jun 19, 2024

Uh oh!

asumagic commented Jun 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asumagic commented Jun 25, 2024

Uh oh!

asumagic commented Jul 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IliasMAOUDJ commented Jul 2, 2024

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IliasMAOUDJ commented Jan 14, 2025

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

pplantinga Jan 14, 2025

Choose a reason for hiding this comment

Uh oh!

IliasMAOUDJ Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

IliasMAOUDJ commented Jun 18, 2024 •

edited by pplantinga

Loading

asumagic commented Jun 25, 2024 •

edited

Loading

asumagic commented Jul 2, 2024 •

edited

Loading