Denoising Diffusion Probabilistic Models for SpeechBrain by flexthink · Pull Request #1599 · speechbrain/speechbrain

flexthink · 2022-10-10T05:00:11Z

This PR contains a basic implementation of Denoising Diffusion Probabilistic Models (DDPM) for SpeechBrain
https://arxiv.org/pdf/2006.11239.pdf

An example is provided to generate MEL-spectrograms using the AudioMNIST dataset
It also contains an implementation of DiffWave by @BenoitWang

…e library

…quirements

…hbrain into diffusion-direct

Change spec_norm_std to 1 Change annealing to WarmCoolDecayLRSchedule

…hbrain into diffusion-direct

BenoitWang · 2023-07-31T17:07:20Z

I just re-ran the fastspeech training to check the shapes and I found squeeze(-1) is correct. Everything works fine, thanks @flexthink, a good catch!

…hbrain into diffusion-direct

mravanelli · 2023-08-13T19:44:44Z

I did extensive tests. I ran the full diffusion recipes. Everything looks good to me now. The quality of the digit generated with diffusion is pretty high, considering that AudioMNIST is a small dataset. Only the quality of latent diffusion is quite low, but I think we can improve it in another PR. Thank you @flexthink and @BenoitWang for this great work!

flexthink added 16 commits September 16, 2022 14:25

Diffusion: Initial recipe for AudioMNIST + wrapper for the HuggingFac…

08208ba

…e library

Add sample generation

fb13554

Diffusion: Add a Tranpose

3716b11

Diffusion: Some adjustments

8e83fd2

Diffusion: Decompose normalization for debugging convenience

ee52515

Diffusion: More changes

e9e6be0

Diffusion: Custom diffusion impementation for SpeechBrain

482da0f

Diffusion: Fixes

9ca1aa5

Diffusion: Implement stats, metrics and Tensorboard

95111a8

Diffusion: Re-enable generation

a758182

Diffusion: fixes, additional logging

0dec211

Diffusion: cosmetic changes

b903c4d

Diffusion: Cosmetic changes, fixed unit tests

8750859

Diffusion: Attribution

83975a6

Diffusion: Clean up OpenAI UNet, add a copyright notice per OpenAI re…

669e280

…quirements

Diffusion: Consistency tests, README

bf66e28

flexthink changed the title ~~Denoting Diffusion Probabilistic Models for SpeechBrain~~ Denoising Diffusion Probabilistic Models for SpeechBrain Oct 10, 2022

flexthink and others added 13 commits October 11, 2022 22:31

Merge branch 'develop' into diffusion-direct

4123ff6

Diffusion: cosmetic changes, remove gradient checkpointing

6233d14

Merge branch 'diffusion-direct' of https://github.com/flexthink/speec…

4938a59

…hbrain into diffusion-direct

Diffusion: Cosmetic changes

24edc97

Diffusion: Cosmetic changes

10150d4

Diffusion: Remove an obsolete parameter to pass yaml consistency

32f8296

Diffusion: Stability improvements

151114b

Change spec_norm_std to 1 Change annealing to WarmCoolDecayLRSchedule

Diffusion: Add a generic loss scheduler, set it up

5fe372e

Diffusion: Temporary workaround for the torch 1.12 AdamW issue

faeda39

Diffusion: Add latent diffusion with a VAE

d1b5f56

Diffusion: Changed the default output directory

a4ee219

Diffusion: Latent diffusion updates/fixes/improvements

96a6368

Diffusion: Fixes

04f02fa

flexthink and others added 7 commits July 30, 2023 17:19

Merge branch 'diffusion-direct' of https://github.com/flexthink/speec…

9ecfcad

…hbrain into diffusion-direct

Diffusion: Fix typos

c5ed982

Diffusion: Push DoneDetector into the core

8ffc683

Diffusion: Fix a comment

416684b

Diffusion: Cosmetic fixes, renames

2e1045b

add exemples, small fixes

2dd361f

fix conflict

eddff7a

BenoitWang reviewed Jul 31, 2023

View reviewed changes

flexthink and others added 19 commits July 31, 2023 20:03

Diffusion: Add docstrings

6bd3baa

Merge branch 'diffusion-direct' of https://github.com/flexthink/speec…

79738cd

…hbrain into diffusion-direct

Diffuson: Add doctests

c67f84d

Diffusion: Add doctests to unet, update README.

9f03d38

Diffusion: Cosmetic updates

c8edc66

Diffusion: Fixes re: metadata

5ba182a

Diffusion: Metadata update/clean-up

0612061

Diffusion: Fix recipe tests and make them work with existing examples

0552121

update interface, readme, etc.

ce92f79

Merge branch 'diffusion-direct' of https://github.com/flexthink/speec…

f7ffa39

…hbrain into diffusion-direct

update dropbox

7b76e2c

Merge remote-tracking branch 'upstream/develop' into diffusion-direct

9ba37a5

fix data preparation + other fixes

9e0ec36

reduce batch size by default to make sure it runs on a 32 GB GPU

cfe0ef4

Update README.md

432c482

Update README.md

e33fd71

remove unnecessary files

d2420a2

MAdmixture: Fix zero test loss

ab9e826

Update README.md

cea36b4

mravanelli approved these changes Aug 13, 2023

View reviewed changes

mravanelli merged commit 19afe0f into speechbrain:develop Aug 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Denoising Diffusion Probabilistic Models for SpeechBrain#1599

Denoising Diffusion Probabilistic Models for SpeechBrain#1599
mravanelli merged 179 commits intospeechbrain:developfrom
flexthink:diffusion-direct

flexthink commented Oct 10, 2022 •

edited

Loading

Uh oh!

BenoitWang Jul 31, 2023

Uh oh!

mravanelli commented Aug 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

flexthink commented Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenoitWang Jul 31, 2023

Choose a reason for hiding this comment

Uh oh!

mravanelli commented Aug 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

flexthink commented Oct 10, 2022 •

edited

Loading