[FIX] Flush gradients and save memory for validation. by MartinKocour · Pull Request #1739 · speechbrain/speechbrain

MartinKocour · 2022-12-03T21:37:22Z

When fit_train is done we should call optimizer.zero_grad(set_to_none=True) , which should force to clear the gradients from memory. In the current version of the code, the optimizer.zero_grad() is the last call, which however store zeros in memory.
This minor change should allow to use bigger batch during _fit_valid or evaluate and thus speed-it up. This is especially useful, when you use large models like SepFormer with significant memory footprint.

More details can be found in this PyTorch Tuning Guide.

MartinKocour · 2022-12-03T23:49:21Z

I also spotted that gradients are not accumulated correctly, so I fixed that too.

mravanelli · 2022-12-04T15:32:12Z

Thank you for this PR @MartinKocour. I think it can be useful. One thing I'm not sure about is why calling the function you defined (zero_grad) in fit_batch as well. As you can see, in core.py you still call the original function self.optimizer.zero_grad(), but you modified the recipes files such that they call zero_grad in their fit_batch . I'm not sure we need to delete the gradient buffers at the end of each batch (this might slow down?), but likely we really need to do it at the end of a training epoch only, right?

MartinKocour · 2022-12-04T18:30:54Z

Mirco @mravanelli, in recipes I am not deleting anything. My zero_grad(set_to_none=False) has the same behaviour like original torch.optimizer.zero_grad(), i.e. by default it does not delete the gradients from memory, but instead it just zeros the gradients when set_to_none=False (default).

You are right, I don't have to call self.zero_grad() in the recipes. I just didn't want to duplicate same lines of code. But it is not necessary, I can remove it. Although, the behaviour with self.zero_grad() is the same like with original code.

In core.py I did not use it since it is just single line of code + calling optimizer.zero_grad() is faster than calling Brain.zero_grad() since it contains additional if statement.

pplantinga

Nice change! I especially appreciate saving space for evaluation and fixing the gradient accumulation bug. Quick question, have you verified that this does in fact save space for evaluation and there's nothing else (e.g. manual garbage collection) that is needed?

This reverts commit 30e8def.

MartinKocour · 2022-12-07T21:41:31Z

Ready to be merged

pplantinga

Nice! LGTM

Flush gradients

3782017

MartinKocour requested review from Adel-Moumen and pplantinga December 3, 2022 21:37

Fix gradient accumulation.

14169c5

Edit formatting

c861b1d

MartinKocour requested a review from TParcollet December 3, 2022 23:56

MartinKocour added 3 commits December 3, 2022 21:06

Add zero grad function.

30e8def

Add docstring

4408740

Update docstring.

33f7834

pplantinga reviewed Dec 6, 2022

View reviewed changes

Comment thread speechbrain/core.py Outdated

MartinKocour added 2 commits December 6, 2022 11:07

Revert "Add zero grad function."

aece4b5

This reverts commit 30e8def.

Call custom zero_grad also inside fit_batch

095eb7b

MartinKocour changed the title ~~[WIP] Optimizer~~ [FIX] Flush gradients and save memory for validation. Dec 7, 2022

pplantinga approved these changes Dec 8, 2022

View reviewed changes

anautsch merged commit d7ce222 into speechbrain:develop Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Flush gradients and save memory for validation.#1739

[FIX] Flush gradients and save memory for validation.#1739
anautsch merged 8 commits intospeechbrain:developfrom
MartinKocour:fix/optimizer

MartinKocour commented Dec 3, 2022

Uh oh!

MartinKocour commented Dec 3, 2022

Uh oh!

mravanelli commented Dec 4, 2022

Uh oh!

MartinKocour commented Dec 4, 2022 •

edited

Loading

Uh oh!

pplantinga left a comment

Uh oh!

Uh oh!

MartinKocour commented Dec 7, 2022

Uh oh!

pplantinga left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

MartinKocour commented Dec 3, 2022

Uh oh!

MartinKocour commented Dec 3, 2022

Uh oh!

mravanelli commented Dec 4, 2022

Uh oh!

MartinKocour commented Dec 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MartinKocour commented Dec 7, 2022

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MartinKocour commented Dec 4, 2022 •

edited

Loading