Quaternion Networks - Bug Fixes & Improvements by Drew-Wagner · Pull Request #2464 · speechbrain/speechbrain

Drew-Wagner · 2024-03-19T15:48:52Z

What does this PR do?

This PR fixes several bugs which prevented the use of the quaternion network modules, and completes the collection by implementing avg and max pooling. No tests existed for quaternion networks. This PR introduces a minimum (and incomplete) set of tests.

Several bugs were present in the existing quaternion network modules and are fixed by this PR:

Bias was uninitialized for Conv2d and Conv1d when bias=False was set, leading to nan values in forward and / or backward pass.
QBatchNorm training implementation did not match evaluation implementation: the mean was not being subtracted from the input.
In QBatchNorm, a view was being used incorrectly to broadcast tensors together.
Performance issue with QBatchNorm: torch.rsqrt is several times faster than 1 / torch.sqrt (I measured between a 3-8x speedup)
Groups for QConv modules were broken: the input_channels for the weights must be divided by the # of groups.

Several adjustments were made to improve compatibility of the QConv interface with regular convolution modules:

The swap option was added for QConv2d
max_norm option was added for QConv and QLinear modules
The bias parameter / buffer was renamed from .b to .bias

A QPooling2d module is added which implements:

component-wise average pooling
max pooling by magnitude of the quaternions

Breaking Changes:

The bias parameter of QConv and QLinear modules was renamed from .b to .bias, however given the number of bugs present, it seems unlikely that anyone was depending on this.
The # of input & output channels for QConv modules must be divisible by the # of groups.

Before submitting

Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
Review the self-review checklist to ensure the code is ready for review

TParcollet

Thanks! LGTM, see my questions.

TParcollet · 2024-03-27T12:51:19Z

        # (batch, channel, time)
        x = x.transpose(1, -1)
+
+        if self.max_norm is not None:


is renorming individual quaternion components actually is strictly equivalent to renorming the quaternion?

No I don't believe it is strictly equivalent. I couldn't find any references for how to approach it so I went with the simplest idea which was the component-wise renorm.

mravanelli · 2024-04-13T02:02:47Z

@Drew-Wagner, could you please fix the conflicts, merge the latest development, and do the last modifications?

Drew-Wagner · 2024-04-14T14:56:58Z

+def renorm_quaternion_weights_inplace(
+    r_weight, i_weight, j_weight, k_weight, max_norm
+):
+    """Renorms the magnitude of the quaternion-valued weights.
+
+    Arguments
+    ---------
+    r_weight : torch.Parameter
+    i_weight : torch.Parameter
+    j_weight : torch.Parameter
+    k_weight : torch.Parameter
+    max_norm : float
+        The maximum norm of the magnitude of the quaternion weights
+    """
+    weight_magnitude = torch.sqrt(
+        r_weight.data**2
+        + i_weight.data**2
+        + j_weight.data**2
+        + k_weight.data**2
+    )
+    renormed_weight_magnitude = torch.renorm(
+        weight_magnitude, p=2, dim=0, maxnorm=max_norm
+    )
+    factor = renormed_weight_magnitude / weight_magnitude
+
+    r_weight.data *= factor
+    i_weight.data *= factor
+    j_weight.data *= factor
+    k_weight.data *= factor


@TParcollet Please review this implementation which renorms the weights according to the magnitude of the quaternions, rather than by individual components

A view was incorrectly being applied to broadcast tensors together

- adds max_norm option

- max_norm - swap - rename .b -> .bias

- in_channels must be divided by groups when creating kernels - Add checks to ensure divisibility

- the mean was not being subtracted from the input

- rqsrt was 8x faster

mravanelli · 2024-07-19T12:36:47Z

Thank you @Drew-Wagner for these fixes!

Adel-Moumen requested a review from TParcollet March 19, 2024 15:51

mravanelli assigned Drew-Wagner Mar 19, 2024

mravanelli added the bug Something isn't working label Mar 19, 2024

TParcollet reviewed Mar 27, 2024

View reviewed changes

Drew-Wagner force-pushed the quaternion-network-improvements branch from a95986e to d7b5b7f Compare April 14, 2024 14:55

Drew-Wagner commented Apr 14, 2024

View reviewed changes

Drew-Wagner added 20 commits April 17, 2024 16:54

Fix bug with quaternion normalization

bdd91f3

A view was incorrectly being applied to broadcast tensors together

Improve compatibility of QLinear with Linear

32e3172

- adds max_norm option

Implement QPooling2d for max and avg pooling

b10da6b

Improve compatibility of QConv modules with regular Conv

7eaff6e

- max_norm - swap - rename .b -> .bias

Update docstrings

36c45d5

Fix formatting issues for flake8

9e2d58d

Fix bug with q_CNN groups

3be6725

- in_channels must be divided by groups when creating kernels - Add checks to ensure divisibility

Fix bias initialization

b854864

Tests QLinear

2794e67

Test QPooling2d

149ac80

Fix bug with QBatchNorm and test

86640fa

- the mean was not being subtracted from the input

Speedup QBatchNorm in training by using rsqrt

fb233c0

- rqsrt was 8x faster

Test average magnitude after QBatchNorm

0f8d964

Tests for QConv2d

da845a9

Fix bug with batch norm during eval

b6b7c00

Fix performance by calculating denominator before cat

ce3fb70

use .repeat instead of torch.cat in q_normalization

579858c

Add skip transpose to q_CNN.QConv2d

64d0d69

renorm quaternion weights by magnitude

0d06912

fix black formatting

0697c73

Drew-Wagner force-pushed the quaternion-network-improvements branch from d7b5b7f to 0697c73 Compare April 17, 2024 20:55

Merge branch 'develop' into quaternion-network-improvements

aca8c47

mravanelli self-requested a review July 18, 2024 23:22

mravanelli approved these changes Jul 18, 2024

View reviewed changes

mravanelli merged commit ea8a398 into speechbrain:develop Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quaternion Networks - Bug Fixes & Improvements#2464

Quaternion Networks - Bug Fixes & Improvements#2464
mravanelli merged 21 commits intospeechbrain:developfrom
Drew-Wagner:quaternion-network-improvements

Drew-Wagner commented Mar 19, 2024

Uh oh!

TParcollet left a comment

Uh oh!

TParcollet Mar 27, 2024

Uh oh!

Drew-Wagner Apr 5, 2024

Uh oh!

Uh oh!

Uh oh!

mravanelli commented Apr 13, 2024

Uh oh!

Drew-Wagner Apr 14, 2024

Uh oh!

mravanelli commented Jul 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Drew-Wagner commented Mar 19, 2024

What does this PR do?

PR review

Uh oh!

TParcollet left a comment

Choose a reason for hiding this comment

Uh oh!

TParcollet Mar 27, 2024

Choose a reason for hiding this comment

Uh oh!

Drew-Wagner Apr 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mravanelli commented Apr 13, 2024

Uh oh!

Drew-Wagner Apr 14, 2024

Choose a reason for hiding this comment

Uh oh!

mravanelli commented Jul 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants