Skip to content

update to explict RMSNorm#449

Open
mserranos wants to merge 5 commits into
mainfrom
mserrano/rms_norm
Open

update to explict RMSNorm#449
mserranos wants to merge 5 commits into
mainfrom
mserrano/rms_norm

Conversation

@mserranos
Copy link
Copy Markdown
Collaborator

We would like to clean up the code a little bit by switching to an explicit RMSNorm

Comment thread fms/models/granite.py
emb_v = self.config.emb_dim // self.config.nheads

self.ln = LayerNormParameterized(
self.ln = RMSNorm(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these params need to be updated. The pytorch contract is:

RMSNorm(normalized_shape, eps=None, elementwise_affine=True, device=None, dtype=None)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I wish that we could switch directly to torch.nn.RMSNorm, but currently that gives some deeptools compilation problems. For the time being this is our "RMSNorm" which is similar to the torch.nn.RMSNorm but it is decomposed
differently.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should call this GraniteRMSNorm, this is not a general RMSNorm in the sense of torch.nn.RMSNorm

Mauricio J Serrano and others added 4 commits July 15, 2025 07:08
Signed-off-by: Mauricio J Serrano <mserrano@us.ibm.com>
Signed-off-by: Mauricio J Serrano <mserrano@us.ibm.com>
Signed-off-by: Mauricio J Serrano <mserrano@us.ibm.com>
Signed-off-by: kcirred <16872435+kcirred@users.noreply.github.com>
Signed-off-by: Mauricio J Serrano <mserrano@us.ibm.com>
@mserranos mserranos force-pushed the mserrano/rms_norm branch from 409d9e3 to 17b6583 Compare July 15, 2025 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants