Skip to content

Extending Tensor Parallelism for IBM FMS: Sequence Parallelism#455

Open
Sibi-Git wants to merge 48 commits into
foundation-model-stack:mainfrom
HPML-Team9:main
Open

Extending Tensor Parallelism for IBM FMS: Sequence Parallelism#455
Sibi-Git wants to merge 48 commits into
foundation-model-stack:mainfrom
HPML-Team9:main

Conversation

@Sibi-Git
Copy link
Copy Markdown

This project extends the IBM Foundation Model Stack (FMS) to support both Tensor Parallelism (TP) and Sequence Parallelism (SP) in distributed model inference. While TP enables parameter sharding across GPUs, it does not partition the sequence dimension, which leads to memory inefficiency at long sequence lengths. We address this by integrating SP into normalization layers, optimizing layout transitions, and enabling support for non-divisible and short sequence lengths.

Members:

  • Maria Surani (ms7019)
  • Ryan Ghosh (rg3681)
  • Sibi Marappan (sm5726)

aw471 and others added 30 commits December 19, 2024 14:06
Automate tensor parallel plan generation, sequence parallelismn supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants