Skip to content

Commit 1fed12e

Browse files
conglonglijeffra
andauthored
Update documentation about Megatron examples (deepspeedai#141)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
1 parent 2467552 commit 1fed12e

1 file changed

Lines changed: 2 additions & 0 deletions

File tree

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ This repo contains example models that use [DeepSpeed](https://github.com/micros
44

55
# Note on Megatron examples
66

7+
NOTE: We are in the process of deprecating the 3 Megatron-LM snapshots in this repo. Our current and future features with Megatron-LM will use the [Megatron-DeepSpeed fork](https://github.com/microsoft/Megatron-DeepSpeed). Currently the Megatron-DeepSpeed fork supports 3D parallelism + ZeRO Stage 1 and Curriculum Learning. Please see this new fork for further updates in the process.
8+
79
Megatron-LM : This is a fairly old snapshot of Megatron-LM , and we have been using it show case the earlier features of DeepSpeed. This does not contain ZeRO-3 or 3D parallelism.
810

911
Megatron-LM-v1.1.5-3D_parallelism: This is a relatively new Megatron (Oct 2020), but before Megatron started supporting 3D parallelism. We ported this version to showcase how to use 3D parallelism inside DeepSpeed with Megatron.

0 commit comments

Comments
 (0)