Add snip_momentum structured pruning example with 80% sparsity ratio by ftian1 · Pull Request #348 · deepspeedai/DeepSpeedExamples

ftian1 · 2023-04-19T04:20:30Z

This PR is used to demonstrate the functionality of snip_momentum structured pruning algo implemented in here.

User can reproduce below result by running source ./bash_script/pruning_sparse_snip_momentum.sh with the PR mentioned at above.

pattern	sparsity ratio	pruning method	epochs	acc & mm-acc
1x1	80%	DeepSpeed L1	2	0.8113/0.822
1x1	80%	Snip_momentum	2	0.8176/0.822
4x1	80%	snip_momentum	10	0.8248/0.8305

xiaoxiawu-microsoft · 2023-05-08T16:59:11Z

@ftian1, thanks for your patience. I have tested the method and it looks good. I will do some minor changes for the checkpoint saving.

* add mask for generation, otherwise the generation is broken (#468) * Update model_utils.py (#471) * Support huggyllama/llama-7b with DSPipeline (#484) During token encoding for the huggyllama/llama-7b model, self.tokenizer.batch_encode_plus returns a token_type_ids kwarg that isn't recognized in the model's generate function. This is due to AutoTokenizer returning LlamaTokenizerFast instead of LlamaTokenizer as seen in the model tokenizer config: https://huggingface.co/huggyllama/llama-7b/blob/8416d3fefb0cb3ff5775a7b13c1692d10ff1aa16/tokenizer_config.json#L24 As a workaround, the DSPipeline will check to see if the tokenizer is LlamaTokenizerFast and only pass in input_tokens.input_ids to the self.model.generate(...) call if that is the case. * Add snip_momentum structured pruning example with 80% sparsity ratio (#348) --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com> Co-authored-by: Tian, Feng <feng.tian@intel.com>

ftian1 · 2023-05-11T07:41:18Z

@xiaoxiawu-microsoft appreciate for your effort

* add mask for generation, otherwise the generation is broken * . * merge master (#504) * add mask for generation, otherwise the generation is broken (#468) * Update model_utils.py (#471) * Support huggyllama/llama-7b with DSPipeline (#484) During token encoding for the huggyllama/llama-7b model, self.tokenizer.batch_encode_plus returns a token_type_ids kwarg that isn't recognized in the model's generate function. This is due to AutoTokenizer returning LlamaTokenizerFast instead of LlamaTokenizer as seen in the model tokenizer config: https://huggingface.co/huggyllama/llama-7b/blob/8416d3fefb0cb3ff5775a7b13c1692d10ff1aa16/tokenizer_config.json#L24 As a workaround, the DSPipeline will check to see if the tokenizer is LlamaTokenizerFast and only pass in input_tokens.input_ids to the self.model.generate(...) call if that is the case. * Add snip_momentum structured pruning example with 80% sparsity ratio (#348) --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com> Co-authored-by: Tian, Feng <feng.tian@intel.com> * . * change eos * . * resolve tokenizer issue * add script * . --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com> Co-authored-by: Tian, Feng <feng.tian@intel.com>

…eepspeedai#348)

…348)

Add snip_momentum structured pruning example with 80% sparsity ratio

f453441

ftian1 requested review from RezaYazdaniAminabadi, ShadenSmith, arashb, awan-10, cli99, conglongli, duli2012, eltonzheng, jeffra, minjiaz, mrwyattii, samyam, tjruwase, xiaoxiawu-microsoft and yaozhewei as code owners April 19, 2023 04:20

ftian1 mentioned this pull request Apr 19, 2023

Add snip_momentum structured pruning which supports higher sparse ratio deepspeedai/DeepSpeed#3300

Merged

yaozhewei assigned yaozhewei and xiaoxiawu-microsoft Apr 24, 2023

xiaoxiawu-microsoft merged commit 2ec4be7 into deepspeedai:master May 8, 2023

Syulin7 pushed a commit to Syulin7/DeepSpeedExamples that referenced this pull request May 15, 2023

Add snip_momentum structured pruning example with 80% sparsity ratio (d…

26efcc3

…eepspeedai#348)

leocnj pushed a commit to leocnj/DeepSpeedExamples that referenced this pull request May 27, 2023

Add snip_momentum structured pruning example with 80% sparsity ratio (d…

d623188

…eepspeedai#348)

hwchen2017 pushed a commit that referenced this pull request Jun 8, 2025

Add snip_momentum structured pruning example with 80% sparsity ratio (#…

b424148

…348)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add snip_momentum structured pruning example with 80% sparsity ratio#348

Add snip_momentum structured pruning example with 80% sparsity ratio#348
xiaoxiawu-microsoft merged 1 commit into
deepspeedai:masterfrom
ftian1:master

ftian1 commented Apr 19, 2023

Uh oh!

xiaoxiawu-microsoft commented May 8, 2023

Uh oh!

ftian1 commented May 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ftian1 commented Apr 19, 2023

Uh oh!

xiaoxiawu-microsoft commented May 8, 2023

Uh oh!

ftian1 commented May 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants