Skip to content

Add snip_momentum structured pruning example with 80% sparsity ratio#348

Merged
xiaoxiawu-microsoft merged 1 commit into
deepspeedai:masterfrom
ftian1:master
May 8, 2023
Merged

Add snip_momentum structured pruning example with 80% sparsity ratio#348
xiaoxiawu-microsoft merged 1 commit into
deepspeedai:masterfrom
ftian1:master

Conversation

@ftian1
Copy link
Copy Markdown
Contributor

@ftian1 ftian1 commented Apr 19, 2023

This PR is used to demonstrate the functionality of snip_momentum structured pruning algo implemented in here.

User can reproduce below result by running source ./bash_script/pruning_sparse_snip_momentum.sh with the PR mentioned at above.

pattern sparsity ratio pruning method epochs acc & mm-acc
1x1   80% DeepSpeed L1     2   0.8113/0.822
1x1   80% Snip_momentum 2   0.8176/0.822
4x1   80% snip_momentum 10 0.8248/0.8305

@xiaoxiawu-microsoft
Copy link
Copy Markdown
Contributor

@ftian1, thanks for your patience. I have tested the method and it looks good. I will do some minor changes for the checkpoint saving.

yaozhewei added a commit that referenced this pull request May 9, 2023
* add mask for generation, otherwise the generation is broken (#468)

* Update model_utils.py (#471)

* Support huggyllama/llama-7b with DSPipeline (#484)

During token encoding for the huggyllama/llama-7b model, self.tokenizer.batch_encode_plus returns a token_type_ids kwarg that isn't recognized in the model's generate function.

This is due to AutoTokenizer returning LlamaTokenizerFast instead of LlamaTokenizer as seen in the model tokenizer config:
https://huggingface.co/huggyllama/llama-7b/blob/8416d3fefb0cb3ff5775a7b13c1692d10ff1aa16/tokenizer_config.json#L24

As a workaround, the DSPipeline will check to see if the tokenizer is LlamaTokenizerFast and only pass in input_tokens.input_ids to the self.model.generate(...) call if that is the case.

* Add snip_momentum structured pruning example with 80% sparsity ratio (#348)

---------

Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com>
Co-authored-by: Tian, Feng <feng.tian@intel.com>
@ftian1
Copy link
Copy Markdown
Contributor Author

ftian1 commented May 11, 2023

@xiaoxiawu-microsoft appreciate for your effort

yaozhewei added a commit that referenced this pull request May 12, 2023
* add mask for generation, otherwise the generation is broken

* .

* merge master (#504)

* add mask for generation, otherwise the generation is broken (#468)

* Update model_utils.py (#471)

* Support huggyllama/llama-7b with DSPipeline (#484)

During token encoding for the huggyllama/llama-7b model, self.tokenizer.batch_encode_plus returns a token_type_ids kwarg that isn't recognized in the model's generate function.

This is due to AutoTokenizer returning LlamaTokenizerFast instead of LlamaTokenizer as seen in the model tokenizer config:
https://huggingface.co/huggyllama/llama-7b/blob/8416d3fefb0cb3ff5775a7b13c1692d10ff1aa16/tokenizer_config.json#L24

As a workaround, the DSPipeline will check to see if the tokenizer is LlamaTokenizerFast and only pass in input_tokens.input_ids to the self.model.generate(...) call if that is the case.

* Add snip_momentum structured pruning example with 80% sparsity ratio (#348)

---------

Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com>
Co-authored-by: Tian, Feng <feng.tian@intel.com>

* .

* change eos

* .

* resolve tokenizer issue

* add script

* .

---------

Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com>
Co-authored-by: Tian, Feng <feng.tian@intel.com>
Syulin7 pushed a commit to Syulin7/DeepSpeedExamples that referenced this pull request May 15, 2023
leocnj pushed a commit to leocnj/DeepSpeedExamples that referenced this pull request May 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants