Skip to content

Generic adapters implementation#2563

Merged
TParcollet merged 56 commits intospeechbrain:developfrom
pplantinga:adapters
Sep 10, 2024
Merged

Generic adapters implementation#2563
TParcollet merged 56 commits intospeechbrain:developfrom
pplantinga:adapters

Conversation

@pplantinga
Copy link
Copy Markdown
Collaborator

@pplantinga pplantinga commented Jun 5, 2024

Closes #2526, #2534

Here is a proposal for how we can add adapters (including LoRA) to the toolkit. This branch is based on #2534 - and it also implements flexible layer selection and small checkpoints.

There's a few more things that would be nice to have but I personally don't think they're necessary before merge.

  • a merge_and_unload() type function for LoRA-type layers that reintegrates the adapter weights to the original model
  • the capability to use adapters from peft library -- they have an extensive collection that will probably update regularly
  • more adapter types

If anyone thinks these are urgent we can work on adding them to this PR.

UPDATE (see below): This works with PEFT now.

@TParcollet
Copy link
Copy Markdown
Collaborator

TParcollet commented Jun 6, 2024

@pplantinga are the checkpointing features working as well with this easy peft adaptation? We should make sure it works with Pretrainer also, not just checkpointing I blieve.

@mravanelli mravanelli added the enhancement New feature or request label Jun 17, 2024
@TParcollet
Copy link
Copy Markdown
Collaborator

TParcollet commented Jul 8, 2024

@Adel-Moumen @mravanelli I think we will want this in v1.0.1 And it looks ready to me?

@TParcollet
Copy link
Copy Markdown
Collaborator

@poonehmousavi could you review and test the code as mentioned? It looks ready to me. Thanks!

@poonehmousavi
Copy link
Copy Markdown
Collaborator

@poonehmousavi could you review and test the code as mentioned? It looks ready to me. Thanks!

Sure. I will do it by tomorrow.

@poonehmousavi
Copy link
Copy Markdown
Collaborator

@pplantinga have you tested it with pretrainer using for interfaces? also have you checked how it works with quantization?(like QLORA)

@pplantinga
Copy link
Copy Markdown
Collaborator Author

pplantinga commented Jul 13, 2024

@pplantinga have you tested it with pretrainer using for interfaces?

I tested this and it worked, but had warnings due to loading only trained params. I have fixed this now.

The yaml I used is here:

whisper_hub: openai/whisper-small.en
lora_rank: 16
language: "english"
sample_rate: 16000

min_decode_ratio: 0.0
max_decode_ratio: 1.0
test_beam_size: 8

whisper_pretrained: !new:speechbrain.lobes.models.huggingface_transformers.whisper.Whisper
    source: !ref <whisper_hub>
    save_path: .
    language: !ref <language>
    task: "transcribe"
    sampling_rate: !ref <sample_rate>

whisper: !new:speechbrain.nnet.adapters.AdaptedModel
    model_to_adapt: !ref <whisper_pretrained>
    adapter_class: !name:speechbrain.nnet.adapters.LoRA
    all_linear: True
    adapter_kwargs:
        rank: !ref <lora_rank>

test_search: !new:speechbrain.decoders.seq2seq.S2SWhisperBeamSearcher
    module: [!ref <whisper>]
    min_decode_ratio: !ref <min_decode_ratio>
    max_decode_ratio: !ref <max_decode_ratio>
    beam_size: !ref <test_beam_size>

modules:
    whisper: !ref <whisper>
    decoder: !ref <test_search>

pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer
    loadables:
        whisper: !ref <whisper>

And python:

model = sb.inference.ASR.WhisperASR.from_hparams(".", "lora_pre.yaml", savedir="results/whisper/1987/save/CKPT+2024-06-05+18-30-33+00")
model.transcribe_file("speechbrain/asr-streaming-conformer-librispeech/test-en.wav")

also have you checked how it works with quantization?(like QLORA)

I am not very familiar with QLoRA, it seems there's additional setup needed to get this to work.

@pplantinga
Copy link
Copy Markdown
Collaborator Author

One epoch (100h) results for Whisper Small.en, published results are test-clean=3.05 and test-other=7.53:

speechbrain.utils.train_logger - Epoch loaded: 1 - test loss: 9.73e-01, test CER: 1.03, test WER: 2.81
speechbrain.utils.train_logger - Epoch loaded: 1 - test loss: 9.86e-01, test CER: 1.08, test WER: 2.90
speechbrain.utils.train_logger - Epoch loaded: 1 - test loss: 1.22, test CER: 3.00, test WER: 6.57

@TParcollet
Copy link
Copy Markdown
Collaborator

TParcollet commented Sep 4, 2024

@pplantinga Should we merge this maybe? Maybe with a small tutorial somewhere as well?

@pplantinga
Copy link
Copy Markdown
Collaborator Author

There are more features that could be added but I think this is ready for merge as-is and the rest can be added later.

@pplantinga pplantinga added this to the v1.1.0 milestone Sep 10, 2024
Copy link
Copy Markdown
Collaborator

@TParcollet TParcollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adapters + LLama -- re-design.

4 participants