Skip to content

Move files with optional dependencies to integrations folder#2782

Merged
Adel-Moumen merged 65 commits intospeechbrain:developfrom
pplantinga:integrations
May 27, 2025
Merged

Move files with optional dependencies to integrations folder#2782
Adel-Moumen merged 65 commits intospeechbrain:developfrom
pplantinga:integrations

Conversation

@pplantinga
Copy link
Copy Markdown
Collaborator

@pplantinga pplantinga commented Dec 10, 2024

In internal discussions, we decided to move as much of our external-dependent code as possible into a single folder for a couple of reasons:

  • so we could add disclaimers that this stuff could stop working at any time
  • additional testing requirements for code placed in this part of the repo, so we know if it stops working
  • testing is only run before each release, alongside recipe tests, rather than for every CI build
  • A readme and requirements file for using each integration

The basic concept is that we will add a python module speechbrain.integrations that contains any speechbrain blocks that depend on external libraries. So far we have:

  • speechbrain.lobes.models.huggingface_transformers => speechbrain.integrations.huggingface
  • speechbrain.wordemb => speechbrain.integrations.huggingface.wordemb
  • speechbrain.k2_integration => speechbrain.integrations.k2_fsa
  • speechbrain.lobes.models.discrete.speechtokenizer => speechbrain.integrations.audio_tokenizers.speechtokenizer_interface
  • speechbrain.lobes.models.discrete.wavtokenizer => speechbrain.integrations.audio_tokenizers.wavtokenizer_interface
  • speechbrain.lobes.models.flair => speechbrain.integrations.nlp
  • speechbrain.lobes.models.spacy => speechbrain.integrations.nlp
  • speechbrain.decoders.language_model => speechbrain.integrations.decoders.kenlm_scorer
  • speechbrain.alignment.ctc_segmentation => speechbrain.integrations.alignment.ctc_segmentation
  • speechbrain.lobes.models.kmeans => speechbrain.integrations.audio_tokenizers.kmeans
  • speechbrain.utils.bleu => speechbrain.integrations.nlp.bleu
  • speechbrain.processing.diarization => speechbrain.integrations.alignment.diarization
  • speechbrain.nnet.losses.transducer => speechbrain.integrations.transducer

Deprecations:

  • speechbrain.lobes.models.fairseq_wav2vec
  • speechbrain.utils.kmeans

After this approach is verified, I will:

  • Find and replace all deprecated references to the corresponding new paths
  • Update path for any integrations that I've missed, e.g. kmeans, audio tokenizers (now merged)
  • Add documentation to explain how to add to this folder, testing requirements, etc.

We need to ensure all integrated third-party folders have at least one doctest. Still needed:

  • spacyNLP
  • flairNLP

@poonehmousavi
Copy link
Copy Markdown
Collaborator

Thanks @pplantinga … this change would be really helpful in adding new tokenizers. Following are also needed to be transferred to integration folder:

@pplantinga
Copy link
Copy Markdown
Collaborator Author

Thanks @pplantinga … this change would be really helpful in adding new tokenizers. Following are also needed to be transferred to integration folder:

I've added discrete to the list above, but I don't see any external imports in beats.py, does this code depend on an external library? IMO if the code is now internal (even if the original code was not) we don't need to put it in integrations

@pplantinga pplantinga self-assigned this Dec 10, 2024
@pplantinga pplantinga added the correctness Functionality not objectively broken, but may be surprising or wrong e.g. regarding literature label Dec 10, 2024
@pplantinga
Copy link
Copy Markdown
Collaborator Author

I've added discrete to the list above, but I don't see any external imports in beats.py, does this code depend on an external library? IMO if the code is now internal (even if the original code was not) we don't need to put it in integrations

In addition, are we keeping a folder for discrete in lobes.models? This makes the redirect a bit more complex.

@poonehmousavi
Copy link
Copy Markdown
Collaborator

Thanks @pplantinga … this change would be really helpful in adding new tokenizers. Following are also needed to be transferred to integration folder:

I've added discrete to the list above, but I don't see any external imports in beats.py, does this code depend on an external library? IMO if the code is now internal (even if the original code was not) we don't need to put it in integrations

Actually, I transferred all their code to SB so it not dependant on external library, we just initialized the weights with their checkpoints

Copy link
Copy Markdown
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pplantinga, the changes looks good to me. I think you also need to move Flair and spacy folder since they also depend on an external library. I would also advocate for having a README.md in each subfolder within integrations to explain how to install and make the dep and speechbrain working together (e.g. specifying the compatible versions). Finally, I would say that we should also discuss about the folders lobes and the naming conventions. I tend to think that we could increase the readability by having better names and maybe even subfolders? Idk. :)

@pplantinga
Copy link
Copy Markdown
Collaborator Author

Hi @pplantinga, the changes looks good to me. I think you also need to move Flair and spacy folder since they also depend on an external library.

Added to list above

I would also advocate for having a README.md in each subfolder within integrations to explain how to install and make the dep and speechbrain working together (e.g. specifying the compatible versions).

This could make good sense, we could have the requirements file that pinned a specific working version perhaps, and an author or person to contact if it stops working.

Finally, I would say that we should also discuss about the folders lobes and the naming conventions. I tend to think that we could increase the readability by having better names and maybe even subfolders? Idk. :)

What naming conventions would you change? The "lobes" name itself?

@pplantinga
Copy link
Copy Markdown
Collaborator Author

One more question for the crew, up to this point, we've been ignoring the CI tests for the integrations because it adds to the CI time and increases the likelihood that the CI breaks because we have to install a bunch of extra packages (and I'm wondering if some of them might not work on the CI server anyway). Do we want to add these tests back in, or is this something we want to continue skipping and just say that integrations are at your own risk. @TParcollet etc.

@pplantinga
Copy link
Copy Markdown
Collaborator Author

There is a long-standing issue with fairseq in newer Python versions (2.11+) that doesn't seem likely to get addressed anytime soon. Should we just go ahead and deprecate our integration?

facebookresearch/fairseq#5012

@pplantinga pplantinga changed the title Move k2 and hf transformers to integrations folder Move files with optional dependencies to integrations folder Dec 12, 2024
@TParcollet
Copy link
Copy Markdown
Collaborator

TParcollet commented Jan 15, 2025

"speechbrain.nnet.losses.transducer => speechbrain.integrations.transducer ??? Should we deprecate in favor of e.g. torchaudio?" --> No, torchaudio loss is crazily slower than our Numba impl. For the tests, i'd be in favor of having them, but it sounds like a nightmare to maintain. At the same time, we provide this folder as a main part of the library... so our users may expect it to work properly. Could we have a specific CI for this folder? Like one that is only triggered if the code in it is touched?

Coverage that we should improve (I think anything bellow 60% is not really acceptable...) and since we are changing everything now, maybe good to put this in that PR?:
speechbrain/integrations/lm/ken.py
speechbrain/integrations/processing/diarization.py
speechbrain/integrations/k2_fsa/utils.py
speechbrain/integrations/k2_fsa/lattice_decoder.py
speechbrain/integrations/huggingface/wordemb/transformer.py
speechbrain/integrations/huggingface/llama2.py
speechbrain/integrations/k2_fsa/graph_compiler.py

@TParcollet
Copy link
Copy Markdown
Collaborator

Also that is a big PR, reviewing / testing will be slow.

@pplantinga
Copy link
Copy Markdown
Collaborator Author

Also that is a big PR, reviewing / testing will be slow.

You are right, its big but perhaps not as big as it looks since most of the changes are to module path names. For reviewing I would recommend just taking a look around the new integrations folder and seeing if the organization / readmes make sense. Some testing should be done too but again the functional changes are mostly just to module path names which should be unlikely to break anything. Biggest risk I guess is that I somehow put the wrong path for something, which recipe tests should catch.

@pplantinga pplantinga added this to the v1.1.0 milestone Feb 14, 2025
Comment thread speechbrain/__init__.py
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than discrete, maybe we could have been more specific i.e. audio_tokenizers or something like that.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think keeping the full name is better no? speech_tokenizer.py or speechtokenizer.py

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't keep the same name as the imported library as this leads to some sort of import error. But perhaps there's a better option than speechtok.py like speechtokenizer_interface.py or something.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I understand now. I know that ESPNet is using the convention name tokenizer. E.g. from espnet.espnet2.speechlm.tokenizer import X The only issue with this is that tokenizer is a broad term and can be applied to anything. I am honestly not against having a folder that can have audio/speech tokenizers, as well as text (if one day this happen) or vision tokenizers.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, shouldn't sentencepiece be moved in the .integrations. folder ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well currently sentencepiece is a non-optional dependency, but integrations is only for optional dependencies. I think we use it enough throughout the toolkit that perhaps its worth keeping it that way.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for this one wavtokenizer or wav_tokenizer.py

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wavtokenizer_interface.py ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about speechbrain.integrations.tokenizer.wavtokenizer ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this file go to discrete folder instead? The model doesn't seems to use the HFTransformersInterface abstract class that we have in huggingface.py. It seems to be more an orchestrator of different pre-trained blocks.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good idea!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dac or DAC ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Python standard for file names is all lower case -- due to different operating systems treating upper/lower case differently.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. I guess everyone in the community of codecs knows what dac refers to. We can keep it lowercase as you suggestd :)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason why we are dropping FairSeq support? I tend to agree but maybe we should be more specific about the reason why?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FB has stopped active development of FairSeq (last released version was 2022) and there is some dependency issue with a new release of some package though I can't remember which one (cuda? pytorch?)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FB stopping FairSeq, now Torchaudio :(

Comment thread speechbrain/lobes/models/kmeans.py
@pplantinga
Copy link
Copy Markdown
Collaborator Author

Unfortunately the MERT huggingface model is not compatible with the latest version of transformers (see huggingface/transformers#36134). I'll set up the test with an earlier version but we may have to deprecate at some point.

@pplantinga
Copy link
Copy Markdown
Collaborator Author

Okay, this should be pretty much ready to go @TParcollet @Adel-Moumen

Any last comments?

@pplantinga pplantinga mentioned this pull request May 23, 2025
13 tasks
Copy link
Copy Markdown
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Adel-Moumen Adel-Moumen merged commit 068a41b into speechbrain:develop May 27, 2025
5 checks passed
pplantinga added a commit to pplantinga/speechbrain that referenced this pull request Jun 2, 2025
…rain#2782)

Co-authored-by: Adel Moumen <adelmoumen.pro@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

correctness Functionality not objectively broken, but may be surprising or wrong e.g. regarding literature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants