Conversation
|
The error is related to the returning warning, should I skip the doctest fro MERT? |
|
|
||
| Example | ||
| ------- | ||
| >>> audio = torch.randn(4, 10000) # Batch of 4 audio signals |
There was a problem hiding this comment.
audio = torch.randn(4, 10000) # Batch of 4 audio signals
length = torch.tensor([1.0, 0.5, 0.75, 1.0])
model = BEATs("BEATs_iter1_finetuned_on_AS2M_cpt2.pt")
outputs = model.extract_features(audio, length)
print(outputs.shape)
>>> AttributeError: 'tuple' object has no attribute 'shape'
When loading a pretrained model, I guess you do not need the predictor but just the embedding model (see line 2005).
There was a problem hiding this comment.
It happens when the finetuned version is loaded, so it also returns the prob_log of the prediction. I changed the code, so it could work with both finetuned and self-supervised checkpoints
| @@ -0,0 +1,2059 @@ | |||
| """This lobe enables the integration of pretrained BBEATs: Audio Pre-Training with Acoustic Tokenizers. | |||
There was a problem hiding this comment.
Fix typo: "BBEATs" -> "BEATs"
| where a projection of the CNN output is added to the beginning. | ||
| If False, the forward function outputs the hidden states only from the last transformer layer. | ||
|
|
||
| Example |
There was a problem hiding this comment.
Replace with this, it should fix the tests:
Example
-------
>>> import torch
>>> inputs = torch.rand([10, 600])
>>> model_hub = "m-a-p/MERT-v1-95M"
>>> save_path = "savedir"
>>> model = MERT(model_hub, save_path)
WARNING: feature_extractor_cqt requires the libray 'nnAudio'
>>> outputs = model(inputs)
>>> outputs.shape
torch.Size([10, 1, 768])
There was a problem hiding this comment.
The warning is resolved but then since the warning message contains "library" instead of "library" we got recommit issue
There was a problem hiding this comment.
Replace:
>>> model = MERT(model_hub, save_path)
WARNING: feature_extractor_cqt requires the libray 'nnAudio'
with
>>> model = MERT(model_hub, save_path) # doctest: +ELLIPSIS
WARNING: ...
| self, attn_weights, tgt_len: int, src_len: int, bsz: int | ||
| ): | ||
| """ | ||
| """5 |
|
Thank you @poonehmousavi , for this contribution! I reviewed the PR, and everything seems to work properly. I only have the following comments:
Expected Format: There are a few instances like this that need to be corrected.
|
regarding moving BEATs to the integration folder, the current version of BEATs is completely implemented in SB without the need to external library.. so maybe there is no need to be moved to integration folder |
|
After the latest changes, I'm fine with it. |
What does this PR do?
Adding SSL model for Music and audio domains:
Fixes #<issue_number>
Before submitting
PR review
Reviewer checklist