IWSLT 2022 speech translation recipe#1475
Conversation
|
Hey @mzboito did you finally solved your issues ? I won't have time to have a look at all that before a while I am afraid :/ |
|
Dear reviewers. When I try to run "pytest tests" locally, I get an error related to fairseq's progress_bar. Thanks for your time. |
|
Thanks @TParcollet, I just received the log from the failed tests. I fixed the errors (unused variables). Could you try to run it again? |
|
Hello again. Sorry about the mess! I fixed my formatting and now "pre-commit run --all-files" passes on my machine. |
TParcollet
left a comment
There was a problem hiding this comment.
Hi @mzboito and thank you so much for this work! Once my tree comment will be fixed, we will be able to merge! Thanks again!
|
Hello Titouan, thanks for all your comments! I applied all the changes and trained a model from scratch to make sure nothing breaks with the new tokenizer integrated into the train.py. |
Hi @mzboito known issue. There are scripts in the tests folder that shadow this. You can try to run locally: Just saw your message while I drafted this one - re-starting the tests, let's see. |
TParcollet
left a comment
There was a problem hiding this comment.
One more change, and we're good to go! Thanks @mzboito
|
Hi again Titouan! Thanks for the feedback: it looks much better like this. :)
|
|
Nice! Lemme try it! |
|
Oh I'm sorry, the error is that I didn't update the recipe csv file!! |
anautsch
left a comment
There was a problem hiding this comment.
it's just for the minor comments. lgtm otherwise !
( obligatory curiosity question: do you plan to upload the models ? )
|
GitHub indicates a conflict with the file |
|
Hello @anautsch ,
I hope it works fine now! For some reason github is still saying there's a conflict at recipes.csv |
Thanks @mzboito ! before, more lines were affected. It could be simply a tree/history versioning thingy (github still thinking of your version at branch time, and now this one being copied over - so a resolve merge cleared it up - thanks for preparing this, so it was easy!) |
|
@mzboito linters got to be kidding... edit: seriously, I can't see what it complains about - one empty line too much? |
…/speechbrain into iwslt_speech_translation
|
Sorry @anautsch , I'm in a new machine and i forgot to install black and flake8! |
|
it's about trailing whitespaces, the other linters passed. |
|
That's a bit odd, I don't know why linters is mad about my comment, but I moved it somewhere else. Let's see! |
Indeed. Heading for lunch; fingers crossed ! |
|
Bon app @anautsch ! Locally, when I run black giving as input the hparams file I get the following error: I don't understand why just now it is complaining about this line. Before it was running fine. Moreover, I checked a different recipe, and the line is identical (e.g. https://github.com/speechbrain/speechbrain/blob/develop/templates/speaker_id/train.yaml) Not sure how to fix this. |
Thanks; let's see what we have here.
black should handle py files only; cf linters:
It has the same error when given to black.
gives When I added an empty line at the end that error disappeared (but it wasn't reported before either). Chasing down this lead: So, I tried out: ran a Idk - some cache issue? Please run twice: |
|
This is funny and frightening at the same time ... what is git up to... the only thing coming to mind is "\n\r" vs "\n" end of the line character thingy; or some other 'invisible' command that is not an end-of-line and thus raises the above error as its casted as "whitespace" which is then trailing. dunno. |
|
Hello @anautsch , sorry for the delay. pre-commit run trailing-whitespace --files recipes/IWSLT22_lowresource/hparams/train_w2v2_st.yaml |
anautsch
left a comment
There was a problem hiding this comment.
one last bit - just went through a final time
(never ending story here... sorry for that)
(about the white space, at some point, I need to reconfigure my tools and/or clean my glasses... expected it but didn't catch it before)
This is a recipe for wav2vec 2.0 fine-tuning in the speech translation task. It includes data processing for the Tamasheq-French dataset, and the parameters from the best system in the low-resource task (that will be used as baseline next year).