Ssofja
diff --git a/‎examples/asr/speech_to_label.py‎
Lines changed: 4 additions & 4 deletions b/‎examples/asr/speech_to_label.py‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎examples/asr/speech_to_text_bpe.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/asr/speech_to_text_bpe.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/asr/speech_to_text_rnnt.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/asr/speech_to_text_rnnt.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/asr/vad_infer.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/asr/vad_infer.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/speaker_recognition/speaker_reco.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/speaker_recognition/speaker_reco.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎scripts/rttm_to_manifest.py‎ ‎… speaker_recognition/rttm_to_manifest.py‎scripts/rttm_to_manifest.py renamed to scripts/ speaker_recognition/rttm_to_manifest.py b/‎scripts/rttm_to_manifest.py‎ ‎… speaker_recognition/rttm_to_manifest.py‎scripts/rttm_to_manifest.py renamed to scripts/ speaker_recognition/rttm_to_manifest.py
diff --git a/‎scripts/scp_to_manifest.py‎ ‎…/ speaker_recognition/scp_to_manifest.py‎scripts/scp_to_manifest.py renamed to scripts/ speaker_recognition/scp_to_manifest.py b/‎scripts/scp_to_manifest.py‎ ‎…/ speaker_recognition/scp_to_manifest.py‎scripts/scp_to_manifest.py renamed to scripts/ speaker_recognition/scp_to_manifest.py
diff --git a/‎scripts/process_asr_text_tokenizer.py‎ ‎…tokenizers/process_asr_text_tokenizer.py‎scripts/process_asr_text_tokenizer.py renamed to scripts/ tokenizers/process_asr_text_tokenizer.py b/‎scripts/process_asr_text_tokenizer.py‎ ‎…tokenizers/process_asr_text_tokenizer.py‎scripts/process_asr_text_tokenizer.py renamed to scripts/ tokenizers/process_asr_text_tokenizer.py
diff --git a/‎…/create_tarred_transformer_lm_dataset.py‎ ‎…/create_tarred_transformer_lm_dataset.py‎scripts/create_tarred_transformer_lm_dataset.py renamed to scripts/asr_language_modelling/create_tarred_transformer_lm_dataset.py b/‎…/create_tarred_transformer_lm_dataset.py‎ ‎…/create_tarred_transformer_lm_dataset.py‎scripts/create_tarred_transformer_lm_dataset.py renamed to scripts/asr_language_modelling/create_tarred_transformer_lm_dataset.py
diff --git a/‎scripts/install_ctc_decoders.sh‎ ‎…nguage_modelling/install_ctc_decoders.sh‎scripts/install_ctc_decoders.sh renamed to scripts/asr_language_modelling/install_ctc_decoders.sh b/‎scripts/install_ctc_decoders.sh‎ ‎…nguage_modelling/install_ctc_decoders.sh‎scripts/install_ctc_decoders.sh renamed to scripts/asr_language_modelling/install_ctc_decoders.sh
@@ -16,10 +16,10 @@
 # Task 1: Speech Command
 
 ## Preparing the dataset
-Use the `process_speech_commands_data.py` script under <NEMO_ROOT>/scripts in order to prepare the dataset.
+Use the `process_speech_commands_data.py` script under <NEMO_ROOT>/scripts/dataset_processing in order to prepare the dataset.
 
 ```sh
-python <NEMO_ROOT>/scripts/process_speech_commands_data.py \
+python <NEMO_ROOT>/scripts/dataset_processing/process_speech_commands_data.py \
     --data_root=<absolute path to where the data should be stored> \
     --data_version=<either 1 or 2, indicating version of the dataset> \
     --class_split=<either "all" or "sub", indicates whether all 30/35 classes should be used, or the 10+2 split should be used> \
@@ -47,7 +47,7 @@
 # Task 2: Voice Activity Detection
 
 ## Preparing the dataset
-Use the `process_vad_data.py` script under <NEMO_ROOT>/scripts in order to prepare the dataset.
+Use the `process_vad_data.py` script under <NEMO_ROOT>/scripts/dataset_processing in order to prepare the dataset.
 
 ```sh
 python process_vad_data.py \
@@ -82,7 +82,7 @@
    Note that it's possible that tarred datasets impacts validation scores because it drop values in order to have same amount of files per tarfile; 
    Scores might be off since some data is missing. 
 
-   Use the `convert_to_tarred_audio_dataset.py` script under <NEMO_ROOT>/scripts in order to prepare tarred audio dataset.
+   Use the `convert_to_tarred_audio_dataset.py` script under <NEMO_ROOT>/scripts/speech_recognition in order to prepare tarred audio dataset.
    For details, please see TarredAudioToClassificationLabelDataset in <NEMO_ROOT>/nemo/collections/asr/data/audio_to_label.py
 
 python speech_to_label.py \
 
@@ -14,10 +14,10 @@
 
 """
 # Preparing the Tokenizer for the dataset
-Use the `process_asr_text_tokenizer.py` script under <NEMO_ROOT>/scripts in order to prepare the tokenizer.
+Use the `process_asr_text_tokenizer.py` script under <NEMO_ROOT>/scripts/tokenizers/ in order to prepare the tokenizer.
 
 ```sh
-python <NEMO_ROOT>/scripts/process_asr_text_tokenizer.py \
+python <NEMO_ROOT>/scripts/tokenizers/process_asr_text_tokenizer.py \
         --manifest=<path to train manifest files, seperated by commas>
         OR
         --data_file=<path to text data, seperated by commas> \
 
@@ -21,10 +21,10 @@
 
 """
 # Preparing the Tokenizer for the dataset
-Use the `process_asr_text_tokenizer.py` script under <NEMO_ROOT>/scripts in order to prepare the tokenizer.
+Use the `process_asr_text_tokenizer.py` script under <NEMO_ROOT>/scripts/tokenizers/ in order to prepare the tokenizer.
 
 ```sh
-python <NEMO_ROOT>/scripts/process_asr_text_tokenizer.py \
+python <NEMO_ROOT>/scripts/tokenizers/process_asr_text_tokenizer.py \
         --manifest=<path to train manifest files, seperated by commas> \
         --data_root="<output directory>" \
         --vocab_size=<number of tokens in vocabulary> \
 
@@ -17,7 +17,7 @@
     1) shift the window of length time_length (e.g. 0.63s) by shift_length (e.g. 10ms) to generate the frame and use the prediction of the window to represent the label for the frame;
        [this script demonstrate how to do this approach]
     2) generate predictions with overlapping input segments. Then a smoothing filter is applied to decide the label for a frame spanned by multiple segments. 
-       [get frame level prediction by this script and use vad_overlap_posterior.py in NeMo/scripts/
+       [get frame level prediction by this script and use vad_overlap_posterior.py in NeMo/scripts/voice_activity_detection
        One can also find posterior about converting frame level prediction 
        to speech/no-speech segment in start and end times format in that script.]
    
 
@@ -43,7 +43,7 @@
    Note that it's possible that tarred datasets impacts validation scores because it drop values in order to have same amount of files per tarfile; 
    Scores might be off since some data is missing. 
    
-   Use the `convert_to_tarred_audio_dataset.py` script under <NEMO_ROOT>/scripts in order to prepare tarred audio dataset.
+   Use the `convert_to_tarred_audio_dataset.py` script under <NEMO_ROOT>/speech_recognition/scripts in order to prepare tarred audio dataset.
    For details, please see TarredAudioToClassificationLabelDataset in <NEMO_ROOT>/nemo/collections/asr/data/audio_to_label.py
 """