Skip to content

[espnet3-14.3] Add integration test for ASR#6392

Open
Masao-Someki wants to merge 48 commits intoespnet:masterfrom
Masao-Someki:espnet3/integration_test_3
Open

[espnet3-14.3] Add integration test for ASR#6392
Masao-Someki wants to merge 48 commits intoespnet:masterfrom
Masao-Someki:espnet3/integration_test_3

Conversation

@Masao-Someki
Copy link
Copy Markdown
Contributor

What did you change?

Added ASR integration test


Why did you make this change?

To test espnet3 pipeline


Is your PR small enough?

No, 28 files change and 2,069 addition.

Except for the following edits, the reviews are done in #6331.

  • Please also add comments to the config like the that in the template directory

Additional Context

@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. ASR Automatic speech recogntion CI Travis, Circle CI, etc ESPnet3 labels Mar 17, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds integration tests for ASR in espnet3, introducing a new recipe structure with template configurations and a more robust optimization and inference pipeline. The changes are extensive and well-structured, improving flexibility and maintainability. A key improvement is the refactoring of the optimizer and scheduler handling in the LightningModule, moving to a manual optimization loop for multi-optimizer setups which is cleaner and more aligned with PyTorch Lightning's practices. The inference pipeline is also enhanced with better artifact handling. However, I found a significant performance issue in the new CI script.

--training_config conf/training.yaml \
--inference_config "${inference_config}" \
--metrics_config conf/metrics.yaml
rm -rf exp data
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The rm -rf exp data command inside the run_with_training_config function, which is called in a loop, will cause the dataset to be downloaded and prepared repeatedly for each training configuration. This is highly inefficient and will significantly slow down the CI process. The create_dataset and train_tokenizer stages are idempotent and can be run just once before the loop. To fix this, you should remove data from this rm command to persist the prepared data across test runs within the same CI job. The exp directory should still be cleaned up to ensure a fresh state for each training run.

Suggested change
rm -rf exp data
rm -rf exp

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 93.73041% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.25%. Comparing base (48cb257) to head (8f24650).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
espnet3/systems/asr/transducer_task.py 91.86% 10 Missing ⚠️
espnet3/utils/run_utils.py 88.00% 6 Missing ⚠️
espnet3/components/data/dataset_module.py 94.44% 3 Missing ⚠️
espnet3/systems/asr/system.py 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6392      +/-   ##
==========================================
+ Coverage   70.16%   70.25%   +0.08%     
==========================================
  Files         787      791       +4     
  Lines       73367    73606     +239     
==========================================
+ Hits        51477    51710     +233     
- Misses      21890    21896       +6     
Flag Coverage Δ
test_integration_espnet2 46.78% <ø> (ø)
test_python_espnet2 61.02% <0.00%> (-0.20%) ⬇️
test_python_espnet3 18.08% <93.73%> (+0.63%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Fhrozen Fhrozen added this to the v.202604 milestone Mar 17, 2026
Copy link
Copy Markdown
Contributor

@sw005320 sw005320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many dataset-dependent names or paths in the config file.
Please avoid them as much as possible.
You should make them under the recipe directory and avoid specifying the dataset names.

I think the tokenizer part is well-designed.
Please follow this way

Comment thread egs3/mini_an4/asr/conf/inference.yaml Outdated
Comment on lines +12 to +14
exp_tag: ${load_yaml:training.yaml,exp_tag}
inference_dir: ${exp_dir}/inference
dataset_dir: ${data_dir}/mini_an4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be an issue as we need to specify the dataset name mini_an4, and it would cause some errors.
How about creating the dataset_dir under egs3/mini_anr so that we don't need to specify it?

Is there another way to avoid it?

create_dataset:
func: src.creating_dataset.create_dataset
dataset_dir: ${dataset_dir}
archive_path: ${recipe_dir}/../../egs2/mini_an4/asr1/downloads.tar.gz
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto
Instead of specifying the data in egs2, it is better to put it under the recipe directory.

Also, please avoid using egs2 files.
Make espnet3 recipes independent as much as possible.

task: espnet3.systems.asr.task.ASRTask

exp_tag: train_asr_rnn_debug
dataset_dir: ${data_dir}/mini_an4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

##########################################################
task: espnet3.systems.asr.task.ASRTask

exp_tag: train_asr_rnn_debug
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why _debug?
Please remove unnecessary names and avoid confusions

train:
- name: train_nodev
dataset:
_target_: src.dataset.MiniAN4Dataset
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto
This would bring some issues as we need to specify the dataset name

create_dataset:
func: src.creating_dataset.create_dataset
dataset_dir: ${dataset_dir}
archive_path: ${recipe_dir}/../../egs2/mini_an4/asr1/downloads.tar.gz
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

train:
- name: train_nodev
dataset:
_target_: src.dataset.MiniAN4Dataset
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

valid:
- name: train_dev
dataset:
_target_: src.dataset.MiniAN4Dataset
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

dataset_dir = Path(dataset_dir)
logger = setup_logger(name="mini_an4.create_dataset")

archive = Path("../../../egs2/mini_an4/asr1/downloads.tar.gz")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid to specify the espnet2 directory

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be dataset independent.
It would be better to move this to the recipe common place (e.g., template)

@sw005320
Copy link
Copy Markdown
Contributor

Can you even remove mini_an4/asr from the config?
Since it will be under the mini_an4/asr recipe directory, we can extract this path information from the current directory information.

@Masao-Someki
Copy link
Copy Markdown
Contributor Author

The change in c78db55 removes the load_yaml resolver and stops config files from reaching into sibling configs to fetch exp_tag / exp_dir.

Instead, experiment identity is handled at runner setup time:

  • configs are loaded with resolve=False
  • when training_config is present, exp_tag and exp_dir are propagated into inference_config and metrics_config
  • when training_config is absent, standalone inference/metrics configs must define exp_tag or a concrete exp_dir, otherwise the runner raises an error

Motivation

load_yaml looked convenient for reusing values across config files, but it created a weaker config-loading path.
In practice, that caused a few problems:

  • it read raw YAML files, so some configs defined by TEMPLATE/**/*.yaml or with defaults:, are not visible from the load_yaml resolver.
  • behavior became hard to track once configs started depending on other configs during interpolation

In the CI issue, the transducer failure was a concrete example of this..
inference_transducer.yaml read exp_dir from training.yaml, but load_yaml evaluated the raw training config without merging the default config, so resolution broke.
Rather than making cross-config reads more complex, this change removes that pattern and keeps experiment identity handling in run.py

What changed

  • removed load_yaml from espnet3/utils/config_utils.py
  • added runner-side experiment-context helpers in espnet3/utils/run_utils.py
  • updated egs3/TEMPLATE/asr/run.py to apply and validate experiment context before final resolve
  • removed load_yaml usage from mini_an4 inference / transducer inference / metrics configs
  • moved helper coverage into test/espnet3/utils/test_run_utils.py
  • updated config-related tests accordingly

@Masao-Someki
Copy link
Copy Markdown
Contributor Author

@sw005320 If this looks okay to you, could you merge this PR? We can move to the next (3.14.5) bug fix PR!

@Fhrozen Fhrozen modified the milestones: v.202604, v.202607 Apr 7, 2026
@Masao-Someki
Copy link
Copy Markdown
Contributor Author

I have fixed the config name. I will handle other config-related items in the following PRs!

Comment thread .github/workflows/ci_on_ubuntu.yml Outdated
python-version: ["3.10", "3.12"]
pytorch-version: [2.9.1, 2.10.0, 2.11.0]
use-conda: [false]
chainer-version: [6.0.0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need Chainer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment!
chainer-version was not actually used in the espnet3's CI workflow, so I removed it in 2e58522

dataset:
test:
- name: test
data_src_args:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need args?
data_src or data_sources would be good enough?

Also, it is better to add similar explanation comments to the training data

##########################################################

DATASET DEFINITION

##########################################################

Dataset splits:

train:

- split="train" from the dataset.dataset.Minian4Dataset

valid:

- split="valid" from the dataset.dataset.Minian4Dataset

Notes:

- actual data sources are defined via ${recipe_dir}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and yes, I think it is good to have data_src_args.
This would be passed directly to the dataset class. For example, suppose we have a dataset class like:

class MyDataset(Dataset):
  def __init__(self, split="train",):
    self.data = self._build_data(split)
  ...

Then we can feed the data_src_args.split to the __init__().
We could remove data_src_args, but then we would need additional code to manually remove (pop) data_src from the config, and then feed the rest configs to __init__().

@sw005320
Copy link
Copy Markdown
Contributor

Please check my other comments

@Masao-Someki
Copy link
Copy Markdown
Contributor Author

Thank you!

  • Chainer is removed for espnet3's CI in 2e58522
  • I think data_src_args should stay as-is..
  • I added dataset explanation comments to the training config in 1bbaeb9 and c8a80bf!

@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Apr 14, 2026
# - uses the "test" split defined in the dataset.dataset.Minian4Dataset
dataset:
test:
- name: test
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should also include valid

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Masao-Someki, please check this comment
We usually do not have only the test data.
If it is intentional to reduce the computation cost, that is valid (but I think we do not have so many additions for the computation time)

##########################################################
# PATHS AND INPUTS #
##########################################################
inference_dir: ${exp_dir}/inference
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we specify the data?
Is it included in this directory specification?
I think it should be commented.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can skip RNN related integration test.
This is not used anymore.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto
We can skip RNN related integration test.
This is not used anymore.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the data_aug part can be moved to the other config

Comment thread egs3/TEMPLATE/asr/conf/inference.yaml Outdated
Comment on lines +36 to +45
# - name: librispeech_test_clean
# data_src: librispeech/asr
# data_src_args:
# split: test_clean
# data_path: ${dataset_dir}
# - name: librispeech_test_other
# data_src: librispeech/asr
# data_src_args:
# split: test_other
# data_path: ${dataset_dir}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dev sets should be included

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Masao-Someki, please check this comment
We usually do not have only the test data.
If it is intentional to reduce the computation cost, that is valid (but I think we do not have so many additions for the computation time)

Comment on lines +10 to +11
- data_src: mini_an4/asr
data_src_args:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines look redundant.
Do we really need data_src_args:?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ASR Automatic speech recogntion CI Travis, Circle CI, etc ESPnet3 size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants