Skip to content

Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream#10814

Open
singalsu wants to merge 4 commits into
thesofproject:mainfrom
singalsu:mfcc_compress_encoder
Open

Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream#10814
singalsu wants to merge 4 commits into
thesofproject:mainfrom
singalsu:mfcc_compress_encoder

Conversation

@singalsu
Copy link
Copy Markdown
Collaborator

@singalsu singalsu commented May 26, 2026

This PR adds commits to previous VAD add PR #10782

  • audio: mfcc: switch to source/sink API, int32 output, and DTX
  • base_fw: advertise BESPOKE codec for MFCC compress capture
  • audio: mfcc: update decode tools and add Python compress scripts
  • tools: topology: add MFCC compress capture for jack and DMIC

A kernel PR for encoder type ALSA controls fix is needed to run this.

@singalsu singalsu changed the title q Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream May 26, 2026
@singalsu
Copy link
Copy Markdown
Collaborator Author

Note: To run the MFCC compress topologies, need kernel patches thesofproject/linux#5647 and thesofproject/linux#5789.

Comment thread src/audio/mfcc/mfcc.c Outdated
Comment thread src/audio/mfcc/mfcc.c
Comment thread src/audio/mfcc/mfcc_generic.c Outdated
@singalsu singalsu force-pushed the mfcc_compress_encoder branch from d5267b3 to 969d644 Compare May 27, 2026 08:16
@singalsu singalsu changed the title Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream [DNM] Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream May 27, 2026
@singalsu singalsu marked this pull request as ready for review May 27, 2026 09:54
Copilot AI review requested due to automatic review settings May 27, 2026 09:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the SOF MFCC component and related tooling/topology to support VAD + DTX behavior and to use MFCC as a compress PCM “encoder” that can emit discontinuous (DTX-suppressed) feature frames, including optional IPC4 control notifications for VAD state.

Changes:

  • Add MFCC VAD/DTX support in firmware (new VAD implementation, frame header with VAD/energy fields, optional IPC4 notifications, and compress-output mode).
  • Add/adjust topology2 definitions to expose MFCC feature capture for both normal PCM and compress PCM on SDW jack/DMIC, including new build targets.
  • Update MFCC tuning/export and host-side decode/visualization/transcription tools (Matlab/Octave + Python scripts), plus new documentation.

Reviewed changes

Copilot reviewed 40 out of 40 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/topology/topology2/platform/intel/sdw-jack-audio-feature.conf Adds MFCC frame sizing define and VAD mixer control naming for jack feature capture.
tools/topology/topology2/platform/intel/sdw-jack-audio-feature-compress.conf New compress PCM MFCC feature-capture topology for jack (MFCC encoder type, blob selection, VAD control).
tools/topology/topology2/platform/intel/sdw-dmic-audio-feature.conf Adds MFCC frame sizing define and VAD mixer control naming for DMIC feature capture.
tools/topology/topology2/platform/intel/sdw-dmic-audio-feature-compress.conf New compress PCM MFCC feature-capture topology for DMIC (MFCC encoder type, blob selection, VAD control).
tools/topology/topology2/platform/intel/dmic1-mfcc.conf Renames MFCC bytes control and adds VAD mixer control naming.
tools/topology/topology2/include/pipelines/cavs/host-gateway-src-mfcc-capture.conf Adds MFCC_FRAME_BYTES-driven ibs/obs to support variable-sized (compress) MFCC frames.
tools/topology/topology2/include/components/mfcc/mel80.conf Updates exported MFCC configuration blob.
tools/topology/topology2/include/components/mfcc/mel80_compress.conf New exported MFCC configuration blob for compress output.
tools/topology/topology2/include/components/mfcc/mel80_compress_dtx.conf New exported MFCC configuration blob for compress output + DTX.
tools/topology/topology2/include/components/mfcc/default.conf Updates exported default MFCC configuration blob.
tools/topology/topology2/include/components/mfcc/ceps13_compress_dtx.conf New exported MFCC configuration blob for cepstral output + compress + DTX.
tools/topology/topology2/include/components/mfcc.conf Adds mixer control template to MFCC widget and allows type override (e.g., encoder).
tools/topology/topology2/include/common/common_definitions.conf Adds default feature flags for SDW jack/DMIC compress MFCC capture.
tools/topology/topology2/include/bench/mfcc_controls_playback.conf Enables an MFCC mixer switch control in bench playback controls.
tools/topology/topology2/include/bench/mfcc_controls_capture.conf Enables an MFCC mixer switch control in bench capture controls.
tools/topology/topology2/development/tplg-targets.cmake Renames MFCC topology targets and adds compress MFCC mel/ceps variants with frame sizing + blob selection.
tools/topology/topology2/cavs-sdw.conf Adds feature-gated includes for new compress MFCC capture topologies.
src/include/user/mfcc.h Extends MFCC config ABI with VAD/DTX/compress flags and timing parameters.
src/include/sof/audio/mfcc/mfcc_vad.h New VAD API/state definitions for MFCC.
src/include/sof/audio/mfcc/mfcc_comp.h Refactors MFCC component interfaces (source/sink API, frame header, VAD/DTX state, IPC4 helpers).
src/audio/mfcc/tune/sof_mel_to_text_live_dsp_vad.py New live Whisper transcription script using DSP VAD embedded in PCM stream.
src/audio/mfcc/tune/sof_mel_to_text_live_compress.py New live Whisper transcription script for compress PCM + DTX/discontinuous frames.
src/audio/mfcc/tune/sof_mel_spectrogram_compress.py New live mel spectrogram viewer for compress PCM MFCC frames.
src/audio/mfcc/tune/sof_ceps_spectrogram_compress.py New live cepstral viewer for compress PCM MFCC frames.
src/audio/mfcc/tune/setup_mfcc.m Updates blob export for new config layout; adds compress + DTX blob exports.
src/audio/mfcc/tune/README.txt Removed in favor of README.md.
src/audio/mfcc/tune/README.md New markdown documentation for tuning, decoding, and live scripts.
src/audio/mfcc/tune/decode_mel.m Updates decoder for new int32 + header format and DTX gap filling.
src/audio/mfcc/tune/decode_ceps.m Updates decoder for new int32 + header format and DTX gap filling.
src/audio/mfcc/tune/decode_all.m Updates batch decode to new decoder signatures and int32 outputs.
src/audio/mfcc/mfcc.c Moves MFCC to source/sink API processing, hooks VAD notifications and compress/DTX behavior.
src/audio/mfcc/mfcc_vad.c New VAD implementation (noise floor tracking + weighted energy + hangover).
src/audio/mfcc/mfcc_setup.c Adds VAD init, DTX/compress state init, buffer free fixes, sample-rate limit check.
src/audio/mfcc/mfcc_ipc4.c New IPC4 control notification plumbing for VAD state reporting.
src/audio/mfcc/mfcc_hifi4.c Removes old stream-buffer source copy implementations (now in common source/sink code).
src/audio/mfcc/mfcc_hifi3.c Removes old stream-buffer source copy implementations (now in common source/sink code).
src/audio/mfcc/mfcc_generic.c Removes old stream-buffer source copy implementations (now in common source/sink code).
src/audio/mfcc/mfcc_common.c Adds source/sink copy funcs, header/VAD handling, legacy vs compress output paths, and DTX suppression logic.
src/audio/mfcc/CMakeLists.txt Registers new mfcc_vad.c and conditionally mfcc_ipc4.c in build.
src/audio/base_fw.c Advertises BESPOKE codec capability for MFCC compress capture.

Comment thread src/audio/mfcc/tune/decode_mel.m Outdated
Comment thread src/audio/mfcc/tune/README.md Outdated
Comment thread src/audio/mfcc/tune/sof_ceps_spectrogram_compress.py Outdated
Comment thread src/audio/mfcc/mfcc_common.c Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 4 comments.

Comment thread src/audio/mfcc/mfcc_common.c Outdated
Comment thread src/audio/mfcc/tune/sof_ceps_spectrogram_compress.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 2 comments.

Comment thread src/audio/mfcc/mfcc_common.c
Comment thread src/audio/mfcc/mfcc_common.c Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 7 comments.

Comment thread src/audio/mfcc/mfcc_common.c Outdated
Comment thread src/audio/mfcc/mfcc_common.c Outdated
Comment thread src/audio/mfcc/tune/README.md
Comment thread src/audio/mfcc/tune/README.md
Comment thread src/audio/mfcc/tune/sof_mel_spectrogram_compress.py Outdated
Comment thread src/audio/mfcc/tune/sof_ceps_spectrogram_compress.py Outdated
Comment thread src/audio/mfcc/mfcc.c
Comment on lines +261 to +265
cd->source_format = source_format;

err:
comp_set_state(dev, COMP_TRIGGER_RESET);
return ret;
if (cd->config->compress_output)
comp_info(dev, "compress PCM output mode enabled");

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — but the sink format is intentionally unconstrained here. mfcc_output_legacy() computes commit_bytes = sink_get_frame_bytes(sink) * frames, zero-fills that period, then copies the header + int32 payload byte-wise into the period, carrying any leftover via state->out_remain. The commit size always matches what the sink expects, so S16_LE / S24_4LE / S32_LE sinks all produce correctly-sized commits — no truncation or buffer overrun. The bench topologies that wire S16_LE on the MFCC sink rely on this behavior (the host decodes the bytes as an MFCC blob, not as PCM). Rejecting non-S32 sinks would break those bench flows. The only thing being conveyed through the sink is opaque bytes; there is no PCM-format contract on the MFCC output.

singalsu added 4 commits May 28, 2026 17:54
Switch from process_audio_stream to source/sink API. Add compress
PCM output mode (variable-size frames, no zero padding) alongside
legacy mode (full period with zero-fill).

Unify all output to int32 Q9.23 regardless of source format.
Remove out_data_ptr_32, mel_spectra int16 copy, mfcc_func typedef,
and per-format output functions from mfcc_common/hifi3/hifi4.

Add DTX for compress mode: suppress silence frames after
configurable trailing count, with optional periodic keepalive.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Register SND_AUDIOCODEC_BESPOKE capture in codec info TLV when
CONFIG_COMP_MFCC is enabled so the kernel detects compress capture
support via IPC4_SOF_CODEC_INFO.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Update Octave decode scripts for int32 Q9.23 output and DTX gap
filling. Add DTX blob generation to setup_mfcc.m.

Add Python compress capture tools: sof_mel_spectrogram_compress.py,
sof_ceps_spectrogram_compress.py, sof_mel_to_text_live_compress.py.
Refactor sof_mel_to_text_live_dsp_vad.py to use shared compress
capture code. Add README with usage examples.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Add sdw-jack-audio-feature-compress.conf (PCM 53, pipeline 132)
and sdw-dmic-audio-feature-compress.conf (PCM 54, pipeline 133)
for compress MFCC capture with DTX blobs.

Fix buffer sizes: set MFCC obs and host-copier ibs/obs to 344
bytes (24-byte header + 80 x int32). Add mel and ceps compress
topology targets for MTL and ARL. Rename normal MFCC topologies
to *-mfcc-mel-normal for clarity.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
@singalsu singalsu force-pushed the mfcc_compress_encoder branch from 97a3c57 to b491cf1 Compare May 28, 2026 15:01
@singalsu singalsu changed the title [DNM] Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream Audio: MFCC: Use the MFCC module as compress PCM encoder with discontinuous stream May 28, 2026
@singalsu singalsu requested a review from Copilot May 28, 2026 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants