Skip to content

[DNM][WiP][PoC] userspace LL scheduling: LLEXT & multicore#10945

Draft
lyakh wants to merge 95 commits into
thesofproject:mainfrom
lyakh:llext-ull
Draft

[DNM][WiP][PoC] userspace LL scheduling: LLEXT & multicore#10945
lyakh wants to merge 95 commits into
thesofproject:mainfrom
lyakh:llext-ull

Conversation

@lyakh

@lyakh lyakh commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

This includes #10558 and my patches on top to enable LLEXT and multicore. Current status: passes simple tests with nocodec with both core 0 and core 1 streaming. 2 streams simultaneously run into a problem when the first of them terminates. WiP.

kv2019i added 30 commits June 16, 2026 20:21
Add a built option HOST_DMA_IPC_POSITION_UPDATES to control whether
functionality to send IPC stream position updates is enabled or
not. Most platforms provide more efficient means for host to
monitor DMA state, so this code is in most cases unncessary.

The current IPC sending code (from audio context) also assume
kernel context, so making this functionality user-space compatible
will require extra work.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Drop the IRQ disable/enable in ipc4_search_for_drv(). The driver
list is only modified at FW boot and when a new driver is registered
at runtime via SOF_IPC4_GLB_LOAD_LIBRARY IPC. ipc4_search_for_drv()
is only used when processing IPC messages. As IPC processing
is serialized, it is not possible for the driver list to be modified
concurrently with a call to ipc4_search_for_drv().

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
The component driver list is only modified at FW boot and at runtime
when a library is loaded. At boot, module init runs serially on the
primary core (Zephyr SYS_INIT at APPLICATION level, before secondary
cores are started; .initcall walked on a single core for XTOS). At
runtime, registration happens from the IPC thread, which is serialized
with only one command processed at a time. These two phases never
overlap, as IPC message processing only begins after boot completes,
so the list can never be modified concurrently.

The lock was also already incoherent: comp_set_adapter_ops() iterate the
list without holding the lock, so it provided no real mutual exclusion.

Drop the spinlock from comp_register() and comp_unregister(), and from
the UUID search in the IPC3 get_drv() reader. Remove the now-unused
lock field from struct comp_driver_list and its initialization.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Add support for registering user-space LL tasks, and ability to use
the task scheduling functions from user-space.

The implementation splits scheduler list into kernel and user
portions if SOF is built with CONFIG_SOF_USERSPACE_LL. A scheduler
type can be either maintained in kernel or user, never both. With
this patch, the SOF_SCHEDULE_LL_TIMER is moved to user managed
if CONFIG_SOF_USERSPACE_LL is used.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Ensure the scheduler objects and lists of schedulers are allocated
such that they can be used with both kernel and user-space LL
scheduler implementations.

The SOF_MEM_FLAG_KERNEL flag is removed. This flag has been a no-op
for a while, and given scheduler list is not always in kernel anymore,
it would be highly confusing to keep it.

When CONFIG_SOF_USERSPACE_LL is set, the context of all schedulers
is managed in the LL user-space domain.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
The real fix is to remove the locking around dai_get_properties() altogether,
but this depends on fixes in Zephyr DAI drivers. To unblock user-space work,
remove the calls to spinlocks for now. This opens up possibility to hit issues
with concurrent playback and capture cases on multiple cores, so this commit
remains a WIP until fixes in Zephyr drivers land.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Modify code to allocate DAI properties object on stack and
use dai_get_properties_copy(). This is required when DAI code
is run in user-space and a syscall is needed to talk to the DAI
driver. It's not possible to return a pointer to kernel memory,
so instead data needs to be copied to caller stack.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Turn the pdata->sem into a dynamic object in userspace LL builds,
implemented with Zephyr k_sem. Add POSIX no-op stubs
for sys_sem to maintain testbench build compatibility.

Keep statically allocated semaphore for kernel LL builds.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Add function scheduler_get_data_for_core() to look up scheduler
data for a particular type of scheduler. This variant allows to
pass the core number as an argument, so it can be called from
unprivileged code.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Add function user_ll_grant_access() to allow other threads
to access the scheduler mutex. This is needed if work is submitted
from other threads to the scheduler.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
When LL scheduler is run in user-space, use a different Zephyr
thread name.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
The COHERENT_CHECK_NONSHARED_CORES debug macros call cpu_get_id()
which invokes arch_proc_id() - a privileged hardware register read
that faults in user-space context. Disable the entire debug block at
compile time when CONFIG_SOF_USERSPACE_LL is enabled. This also fixes
the same latent issue in CORE_CHECK_STRUCT and CORE_CHECK_STRUCT_INIT.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Add new functions to lock/unlock the LL scheduler for a given
core. This is intended for audio application code that needs
to modify the audio pipelines and needs an interface to
get exclusive access to the pipelines on a particular core.

This interface is specific to SOF builds with CONFIG_SOF_USERSPACE_LL.
If LL scheduler is running in kernel space, there is option
to disable interrupts for similar effect. For now these code
paths are kept separate.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
In user-space LL builds (CONFIG_SOF_USERSPACE_LL), the IPC user thread
cannot block interrupts while making modifications to the audio graph.

To workaround this limitation, one could either protect each pipeline
object with locks, or keep the LL level lock held while executing
LL tasks.

This patch implements support for the latter approach. If building
SOF for user LL, do not release the lock when running a task. This
reduces number of syscalls during a LL iteration, and allows to
safely implement IPC handlers that need to modify the audio graph.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Modify the locking approach for CONFIG_SOF_USERSPACE_LL builds.
Kernel LL implementation heavily relies on ability to disable
interrupts when IPC handler is modifying the graph. This ensures
a new LL tick and execution of a new graph cycle does not start
before the graph modifications done by IPC handler are complete.

In user-space, this approach is not available as user-space thread
cannot disable interrupts. In commit 1e59ce2 ("pipeline: protect
component connections with a mutex"), a sys_mutex based locking was
implemented to protect the component list and modifications to it. This
approach does not scale in the end as this would require taking the
mutex for each component of each pipeline, and take the locks on every
LL cycle tick. This results in significant system call overhead.
Additionally Zephyr sys_mutex does not work correctly if the lock
object is put into dynamically allocated user memory.

In this commit, locking the LL graph is moved to a higher level.
A single lock is used to protect the whole LL graph, and the lock
is taken at start of LL tick. The same lock is taken by the IPC handlers
when modifications to the graph are taken. The mutex interface supports
priority inversion, so this usage is safe if LL timer tick happens
while IPC processing is still in progress.

The patch only changes behaviour for userspace LL SOF builds. If
LL scheduling is kept in kernel, locking is done as before.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Modify the checks in zephyr_ll_assert_core() to make them safe
to call from user-space LL threads.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Needs more review, but makes the tests pass again.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Build fails when building with CONFIG_THREAD_NAME disabled. Fix
the issue by conditional compilation of code using
CONFIG_THREAD_MAX_NAME_LEN.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Add support to run pipeline_schedule_triggered() in user-space.
Use the user_ll_lock/unlock_sched() interface if building
with CONFIG_SOF_USERSPACE_LL.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
In user-space LL builds the low-latency scheduler runs its work in a
dedicated privileged domain thread, created together with its timer and
access grants by scheduler_init_context() (zephyr_ll_init_context() ->
domain_thread_init()). This context is per-core and must exist on every
core that runs LL tasks.

So far it was only established for the primary core, so LL tasks could
not be scheduled on secondary cores when CONFIG_SOF_USERSPACE_LL is
enabled.

Allocate an LL task in secondary_core_init() and run
scheduler_init_context() on it, giving each secondary core its own LL
domain thread. A dedicated sec_core_init UUID is registered for the
task. The whole block is compiled in only for CONFIG_SOF_USERSPACE_LL.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Copier set_chmap() blocks IRQs to atomically update the converters.
This code is not safe to be moved to user-space, so replace the locks
with calls to block LL scheduler execution.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
kv2019i and others added 30 commits June 16, 2026 20:51
Place the pipeline position lookup table in the sysuser memory
partition and replace k_spinlock with a dynamically allocated
k_mutex when CONFIG_SOF_USERSPACE_LL is enabled. Spinlocks disable
interrupts which is a privileged operation unavailable from
user-mode threads.

The mutex pointer is stored in a separate APP_SYSUSER_BSS variable
outside the SHARED_DATA struct so Zephyr's kernel object tracking
can recognize it for syscall verification.

Move pipeline_posn_init() from task_main_start() to
primary_core_init() before platform_init(), so the mutex is
allocated before ipc_user_init() grants thread access to it.

In pipeline_posn_get(), bypass the sof_get() kernel singleton and
access the shared structure directly when running in user-space.
Grant the ipc_user_init thread access to the pipeline position
mutex via new pipeline_posn_grant_access() helper.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
…pace task

zephyr_ll_task_sched_free() frees an active (RUNNING/RESCHEDULE) task by
setting pdata->freeing and waiting on pdata->sem for the scheduler thread
to remove the task from its run list before the memory is released.

Under CONFIG_SOF_USERSPACE_LL this function runs in kernel context while
pdata->sem is a sys_sem allocated on the user heap. sys_sem_take() returns
-EINVAL immediately when called from kernel context, so the wait is a
no-op: pdata is freed (and the struct task is subsequently freed by
pipeline_free()) while the task is still linked in sch->tasks with
n_tasks != 0 and the scheduling domain handler still set. Because n_tasks
is non-zero, schedule_free() does not stop the LL timer, and the next
timer tick runs zephyr_ll_run() over the dangling task, dereferencing
freed memory and taking a load/store-privilege exception (EXCCAUSE 26) in
the user-space LL thread.

Stop relying on the cross-privilege semaphore handshake in this path. When
the task must be waited for, mark it cancelled so that, should it actually
be mid-execution on the scheduler's temporary list, it is removed via the
cancel path without re-running task->run() on resources the caller may
already have freed. If the task is still linked on the run list, the
scheduler thread is provably not executing it (a running task is moved off
sch->tasks with the lock dropped), so remove it directly and skip the
wait. This guarantees the task is delisted (n_tasks -> 0, handler -> NULL)
before pdata is freed, eliminating both the dangling list entry and the
stray timer wakeups.

Verified on PTL with the standalone user-space LL boot tests: the
userspace_ll suite, including pipeline_two_components_user, now passes
without the fatal exception at teardown.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
If SOF is built with CONFIG_SOF_USERSPACE_LL, the IPC user
handled will require access to coldrodata sections to initialize
audio modules.

This logic is not required for LLEXT modules, which have existing
code to add access to coldrodata (and other sections). This commit
is needed for builds where LLEXT is not used.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
This is a set of temporary changes to audio code to remove calls
to privileged interfaces that are not mandatory to run simple
audio tests.

These need proper solutions to be able to run all use-cases in user
LL version.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
The .coldrodata partition can be empty, avoid a failure in such
cases.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Add a missing header for the zephyr_ll_(un)lock_sched() functions.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Extract a privileged LLEXT-related part from
lib_manager_module_create() into a separate function and make it a
system call. At the same time ilib_manager_mod_free_priv() already
executes privileged operations, to make it callable in userspace
convert lib_manager_free_module() to a system call too.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
ll_schedule_domain.h is needed for user_ll_lock_sched() and
user_ll_unlock_sched()

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Use a pointer type-cast instead of copying a structure.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Use "%p" to log a pointer.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
scheduler_get_task_info_ll() and zephyr_ll_domain() are only needed
when CONFIG_SOF_USERSPACE_LL=y

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Map library data in DRAM to the LL memory domain but only with
kernel access. This is needed for LLEXT ELF linking.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
When CONFIG_SOF_USERSPACE_USE_DRIVER_HEAP isn't selected, dynamically
allocated driver objects should still be accessible to the userspace.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
When loading and linking LLEXT modules map them automatically for the
LL memory domain, unless they belong to the DP domain.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Even if userspace LL is disabled but generic userspace is enabled,
IPC syscalls can be enabled.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Drivers are now accessible to userspace LL, remove now superfluous
copies.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
LLEXT is now working with userspace LL and can be enabled.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
"Trace context" isn't used any more, no need to warn about it.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Memory zones are only used with IPC3, mark them as such.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Eliminate multiple instances of IPC4 data copying, use simple type-
casts instead. This removes stack objects and replaces run-time
copying with compile-time pointer substitution.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
When initialising in userspace use the userspace heap for channel
memory allocations.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Instead of allocating semaphores during global initialisation, do
that later when initialising the domain for specific cores. This
also automatically grants access rights to the allocating thread.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
This is needed at least to set the .priv_data pointer to NULL.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Also when userspace is used scheduler instances have to be allocated
uncached.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Prepare for multi-core support: allocate the IPC thread dynamically
and extract thread initialisation into a separate function.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Userspace IPC context is global, allocate it uncached.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Use current core when calling scheduler_get_data_for_core().

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Sometimes 10ms aren't enough for userspace IPC processing, increase
it to 20ms.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Make scheduling LL thread and synchronisation objects per-core and
forward IPCs and scheduling events accordingly.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants