soundwire: Use get_device()/put_device() to protect callbacks#5799
soundwire: Use get_device()/put_device() to protect callbacks#5799rfvirgil wants to merge 1 commit into
Conversation
Use get_device()/put_device() around calling driver callbacks instead of holding a mutex. This avoids mutex nesting and possible mutex inversion. Rename sdw_dev_lock to probe_remove_lock to make its purpose explicit. When the core SoundWire code calls a driver callback it must be sure that the target driver is not unloaded while the callback is running. The code was using the sdw_dev_lock mutex to block driver remove() while the callback is executing. This meant the callbacks were always called while holding that mutex, which could lead to mutex inversion. For example during enumeration: 1) Take sdw_dev_lock 2) Call update_status() 3) Take regmap lock But during BRA activity we could have: 1) Take regmap lock 2) Start BRA transaction 3) Take sdw_dev_lock 4) Call port_prep() and/or bus_config() callbacks A better way to prevent the target driver being unloaded is to increment its reference count by calling get_device(). This pattern is seen in other drivers that need to call callbacks in another device. The mutex is only needed around the call to get_device() to prevent the driver being removed while its reference count is being incremented. After that, the mutex can be released becase the driver cannot be removed until its reference count is decremented. Change-Id: I1f5cfd81d6204cf3aa92ddc2896af17307edd9c1 Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
There was a problem hiding this comment.
Pull request overview
This PR refactors SoundWire slave-driver callback invocation to avoid holding a mutex across driver callbacks by introducing sdw_slave_device_get()/put() (device refcounting) and renaming sdw_dev_lock to probe_remove_lock to clarify intent.
Changes:
- Rename
sdw_dev_locktoprobe_remove_lockinstruct sdw_slaveand corresponding init/destroy sites. - Introduce
sdw_slave_device_get()/sdw_slave_device_put()helpers and switch several callback call sites to use them instead of holding the mutex across the callback. - Update multiple core callback sites (stream + bus handling) to use the new get/put helpers.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| include/linux/soundwire/sdw.h | Renames the slave mutex field to probe_remove_lock and updates kernel-doc. |
| drivers/soundwire/slave.c | Adds sdw_slave_device_get()/put() helpers; updates mutex init/destroy for renamed lock. |
| drivers/soundwire/bus.h | Declares the new helper prototypes for internal users. |
| drivers/soundwire/stream.c | Replaces mutex-held callback invocations with sdw_slave_device_get()/put() around driver ops calls. |
| drivers/soundwire/bus.c | Replaces mutex-held callback invocations with sdw_slave_device_get()/put() in several bus/interrupt/status paths. |
| drivers/soundwire/bus_type.c | Renames mutex usage in probe/remove paths to probe_remove_lock. |
Comments suppressed due to low confidence (2)
drivers/soundwire/bus_type.c:174
- sdw_bus_remove() drops probe_remove_lock before invoking drv->remove(), so it can now run concurrently with callbacks that obtained a device reference via sdw_slave_device_get(). Since get_device()/put_device() does not block unbind/remove, the driver’s remove() may race with other driver callbacks and teardown shared state.
Consider making remove wait for any in-flight callbacks (refcount + wait) before calling drv->remove(), or otherwise ensuring callbacks cannot run once removal starts.
static void sdw_bus_remove(struct device *dev)
{
struct sdw_slave *slave = dev_to_sdw_dev(dev);
struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
mutex_lock(&slave->probe_remove_lock);
slave->probed = false;
mutex_unlock(&slave->probe_remove_lock);
if (drv->remove)
drv->remove(slave);
drivers/soundwire/bus.c:1804
- Same dev->driver lifetime race: dev->driver is dereferenced after sdw_slave_device_get() without any guarantee that unbind/remove cannot proceed concurrently.
This needs a stable referenced driver/module pointer captured under probe_remove_lock (or another synchronization scheme) before calling drv->ops->interrupt_callback().
if (sdw_slave_device_get(slave)) {
struct device *dev = &slave->dev;
struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
if (drv->ops && drv->ops->interrupt_callback) {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| struct device *sdw_slave_device_get(struct sdw_slave *slave) | ||
| { | ||
| guard(mutex)(&slave->probe_remove_lock); | ||
|
|
||
| if (!slave->probed) | ||
| return NULL; | ||
|
|
||
| return get_device(&slave->dev); | ||
| } |
| if (sdw_slave_device_get(slave)) { | ||
| struct device *dev = &slave->dev; | ||
| struct sdw_driver *drv = drv_to_sdw_driver(dev->driver); | ||
|
|
| if (sdw_slave_device_get(slave)) { | ||
| struct device *dev = &slave->dev; | ||
| struct sdw_driver *drv = drv_to_sdw_driver(dev->driver); | ||
|
|
| if (sdw_slave_device_get(slave)) { | ||
| struct device *dev = &slave->dev; | ||
| struct sdw_driver *drv = drv_to_sdw_driver(dev->driver); | ||
|
|
| if (sdw_slave_device_get(slave)) { | ||
| struct device *dev = &slave->dev; | ||
| struct sdw_driver *drv = drv_to_sdw_driver(dev->driver); | ||
|
|
||
| if (drv->ops && drv->ops->update_status) | ||
| ret = drv->ops->update_status(slave, status); |
Use get_device()/put_device() around calling driver callbacks instead of holding a mutex. This avoids mutex nesting and possible mutex inversion. Rename sdw_dev_lock to probe_remove_lock to make its purpose explicit.
When the core SoundWire code calls a driver callback it must be sure that the target driver is not unloaded while the callback is running. The code was using the sdw_dev_lock mutex to block driver remove() while the callback is executing.
This meant the callbacks were always called while holding that mutex, which could lead to mutex inversion. For example during enumeration:
But during BRA activity we could have:
A better way to prevent the target driver being unloaded is to increment its reference count by calling get_device(). This pattern is seen in other drivers that need to call callbacks in another device. The mutex is only needed around the call to get_device() to prevent the driver being removed while its reference count is being incremented. After that, the mutex can be released becase the driver cannot be removed until its reference count is decremented.
Change-Id: I1f5cfd81d6204cf3aa92ddc2896af17307edd9c1