Skip to content

Support registering extra ioeventfds#7841

Open
DemiMarie wants to merge 7 commits intocloud-hypervisor:mainfrom
DemiMarie:extra-ioeventfds
Open

Support registering extra ioeventfds#7841
DemiMarie wants to merge 7 commits intocloud-hypervisor:mainfrom
DemiMarie:extra-ioeventfds

Conversation

@DemiMarie
Copy link
Copy Markdown
Contributor

This is the other prerequisite needed for vhost-guest.

@DemiMarie DemiMarie requested a review from a team as a code owner March 14, 2026 03:13
@rbradford rbradford requested a review from sboeuf March 16, 2026 10:15
Isr = 3,
Device = 4,
Pci = 5,
Doorbell = 6,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we safely rely on the spec you pointed out in the commit message?
The latest virtio spec has the following description:

/* Common configuration */ 
#define VIRTIO_PCI_CAP_COMMON_CFG        1 
/* Notifications */ 
#define VIRTIO_PCI_CAP_NOTIFY_CFG        2 
/* ISR Status */ 
#define VIRTIO_PCI_CAP_ISR_CFG           3 
/* Device specific configuration */ 
#define VIRTIO_PCI_CAP_DEVICE_CFG        4 
/* PCI configuration access */ 
#define VIRTIO_PCI_CAP_PCI_CFG           5 
/* Shared memory region */ 
#define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 
/* Vendor-specific data */ 
#define VIRTIO_PCI_CAP_VENDOR_CFG        9

so I guess it should be safe to use doorbell = 6, but I wanted to double check (/cc @rbradford )

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. Given this is not a part of the official spec, Let's add a comment above to document the source variant spec we are referring to.

@DemiMarie DemiMarie force-pushed the extra-ioeventfds branch 4 times, most recently from 35d09bd to eb906fa Compare March 21, 2026 08:20
Copy link
Copy Markdown
Contributor Author

@DemiMarie DemiMarie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the commit messages. Hopefully that answers your questions.

@DemiMarie DemiMarie force-pushed the extra-ioeventfds branch 2 times, most recently from 5f7e04f to f4e55a3 Compare March 22, 2026 10:12
@alyssais
Copy link
Copy Markdown
Member

The first commit message uses the terminology "virtio-device-backend", which has not been previously introduced. It would be less confusing to keep calling it virtio-vhost-user for now, ideally until we have a spec proposal that refers to it as whatever name you want to give it.

@DemiMarie DemiMarie force-pushed the extra-ioeventfds branch 2 times, most recently from 6926747 to 63d4b53 Compare March 25, 2026 11:58
@DemiMarie DemiMarie requested a review from sboeuf April 1, 2026 16:16
virtio-vhost-user requires for PCI doorbells used to
notify the frontend of events.  Add support for them.

The specification for this capability is
<https://stefanha.github.io/virtio/vhost-user-slave.html>.

No functional change intended as the code is currently not used.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Virtio-vhost-user devices need one interrupt per queue, plus one
interrupt per config change, plus one interrupt per queue of the
*device they are implementing*.  Allow devices to override the number of
interrupts they need instead of assuming it is always one more than the
number of queues.  No functional change for existing devices, which use
a default implementation that returns the number of queues.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
It will always be 0.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
No functional change intended.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
The code will soon need to iterate over ioeventfds provided by the inner
virtio device.  Accessing that device requires a lock, which works fine
with internal iteration but breaks with external iteration.  While an
Arc could be used to avoid the need for the callback, implementations of
this method will need to grab a device-internal lock and invoke the
callback with the lock held.  Returning a MutexGuard would constrain the
device implementation unnecessarily.

This also changes VirtioDevice::ioeventfds() to take *two* addresses,
not one.  The callback also takes two addresses.  This is because
relocating a BAR must happen atomically with respect to registering and
unregistering ioeventfds by the device.  If the device manager
used one call to unregister ioeventfds and a second call to reregister
them, a device implementation could register ioeventfds between the two
calls.  Such ioeventfds would not be unregistered, potentially causing
errors during reregistration.  Using a single call for both
unregistration and reregistration allows the device implementation to
hold a lock across both operations.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
The device manager calls VirtioTransport::ioeventfds whenever it
needs to unregister or register ioeventfds.  This function invokes a
callback with each virtqueue ioeventfd, its old base address, and its
new base address.  Also invoke the callback with ioeventfds provided by
the device itself.

No functional change as no current device provides any.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Devices that register their own ioeventfds need to know the address they
should register them at.  It is not possible for this to be a callback
on PciDevice because PciDevice contains the VirtioDevice.

Calls to the callback after activation are ignored.  Such calls are
always preceeded by a call to VirtioPciDevice::ioeventfds(), which
already provides the new address.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
@likebreath
Copy link
Copy Markdown
Member

@sboeuf Appears all previous comments are addressed. Can you please take another look?

Copy link
Copy Markdown
Member

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thanks for addressing the comments, LGTM

Copy link
Copy Markdown
Member

@likebreath likebreath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments below.

Comment on lines +60 to +61
/// Maximum number of doorbells.
pub const MAX_DOORBELLS: u16 = (u8::MAX as u16) * 2 + 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some context as comments to explain why it is defined as 511?


/// The maximum number of doorbells the device supports.
/// Most devices don't support any.
/// Limited to 511 doorbells.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: better refer to MAX_DOORBELLS rather 511

Comment on lines +4211 to +4212
// This includes 1 per virtqueue, 1 per doorbell, and 1 for virtio config
// space change.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this change appears to be included in the wrong commit.

Isr = 3,
Device = 4,
Pci = 5,
Doorbell = 6,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. Given this is not a part of the official spec, Let's add a comment above to document the source variant spec we are referring to.

Comment on lines +436 to +437
let num_queues =
locked_device.queue_max_sizes().len() + usize::from(locked_device.doorbells_max());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a behavior change and needs comment to explain why now "num_queues" is now the sum of "queue_num" and "doorbells".

Also, likely this change is should be included to commit "virtio-devices: Add support for PCI doorbells".

const DOORBELL_BAR_OFFSET: u64 =
next_bar_addr_align(NOTIFICATION_BAR_OFFSET, NOTIFICATION_SIZE, 1u64 << 16);
const DOORBELL_OFF_MULTIPLIER: u32 = NOTIFY_OFF_MULTIPLIER;
const DOORBELL_BAR_SIZE: u64 = (MAX_QUEUES * 2 + 1) * DOORBELL_OFF_MULTIPLIER as u64;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be:

const DOORBELL_BAR_SIZE: u64 = MAX_DOORBELLS * DOORBELL_OFF_MULTIPLIER as u64;

Comment on lines 438 to 450
if num_queues > (NOTIFICATION_SIZE / u64::from(NOTIFY_OFF_MULTIPLIER)) as usize {
return Err(VirtioPciDeviceError::CreateVirtioPciDevice(anyhow!(
"Got {} queues, but limit is {}",
num_queues,
NOTIFICATION_SIZE / u64::from(NOTIFY_OFF_MULTIPLIER)
)));
}
let num_doorbells = locked_device.doorbells_max();
if num_doorbells > MAX_DOORBELLS {
return Err(VirtioPciDeviceError::CreateVirtioPciDevice(anyhow!(
"Got {num_doorbells} doorbells, but limit is {MAX_DOORBELLS}"
)));
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following on the comment about num_queues, this check should be done earlier, and num_doorbells should be reused for calculating num_queues.

Comment on lines +14 to +19
fn ioeventfds<T>(
&self,
old_base_addr: u64,
new_base_addr: u64,
cb: &mut dyn FnMut(&EventFd, u64, u64) -> core::result::Result<(), T>,
) -> core::result::Result<(), T>;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a public trait and its semantics is getting more complex, can you please add a comment above to document its usage, particularly the cb argument?

Also a nit about the commit message for d863181, I believe you meant to say:

This also changes VirtioTransport::ioeventfds() to take ...

Comment on lines +907 to +924
let mut err = Ok(());
match self.device.lock().unwrap().ioeventfds(
old_base_addr,
new_base_addr,
&mut |eventfd, old_addr, new_addr| {
assert!(err.is_ok());
cb(eventfd, old_addr, new_addr).map_err(|e| {
err = Err(e);
PrivatelyConstructableError(PhantomData)
})
},
) {
Ok(()) => {
err.unwrap();
Ok(())
}
Err(PrivatelyConstructableError(PhantomData)) => Err(err.unwrap_err()),
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this works to resolve the error type mismatch problem.

Though, I'd suggest we make fn ioeventfds from both VirtioTransport and VirtioDevice return a concrete error type, say Anyhow::Result<>. Since both traits would share the same concrete error type, the transport impl can just forward cb directly to the device with ? — no side-channel, no PhantomData, no error smuggling.

All three call sites already discard type information (formatting into strings or wrapping into a single variant), so the generic buys nothing in practice.

Comment on lines +189 to +202
/// Sets the device's config address base. This is only necessary for
/// devices that need to know the guest physical address of the PCI BARs
/// they use. Therefore, it is only meaningful for PCIe devices.
/// Most devices won't implement this method.
///
/// One valid use of this method is to set the address at which the
/// device should register custom ioeventfds.
///
/// This method will be called *before* [`VirtioDevice::activate`] and
/// implementations must be prepared for this.
///
/// Calls after [`VirtioDevice::activate`] must not affect ioeventfd
/// registration. This is the job of [`VirtioDevice::ioeventfds`].
fn set_config_address_base(&mut self, _base: u64) {}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we drop this (the last) commit from the series. set_config_address_base has no current consumer, while the default impl is empty and no device overrides it. Besides the comment, I can't meaningfully review if the call-site is correct. The remaining 6 commits are self-contained without it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants