Vhost-guest (was virtio-vhost-user) device by DemiMarie · Pull Request #7695 · cloud-hypervisor/cloud-hypervisor

DemiMarie · 2026-02-12T01:18:16Z

Implement a vhost-guest device, formerly known as virtio-vhost-user. A vhost-guest device is a vhost-user server that forwards all messages to the guest.

Vhost-guest requires additional features not used by other virtio devices. Therefore, this PR is based on top of #7841. It is also very complex, as shown by its rather large size. To mitigate the complexity, most of the code is in a separate crate that does not depend on Cloud Hypervisor internals. This will allows it to be used by other VMMs (like crosvm), and (more importantly) will allow it to be unit tested without having to create a VM. I expect the number of unit tests needed to be significant.

A full-featured vhost-guest device also needs a fix to #7839. Securely hot-unplugging a vhost-guest device requires a fix to #7955. While the current code is not yet testable, I hope to have a testable version fairly soon after fixing #7955. It will require changes to Cloud Hypervisor core code to implement notifications.

Right now, the only vhost-guest driver I am aware of is an old version of crosvm. Crosvm uses VFIO to drive the device, which requires allow_unsafe_interrupts. I plan to upstream a Linux kernel driver that will provide a vhost-msg compatible API. There is corresponding work in QEMU to implement support for using such a driver, which will allow any QEMU-supported device to be used.

This implementation is optimized for performance at the expense of simplicity. Frontend-provided vring call and error eventfds are registered with the ioeventfd mechanism, and frontend-provided vring kick eventfds are registered as irqfds. The goal is to avoid Cloud Hypervisor being involved in any hot paths. For instance, this might allow a future version of DPDK and/or VPP to provide high-performance network services to other VMs.

This implementation is designed to be secure against both malicious guests and malicious vhost-user clients. I assumed that vhost-user clients are likely in separate sandboxes from Cloud Hypervisor, and therefore cannot compromise it directly without a separate vulnerability. I consider it to be a security vulnerability in this code if a malicious guest or malicious vhost-user client can compromise it, even if the two must coordinate with each other to perform the attack.

This implementation uses a substantial amount of unsafe code. Some of it is taken from the vhost-user crate, which is not designed for non-AF_UNIX transports. I have encapsulated this code to the extent possible, but very careful review would be greatly appreciated.

The protocol library (under virtio-vhost-user/) is ready to review. I expect the main future change to be the addition of lots of unit tests. The device implementation (in virtio-devices/src/virtio_vhost_user.rs) is incomplete and not hooked up to the rest of Cloud Hypervisor. While review would be appreciated, it is less important at this time.

DemiMarie · 2026-02-22T10:31:58Z

I’m now running into a serious bottleneck. The version of the virtio vhost-user device backend spec I am implementing requires support for additional interrupts beyond what virtio itself needs. Specifically, a vhost-user device backend (VDB) device will only have one queue. However, implementing a device with N queues needs N MSI-X interrupts, N externally-provided irqfds, and 2N + 1 externally-provided ioeventfds.

This is unique to the VDB device, and is a result of VDB being optimized for performance: it aims to avoid the userspace VMMs being involved in interrupt handling at runtime. I don’t see a way to avoid this without a substantial performance loss. For the desktop workloads @alyssais and I care about, this might be acceptable, but I doubt cloud workloads would be okay with it.

How should these be wired up to the rest of Cloud Hypervisor? Should I resort to slow userspace emulation? I could resort to handling interrupts in-band via control messages on the admin queue, which would work but not be performant. That’s fine for a proof of concept, but fixing it is a backwards-incompatible change (from the backend VM’s perspective) and so I want to fix it before writing and submitting the kernel driver.

Once that is done, the rest of the protocol should hopefully be straightforward to implement. Most messages require no involvement from Cloud Hypervisor beyond checking that they are of known numbers and the framing is correct. Protocol feature negotiation does need Cloud Hypervisor involvement, but only to ensure that no messages with unknown types are generated by the frontend or backend.

phip1611 · 2026-04-01T05:13:15Z

What's the status here? The PR description is empty.

DemiMarie · 2026-04-01T16:12:17Z

What's the status here? The PR description is empty.

This device is still incomplete. It has not been tested at all, which is not great. It's not even in a state where it can be tested. For instance, reading configuration space hits a todo!(), as does device reset. The device is also not hooked up to the device manager, so it can't be used.

What is largely implemented is processing incoming and outgoing vhost-user messages. The existing vhost-user crate makes too many assumptions that don't work for this code, so it needs to reimplement parsing itself. The result is that this device is quite complex. It also uses core code that is currently not used and which I implemented specifically for this device. Some of it (#7841) is not even merged yet.

rbradford · 2026-04-01T21:28:59Z

I think if there is no intention to land some code in this PR. It would be better to just reference a branch from an issue. Having too many open PRs creates noise and distractions.

DemiMarie · 2026-04-01T22:39:39Z

I think if there is no intention to land some code in this PR. It would be better to just reference a branch from an issue. Having too many open PRs creates noise and distractions.

I do intend to land this code. It’s not ready yet, though, which is why it is marked as draft.

At this point, I expect that most work will be fixing bugs, adding the device to the rest of the system, and testing it. In particular, I’d like feedback on the code structure, as I don’t intend to change that unless required.

alyssais · 2026-04-07T00:56:51Z

In particular, I’d like feedback on the code structure, as I don’t intend to change that unless required.

It might help if the PR had a proper title and description!

virtio-vhost-user requires for PCI doorbells used to notify the frontend of events. Add support for them. Each doorbell requires an MSI-X interrupt, so account for this. The specification for this capability is <https://stefanha.github.io/virtio/vhost-user-slave.html>. No functional change intended as the code is currently not used. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

Virtio-vhost-user devices need one interrupt per queue, plus one interrupt per config change, plus one interrupt per queue of the *device they are implementing*. Allow devices to override the number of interrupts they need instead of assuming it is always one more than the number of queues. No functional change for existing devices, which use a default implementation that returns the number of queues. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

It will always be 0. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

No functional change intended. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

The code will soon need to iterate over ioeventfds provided by the inner virtio device. Accessing that device requires a lock, which works fine with internal iteration but breaks with external iteration. While an Arc could be used to avoid the need for the callback, implementations of this method will need to grab a device-internal lock and invoke the callback with the lock held. Returning a MutexGuard would constrain the device implementation unnecessarily. This also changes VirtioTransport::ioeventfds() to take *two* addresses, not one. The callback also takes two addresses. This is because relocating a BAR must happen atomically with respect to registering and unregistering ioeventfds by the device. If the device manager used one call to unregister ioeventfds and a second call to reregister them, a device implementation could register ioeventfds between the two calls. Such ioeventfds would not be unregistered, potentially causing errors during reregistration. Using a single call for both unregistration and reregistration allows the device implementation to hold a lock across both operations. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

The device manager calls VirtioTransport::ioeventfds whenever it needs to unregister or register ioeventfds. This function invokes a callback with each virtqueue ioeventfd, its old base address, and its new base address. Also invoke the callback with ioeventfds provided by the device itself. No functional change as no current device provides any. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

Devices that register their own ioeventfds need to know the address they should register them at. It is not possible for this to be a callback on PciDevice because PciDevice contains the VirtioDevice. Calls to the callback after activation are ignored. Such calls are always preceeded by a call to VirtioPciDevice::ioeventfds(), which already provides the new address. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

Map memory from externally-provided FDs into the guest address space. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

Virtio-vhost-user uses four virtqueues. These are grouped into two pairs of two queues each. In each pair, one pair is used for requests and the other is used for replies. Each pair is also associated with an AF_UNIX stream socket used for communication with the frontend. Implement a queue pair abstraction that handles message framing and communication with the VM. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

These are significantly simpler than frontend->backend requests. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

Implement frontend-to-backend requests and the corresponding backend-to-frontend replies. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

This is a partial implementation of a virtio vhost-user device backend. There is no plan to implement migration for this device. In most use-cases, the VM using this device is tied to physical hardware and makes no sense to migrate. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

This removes dependencies on the rest of Cloud Hypervisor in preparation for making it a standalone crate. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

This allows it to be used outside of Cloud Hypervisor. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

This code is now used. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

The virtio-vhost-user crate already validates eventfds, so the device model in Cloud Hypervisor doesn't need to unsafely assume that using a non-eventfd as an eventfd is safe. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

phip1611 reviewed Feb 12, 2026

View reviewed changes

Comment thread virtio-devices/src/transport/pci_device.rs Outdated

DemiMarie force-pushed the virtio-device-backend branch from db7e861 to 187e228 Compare February 13, 2026 05:03

This was referenced Feb 14, 2026

First class file descriptor support #7704

Open

Delete unused shared memory region code #7709

Closed

DemiMarie force-pushed the virtio-device-backend branch from 187e228 to 291fa79 Compare February 24, 2026 21:36

DemiMarie force-pushed the virtio-device-backend branch 8 times, most recently from cf2d063 to 63d6c83 Compare March 10, 2026 09:58

DemiMarie force-pushed the virtio-device-backend branch from 63d6c83 to bc2de9d Compare March 24, 2026 13:56

DemiMarie force-pushed the virtio-device-backend branch 2 times, most recently from 995d721 to 7195efe Compare April 1, 2026 16:07

DemiMarie force-pushed the virtio-device-backend branch from 7195efe to e3a5270 Compare April 1, 2026 16:12

DemiMarie force-pushed the virtio-device-backend branch 8 times, most recently from 22e3be5 to 7e4c22a Compare April 3, 2026 03:57

DemiMarie force-pushed the virtio-device-backend branch 6 times, most recently from 668df93 to a864563 Compare April 3, 2026 06:12

DemiMarie changed the title ~~WIP virtio device backend~~ Vhost-guest (was virtio-vhost-user) device Apr 7, 2026

DemiMarie added 4 commits April 12, 2026 06:19

virtio-devices: PCI: Hard-code the settings BAR number

726dc00

It will always be 0. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

pci: Reduce use of magic number 4

d410189

No functional change intended. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

DemiMarie force-pushed the virtio-device-backend branch from 787c6c2 to 95ee5be Compare April 12, 2026 10:21

DemiMarie added 12 commits April 12, 2026 06:22

virtio-vhost-user: Guest mapping support

8559565

Map memory from externally-provided FDs into the guest address space. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

virtio-vhost-user: Implement backend->frontend requests

3955284

These are significantly simpler than frontend->backend requests. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

virtio-vhost-user: Frontend -> backend requests

0f00d8f

Implement frontend-to-backend requests and the corresponding backend-to-frontend replies. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

virtio-vhost-user: Make code standalone

1125564

This removes dependencies on the rest of Cloud Hypervisor in preparation for making it a standalone crate. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

virtio-vhost-user: Move protocol implementation to independent crate

9bbbb24

This allows it to be used outside of Cloud Hypervisor. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

vmm: Remove stale #[allow(dead_code)]

55d6166

This code is now used. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

virtio-devices: Remove stale unsafe code

be092a6

The virtio-vhost-user crate already validates eventfds, so the device model in Cloud Hypervisor doesn't need to unsafely assume that using a non-eventfd as an eventfd is safe. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

DemiMarie force-pushed the virtio-device-backend branch from 95ee5be to be092a6 Compare April 12, 2026 10:25

DemiMarie mentioned this pull request Apr 12, 2026

Support registering extra ioeventfds #7841

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vhost-guest (was virtio-vhost-user) device#7695

Vhost-guest (was virtio-vhost-user) device#7695
DemiMarie wants to merge 16 commits intocloud-hypervisor:mainfrom
DemiMarie:virtio-device-backend

DemiMarie commented Feb 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

DemiMarie commented Feb 22, 2026 •

edited

Loading

Uh oh!

phip1611 commented Apr 1, 2026

Uh oh!

DemiMarie commented Apr 1, 2026

Uh oh!

rbradford commented Apr 1, 2026

Uh oh!

DemiMarie commented Apr 1, 2026

Uh oh!

alyssais commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

DemiMarie commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

DemiMarie commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phip1611 commented Apr 1, 2026

Uh oh!

DemiMarie commented Apr 1, 2026

Uh oh!

rbradford commented Apr 1, 2026

Uh oh!

DemiMarie commented Apr 1, 2026

Uh oh!

alyssais commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

DemiMarie commented Feb 12, 2026 •

edited

Loading

DemiMarie commented Feb 22, 2026 •

edited

Loading