Skip to content

Serial/console socket connection should persist across VM reboot #7974

@maxpain

Description

@maxpain

Summary

When a VM reboots (via API vm.reboot or guest-initiated), Cloud Hypervisor tears down and recreates the entire serial/console subsystem — including closing the accepted Unix socket connection, deleting the socket file, and re-binding a new listener.

This means any connected serial console client gets an EOF and must reconnect. Boot messages from the rebooted VM are lost in the gap between the old socket closing and the client reconnecting to the new one.

Current behavior

In vmm/src/lib.rs, vm_reboot():

  1. Calls vm.shutdown() → old Vm is dropped → DeviceManager dropped → SerialManager::drop() fires
  2. SerialManager::drop() kills the epoll thread, closes the UnixListener, and deletes the socket file
  3. pre_create_console_devices() is called again → new UnixListener::bind() creates a fresh socket
  4. New SerialManager is spawned, waiting for a new client connection

During steps 2-3, any serial output from the VM (BIOS, bootloader, early kernel) is lost because no client is connected to the new socket.

Expected behavior

The serial/console chardev socket connection should persist across VM reboot, similar to how QEMU handles it:

  • QEMU's system_reset() only resets UART registers (serial_reset() in hw/char/serial.c) — the backing chardev socket is untouched
  • The connected client sees a brief pause, then receives boot output from the rebooted VM
  • No reconnection needed, no output lost

This is important for:

  • Debugging boot issues — if a VM fails to boot after reboot, the serial output is the only way to diagnose, but it's lost in the reconnection gap
  • Serial console proxies — systems like OpenStack nova-serialproxy rely on the connection persisting across reboots. Cloud Hypervisor's current behavior requires additional reconnect logic in every consumer
  • Audit/logging — continuous serial log capture is broken by the gap

Proposed solution

During vm_reboot(), preserve the SerialManager (and its accepted client connections) instead of dropping and recreating it. Reset only the UART device state, not the host-side socket infrastructure.

This would align with QEMU's approach where the chardev layer is decoupled from device reset.

Want to contribute?

  • I would like to work on this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions