Skip to content

/vm.resize-disk missbehaves when VMs disk is backed by a host block device. #7923

@vincent-thomas

Description

@vincent-thomas

Describe the bug
When using a host block device (e.g. /dev/sdX, /dev/loopX) as the backing storage for a vm's disk, calling /vm.resize-disk after the external host block device has been resized does not notify the guest of the new size.

Root cause
cloud-hypervisor treats the block device as a "raw file", which seems to be classified as an unstructured file on a filesystem. When cloud-hypervisor tries to resize said file, it runs ftruncate, from which the operation errors out. Since ftruncate fails, the rest of the notification code path is never reached and the guest remains unaware of the resize.

To Reproduce
Steps to reproduce the behaviour:

Have a linux kernel ready.

Run this script (expand to see).
#!/bin/bash
# Reproduces: vm.resize-disk fails on block devices (ftruncate returns EINVAL)
set -eu

: "${KERNEL:?Set KERNEL=/path/to/vmlinux}"
CH="${CH_BINARY:-./target/release/cloud-hypervisor}"
SOCK="/tmp/ch-blk-resize-test.sock"
IMG="/tmp/ch-blk-resize-test.img"

cleanup() { kill $PID 2>/dev/null; losetup -d $LOOP 2>/dev/null; rm -f $IMG $SOCK; }
trap cleanup EXIT

# Create 100MB block device
truncate -s 100M $IMG
LOOP=$(losetup --find --show $IMG)

# Start VM
$CH --api-socket $SOCK --kernel $KERNEL --disk path=$LOOP,id=blk \
    --cpus boot=1 --memory size=512M --console off --serial null &
PID=$!
while [ ! -S $SOCK ]; do sleep 0.1; done
sleep 1

# Resize block device on host to 200MB
truncate -s 200M $IMG
losetup -c $LOOP

# Try to notify guest - this fails with EINVAL
echo "Host size after resize: $(blockdev --getsize64 $LOOP) bytes"
echo "Calling vm.resize-disk API..."
curl -s --unix-socket $SOCK -X PUT -H "Content-Type: application/json" \
    -d '{"id":"blk","desired_size":209715200}' \
    http://localhost/api/v1/vm.resize-disk
echo

Example run:

$ sudo KERNEL=<path-to-kernel>.bin ./scripts/reproduce-block-device-resize-bug.sh
cloud-hypervisor:   0.002727s: <vmm> WARN:vmm/src/device_manager.rs:2679 -- No image_type specified - detected as raw. Configuration updated to persist type across reboots and migrations.
cloud-hypervisor:   0.002756s: <vmm> WARN:vmm/src/device_manager.rs:2685 -- Autodetected raw image type. Disabling sector 0 writes.
Host size after resize: 209715200 bytes
Calling vm.resize-disk API...
["Error from API","The disk could not be resized","Error from device manager","Disk resize error","Disk resize failed","I/O error","Resize failed","Invalid argument (os error 22)"]

Version
v51.0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions