Skip to content

Fix block device resize disk#7948

Merged
rbradford merged 2 commits intocloud-hypervisor:mainfrom
vincent-thomas:fix-block-device-resize-disk
Apr 17, 2026
Merged

Fix block device resize disk#7948
rbradford merged 2 commits intocloud-hypervisor:mainfrom
vincent-thomas:fix-block-device-resize-disk

Conversation

@vincent-thomas
Copy link
Copy Markdown
Contributor

@vincent-thomas vincent-thomas commented Apr 1, 2026

As it currently is, CH doesn't support resizing disks of VMs whose storage is backed by a host block device. This PR addresses this and prevents (but doesn't fix) a deadlock of the CH process (which in turn causes VM cpu to freeze) from happening when doing so.

Previous discussion and further details: #7923

Fixes #7923

@vincent-thomas vincent-thomas requested a review from a team as a code owner April 1, 2026 19:08
@vincent-thomas vincent-thomas force-pushed the fix-block-device-resize-disk branch from ecbc12e to 51a33af Compare April 1, 2026 19:22
Comment thread block/src/raw_async.rs Outdated

let nsectors = new_size / SECTOR_SIZE;

self.common.pause().map_err(Error::PauseVcpus)?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need this for proper synchronization during disk-resize?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, PauseVcpus is wrong here (a mistake I made). We should correct this and make it PauseVirtioThreads or similar.

Copy link
Copy Markdown
Contributor Author

@vincent-thomas vincent-thomas Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pause and resume actually caused a deadlock of the cloud-hypervisor process and froze the whole VM indefinitely.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not happen. Does it only deadlock when you use block devices? For me, using a raw file with the original code works perfectly fine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not deadlock when using a block device on the host with "fast" I/O (for example losetup).
It does deadlock when using a host block device on the host with "slow", network-bound I/O (for example ceph; rbd mapped block device). Both scenarios happen 100% of the time when manually testing.

The removal of the pause/resume came from experimentation, which turned out to fix this issue and work well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the VM was paused, and we can ensure that it says paused during the resize call, I would suggest that we only then skip pausing/resuming the virtio-queues. @phip1611 @vincent-thomas

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phip1611 you mix up vcpu pausing and virtqueu pausing ;)

Copy link
Copy Markdown
Contributor Author

@vincent-thomas vincent-thomas Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpressure @phip1611 Have a look at my recently updated diff, It is changed to fix the root cause.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only then skip pausing/resuming the virtio-queues.

I agree - looks like @vincent-thomas did the right thing now. LGTM on a first glance

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ask for another review since it sounds like this conversation has cleared up maintainers worries.

Comment thread block/src/raw_async.rs Outdated
Comment thread block/src/raw_async.rs Outdated
Comment thread block/src/raw_async.rs Outdated
@vincent-thomas vincent-thomas force-pushed the fix-block-device-resize-disk branch 3 times, most recently from bbc9d98 to 603b73b Compare April 8, 2026 10:40
@vincent-thomas vincent-thomas requested a review from phip1611 April 9, 2026 09:46
@phip1611
Copy link
Copy Markdown
Member

phip1611 commented Apr 9, 2026

please fix CI

Copy link
Copy Markdown
Member

@phip1611 phip1611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are good, thanks!

It looks to me that your second commit changes lines you introduced in the first commit. We value clean git histories.

Please bring your commits into clean structure, then we can move this forward.

Also see https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/CONTRIBUTING.md#patch-format--git-commit-hygiene

Comment thread block/src/raw_async.rs
@vincent-thomas vincent-thomas force-pushed the fix-block-device-resize-disk branch from 603b73b to 75256f5 Compare April 9, 2026 14:46
@vincent-thomas
Copy link
Copy Markdown
Contributor Author

Please bring your commits into clean structure

Sorry about that, should be okay now.

@vincent-thomas vincent-thomas requested a review from phip1611 April 9, 2026 16:00
@phip1611
Copy link
Copy Markdown
Member

phip1611 commented Apr 9, 2026

Please run cargo +nightly fmt --all to fix your formatting.

The commit message check can be ignored (optional CI step). You could properly fix it by using the scheme:

See [0].

[0] https://...

);
self.paused.store(true, Ordering::SeqCst);

// If already paused, return early to avoid deadlock waiting on barrier
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This situation occurs when the VMM thread holds a device mutex while
calling an operation that triggers pause(), and a vCPU thread
simultaneously needs that same mutex for MMIO access. With slow I/O
backends (like RBD/Ceph), the timing window for this race is larger,
making the deadlock more likely to occur.

Ah, wait. Doesn't this open another race window? Another thread may unpause this too early? Or are we protected by mutexes?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is another place where we need a three-way handshake with atomic

RUNNING -> PAUSING -> PAUSED

It might not strictly needed in this case since we don't really do anything after.

@vincent-thomas vincent-thomas force-pushed the fix-block-device-resize-disk branch 2 times, most recently from 1bd5f9d to 5bace4c Compare April 16, 2026 10:03
Copy link
Copy Markdown
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your race commit fix deserves its own PR as I think it is independent of the resize fix?

@vincent-thomas
Copy link
Copy Markdown
Contributor Author

Your race commit fix deserves its own PR as I think it is independent of the resize fix?

Yeah sure but if the resize fix lands and the race commit doesn't, then CH will deadlock on any resize-disk API call where:

  • VM has been paused before the resize disk call.
  • The VM is backed by a slow host block device.

@rbradford
Copy link
Copy Markdown
Member

rbradford commented Apr 16, 2026

The ordering of the commits in the PR is probably wrong then - but I asked you to split the race fix out as it's an obviously good fix.

Previously, calling pause() when already paused would wait on a barrier
for worker threads that were already parked, causing a deadlock.

This situation occurs when the VMM thread holds a device mutex while
calling an operation that triggers pause(), and a vCPU thread
simultaneously needs that same mutex for MMIO access. With slow I/O
backends (like RBD/Ceph), the timing window for this race is larger,
making the deadlock more likely to occur, see [0].

Make pause() idempotent by checking the paused state atomically and
returning early if already paused, avoiding the barrier wait.

[0] #7948 (comment)

Signed-off-by: Vincent Thomas <vincent@v-thomas.com>
@rbradford rbradford force-pushed the fix-block-device-resize-disk branch from 5bace4c to c2ec811 Compare April 16, 2026 17:16
@rbradford
Copy link
Copy Markdown
Member

I flipped the order of your commits.

@rbradford rbradford enabled auto-merge April 16, 2026 17:17
@rbradford rbradford requested a review from phip1611 April 16, 2026 17:17
@rbradford rbradford dismissed phip1611’s stale review April 16, 2026 17:18

PR was cleaned up

@rbradford rbradford disabled auto-merge April 16, 2026 17:21
Copy link
Copy Markdown
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need cargo fmt

);
self.paused.store(true, Ordering::SeqCst);

// If already paused, return early to avoid deadlock waiting on barrier
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is another place where we need a three-way handshake with atomic

RUNNING -> PAUSING -> PAUSED

It might not strictly needed in this case since we don't really do anything after.

Block devices (LVM volumes, loop devices, RBD, etc.) cannot be resized
via ftruncate - they are resized externally. When vm.resize-disk is
called for a block device backend, verify the device size matches the
requested size instead of attempting ftruncate.

This enables the resize-disk API to work with block device backends by
validating the externally-resized device matches the expected size.

Signed-off-by: Vincent Thomas <vincent@v-thomas.com>
@vincent-thomas vincent-thomas force-pushed the fix-block-device-resize-disk branch from c2ec811 to e91273d Compare April 17, 2026 07:01
@vincent-thomas
Copy link
Copy Markdown
Contributor Author

vincent-thomas commented Apr 17, 2026

Still need cargo fmt

Sorry about this, ran it a couple of times I thought.... Should be fixed now.

@rbradford rbradford enabled auto-merge April 17, 2026 07:09
@rbradford rbradford added this pull request to the merge queue Apr 17, 2026
github-merge-queue Bot pushed a commit that referenced this pull request Apr 17, 2026
Previously, calling pause() when already paused would wait on a barrier
for worker threads that were already parked, causing a deadlock.

This situation occurs when the VMM thread holds a device mutex while
calling an operation that triggers pause(), and a vCPU thread
simultaneously needs that same mutex for MMIO access. With slow I/O
backends (like RBD/Ceph), the timing window for this race is larger,
making the deadlock more likely to occur, see [0].

Make pause() idempotent by checking the paused state atomically and
returning early if already paused, avoiding the barrier wait.

[0] #7948 (comment)

Signed-off-by: Vincent Thomas <vincent@v-thomas.com>
Merged via the queue into cloud-hypervisor:main with commit fd8ded9 Apr 17, 2026
38 checks passed
@vincent-thomas vincent-thomas deleted the fix-block-device-resize-disk branch April 17, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/vm.resize-disk missbehaves when VMs disk is backed by a host block device.

5 participants