Skip to content

KVM: allow importVm to adopt an existing RBD (Ceph) root volume#13365

Open
andrijapanicsb wants to merge 2 commits into
apache:4.22from
andrijapanicsb:kvm-importvm-rbd-root
Open

KVM: allow importVm to adopt an existing RBD (Ceph) root volume#13365
andrijapanicsb wants to merge 2 commits into
apache:4.22from
andrijapanicsb:kvm-importvm-rbd-root

Conversation

@andrijapanicsb
Copy link
Copy Markdown
Contributor

@andrijapanicsb andrijapanicsb commented Jun 6, 2026

Description

This PR allows KVM volume import on RBD (Ceph) primary storage to correctly adopt and record existing volumes. It covers two related flows that share the same code paths:

A) importVm (importsource=shared, KVM) adopting an existing ROOT volume on RBD — previously failed with Disk not found or is invalid (jobresultcode 530).

B) importVolume (managing a previously-unmanaged volume) on RBD — previously succeeded but recorded the wrong volume format (QCOW2 instead of RAW).

Two things were responsible:

  1. Agent side (flow A)LibvirtCheckVolumeCommandWrapper (which backs the CheckVolumeCommand issued during import) only whitelisted file-based pools (Filesystem, NetworkFilesystem, SharedMountPoint) and inspected the volume with direct file reads (checkQcow2File / getVirtualSizeFromFile). Those do not work on RBD, so the command returned Unsupported Storage Pool, the server-side instanceof CheckVolumeAnswer cast failed, and the import was rolled back with Disk not found or is invalid.

    RBD is now supported by inspecting the volume through the RBD URI via qemu-img — the same approach already used by LibvirtGetVolumesOnStorageCommandWrapper, which backs listVolumesForImport and already works on RBD. For raw RBD images the QCOW2 validation is skipped, the virtual size comes from the pool-reported disk, and the encrypted / backing-file / locked details are still collected so the existing server-side import validations keep working.

  2. Server / engine side (flows A and B)VolumeOrchestrator.importVolume() and updateImportedVolume() hardcoded the KVM volume format to QCOW2. RBD-backed volumes are RAW. The format is now derived from the storage pool type (RBD → RAW) via a single shared helper, applied at both import call sites. Because the standalone importVolume API (VolumeImportUnmanageManagerImplvolumeManager.importVolume(...)) goes through the same method, managing a previously-unmanaged RBD volume now also records it as RAW.

No new server-side guardrails were required: importKVMSharedDisk is already pool-type agnostic and passes the pool type through.

Upgrading existing (already-imported) volumes

This is a fix at import time; it does not retroactively rewrite volumes that were imported/managed before the fix and recorded as QCOW2. RBD/Ceph stores raw block images, so any volume on an RBD pool recorded as QCOW2 is a metadata error and can be corrected with:

UPDATE `cloud`.`volumes` v
JOIN `cloud`.`storage_pool` sp ON v.pool_id = sp.id
SET v.format = 'RAW'
WHERE sp.pool_type = 'RBD'
  AND v.format = 'QCOW2'
  AND v.removed IS NULL;

This is provided as an operator step rather than an automatic DB migration, because the 4.22 branch does not yet have a 4.22.1.0 → 4.22.2.0 schema-upgrade path scaffolded (the chain currently runs a no-op for that step). It can be promoted to an automatic migration if preferred.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

N/A

How Has This Been Tested?

Unit tests (new + extended), all passing locally on the 4.22 branch (JDK 17, Maven 3.9.9):

  • LibvirtCheckVolumeCommandWrapperTest (new, 5 tests) — RBD success path (virtual size + collected details), verification that qemu-img is pointed at the RBD URI, locked-volume detail propagation, invalid/missing volume, and unsupported pool type.
  • VolumeOrchestratorTest (3 added) — KVM+RBD → RAW, KVM+non-RBD → QCOW2, non-KVM ignores the pool type.
  • VolumeImportUnmanageManagerImplTest (1 added) — the importVolume (manage-volume) flow forwards the RBD pool type and KVM hypervisor to the orchestrator (i.e. flow B is wired to the RAW-format fix).
mvn -pl plugins/hypervisors/kvm test -Dtest='LibvirtCheckVolumeCommandWrapperTest'
mvn -pl engine/orchestration test -Dtest='VolumeOrchestratorTest'
mvn -pl server test -Dtest='VolumeImportUnmanageManagerImplTest'

To be verified on a KVM + Ceph/RBD environment: (A) importVm with importsource=shared adopting an existing ROOT volume on an RBD pool now succeeds and the resulting volume is recorded with format RAW; (B) importVolume of an unmanaged RBD volume records it as RAW.

How did you try to break this feature and the system with this change?

  • Non-RBD pools (Filesystem / NetworkFilesystem / SharedMountPoint) keep their existing behaviour (QCOW2 validation and QCOW2 format unchanged).
  • Unsupported pool types still return Unsupported Storage Pool.
  • Locked / encrypted / volumes with a backing file on RBD are still rejected by the existing server-side checkVolume validations (details are collected via the force-share qemu-img -U probe).
  • Non-KVM hypervisors are unaffected: the format helper only maps RBD → RAW for KVM, otherwise it delegates to the existing per-hypervisor mapping.

importVm with importsource=shared on KVM failed to adopt an existing ROOT
volume located on an RBD (Ceph) primary storage pool, returning
"Disk not found or is invalid" (jobresultcode 530).

Two changes fix this:

1. LibvirtCheckVolumeCommandWrapper only whitelisted file-based pools
   (Filesystem, NetworkFilesystem, SharedMountPoint) and inspected the volume
   with direct file reads (checkQcow2File / getVirtualSizeFromFile), which do
   not work on RBD. RBD is now supported: the volume is inspected through the
   RBD URI via qemu-img (the same approach already used by
   LibvirtGetVolumesOnStorageCommandWrapper, which backs listVolumesForImport),
   the QCOW2 check is skipped for raw RBD images, the virtual size is taken from
   the pool-reported disk, and the encrypted/backing-file/locked details are
   still collected so the management server import validations keep working.

2. VolumeOrchestrator.importVolume() and updateImportedVolume() hardcoded the
   KVM volume format to QCOW2. RBD-backed volumes are raw, so the format is now
   derived from the storage pool type (RBD -> RAW) through a single shared
   helper, applied at both import call sites.

Unit tests: new LibvirtCheckVolumeCommandWrapperTest and additional
VolumeOrchestratorTest cases for the format helper.
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 6, 2026

Codecov Report

❌ Patch coverage is 83.33333% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.68%. Comparing base (21b2025) to head (8827182).

Files with missing lines Patch % Lines
...urce/wrapper/LibvirtCheckVolumeCommandWrapper.java 81.81% 0 Missing and 2 partials ⚠️
...stack/engine/orchestration/VolumeOrchestrator.java 85.71% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #13365      +/-   ##
============================================
+ Coverage     17.67%   17.68%   +0.01%     
- Complexity    15792    15808      +16     
============================================
  Files          5922     5922              
  Lines        533165   533181      +16     
  Branches      65208    65211       +3     
============================================
+ Hits          94242    94309      +67     
+ Misses       428276   428221      -55     
- Partials      10647    10651       +4     
Flag Coverage Δ
uitests 3.69% <ø> (ø)
unittests 18.76% <83.33%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

importVolume (managing a previously-unmanaged volume) routes through
VolumeOrchestrator.importVolume() and therefore also benefits from the
pool-type-aware format fix (RBD -> RAW). Add a VolumeImportUnmanageManagerImplTest
case asserting the RBD pool type and KVM hypervisor are forwarded to the
orchestrator.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant