KVM: fix UEFI disk-only instance snapshot NVRAM handling#13020
KVM: fix UEFI disk-only instance snapshot NVRAM handling#13020Kunalbehbud wants to merge 4 commits intoapache:4.22from
Conversation
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 4.22 #13020 +/- ##
============================================
+ Coverage 17.62% 17.67% +0.04%
- Complexity 15723 15780 +57
============================================
Files 5921 5921
Lines 532841 533245 +404
Branches 65191 65230 +39
============================================
+ Hits 93914 94228 +314
- Misses 428310 428345 +35
- Partials 10617 10672 +55
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17485 |
|
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch. |
Add the command payload, answer metadata, and host capability plumbing required for KVM disk-only instance snapshots to carry UEFI NVRAM state between management and the KVM agent. Also synchronize host capability booleans on reconnect so stale UEFI/NVRAM support details are removed when an older agent reconnects.
Copy the active UEFI NVRAM file as a sidecar during disk-only instance snapshot creation, restore it on revert, and clean it up during delete and merge flows. Also tighten host capability checks, preserve successful snapshot metadata when post-snapshot thaw or resume fails, and reject reverting UEFI disk-only snapshots that do not contain NVRAM state.
Cover host capability synchronization, UEFI NVRAM sidecar handling across create/revert/delete flows, and the running-VM recovery paths for disk-only instance snapshots.
f657823 to
2bc9051
Compare
|
Rebased this branch on the latest Local verification passed again on top of the updated base: The new GitHub Actions runs for this fork push are currently in |
There was a problem hiding this comment.
Pull request overview
Fixes KVM disk-only instance snapshots for UEFI VMs by capturing/restoring the active NVRAM state via an NVRAM “sidecar” file, and gating these flows on explicit host capabilities to prevent unsafe reverts.
Changes:
- Plumbs UEFI/NVRAM metadata through management ↔ agent commands (create/revert/delete) and persists NVRAM sidecar path in snapshot details.
- Implements NVRAM sidecar copy on snapshot creation and atomic restore on revert; adds sidecar cleanup on delete/merge flows.
- Adds host capability advertising + reconnect-time capability syncing/cleanup, plus targeted unit tests for the new behavior.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| plugins/hypervisors/kvm/src/test/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtDiskOnlyVMSnapshotCommandWrapperTest.java | New unit tests covering NVRAM sidecar backup/restore/cleanup and freeze/suspend sequencing/warnings. |
| plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtCreateDiskOnlyVMSnapshotCommandWrapper.java | Copies active NVRAM into a per-snapshot sidecar; adds freeze/suspend + post-snapshot recovery warnings. |
| plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtRevertDiskOnlyVMSnapshotCommandWrapper.java | Validates presence of NVRAM sidecar for UEFI reverts and atomically restores active NVRAM from the sidecar. |
| plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtDeleteDiskOnlyVMSnapshotCommandWrapper.java | Deletes NVRAM sidecar when present, using either provided primary datastore or root snapshot lookup. |
| plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtReadyCommandWrapper.java | Advertises the new host capability for NVRAM-aware disk-only snapshots. |
| plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java | Adds host detail advertising at startup and provides getUefiNvramPath(vmUuid) helper. |
| engine/storage/snapshot/src/test/java/org/apache/cloudstack/storage/vmsnapshot/KvmFileBasedStorageVmSnapshotStrategyTest.java | New tests validating NVRAM path persistence, agent command plumbing, and capability gating. |
| engine/storage/snapshot/src/main/java/org/apache/cloudstack/storage/vmsnapshot/KvmFileBasedStorageVmSnapshotStrategy.java | Persists NVRAM sidecar metadata, gates create/revert/delete/merge cleanup on capabilities, sends alert on recovery warnings. |
| engine/orchestration/src/test/java/com/cloud/agent/manager/AgentManagerImplTest.java | Adds test ensuring stale NVRAM capability is cleared on reconnect when no longer advertised. |
| engine/orchestration/src/main/java/com/cloud/agent/manager/AgentManagerImpl.java | Syncs boolean host capabilities from ReadyAnswer details, including removal of stale DB details when not advertised. |
| core/src/main/java/com/cloud/agent/api/storage/CreateDiskOnlyVmSnapshotCommand.java | Adds vmUuid and uefiEnabled fields for agent-side NVRAM handling decisions. |
| core/src/main/java/com/cloud/agent/api/storage/CreateDiskOnlyVmSnapshotAnswer.java | Adds nvramSnapshotPath to return sidecar path back to management. |
| core/src/main/java/com/cloud/agent/api/storage/RevertDiskOnlyVmSnapshotCommand.java | Adds vmUuid, uefiEnabled, and nvramSnapshotPath to support validated/atomic NVRAM restores. |
| core/src/main/java/com/cloud/agent/api/storage/DeleteDiskOnlyVmSnapshotCommand.java | Adds optional nvramSnapshotPath and primaryDataStore to support sidecar-only cleanup. |
| api/src/main/java/com/cloud/host/Host.java | Introduces HOST_KVM_DISK_ONLY_VM_SNAPSHOT_NVRAM host capability key. |
| plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtConvertInstanceCommandWrapper.java | Fixes a typo in an error message (“calll” → “call”). |
| agent/conf/agent.properties | Removes a trailing whitespace-only line. |
| PendingReleaseNotes | Adds release note about UEFI disk-only snapshots now capturing NVRAM and legacy snapshot revert rejection. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| protected void verifyVmFilesystemsFrozen(Domain domain, String vmName) throws LibvirtException, IOException { | ||
| String status = getResultOfQemuCommand(FreezeThawVMCommand.STATUS, domain); | ||
| if (StringUtils.isBlank(status) || !new JsonParser().parse(status).isJsonObject()) { | ||
| throw new IOException(String.format("Failed to verify VM [%s] filesystem freeze state before taking the disk-only VM snapshot. Result: %s", vmName, status)); | ||
| } | ||
|
|
||
| String statusResult = new JsonParser().parse(status).getAsJsonObject().get("return").getAsString(); | ||
| if (!FreezeThawVMCommand.FREEZE.equals(statusResult)) { | ||
| throw new IOException(String.format("Failed to freeze VM [%s] filesystems before taking the disk-only VM snapshot. Status: %s", vmName, statusResult)); | ||
| } |
There was a problem hiding this comment.
verifyVmFilesystemsFrozen assumes the guest-agent response is valid JSON with a string return field. JsonParser.parse(...), .get("return"), and .getAsString() can throw unchecked exceptions (e.g., JsonSyntaxException/IllegalStateException/NullPointerException) on malformed/unexpected responses, which will bypass the surrounding catch (LibvirtException | IOException) and can cause the agent command to fail without returning a proper Answer (and potentially leave the guest frozen until the finally recovery runs). Consider parsing defensively (e.g., catch runtime JSON parsing exceptions and rethrow as IOException), validating the presence/type of return, and treating JSON error responses explicitly as failures.
Description
KVM disk-only instance snapshots do not capture the active UEFI NVRAM state, which makes revert unsafe for UEFI guests.
This PR fixes the KVM disk-only instance snapshot flow to:
quiescevm=trueOlder UEFI disk-only snapshots created without an NVRAM sidecar are intentionally rejected on revert.
This PR only covers
disk-onlyinstance snapshots for KVM UEFI VMs.snapshotMemory=true/ internal snapshots remain out of scope.Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
N/A
How Has This Been Tested?
mvn -pl engine/orchestration,engine/storage/snapshot,plugins/hypervisors/kvm -am -Dtest=AgentManagerImplTest,KvmFileBasedStorageVmSnapshotStrategyTest,LibvirtDiskOnlyVMSnapshotCommandWrapperTest -Dsurefire.failIfNoSpecifiedTests=false testHow did you try to break this feature and the system with this change?