Skip to content

Convert command's timeout for snapshots commands#13210

Open
erikbocks wants to merge 1 commit into
apache:mainfrom
scclouds:fix-snapshot-timeout
Open

Convert command's timeout for snapshots commands#13210
erikbocks wants to merge 1 commit into
apache:mainfrom
scclouds:fix-snapshot-timeout

Conversation

@erikbocks
Copy link
Copy Markdown
Collaborator

@erikbocks erikbocks commented May 21, 2026

Description

The #9659 PR introduced the commands.timeout global configuration for granular command timeout definition. If a operator wishes to increase snapshot related timeouts, he could increase the CreateObjectCommand timeout in the commands.timeout configuration. The defined timeouts are set in seconds.

However, normal and incremental snapshots creation flows use qemu-img script to execute some of the necessary operations. These scripts accept timeouts as milliseconds, but receive them as seconds from the CreateObjectCommand. This leads to incorrect timeouts.

Therefore, this PR converts the CreateObjectCommand seconds to milliseconds before passing them to qemu-img scripts.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

First, I set commands.timeout configuration value to CreateObjectCommand=300. This defines this command timeout to 5 minutes. Then, I tried to create a full volume snapshot. The process failed due to a timeout.

Command timeout log
2026-05-21 15:04:44,122 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (Work-Job-Executor-13:[ctx-fe95241a, job-77/job-78, ctx-11fcb589]) (logid:d462b71a) Wait time setting on org.apache.cloudstack.storage.command.CreateObjectCommand is 300 seconds
Failure log
2026-05-21 15:04:56,776 DEBUG [utils.script.Script] (AgentRequest-Handler-1:[]) (logid:d462b71a) Executing command [qemu-img convert -O qcow2 -U --image-opts driver=qcow2,file.filename=/mnt/4244ccc6-8f06-3f9c-87bc-a7ba8c1caae9/5331f95a-ec63-4a79-a28f-7642ed095875 /mnt/507fca2c-a424-3ffc-b5f6-f7fe9e7c17e7/snapshots/2/4/2ad278b7-e1fa-4331-ae36-4ba67db4097b ].
2026-05-21 15:04:57,080 WARN  [utils.script.Script] (AgentRequest-Handler-1:[]) (logid:d462b71a) Process [14516] for command [qemu-img convert -O qcow2 -U --image-opts driver=qcow2,file.filename=/mnt/4244ccc6-8f06-3f9c-87bc-a7ba8c1caae9/5331f95a-ec63-4a79-a28f-7642ed095875 /mnt/507fca2c-a424-3ffc-b5f6-f7fe9e7c17e7/snapshots/2/4/2ad278b7-e1fa-4331-ae36-4ba67db4097b ] timed out. Output is [].

I installed the packages with the timeout conversion to my local environment, then tried to create another snapshot. Using a debug breakpoint, I validated that the qemu-img script instance had the converted timeout. The new snapshot was created successfully,


When trying to create a incremental snapshot, the same error occurred. Then, I installed the packages with the necessary changes and tried to create another incremental snapshot. This process also used debug breakpoints to validate the scripts timeout.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

❌ Patch coverage is 16.66667% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 18.09%. Comparing base (1fe486f) to head (c60546b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...ud/hypervisor/kvm/storage/KVMStorageProcessor.java 16.66% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #13210      +/-   ##
============================================
- Coverage     18.09%   18.09%   -0.01%     
- Complexity    16732    16733       +1     
============================================
  Files          6037     6037              
  Lines        542780   542812      +32     
  Branches      66464    66471       +7     
============================================
  Hits          98233    98233              
- Misses       433499   433531      +32     
  Partials      11048    11048              
Flag Coverage Δ
uitests 3.51% <ø> (ø)
unittests 19.26% <16.66%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant