engine, server, services: fix for respecting secondary storage threshold limit#3480
Conversation
…old limit ImageStoreProviderManager will return a null value when there is no image store available with free space and satisfies threshold limit. Consumers have been refactored to accept null value and throw exception or log error if needed. Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
| DataStore sourceTemplateDataStore = dataStoreManagerImpl.getImageStore(srcVolumeInfo.getDataCenterId()); | ||
| TemplateInfo sourceTemplateInfo = templateDataFactory.getTemplate(srcVolumeInfo.getTemplateId(), sourceTemplateDataStore); | ||
| TemplateObjectTO sourceTemplate = new TemplateObjectTO(sourceTemplateInfo); | ||
| if (sourceTemplateDataStore != null) { |
There was a problem hiding this comment.
@rhtyd Throwing an exception or returning null for Image store can fail this. This might not be needing a sec store for writing. Maybe need to handle this separately?
There was a problem hiding this comment.
Could this be the reason you are using (sourceTemplateDataStore != null) ? If so, you might need an else at the end to manage the nulls
There was a problem hiding this comment.
@kioie It should never be getting null value for image store as it needs the store only for reading and threshold limit shouldn't be affecting this. Will have to make changes for it.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
| s_logger.error(String.format("Can't find an image storage in zone with less than %d usage", | ||
| Math.round(_statsCollector.getImageStoreCapacityThreshold()*100))); | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Besides checking for whether the store is below the threshold, are you also considering doing a check if the image will fit in the imagestore and what capacity remains afterwards....for example, if the current used capacity is at 89% and the image might move it to even upto 99%....or is that check happening on a separate method?
There was a problem hiding this comment.
@kioie I'm not sure that is possible to implement for scenarios at this point with the current design. For example, while registering a template through URL we don't know actual size after download and install.
But such a check can be considered for copy template, snapshots, etc. I've to check the code for that. Anyway, the requirement of this method is to return an image store with capacity within the threshold and I don't think we can have a check for resultant size here.
There was a problem hiding this comment.
That's okay, just wanted to know the possibility of it happening
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
… cmd Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
|
@blueorangutan package |
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
|
@blueorangutan package |
|
@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-148 |
|
@blueorangutan test |
1 similar comment
|
@blueorangutan test |
|
@shwstppr were the vmware specific fixes also done? You can kick tests now. |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-167 |
|
@blueorangutan test centos7 vmware-65u2 |
|
@shwstppr a Trillian-Jenkins test job (centos7 mgmt + vmware-65u2) has been kicked to run smoke tests |
|
@rhtyd yes I made the changes for vmware migration issue. Would help if you can review last 2 commits |
|
|
||
| DataStore getImageStoreForWrite(long zoneId); | ||
|
|
||
| List<DataStore> getImageStoresForWrite(long zoneId); |
There was a problem hiding this comment.
Rename it something more meaningful like listImagesStoresWithFreeCapacity?
| DataStore getImageStore(long zoneId); | ||
| DataStore getImageStoreForRead(long zoneId); | ||
|
|
||
| DataStore getImageStoreForWrite(long zoneId); |
There was a problem hiding this comment.
How about getImagesStoresWithFreeCapacity?
| List<DataStore> getImageStoresByScope(ZoneScope scope); | ||
|
|
||
| DataStore getImageStore(long zoneId); | ||
| DataStore getImageStoreForRead(long zoneId); |
There was a problem hiding this comment.
getRandomImageStore or something that conveys meaning? You may also put javadocs on them to convey which method does what?
|
Trillian test result (tid-215)
|
|
I've rekicked Travis jobs, please check again in case they fail @shwstppr |
|
@rhtyd a Trillian-Jenkins matrix job (centos6 mgmt + xs71, centos7 mgmt + vmware65, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests |
|
Trillian test result (tid-220)
|
|
@blueorangutan test centos7 vmware-55u3 |
|
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + vmware-55u3) has been kicked to run smoke tests |
|
Trillian test result (tid-219)
|
|
Trillian test result (tid-222)
|
borisstoyanov
left a comment
There was a problem hiding this comment.
I think I'm hitting an issue with this one. I cannot sync the templates across a newly added image store. Here's my steps:
Originally created environement with 4 secondary storages
deleted 3 of them so I just have 1 running.
On the deleted storages I've clean all files.
Registered a few templates in cloudstack and they ended up on the running storage
Added back one of the storages and destroyed the SSVM in a while
After a day I went back to the machine and the newly added templates were not on the new image store, just the default ones were added. Also the new secondary storage is not being mounted on any of the hosts
|
@borisstoyanov will look into it
|
|
Tests LGTM, any comments @shwstppr @borisstoyanov ? |
|
@rhtyd looking into the issue @blueorangutan package |
|
@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-193 |
@borisstoyanov I've verified, the issue you are facing is unrelated to this PR. Currently, code skips downloading a template to new secondary store if is not public and not featured and not a system template. |
|
Sure, thanks on the investigation @shwstppr, will check |
borisstoyanov
left a comment
There was a problem hiding this comment.
that solved the template sync, but the storage pool is not mounted across the hosts still, which I believe is an issue @shwstppr
|
@borisstoyanov @shwstppr have you tested this simple case: |
borisstoyanov
left a comment
There was a problem hiding this comment.
LGTM, after migrating a VM to the host all image stores were mounted
Retrieval of an image store using ImageStoreProviderManager has been refactored by introducing three different methods,
DataStore getRandomImageStore(List<DataStore> imageStores);To get an image store for reading purpose. Threshold capacity check will not be used here.
DataStore getImageStoreWithFreeCapacity(List<DataStore> imageStores);To get an image store for reading purpose. Threshold capacity check will be used here and the store with max free space will be returned. If no store with filled storage less than the threshold is found, the NULL value will be returned.
List<DataStore> listImageStoresWithFreeCapacity(List<DataStore> imageStores);To get a list of image stores for writing purpose which fulfills threshold capacity check.
Correspondingly DataStoreManager methods have been refactored to return similar values for a given zone,
Consumers have been refactored to accept NULL value and throw exception or log error if needed.
Fixes -
#3287 - NULL value will be returned when secondary storage is needed for writing but there is not store with free space.
#3041 - Rather than returning random secondary storage for writing, storage with max. free space will be returned.
#3478 - For migration on VMware, all writable secondary storage will be mounted while preparation.
Types of changes
Screenshots (if appropriate):
How Has This Been Tested?