Skip to content

CLOUDSTACK-10154: fixing some smoketests failures#2335

Merged
yadvr merged 3 commits into
apache:masterfrom
shapeblue:fix-smoketests
Nov 28, 2017
Merged

CLOUDSTACK-10154: fixing some smoketests failures#2335
yadvr merged 3 commits into
apache:masterfrom
shapeblue:fix-smoketests

Conversation

@borisstoyanov
Copy link
Copy Markdown
Contributor

ssvm tests:
==== Marvin Init Successful ====
=== TestName: test_01_list_sec_storage_vm | Status : SUCCESS ===
=== TestName: test_02_list_cpvm_vm | Status : SUCCESS ===
=== TestName: test_03_ssvm_internals | Status : SUCCESS ===
=== TestName: test_04_cpvm_internals | Status : SUCCESS ===
=== TestName: test_05_stop_ssvm | Status : SUCCESS ===
=== TestName: test_06_stop_cpvm | Status : SUCCESS ===
=== TestName: test_07_reboot_ssvm | Status : SUCCESS ===
=== TestName: test_08_reboot_cpvm | Status : SUCCESS ===
=== TestName: test_09_destroy_ssvm | Status : SUCCESS ===
=== TestName: test_10_destroy_cpvm | Status : SUCCESS ===

deploy_vm_root_resize:
=== TestName: test_00_deploy_vm_root_resize | Status : SUCCESS ===
=== TestName: test_01_deploy_vm_root_resize | Status : SUCCESS ===
=== TestName: test_02_deploy_vm_root_resize | Status : SUCCESS ===

test volumes:
=== TestName: test_10_list_volumes | Status : SUCCESS ===

test host annotations
=== TestName: test_01_add_annotation | Status : SUCCESS ===
=== TestName: test_02_add_multiple_annotations | Status : SUCCESS ===
=== TestName: test_03_user_role_dont_see_annotations | Status : SUCCESS ===
=== TestName: test_04_remove_annotations | Status : SUCCESS ===
=== TestName: test_05_add_annotation_for_invalid_entityType | Status : SUCCESS ===

ping @rhtyd @PaulAngus @nvazquez @DaanHoogland for review


#Giving 30 seconds to management to warm-up,
#Experienced failures when trying to deploy a VM exactly when management came up
time.sleep(30)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borisstoyanov instead of a blind sleep, should we poll every few seconds using wait_until and see if mgmt server's port 8080 is reachable -- see for example, in travis we wait using this: https://github.com/apache/cloudstack/blob/master/tools/travis/before_script.sh#L25

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is warmup, we've already used the polling approach to determine management is up, at this stage management is responding, but it need some warmup period.


def waitForSystemVMAgent(self, vmname):
timeout = self.services["timeout"]
timeout = 120
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've heavily refactored and fixed this test in my branch/PR #2211. In general, throughout the test codebase we should NOT be sleeping blindly but use wait_until and poll for behaviour (i.e list stuff) over some short period (1-10 seconds) and retries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we leave this as is until debian9 changes get merged?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let's leave them be.

host = Host.list(
self.apiclient,
type='Routing',
virtualmachineid=list_vm.id
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. While this should have worked, listHosts when passed a vm id will also return other hosts in the cluster that can host this VM (enough VM cpu/ram)

Comment thread test/integration/smoke/test_volumes.py Outdated
def wait_for_attributes_and_return_root_vol(self):

for i in range(60):
for i in range(360):
Copy link
Copy Markdown
Member

@yadvr yadvr Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borisstoyanov can you rewrite this and use the following example/patterns in other places wherever we're sleeping blindly:

# If not import do this:
from marvin.lib.utils import wait_until
[...snipped...]

     def wait_for_attributes_and_return_root_vol(self):
        def checkVolumeResponse():
               list_volume_response = Volume.list(
                  self.apiClient,
                  virtualmachineid=self.virtual_machine.id,
                 type='ROOT',
                 listall=True
             )
             if  list_volume_response[0].virtualsize is not None:
                 return True, list_volume_response[0]
             return False, None

        # sleep interval is 1s, retries is 360, this will sleep atmost 360 seconds, or 6 mins
        res, response = wait_until(1, 360, checkVolumeResponse)                                                                      
        if not res:                                                                
            self.fail("Failed to return root volume response")                 
        return response 

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, thanks @borisstoyanov. Left some comments to further improve the test code.

@borisstoyanov
Copy link
Copy Markdown
Contributor Author

@rhtyd I've addressed your comment, also not a huge fan of implicit waits..

Copy link
Copy Markdown
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Nov 27, 2017

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-1307

@borisstoyanov
Copy link
Copy Markdown
Contributor Author

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1703)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 41087 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2335-t1703-kvm-centos7.zip
Intermitten failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermitten failure detected: /marvin/tests/smoke/test_iso.py
Intermitten failure detected: /marvin/tests/smoke/test_network.py
Intermitten failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermitten failure detected: /marvin/tests/smoke/test_usage.py
Intermitten failure detected: /marvin/tests/smoke/test_volumes.py
Intermitten failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Test completed. 63 look OK, 4 have error(s)

Test Result Time (s) Test File
test_01_vpc_remote_access_vpn Failure 60.91 test_vpc_vpn.py
test_07_resize_fail Failure 15.37 test_volumes.py
test_04_rvpc_privategw_static_routes Failure 334.23 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 208.47 test_privategw_acl.py
test_02_vpc_privategw_static_routes Failure 183.38 test_privategw_acl.py
test_01_vpc_privategw_acl Failure 61.87 test_privategw_acl.py
test_04_create_iso_with_no_checksum Error 65.83 test_iso.py
test_03_create_iso_with_checksum_md5 Error 65.47 test_iso.py
test_02_create_iso_with_checksum_sha256 Error 65.52 test_iso.py
test_01_create_iso_with_checksum_sha1 Error 65.69 test_iso.py
test_change_service_offering_for_vm_with_snapshots Skipped 0.00 test_vm_snapshots.py
test_09_copy_delete_template Skipped 0.01 test_templates.py
test_06_copy_template Skipped 0.00 test_templates.py
test_static_role_account_acls Skipped 0.02 test_staticroles.py
test_11_ss_nfs_version_on_ssvm Skipped 0.02 test_ssvm.py
test_01_scale_vm Skipped 0.00 test_scale_vm.py
test_01_primary_storage_iscsi Skipped 0.11 test_primary_storage.py
test_vm_nic_adapter_vmxnet3 Skipped 0.00 test_nic_adapter_type.py
test_03_nic_multiple_vmware Skipped 1.08 test_nic.py
test_nested_virtualization_vmware Skipped 0.00 test_nested_virtualization.py
test_06_copy_iso Skipped 0.00 test_iso.py
test_list_ha_for_host_valid Skipped 0.02 test_hostha_simulator.py
test_list_ha_for_host_invalid Skipped 0.02 test_hostha_simulator.py
test_list_ha_for_host Skipped 0.02 test_hostha_simulator.py
test_hostha_enable_feature_without_setting_provider Skipped 0.02 test_hostha_simulator.py
test_hostha_enable_feature_valid Skipped 0.02 test_hostha_simulator.py
test_hostha_disable_feature_valid Skipped 0.02 test_hostha_simulator.py
test_hostha_configure_invalid_provider Skipped 0.02 test_hostha_simulator.py
test_hostha_configure_default_driver Skipped 0.02 test_hostha_simulator.py
test_ha_verify_fsm_recovering Skipped 0.02 test_hostha_simulator.py
test_ha_verify_fsm_fenced Skipped 0.02 test_hostha_simulator.py
test_ha_verify_fsm_degraded Skipped 0.02 test_hostha_simulator.py
test_ha_verify_fsm_available Skipped 0.02 test_hostha_simulator.py
test_ha_multiple_mgmt_server_ownership Skipped 0.02 test_hostha_simulator.py
test_ha_list_providers Skipped 0.02 test_hostha_simulator.py
test_ha_enable_feature_invalid Skipped 0.02 test_hostha_simulator.py
test_ha_disable_feature_invalid Skipped 0.02 test_hostha_simulator.py
test_ha_configure_enabledisable_across_clusterzones Skipped 0.02 test_hostha_simulator.py
test_configure_ha_provider_valid Skipped 0.03 test_hostha_simulator.py
test_configure_ha_provider_invalid Skipped 0.02 test_hostha_simulator.py
test_deploy_vgpu_enabled_vm Skipped 0.03 test_deploy_vgpu_enabled_vm.py
test_3d_gpu_support Skipped 0.03 test_deploy_vgpu_enabled_vm.py

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Nov 28, 2017

Lgtm, the isos failures are env related. Merging now.

@yadvr yadvr merged commit f506a99 into apache:master Nov 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants