CLOUDSTACK-10154: fixing some smoketests failures by borisstoyanov · Pull Request #2335 · apache/cloudstack

borisstoyanov · 2017-11-22T12:45:47Z

ssvm tests:
==== Marvin Init Successful ====
=== TestName: test_01_list_sec_storage_vm | Status : SUCCESS ===
=== TestName: test_02_list_cpvm_vm | Status : SUCCESS ===
=== TestName: test_03_ssvm_internals | Status : SUCCESS ===
=== TestName: test_04_cpvm_internals | Status : SUCCESS ===
=== TestName: test_05_stop_ssvm | Status : SUCCESS ===
=== TestName: test_06_stop_cpvm | Status : SUCCESS ===
=== TestName: test_07_reboot_ssvm | Status : SUCCESS ===
=== TestName: test_08_reboot_cpvm | Status : SUCCESS ===
=== TestName: test_09_destroy_ssvm | Status : SUCCESS ===
=== TestName: test_10_destroy_cpvm | Status : SUCCESS ===

deploy_vm_root_resize:
=== TestName: test_00_deploy_vm_root_resize | Status : SUCCESS ===
=== TestName: test_01_deploy_vm_root_resize | Status : SUCCESS ===
=== TestName: test_02_deploy_vm_root_resize | Status : SUCCESS ===

test volumes:
=== TestName: test_10_list_volumes | Status : SUCCESS ===

test host annotations
=== TestName: test_01_add_annotation | Status : SUCCESS ===
=== TestName: test_02_add_multiple_annotations | Status : SUCCESS ===
=== TestName: test_03_user_role_dont_see_annotations | Status : SUCCESS ===
=== TestName: test_04_remove_annotations | Status : SUCCESS ===
=== TestName: test_05_add_annotation_for_invalid_entityType | Status : SUCCESS ===

ping @rhtyd @PaulAngus @nvazquez @DaanHoogland for review

yadvr · 2017-11-27T07:27:33Z

+
+                #Giving 30 seconds to management to warm-up,
+                #Experienced failures when trying to deploy a VM exactly when management came up
+                time.sleep(30)


@borisstoyanov instead of a blind sleep, should we poll every few seconds using wait_until and see if mgmt server's port 8080 is reachable -- see for example, in travis we wait using this: https://github.com/apache/cloudstack/blob/master/tools/travis/before_script.sh#L25

This is warmup, we've already used the polling approach to determine management is up, at this stage management is responding, but it need some warmup period.

yadvr · 2017-11-27T07:29:08Z


    def waitForSystemVMAgent(self, vmname):
-        timeout = self.services["timeout"]
+        timeout = 120


I've heavily refactored and fixed this test in my branch/PR #2211. In general, throughout the test codebase we should NOT be sleeping blindly but use wait_until and poll for behaviour (i.e list stuff) over some short period (1-10 seconds) and retries.

shall we leave this as is until debian9 changes get merged?

Sure, let's leave them be.

yadvr · 2017-11-27T07:31:06Z

        host = Host.list(
            self.apiclient,
            type='Routing',
-            virtualmachineid=list_vm.id


LGTM. While this should have worked, listHosts when passed a vm id will also return other hosts in the cluster that can host this VM (enough VM cpu/ram)

yadvr · 2017-11-27T07:35:46Z

    def wait_for_attributes_and_return_root_vol(self):

-        for i in range(60):
+        for i in range(360):


@borisstoyanov can you rewrite this and use the following example/patterns in other places wherever we're sleeping blindly:

# If not import do this: from marvin.lib.utils import wait_until [...snipped...] def wait_for_attributes_and_return_root_vol(self): def checkVolumeResponse(): list_volume_response = Volume.list( self.apiClient, virtualmachineid=self.virtual_machine.id, type='ROOT', listall=True ) if list_volume_response[0].virtualsize is not None: return True, list_volume_response[0] return False, None # sleep interval is 1s, retries is 360, this will sleep atmost 360 seconds, or 6 mins res, response = wait_until(1, 360, checkVolumeResponse) if not res: self.fail("Failed to return root volume response") return response

yadvr

Overall LGTM, thanks @borisstoyanov. Left some comments to further improve the test code.

borisstoyanov · 2017-11-27T10:49:17Z

@rhtyd I've addressed your comment, also not a huge fan of implicit waits..

yadvr

LGTM.

yadvr · 2017-11-27T11:29:46Z

@blueorangutan package

blueorangutan · 2017-11-27T11:30:31Z

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan · 2017-11-27T12:01:35Z

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-1307

borisstoyanov · 2017-11-27T12:23:57Z

@blueorangutan test

blueorangutan · 2017-11-27T12:24:30Z

@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan · 2017-11-27T23:49:22Z

Trillian test result (tid-1703)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 41087 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2335-t1703-kvm-centos7.zip
Intermitten failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermitten failure detected: /marvin/tests/smoke/test_iso.py
Intermitten failure detected: /marvin/tests/smoke/test_network.py
Intermitten failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermitten failure detected: /marvin/tests/smoke/test_usage.py
Intermitten failure detected: /marvin/tests/smoke/test_volumes.py
Intermitten failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Test completed. 63 look OK, 4 have error(s)

Test	Result	Time (s)	Test File
test_01_vpc_remote_access_vpn	`Failure`	60.91	test_vpc_vpn.py
test_07_resize_fail	`Failure`	15.37	test_volumes.py
test_04_rvpc_privategw_static_routes	`Failure`	334.23	test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup	`Failure`	208.47	test_privategw_acl.py
test_02_vpc_privategw_static_routes	`Failure`	183.38	test_privategw_acl.py
test_01_vpc_privategw_acl	`Failure`	61.87	test_privategw_acl.py
test_04_create_iso_with_no_checksum	`Error`	65.83	test_iso.py
test_03_create_iso_with_checksum_md5	`Error`	65.47	test_iso.py
test_02_create_iso_with_checksum_sha256	`Error`	65.52	test_iso.py
test_01_create_iso_with_checksum_sha1	`Error`	65.69	test_iso.py
test_change_service_offering_for_vm_with_snapshots	Skipped	0.00	test_vm_snapshots.py
test_09_copy_delete_template	Skipped	0.01	test_templates.py
test_06_copy_template	Skipped	0.00	test_templates.py
test_static_role_account_acls	Skipped	0.02	test_staticroles.py
test_11_ss_nfs_version_on_ssvm	Skipped	0.02	test_ssvm.py
test_01_scale_vm	Skipped	0.00	test_scale_vm.py
test_01_primary_storage_iscsi	Skipped	0.11	test_primary_storage.py
test_vm_nic_adapter_vmxnet3	Skipped	0.00	test_nic_adapter_type.py
test_03_nic_multiple_vmware	Skipped	1.08	test_nic.py
test_nested_virtualization_vmware	Skipped	0.00	test_nested_virtualization.py
test_06_copy_iso	Skipped	0.00	test_iso.py
test_list_ha_for_host_valid	Skipped	0.02	test_hostha_simulator.py
test_list_ha_for_host_invalid	Skipped	0.02	test_hostha_simulator.py
test_list_ha_for_host	Skipped	0.02	test_hostha_simulator.py
test_hostha_enable_feature_without_setting_provider	Skipped	0.02	test_hostha_simulator.py
test_hostha_enable_feature_valid	Skipped	0.02	test_hostha_simulator.py
test_hostha_disable_feature_valid	Skipped	0.02	test_hostha_simulator.py
test_hostha_configure_invalid_provider	Skipped	0.02	test_hostha_simulator.py
test_hostha_configure_default_driver	Skipped	0.02	test_hostha_simulator.py
test_ha_verify_fsm_recovering	Skipped	0.02	test_hostha_simulator.py
test_ha_verify_fsm_fenced	Skipped	0.02	test_hostha_simulator.py
test_ha_verify_fsm_degraded	Skipped	0.02	test_hostha_simulator.py
test_ha_verify_fsm_available	Skipped	0.02	test_hostha_simulator.py
test_ha_multiple_mgmt_server_ownership	Skipped	0.02	test_hostha_simulator.py
test_ha_list_providers	Skipped	0.02	test_hostha_simulator.py
test_ha_enable_feature_invalid	Skipped	0.02	test_hostha_simulator.py
test_ha_disable_feature_invalid	Skipped	0.02	test_hostha_simulator.py
test_ha_configure_enabledisable_across_clusterzones	Skipped	0.02	test_hostha_simulator.py
test_configure_ha_provider_valid	Skipped	0.03	test_hostha_simulator.py
test_configure_ha_provider_invalid	Skipped	0.02	test_hostha_simulator.py
test_deploy_vgpu_enabled_vm	Skipped	0.03	test_deploy_vgpu_enabled_vm.py
test_3d_gpu_support	Skipped	0.03	test_deploy_vgpu_enabled_vm.py

yadvr · 2017-11-28T04:25:14Z

Lgtm, the isos failures are env related. Merging now.

CLOUDSTACK-10154: fixing some smoketests failures

1688a60

yadvr reviewed Nov 27, 2017

View reviewed changes

yadvr requested changes Nov 27, 2017

View reviewed changes

Adding wait_until pattern to test_volumes

6a59f9f

Check that response is a list (not None)

8e19acb

yadvr approved these changes Nov 27, 2017

View reviewed changes

yadvr merged commit f506a99 into apache:master Nov 28, 2017

Conversation

borisstoyanov commented Nov 22, 2017

Uh oh!

yadvr Nov 27, 2017

Choose a reason for hiding this comment

Uh oh!

borisstoyanov Nov 27, 2017

Choose a reason for hiding this comment

Uh oh!

yadvr Nov 27, 2017

Choose a reason for hiding this comment

Uh oh!

borisstoyanov Nov 27, 2017

Choose a reason for hiding this comment

Uh oh!

yadvr Nov 27, 2017

Choose a reason for hiding this comment

Uh oh!

yadvr Nov 27, 2017

Choose a reason for hiding this comment

Uh oh!

yadvr Nov 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

borisstoyanov Nov 27, 2017

Choose a reason for hiding this comment

Uh oh!

yadvr left a comment

Choose a reason for hiding this comment

Uh oh!

borisstoyanov commented Nov 27, 2017

Uh oh!

yadvr left a comment

Choose a reason for hiding this comment

Uh oh!

yadvr commented Nov 27, 2017

Uh oh!

blueorangutan commented Nov 27, 2017

Uh oh!

blueorangutan commented Nov 27, 2017

Uh oh!

borisstoyanov commented Nov 27, 2017

Uh oh!

blueorangutan commented Nov 27, 2017

Uh oh!

blueorangutan commented Nov 27, 2017

Uh oh!

yadvr commented Nov 28, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yadvr Nov 27, 2017 •

edited

Loading