Skip to content

CLOUDSTACK-10377: Fix Network restart for Nuage#2672

Merged
yadvr merged 2 commits into
apache:4.11from
nuagenetworks:bugfix/CLOUDSTACK-10377
Jun 6, 2018
Merged

CLOUDSTACK-10377: Fix Network restart for Nuage#2672
yadvr merged 2 commits into
apache:4.11from
nuagenetworks:bugfix/CLOUDSTACK-10377

Conversation

@fmaximus
Copy link
Copy Markdown
Contributor

Description

Changes in PR #2508 have caused network restart to fail in a Nuage setup,
as the new VR takes the same IP as the old one, and the old VR is still running.
Nuage doesn't support multiple VM's having the same IP.
We delay provisioning the interfaces in VSD until the old VR interface is released.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Screenshots (if appropriate):

How Has This Been Tested?

Checklist:

  • I have read the CONTRIBUTING document.
  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
    Testing
  • I have added tests to cover my changes.
  • All relevant new and existing integration tests have passed.
  • A full integration testsuite with all test that can run on my environment has passed.

@fmaximus fmaximus added this to the 4.11.1.0 milestone May 24, 2018
@fmaximus fmaximus self-assigned this May 24, 2018
@fmaximus fmaximus requested a review from yadvr May 24, 2018 10:03
Copy link
Copy Markdown
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we cannot test nuage related code but I'll help kick trillian test.

@yadvr
Copy link
Copy Markdown
Member

yadvr commented May 24, 2018

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2085

@yadvr
Copy link
Copy Markdown
Member

yadvr commented May 24, 2018

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2710)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 25005 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2672-t2710-kvm-centos7.zip
Intermitten failure detected: /marvin/tests/smoke/test_deploy_virtio_scsi_vm.py
Intermitten failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermitten failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermitten failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 65 look OK, 2 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestDeployVirtioSCSIVM>:setup Error 0.00 test_deploy_virtio_scsi_vm.py
test_03_vpc_privategw_restart_vpc_cleanup Error 130.64 test_privategw_acl.py

ReserveVmInterfaceVspCommand cmd = new ReserveVmInterfaceVspCommand(vspNetwork, vspVm, vspNic, vspStaticNat, dhcpOption);
Answer answer = _agentMgr.easySend(nuageVspHost.getId(), cmd);

if (answer == null || !answer.getResult()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use BooleanUtils here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The null check is on answer, not on answer.getResult().

VspDhcpVMOption dhcpOption = _nuageVspEntityBuilder.buildVmDhcpOption(nicFromDb, defaultHasDns, networkHasDns);
ReserveVmInterfaceVspCommand cmd = new ReserveVmInterfaceVspCommand(vspNetwork, vspVm, vspNic, vspStaticNat, dhcpOption);
Answer answer = _agentMgr.easySend(nuageVspHost.getId(), cmd);
if (!Boolean.TRUE.equals(vm.getParameter(VirtualMachineProfile.Param.RollingRestart))) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind extracting these blocks of code to specific methods and unit testing them? Some documentation might help as well.

This is a comment to for this PR only: We really need to change our mind set and start using smaller, concise, well-documented and unit-tested methods. This in turn can greatly improve our productivity.

VspDhcpVMOption dhcpOption = _nuageVspEntityBuilder.buildVmDhcpOption(nicFromDb, defaultHasDns, networkHasDns);
ReserveVmInterfaceVspCommand cmd = new ReserveVmInterfaceVspCommand(vspNetwork, vspVm, vspNic, vspStaticNat, dhcpOption);
Answer answer = _agentMgr.easySend(nuageVspHost.getId(), cmd);
if (!Boolean.TRUE.equals(vm.getParameter(VirtualMachineProfile.Param.RollingRestart))) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use BooleanUtils to evaluate vm.getParameter(VirtualMachineProfile.Param.RollingRestart))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I would have to cast, as vm.getParameter() returns Object

super.deallocate(network, nic, vm);
}

if (virtualMachine.getType() == VirtualMachine.Type.DomainRouter) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extract this blob to a method?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to go into creating methods for checking enum values?
I'm not in favor of creating that kind of method in this class,
so it should go either in VirtualMachine or in VirtualMachine.Type

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure @rafaelweingartner means the entire if block and not the enum check, @fmaximus

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spot on. I was referring to the IF body, and not to the IF condition. We really need shorter methods and more unit tests.

@yadvr
Copy link
Copy Markdown
Member

yadvr commented May 29, 2018

@fmaximus can you address outstanding review comments?

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 4, 2018

While one comment around refactoring code to a different method is outstanding, @fmaximus can you comment? I'm okay to merge this in order to unblock for RC2. /cc @PaulAngus @DaanHoogland

@fmaximus
Copy link
Copy Markdown
Contributor Author

fmaximus commented Jun 4, 2018

Still was running some tests on the refactoring.

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 4, 2018

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@DaanHoogland
Copy link
Copy Markdown
Contributor

ok by me @rhtyd should it be packaged and smoke tested again? I saw you started that

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2088

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 4, 2018

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@fmaximus
Copy link
Copy Markdown
Contributor Author

fmaximus commented Jun 4, 2018

We found some issues with the fix, we're looking into it.

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2719)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 24362 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2672-t2719-kvm-centos7.zip
Intermitten failure detected: /marvin/tests/smoke/test_deploy_virtio_scsi_vm.py
Intermitten failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermitten failure detected: /marvin/tests/smoke/test_hostha_kvm.py
Smoke tests completed. 64 look OK, 3 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestDeployVirtioSCSIVM>:setup Error 0.00 test_deploy_virtio_scsi_vm.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 141.16 test_privategw_acl.py
test_hostha_enable_ha_when_host_in_maintenance Error 1.55 test_hostha_kvm.py

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 5, 2018

Test LGTM, let us know (soon) @fmaximus when you're good with your PR(s). Also, please use Github to log issues, thanks.

@fmaximus
Copy link
Copy Markdown
Contributor Author

fmaximus commented Jun 5, 2018

It's good! :shipit:

@DaanHoogland
Copy link
Copy Markdown
Contributor

@PaulAngus I did a walk through and am fine to merge this one, ok?

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 6, 2018

LGTM, based on code reviews and results I'm okay to merge this as well

@yadvr yadvr merged commit 8798014 into apache:4.11 Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants