systemvm: ipv6 fw_input — accept return traffic from established,rela…#13173
Open
agronaught wants to merge 1 commit into
Open
systemvm: ipv6 fw_input — accept return traffic from established,rela…#13173agronaught wants to merge 1 commit into
agronaught wants to merge 1 commit into
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache CloudStack community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md)
|
…ted connections The systemvm Virtual Router's nftables `ip6 ip6_firewall fw_input` chain is created with policy=drop and only ICMPv6 accept rules. The IPv4 INPUT chain has the equivalent `iifname "eth2" ct state established,related accept` rule (added by `fw_router_routing()`); the IPv6 path has no such rule. Effect: any v6 connection the VR itself initiates outbound (BGP to upstream PE peers, NTP, DNS lookups, etc.) has its return traffic silently dropped at the v6 INPUT hook before TCP processes it. For Isolated v6 ROUTED networks this is fatal — BGP IPv6 sessions cannot establish, tenant /64 prefixes are never advertised upstream, and VMs in the network are unreachable from the IPv6 internet. PR apache#10970 added the equivalent rule to the FORWARD chain only (covering tenant VM return traffic). This commit adds the matching rule to the INPUT chain (covering VR-originated return traffic) by introducing `fw_router_routing_v6()` as the IPv6 mirror of `fw_router_routing()`. Verified end-to-end on ACS 4.22.0.0 KVM: before the patch, v6 BGP sessions stay in `Connect` indefinitely; tcpdump confirms PE responds with SYN-ACK but VR's TCP stack never sees the SYN-ACK (MD5 counters zero — drop happens at netfilter). After the patch, v6 BGP sessions reach `Established` within seconds and remain stable across subsequent tenant firewall rule updates. Fixes: apache#13171 Signed-off-by: Jason Ball <jball@resetdata.com>
0a11f6e to
992fbf5
Compare
Contributor
There was a problem hiding this comment.
Copilot wasn't able to review any files in this pull request.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Member
|
@agronaught |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds the IPv6 equivalent of
fw_router_routing()to the systemvm Virtual Router's network configuration, so that return traffic for VR-initiated IPv6 connections (BGP to upstream PE peers, NTP, DNS lookups, etc.) is allowed back through theip6_firewall fw_inputchain.Problem
The systemvm VR's nftables
ip6 ip6_firewall fw_inputchain is created withpolicy=dropand only ICMPv6 accept rules. The IPv4 INPUT chain has the equivalentiifname "eth2" ct state established,related acceptrule (added byfw_router_routing()inCsAddress.py); the IPv6 path has no such rule.Effect: any v6 connection the VR itself initiates outbound has its return traffic silently dropped at the v6 INPUT hook before TCP processes it. For Isolated IPv6 ROUTED networks this is fatal — BGP IPv6 sessions cannot reach
Established, tenant/64prefixes are never advertised upstream, and VMs in the network are unreachable from the IPv6 internet.#10970 added the equivalent rule to the FORWARD chain (covering tenant VM return traffic) but explicitly removed it from the INPUT chain in its second commit. This PR completes that fix for VR-originated traffic.
Behavioural change
Before this PR, IPv6 BGP sessions from VRs in
IsolatedV6RoutedFiltered(and similar Routed v6) network offerings stay inConnectstate indefinitely. After this PR, sessions reachEstablishedwithin seconds of VR start and prefix advertisements work normally.The change is additive and behind the existing
is_routed()/is_vpc()gating — only routed, non-VPC networks see new INPUT rules. No change for existing v4 paths, v4 NATted networks, or VPC networks.Fixes: #13171
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Justifying Major: any operator wanting to ship the
IsolatedV6RoutedFilteredoffering (or any v6 Routed isolated network withFirewallservice) for production tenant workloads is blocked. Workaround requires per-VRnftinjection that wipes on every tenant FW rule change, making the offering unusable as a customer product without a downstream patch like this one.Screenshots (if appropriate)
N/A — kernel-level firewall change.
How Has This Been Tested?
Verified end-to-end on Apache CloudStack 4.22.0.0, KVM hypervisor (Ubuntu 24.04 hosts), with:
/48)IsolatedV6RoutedFilteredofferingBefore the patch:
vtysh -c "show bgp ipv6 unicast summary"
Neighbor State/PfxRcd
2400:88e0:ffff:258::2 Connect 0
2400:88e0:ffff:258::3 Connect 0
Hypervisor-side packet capture on the underlay confirms PE responds with SYN-ACK, but the VR's TCP stack never delivers it to FRR. Kernel
TCPMD5*counters stay at zero — drop happens at netfilter before TCP processes the segment. Inside the VR:$ nft list table ip6 ip6_firewall
table ip6 ip6_firewall {
chain fw_input {
type filter hook input priority filter; policy drop;
icmpv6 type { ... } accept
}
...
}
No
ct state established,related acceptrule.After the patch:
vtysh -c "show bgp ipv6 unicast summary"
Neighbor State/PfxRcd
2400:88e0:ffff:258::2 Established 1
2400:88e0:ffff:258::3 Established 1
fw_inputnow includes the new rule with active counters:iifname "eth2" ct state established,related counter packets ... bytes ... accept
Verified end-to-end: SSH from public IPv6 internet to a VM inside the v6-routed network succeeds. Reachability survives subsequent tenant firewall rule updates (the rule is rebuilt from
nft_ipv6_fwon everyIpTablesExecutor.process()cycle).How did you try to break this feature and the system with this change?
cmk createIpv6FirewallRule/deleteIpv6FirewallRulerepeatedly after the patch.IpTablesExecutor.process()flushes and rebuilds the v6 table each time; the new INPUT rule is re-emitted on every cycle because it's now innft_ipv6_fw. Counters resume; BGP stays Established.cmk rebootRouter). After the reboot pulls freshcloud-scripts.tgz, the patchedCsAddress.pyruns in the rebuilt VR and the rule is in place from boot. BGP establishes within ~30s of VR ready.is_routed()gating means standard Isolated v4 networks and VPC networks see no new rules in either chain — no behaviour change for them.eth2reference and per-VR counter), with no cross-tenant traffic leakage.Tested with both single-tenant and multi-tenant network deployments. Validated the substrate change on ACS 4.22.0.0; same code path exists in
4.20branch HEAD per inspection.