Message ID | 20231221224311.130319-1-brad@faucet.nz (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] netfilter: nf_nat: fix action not being set for all ct states | expand |
+ Xin Long <lucien.xin@gmail.com> Aaron Conole <aconole@redhat.com> coreteam@netfilter.org On Fri, Dec 22, 2023 at 11:43:11AM +1300, Brad Cowie wrote: > This fixes openvswitch's handling of nat packets in the related state. > > In nf_ct_nat_execute(), which is called from nf_ct_nat(), ICMP/ICMPv6 > packets in the IP_CT_RELATED or IP_CT_RELATED_REPLY state, which have > not been dropped, will follow the goto, however the placement of the > goto label means that updating the action bit field will be bypassed. > > This causes ovs_nat_update_key() to not be called from ovs_ct_nat() > which means the openvswitch match key for the ICMP/ICMPv6 packet is not > updated and the pre-nat value will be retained for the key, which will > result in the wrong openflow rule being matched for that packet. > > Move the goto label above where the action bit field is being set so > that it is updated in all cases where the packet is accepted. > > Fixes: ebddb1404900 ("net: move the nat function to nf_nat_ovs for ovs and tc") > Signed-off-by: Brad Cowie <brad@faucet.nz> Thanks Brad, I agree with your analysis and that the problem appears to have been introduced by the cited commit. I am curious to know what use case triggers this / why it when unnoticed for a year. But in any case, this fix looks good to me. Reviewed-by: Simon Horman <horms@kernel.org> > --- > net/netfilter/nf_nat_ovs.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/netfilter/nf_nat_ovs.c b/net/netfilter/nf_nat_ovs.c > index 551abd2da614..0f9a559f6207 100644 > --- a/net/netfilter/nf_nat_ovs.c > +++ b/net/netfilter/nf_nat_ovs.c > @@ -75,9 +75,10 @@ static int nf_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, > } > > err = nf_nat_packet(ct, ctinfo, hooknum, skb); > +out: > if (err == NF_ACCEPT) > *action |= BIT(maniptype); > -out: > + > return err; > } > > -- > 2.34.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Sun, 24 Dec 2023 at 10:13, Simon Horman <horms@kernel.org> wrote: > Thanks Brad, > > I agree with your analysis and that the problem appears to > have been introduced by the cited commit. Thanks for the review Simon. > I am curious to know what use case triggers this / > why it when unnoticed for a year. We encountered this issue while upgrading some routers from linux 5.15 to 6.2. The dataplane on these routers is provided by an openvswitch bridge which is controlled via openflow by faucet. These routers are also performing SNAT on all traffic to/from the wan interface via openvswitch conntrack openflow rules. We noticed that after upgrading the linux kernel, traceroute/mtr no longer worked when run from clients behind the router. We eventually discovered the reason for this is that the ICMP time exceeded messages elicited by traceroute were matching openflow rules with the incorrect destination ip, despite there being an openflow rule to undo the nat. Other packets in the established or new state matched the expected openflow rules. A git bisect between 5.15 and 6.2 showed that this change in behaviour was introduced by commit ebddb1404900. After the above patch is applied our routers perform nat correctly again for traceroute/mtr.
On Sat, Dec 23, 2023 at 9:48 PM Brad Cowie <brad@faucet.nz> wrote: > > On Sun, 24 Dec 2023 at 10:13, Simon Horman <horms@kernel.org> wrote: > > Thanks Brad, > > > > I agree with your analysis and that the problem appears to > > have been introduced by the cited commit. > > Thanks for the review Simon. > > > I am curious to know what use case triggers this / > > why it when unnoticed for a year. > > We encountered this issue while upgrading some routers from > linux 5.15 to 6.2. The dataplane on these routers is provided > by an openvswitch bridge which is controlled via openflow by > faucet. These routers are also performing SNAT on all traffic > to/from the wan interface via openvswitch conntrack openflow > rules. > > We noticed that after upgrading the linux kernel, traceroute/mtr > no longer worked when run from clients behind the router. > We eventually discovered the reason for this is that the > ICMP time exceeded messages elicited by traceroute were > matching openflow rules with the incorrect destination ip, > despite there being an openflow rule to undo the nat. > Other packets in the established or new state matched the > expected openflow rules. > > A git bisect between 5.15 and 6.2 showed that this change in > behaviour was introduced by commit ebddb1404900. After the > above patch is applied our routers perform nat correctly > again for traceroute/mtr. Acked-by: Xin Long <lucien.xin@gmail.com>
Simon Horman <horms@kernel.org> writes: > + Xin Long <lucien.xin@gmail.com> > Aaron Conole <aconole@redhat.com> > coreteam@netfilter.org > > On Fri, Dec 22, 2023 at 11:43:11AM +1300, Brad Cowie wrote: >> This fixes openvswitch's handling of nat packets in the related state. >> >> In nf_ct_nat_execute(), which is called from nf_ct_nat(), ICMP/ICMPv6 >> packets in the IP_CT_RELATED or IP_CT_RELATED_REPLY state, which have >> not been dropped, will follow the goto, however the placement of the >> goto label means that updating the action bit field will be bypassed. >> >> This causes ovs_nat_update_key() to not be called from ovs_ct_nat() >> which means the openvswitch match key for the ICMP/ICMPv6 packet is not >> updated and the pre-nat value will be retained for the key, which will >> result in the wrong openflow rule being matched for that packet. >> >> Move the goto label above where the action bit field is being set so >> that it is updated in all cases where the packet is accepted. >> >> Fixes: ebddb1404900 ("net: move the nat function to nf_nat_ovs for ovs and tc") >> Signed-off-by: Brad Cowie <brad@faucet.nz> > > Thanks Brad, > > I agree with your analysis and that the problem appears to > have been introduced by the cited commit. > > I am curious to know what use case triggers this / > why it when unnoticed for a year. > > But in any case, this fix looks good to me. > > Reviewed-by: Simon Horman <horms@kernel.org> > >> --- LGTM. I guess we should try to codify the specific flows that were used to flag this into the ovs selftest - we clearly have a missing case after NAT lookup. I'll add it to my (ever growing) list. Meanwhile, Acked-by: Aaron Conole <aconole@redhat.com> >> net/netfilter/nf_nat_ovs.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/net/netfilter/nf_nat_ovs.c b/net/netfilter/nf_nat_ovs.c >> index 551abd2da614..0f9a559f6207 100644 >> --- a/net/netfilter/nf_nat_ovs.c >> +++ b/net/netfilter/nf_nat_ovs.c >> @@ -75,9 +75,10 @@ static int nf_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, >> } >> >> err = nf_nat_packet(ct, ctinfo, hooknum, skb); >> +out: >> if (err == NF_ACCEPT) >> *action |= BIT(maniptype); >> -out: >> + >> return err; >> } >> >> -- >> 2.34.1 >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>
Applied to nf.git, thanks everyone for reviewing.
On Wed, 3 Jan 2024 at 04:10, Aaron Conole <aconole@redhat.com> wrote: > LGTM. I guess we should try to codify the specific flows that were used > to flag this into the ovs selftest - we clearly have a missing case > after NAT lookup. Thanks for the review Aaron, and the sensible suggestion to add a test to ovs to avoid this problem occuring again in future. I've simplified our NAT ruleset and turned it into an ovs system test, which I've submitted as a patch [1] to ovs-dev. The test reproduces the issue introduced by ebddb1404900 and passes when e6345d2824a3 is applied. [1]: https://mail.openvswitch.org/pipermail/ovs-dev/2024-January/410476.html
diff --git a/net/netfilter/nf_nat_ovs.c b/net/netfilter/nf_nat_ovs.c index 551abd2da614..0f9a559f6207 100644 --- a/net/netfilter/nf_nat_ovs.c +++ b/net/netfilter/nf_nat_ovs.c @@ -75,9 +75,10 @@ static int nf_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, } err = nf_nat_packet(ct, ctinfo, hooknum, skb); +out: if (err == NF_ACCEPT) *action |= BIT(maniptype); -out: + return err; }
This fixes openvswitch's handling of nat packets in the related state. In nf_ct_nat_execute(), which is called from nf_ct_nat(), ICMP/ICMPv6 packets in the IP_CT_RELATED or IP_CT_RELATED_REPLY state, which have not been dropped, will follow the goto, however the placement of the goto label means that updating the action bit field will be bypassed. This causes ovs_nat_update_key() to not be called from ovs_ct_nat() which means the openvswitch match key for the ICMP/ICMPv6 packet is not updated and the pre-nat value will be retained for the key, which will result in the wrong openflow rule being matched for that packet. Move the goto label above where the action bit field is being set so that it is updated in all cases where the packet is accepted. Fixes: ebddb1404900 ("net: move the nat function to nf_nat_ovs for ovs and tc") Signed-off-by: Brad Cowie <brad@faucet.nz> --- net/netfilter/nf_nat_ovs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)