diff mbox series

[net] tcp: add a missing nf_reset_ct() in 3WHS handling

Message ID 20220623050436.1290307-1-edumazet@google.com (mailing list archive)
State Accepted
Commit 6f0012e35160cd08a53e46e3b3bbf724b92dfe68
Delegated to: Netdev Maintainers
Headers show
Series [net] tcp: add a missing nf_reset_ct() in 3WHS handling | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 20 this patch: 20
netdev/cc_maintainers fail 1 blamed authors not CCed: kaber@trash.net; 3 maintainers not CCed: yoshfuji@linux-ipv6.org kaber@trash.net dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 20 this patch: 20
netdev/checkpatch warning CHECK: Alignment should match open parenthesis
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet June 23, 2022, 5:04 a.m. UTC
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.

This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.

As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.

Commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.

Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/

Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.

Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/tcp_ipv4.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Ilya Maximets June 23, 2022, 11:48 a.m. UTC | #1
On 6/23/22 07:04, Eric Dumazet wrote:
> When the third packet of 3WHS connection establishment
> contains payload, it is added into socket receive queue
> without the XFRM check and the drop of connection tracking
> context.
> 
> This means that if the data is left unread in the socket
> receive queue, conntrack module can not be unloaded.
> 
> As most applications usually reads the incoming data
> immediately after accept(), bug has been hiding for
> quite a long time.
> 
> Commit 68822bdf76f1 ("net: generalize skb freeing
> deferral to per-cpu lists") exposed this bug because
> even if the application reads this data, the skb
> with nfct state could stay in a per-cpu cache for
> an arbitrary time, if said cpu no longer process RX softirqs.
> 
> Many thanks to Ilya Maximets for reporting this issue,
> and for testing various patches:
> https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/
> 
> Note that I also added a missing xfrm4_policy_check() call,
> although this is probably not a big issue, as the SYN
> packet should have been dropped earlier.
> 
> Fixes: b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
> Reported-by: Ilya Maximets <i.maximets@ovn.org>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Florian Westphal <fw@strlen.de>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>  net/ipv4/tcp_ipv4.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Thanks!  I re-tested this change with the OVS testsuite
and it works fine.  It can successfully reload ntfilter
modules now.  So, for the nf_reset_ct part of the fix:

Tested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>

XFRM part seems correct to me, but I didn't test it.
patchwork-bot+netdevbpf@kernel.org June 24, 2022, 11:40 p.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 23 Jun 2022 05:04:36 +0000 you wrote:
> When the third packet of 3WHS connection establishment
> contains payload, it is added into socket receive queue
> without the XFRM check and the drop of connection tracking
> context.
> 
> This means that if the data is left unread in the socket
> receive queue, conntrack module can not be unloaded.
> 
> [...]

Here is the summary with links:
  - [net] tcp: add a missing nf_reset_ct() in 3WHS handling
    https://git.kernel.org/netdev/net/c/6f0012e35160

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fe8f23b95d32ca4a35d05166d471327bc608fa91..da5a3c44c4fb70f1d3ecc596e694a86267f1c44a 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1964,7 +1964,10 @@  int tcp_v4_rcv(struct sk_buff *skb)
 		struct sock *nsk;
 
 		sk = req->rsk_listener;
-		drop_reason = tcp_inbound_md5_hash(sk, skb,
+		if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
+			drop_reason = SKB_DROP_REASON_XFRM_POLICY;
+		else
+			drop_reason = tcp_inbound_md5_hash(sk, skb,
 						   &iph->saddr, &iph->daddr,
 						   AF_INET, dif, sdif);
 		if (unlikely(drop_reason)) {
@@ -2016,6 +2019,7 @@  int tcp_v4_rcv(struct sk_buff *skb)
 			}
 			goto discard_and_relse;
 		}
+		nf_reset_ct(skb);
 		if (nsk == sk) {
 			reqsk_put(req);
 			tcp_v4_restore_cb(skb);