diff mbox series

[net,v3] net/tcp: Disable TCP-AO static key after RCU grace period

Message ID 20240801-tcp-ao-static-branch-rcu-v3-1-3ca33048c22d@gmail.com (mailing list archive)
State Accepted
Commit 14ab4792ee120c022f276a7e4768f4dcb08f0cdd
Delegated to: Netdev Maintainers
Headers show
Series [net,v3] net/tcp: Disable TCP-AO static key after RCU grace period | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 7 this patch: 7
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 7 this patch: 7
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 7 this patch: 7
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 62 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-08-01--06-00 (tests: 707)

Commit Message

Dmitry Safonov via B4 Relay Aug. 1, 2024, 12:13 a.m. UTC
From: Dmitry Safonov <0x7f454c46@gmail.com>

The lifetime of TCP-AO static_key is the same as the last
tcp_ao_info. On the socket destruction tcp_ao_info ceases to be
with RCU grace period, while tcp-ao static branch is currently deferred
destructed. The static key definition is
: DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ);

which means that if RCU grace period is delayed by more than a second
and tcp_ao_needed is in the process of disablement, other CPUs may
yet see tcp_ao_info which atent dead, but soon-to-be.
And that breaks the assumption of static_key_fast_inc_not_disabled().

See the comment near the definition:
> * The caller must make sure that the static key can't get disabled while
> * in this function. It doesn't patch jump labels, only adds a user to
> * an already enabled static key.

Originally it was introduced in commit eb8c507296f6 ("jump_label:
Prevent key->enabled int overflow"), which is needed for the atomic
contexts, one of which would be the creation of a full socket from a
request socket. In that atomic context, it's known by the presence
of the key (md5/ao) that the static branch is already enabled.
So, the ref counter for that static branch is just incremented
instead of holding the proper mutex.
static_key_fast_inc_not_disabled() is just a helper for such usage
case. But it must not be used if the static branch could get disabled
in parallel as it's not protected by jump_label_mutex and as a result,
races with jump_label_update() implementation details.

Happened on netdev test-bot[1], so not a theoretical issue:

[] jump_label: Fatal kernel bug, unexpected op at tcp_inbound_hash+0x1a7/0x870 [ffffffffa8c4e9b7] (eb 50 0f 1f 44 != 66 90 0f 1f 00)) size:2 type:1
[] ------------[ cut here ]------------
[] kernel BUG at arch/x86/kernel/jump_label.c:73!
[] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[] CPU: 3 PID: 243 Comm: kworker/3:3 Not tainted 6.10.0-virtme #1
[] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[] Workqueue: events jump_label_update_timeout
[] RIP: 0010:__jump_label_patch+0x2f6/0x350
...
[] Call Trace:
[]  <TASK>
[]  arch_jump_label_transform_queue+0x6c/0x110
[]  __jump_label_update+0xef/0x350
[]  __static_key_slow_dec_cpuslocked.part.0+0x3c/0x60
[]  jump_label_update_timeout+0x2c/0x40
[]  process_one_work+0xe3b/0x1670
[]  worker_thread+0x587/0xce0
[]  kthread+0x28a/0x350
[]  ret_from_fork+0x31/0x70
[]  ret_from_fork_asm+0x1a/0x30
[]  </TASK>
[] Modules linked in: veth
[] ---[ end trace 0000000000000000 ]---
[] RIP: 0010:__jump_label_patch+0x2f6/0x350

[1]: https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/696681/5-connect-deny-ipv6/stderr

Cc: stable@kernel.org
Fixes: 67fa83f7c86a ("net/tcp: Add static_key for TCP-AO")
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
---
Changes in v3:
- Use hlist_for_each_entry() instead of hlist_for_each_entry_safe() in
  tcp_ao_sk_omem_free() (Jakub's suggestion)
- While at it, use hlist_del() instead of hlist_del_rcu() in
  tcp_ao_info_free_rcu()
- Link to v2: https://lore.kernel.org/r/20240730-tcp-ao-static-branch-rcu-v2-1-33dc2b7adac8@gmail.com

Changes in v2:
- Use rcu_assign_pointer() in tcp_ao_destroy_sock()
- Combined both ao_info and keys destruction in one RCU callback,
  tcp_ao_info_free_rcu() (suggested-by Jakub)
- Hopefully improved a bit the commit message
- Link to v1: https://lore.kernel.org/r/20240725-tcp-ao-static-branch-rcu-v1-1-021d009beebf@gmail.com
---
 net/ipv4/tcp_ao.c | 43 ++++++++++++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 13 deletions(-)


---
base-commit: 21b136cc63d2a9ddd60d4699552b69c214b32964
change-id: 20240725-tcp-ao-static-branch-rcu-85ede7b3a1a5

Best regards,

Comments

Eric Dumazet Aug. 1, 2024, 6:46 a.m. UTC | #1
On Thu, Aug 1, 2024 at 2:13 AM Dmitry Safonov via B4 Relay
<devnull+0x7f454c46.gmail.com@kernel.org> wrote:
>
> From: Dmitry Safonov <0x7f454c46@gmail.com>
>
> The lifetime of TCP-AO static_key is the same as the last
> tcp_ao_info. On the socket destruction tcp_ao_info ceases to be
> with RCU grace period, while tcp-ao static branch is currently deferred
> destructed. The static key definition is
> : DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ);
>
> which means that if RCU grace period is delayed by more than a second
> and tcp_ao_needed is in the process of disablement, other CPUs may
> yet see tcp_ao_info which atent dead, but soon-to-be.
> And that breaks the assumption of static_key_fast_inc_not_disabled().
>
> See the comment near the definition:
> > * The caller must make sure that the static key can't get disabled while
> > * in this function. It doesn't patch jump labels, only adds a user to
> > * an already enabled static key.
>
> Originally it was introduced in commit eb8c507296f6 ("jump_label:
> Prevent key->enabled int overflow"), which is needed for the atomic
> contexts, one of which would be the creation of a full socket from a
> request socket. In that atomic context, it's known by the presence
> of the key (md5/ao) that the static branch is already enabled.
> So, the ref counter for that static branch is just incremented
> instead of holding the proper mutex.
> static_key_fast_inc_not_disabled() is just a helper for such usage
> case. But it must not be used if the static branch could get disabled
> in parallel as it's not protected by jump_label_mutex and as a result,
> races with jump_label_update() implementation details.
>
> Happened on netdev test-bot[1], so not a theoretical issue:
>
>
> [1]: https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/696681/5-connect-deny-ipv6/stderr
>
> Cc: stable@kernel.org
> Fixes: 67fa83f7c86a ("net/tcp: Add static_key for TCP-AO")
> Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>
Dmitry Safonov Aug. 2, 2024, 12:37 a.m. UTC | #2
On Thu, 1 Aug 2024 at 01:13, Dmitry Safonov via B4 Relay
<devnull+0x7f454c46.gmail.com@kernel.org> wrote:
>
> From: Dmitry Safonov <0x7f454c46@gmail.com>
[..]
> Happened on netdev test-bot[1], so not a theoretical issue:

Self-correction: I see a static_key fix in git.tip tree from a recent
regression, which could lead to the same kind of failure. So, I'm not
entirely sure the issue isn't theoretical.
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=224fa3552029

Yet, I guess it won't be any worse to fix this, even if it is theoretical.

> [] jump_label: Fatal kernel bug, unexpected op at tcp_inbound_hash+0x1a7/0x870 [ffffffffa8c4e9b7] (eb 50 0f 1f 44 != 66 90 0f 1f 00)) size:2 type:1
> [] ------------[ cut here ]------------
> [] kernel BUG at arch/x86/kernel/jump_label.c:73!
> [] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [] CPU: 3 PID: 243 Comm: kworker/3:3 Not tainted 6.10.0-virtme #1
> [] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> [] Workqueue: events jump_label_update_timeout
> [] RIP: 0010:__jump_label_patch+0x2f6/0x350
> ...
> [] Call Trace:
> []  <TASK>
> []  arch_jump_label_transform_queue+0x6c/0x110
> []  __jump_label_update+0xef/0x350
> []  __static_key_slow_dec_cpuslocked.part.0+0x3c/0x60
> []  jump_label_update_timeout+0x2c/0x40
> []  process_one_work+0xe3b/0x1670
> []  worker_thread+0x587/0xce0
> []  kthread+0x28a/0x350
> []  ret_from_fork+0x31/0x70
> []  ret_from_fork_asm+0x1a/0x30
> []  </TASK>
> [] Modules linked in: veth
> [] ---[ end trace 0000000000000000 ]---
> [] RIP: 0010:__jump_label_patch+0x2f6/0x350
>
> [1]: https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/696681/5-connect-deny-ipv6/stderr
[..]

Thanks,
             Dmitry
Kuniyuki Iwashima Aug. 2, 2024, 1:02 a.m. UTC | #3
From: Dmitry Safonov <0x7f454c46@gmail.com>
Date: Fri, 2 Aug 2024 01:37:28 +0100
> On Thu, 1 Aug 2024 at 01:13, Dmitry Safonov via B4 Relay
> <devnull+0x7f454c46.gmail.com@kernel.org> wrote:
> >
> > From: Dmitry Safonov <0x7f454c46@gmail.com>
> [..]
> > Happened on netdev test-bot[1], so not a theoretical issue:
> 
> Self-correction: I see a static_key fix in git.tip tree from a recent
> regression, which could lead to the same kind of failure. So, I'm not
> entirely sure the issue isn't theoretical.
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=224fa3552029

My syzkaller instances recently started to report similar splats over
different places (TCP-AO/MD5, fl6, netfilter, perf, etc), and I was
suspecting a bug in the jump label side.

report19:2:jump_label: Fatal kernel bug, unexpected op at fl6_sock_lookup include/net/ipv6.h:414 [inline] [000000001bd3e3db] (e9 ee 00 00 00 != 0f 1f 44 00 00)) size:5 type:1
report23:1:jump_label: Fatal kernel bug, unexpected op at nf_skip_egress include/linux/netfilter_netdev.h:136 [inline] [00000000c1241913] (e9 e9 0a 00 00 != 0f 1f 44 00 00)) size:5 type:1
report45:2:jump_label: Fatal kernel bug, unexpected op at tcp_ao_required include/net/tcp.h:2776 [inline] [000000009a4b37e9] (eb 5a e8 e1 57 != 66 90 0f 1f 00)) size:2 type:1
report49:3:jump_label: Fatal kernel bug, unexpected op at perf_sw_event include/linux/perf_event.h:1432 [inline] [00000000c1f7a26c] (eb 24 e9 63 fe != 66 90 0f 1f 00)) size:2 type:1
report58:2:jump_label: Fatal kernel bug, unexpected op at tcp_md5_do_lookup include/net/tcp.h:1852 [inline] [00000000fbd24b58] (e9 8d 01 00 00 != 0f 1f 44 00 00)) size:5 type:1

I'll cherry-pick the patch and see if it fixes them altogether.
It will take few days.

Thanks!
Dmitry Safonov Aug. 2, 2024, 2 a.m. UTC | #4
On Fri, 2 Aug 2024 at 02:03, Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Dmitry Safonov <0x7f454c46@gmail.com>
> Date: Fri, 2 Aug 2024 01:37:28 +0100
> > On Thu, 1 Aug 2024 at 01:13, Dmitry Safonov via B4 Relay
> > <devnull+0x7f454c46.gmail.com@kernel.org> wrote:
> > >
> > > From: Dmitry Safonov <0x7f454c46@gmail.com>
> > [..]
> > > Happened on netdev test-bot[1], so not a theoretical issue:
> >
> > Self-correction: I see a static_key fix in git.tip tree from a recent
> > regression, which could lead to the same kind of failure. So, I'm not
> > entirely sure the issue isn't theoretical.
> > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=224fa3552029
>
> My syzkaller instances recently started to report similar splats over
> different places (TCP-AO/MD5, fl6, netfilter, perf, etc), and I was
> suspecting a bug in the jump label side.

I'm glad I dropped you a hint :-)

> report19:2:jump_label: Fatal kernel bug, unexpected op at fl6_sock_lookup include/net/ipv6.h:414 [inline] [000000001bd3e3db] (e9 ee 00 00 00 != 0f 1f 44 00 00)) size:5 type:1
> report23:1:jump_label: Fatal kernel bug, unexpected op at nf_skip_egress include/linux/netfilter_netdev.h:136 [inline] [00000000c1241913] (e9 e9 0a 00 00 != 0f 1f 44 00 00)) size:5 type:1
> report45:2:jump_label: Fatal kernel bug, unexpected op at tcp_ao_required include/net/tcp.h:2776 [inline] [000000009a4b37e9] (eb 5a e8 e1 57 != 66 90 0f 1f 00)) size:2 type:1
> report49:3:jump_label: Fatal kernel bug, unexpected op at perf_sw_event include/linux/perf_event.h:1432 [inline] [00000000c1f7a26c] (eb 24 e9 63 fe != 66 90 0f 1f 00)) size:2 type:1
> report58:2:jump_label: Fatal kernel bug, unexpected op at tcp_md5_do_lookup include/net/tcp.h:1852 [inline] [00000000fbd24b58] (e9 8d 01 00 00 != 0f 1f 44 00 00)) size:5 type:1
>
> I'll cherry-pick the patch and see if it fixes them altogether.
> It will take few days.

Thanks!
patchwork-bot+netdevbpf@kernel.org Aug. 4, 2024, 1:31 p.m. UTC | #5
Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Thu, 01 Aug 2024 01:13:28 +0100 you wrote:
> From: Dmitry Safonov <0x7f454c46@gmail.com>
> 
> The lifetime of TCP-AO static_key is the same as the last
> tcp_ao_info. On the socket destruction tcp_ao_info ceases to be
> with RCU grace period, while tcp-ao static branch is currently deferred
> destructed. The static key definition is
> : DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ);
> 
> [...]

Here is the summary with links:
  - [net,v3] net/tcp: Disable TCP-AO static key after RCU grace period
    https://git.kernel.org/netdev/net/c/14ab4792ee12

You are awesome, thank you!
Kuniyuki Iwashima Aug. 5, 2024, 6:04 p.m. UTC | #6
From: Dmitry Safonov <0x7f454c46@gmail.com>
Date: Fri, 2 Aug 2024 03:00:59 +0100
> On Fri, 2 Aug 2024 at 02:03, Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > From: Dmitry Safonov <0x7f454c46@gmail.com>
> > Date: Fri, 2 Aug 2024 01:37:28 +0100
> > > On Thu, 1 Aug 2024 at 01:13, Dmitry Safonov via B4 Relay
> > > <devnull+0x7f454c46.gmail.com@kernel.org> wrote:
> > > >
> > > > From: Dmitry Safonov <0x7f454c46@gmail.com>
> > > [..]
> > > > Happened on netdev test-bot[1], so not a theoretical issue:
> > >
> > > Self-correction: I see a static_key fix in git.tip tree from a recent
> > > regression, which could lead to the same kind of failure. So, I'm not
> > > entirely sure the issue isn't theoretical.
> > > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=224fa3552029
> >
> > My syzkaller instances recently started to report similar splats over
> > different places (TCP-AO/MD5, fl6, netfilter, perf, etc), and I was
> > suspecting a bug in the jump label side.
> 
> I'm glad I dropped you a hint :-)
> 
> > report19:2:jump_label: Fatal kernel bug, unexpected op at fl6_sock_lookup include/net/ipv6.h:414 [inline] [000000001bd3e3db] (e9 ee 00 00 00 != 0f 1f 44 00 00)) size:5 type:1
> > report23:1:jump_label: Fatal kernel bug, unexpected op at nf_skip_egress include/linux/netfilter_netdev.h:136 [inline] [00000000c1241913] (e9 e9 0a 00 00 != 0f 1f 44 00 00)) size:5 type:1
> > report45:2:jump_label: Fatal kernel bug, unexpected op at tcp_ao_required include/net/tcp.h:2776 [inline] [000000009a4b37e9] (eb 5a e8 e1 57 != 66 90 0f 1f 00)) size:2 type:1
> > report49:3:jump_label: Fatal kernel bug, unexpected op at perf_sw_event include/linux/perf_event.h:1432 [inline] [00000000c1f7a26c] (eb 24 e9 63 fe != 66 90 0f 1f 00)) size:2 type:1
> > report58:2:jump_label: Fatal kernel bug, unexpected op at tcp_md5_do_lookup include/net/tcp.h:1852 [inline] [00000000fbd24b58] (e9 8d 01 00 00 != 0f 1f 44 00 00)) size:5 type:1
> >
> > I'll cherry-pick the patch and see if it fixes them altogether.
> > It will take few days.

I haven't seen the splat so far after applying 224fa3552029.
Before that, syzkaller reported jump_label splats 4~5 times
everyday, so I think 224fa3552029 fixed the regression.

Thanks!
diff mbox series

Patch

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 85531437890c..db6516092daf 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -267,32 +267,49 @@  static void tcp_ao_key_free_rcu(struct rcu_head *head)
 	kfree_sensitive(key);
 }
 
-void tcp_ao_destroy_sock(struct sock *sk, bool twsk)
+static void tcp_ao_info_free_rcu(struct rcu_head *head)
 {
-	struct tcp_ao_info *ao;
+	struct tcp_ao_info *ao = container_of(head, struct tcp_ao_info, rcu);
 	struct tcp_ao_key *key;
 	struct hlist_node *n;
 
+	hlist_for_each_entry_safe(key, n, &ao->head, node) {
+		hlist_del(&key->node);
+		tcp_sigpool_release(key->tcp_sigpool_id);
+		kfree_sensitive(key);
+	}
+	kfree(ao);
+	static_branch_slow_dec_deferred(&tcp_ao_needed);
+}
+
+static void tcp_ao_sk_omem_free(struct sock *sk, struct tcp_ao_info *ao)
+{
+	size_t total_ao_sk_mem = 0;
+	struct tcp_ao_key *key;
+
+	hlist_for_each_entry(key,  &ao->head, node)
+		total_ao_sk_mem += tcp_ao_sizeof_key(key);
+	atomic_sub(total_ao_sk_mem, &sk->sk_omem_alloc);
+}
+
+void tcp_ao_destroy_sock(struct sock *sk, bool twsk)
+{
+	struct tcp_ao_info *ao;
+
 	if (twsk) {
 		ao = rcu_dereference_protected(tcp_twsk(sk)->ao_info, 1);
-		tcp_twsk(sk)->ao_info = NULL;
+		rcu_assign_pointer(tcp_twsk(sk)->ao_info, NULL);
 	} else {
 		ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, 1);
-		tcp_sk(sk)->ao_info = NULL;
+		rcu_assign_pointer(tcp_sk(sk)->ao_info, NULL);
 	}
 
 	if (!ao || !refcount_dec_and_test(&ao->refcnt))
 		return;
 
-	hlist_for_each_entry_safe(key, n, &ao->head, node) {
-		hlist_del_rcu(&key->node);
-		if (!twsk)
-			atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc);
-		call_rcu(&key->rcu, tcp_ao_key_free_rcu);
-	}
-
-	kfree_rcu(ao, rcu);
-	static_branch_slow_dec_deferred(&tcp_ao_needed);
+	if (!twsk)
+		tcp_ao_sk_omem_free(sk, ao);
+	call_rcu(&ao->rcu, tcp_ao_info_free_rcu);
 }
 
 void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp)