diff mbox series

[v2,net-next,06/14] netlink: hold nlk->cb_mutex longer in __netlink_dump_start()

Message ID 20240222105021.1943116-7-edumazet@google.com (mailing list archive)
State Accepted
Commit b5590270068c4324dac4a2b5a4a156e02e21339f
Delegated to: Netdev Maintainers
Headers show
Series rtnetlink: reduce RTNL pressure for dumps | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 945 this patch: 945
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 4 of 4 maintainers
netdev/build_clang success Errors and warnings before: 958 this patch: 958
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 962 this patch: 962
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 43 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-02-23--03-00 (tests: 1457)

Commit Message

Eric Dumazet Feb. 22, 2024, 10:50 a.m. UTC
__netlink_dump_start() releases nlk->cb_mutex right before
calling netlink_dump() which grabs it again.

This seems dangerous, even if KASAN did not bother yet.

Add a @lock_taken parameter to netlink_dump() to let it
grab the mutex if called from netlink_recvmsg() only.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/netlink/af_netlink.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

Comments

Jiri Pirko Feb. 22, 2024, 4:20 p.m. UTC | #1
Thu, Feb 22, 2024 at 11:50:13AM CET, edumazet@google.com wrote:
>__netlink_dump_start() releases nlk->cb_mutex right before
>calling netlink_dump() which grabs it again.

Yeah, I spotted this recently as well. Good to get rid of it.


>
>This seems dangerous, even if KASAN did not bother yet.
>
>Add a @lock_taken parameter to netlink_dump() to let it
>grab the mutex if called from netlink_recvmsg() only.
>
>Signed-off-by: Eric Dumazet <edumazet@google.com>


Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tetsuo Handa June 9, 2024, 8:17 a.m. UTC | #2
Hello.

While investigating hung task reports involving rtnl_mutex, I came to
suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer
in __netlink_dump_start()") is buggy, for that commit made only
mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make
mutex_unlock(nlk->cb_mutex) side conditionally?

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index fa9c090cf629..c23a8d4ddcae 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2352,7 +2352,8 @@ static int netlink_dump(struct sock *sk, bool lock_taken)
 
 	if (nlk->dump_done_errno > 0 ||
 	    skb_tailroom(skb) < nlmsg_total_size(sizeof(nlk->dump_done_errno))) {
-		mutex_unlock(&nlk->nl_cb_mutex);
+		if (!lock_taken)
+			mutex_unlock(&nlk->nl_cb_mutex);
 
 		if (sk_filter(sk, skb))
 			kfree_skb(skb);
@@ -2386,13 +2387,15 @@ static int netlink_dump(struct sock *sk, bool lock_taken)
 	WRITE_ONCE(nlk->cb_running, false);
 	module = cb->module;
 	skb = cb->skb;
-	mutex_unlock(&nlk->nl_cb_mutex);
+	if (!lock_taken)
+		mutex_unlock(&nlk->nl_cb_mutex);
 	module_put(module);
 	consume_skb(skb);
 	return 0;
 
 errout_skb:
-	mutex_unlock(&nlk->nl_cb_mutex);
+	if (!lock_taken)
+		mutex_unlock(&nlk->nl_cb_mutex);
 	kfree_skb(skb);
 	return err;
 }
Tetsuo Handa June 9, 2024, 8:29 a.m. UTC | #3
On 2024/06/09 17:17, Tetsuo Handa wrote:
> Hello.
> 
> While investigating hung task reports involving rtnl_mutex, I came to
> suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer
> in __netlink_dump_start()") is buggy, for that commit made only
> mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make
> mutex_unlock(nlk->cb_mutex) side conditionally?
> 

Sorry for the noise. That commit should be correct, for the caller
no longer calls mutex_unlock(nlk->cb_mutex).

I'll try a debug printk() patch for linux-next.
Eric Dumazet June 10, 2024, 12:59 p.m. UTC | #4
On Sun, Jun 9, 2024 at 10:29 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2024/06/09 17:17, Tetsuo Handa wrote:
> > Hello.
> >
> > While investigating hung task reports involving rtnl_mutex, I came to
> > suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer
> > in __netlink_dump_start()") is buggy, for that commit made only
> > mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make
> > mutex_unlock(nlk->cb_mutex) side conditionally?
> >
>
> Sorry for the noise. That commit should be correct, for the caller
> no longer calls mutex_unlock(nlk->cb_mutex).
>
> I'll try a debug printk() patch for linux-next.

I also have a lot of hung task reports as well, but in most reports
the console is flooded
before the crashes.


[  276.515597][    C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71
[  276.522774][    C1] yealink 4-1:36.0: urb_irq_callback - urb status -71
[  276.529566][    C1] yealink 4-1:36.0: unexpected response 0
[  276.535875][    C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71
[  276.543011][    C1] yealink 4-1:36.0: urb_irq_callback - urb status -71
[  276.549951][    C1] yealink 4-1:36.0: unexpected response 0
[  276.556111][    C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71
[  276.563143][    C1] yealink 4-1:36.0: urb_irq_callback - urb status -71
[  276.570382][    C1] yealink 4-1:36.0: unexpected response 0
[  276.576399][    C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71
[  276.584381][    C1] yealink 4-1:36.0: urb_irq_callback - urb status -71
[  276.591617][    C1] yealink 4-1:36.0: unexpected response 0
[  276.597904][    C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71
[  276.605126][    C1] yealink 4-1:36.0: urb_irq_callback - urb status -71
[  276.612153][    C1] yealink 4-1:36.0: unexpected response 0
[  276.618588][    C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71
[  276.626153][    C1] yealink 4-1:36.0: urb_irq_callback - urb status -71
[  276.631595][   T30] INFO: task dhcpcd:4749 blocked for more than 143 seconds.
[  276.633015][    C1] yealink 4-1:36.0: unexpected response 0
[  276.646813][    C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71
[  276.654401][   T30]       Not tainted
6.10.0-rc2-syzkaller-00269-g96e09b8f8166 #0


2024/06/08 02:48:35 SYZFATAL: failed to recv *flatrpc.HostMessageRaw: EOF

[  276.654461][    C1] yealink 4-1:36.0: urb_irq_callback - urb status -71

I wonder how to deal with SYZFATAL, maybe the reports are truncated and we
do not see who owns rtnl mutex.
Tetsuo Handa June 10, 2024, 1:21 p.m. UTC | #5
On 2024/06/10 21:59, Eric Dumazet wrote:
> On Sun, Jun 9, 2024 at 10:29 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> On 2024/06/09 17:17, Tetsuo Handa wrote:
>>> Hello.
>>>
>>> While investigating hung task reports involving rtnl_mutex, I came to
>>> suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer
>>> in __netlink_dump_start()") is buggy, for that commit made only
>>> mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make
>>> mutex_unlock(nlk->cb_mutex) side conditionally?
>>>
>>
>> Sorry for the noise. That commit should be correct, for the caller
>> no longer calls mutex_unlock(nlk->cb_mutex).
>>
>> I'll try a debug printk() patch for linux-next.
> 
> I also have a lot of hung task reports as well, but in most reports
> the console is flooded
> before the crashes.

Yeah, printk() flooding is the cause of some of hung task reports.

I queued https://sourceforge.net/p/tomoyo/tomoyo.git/ci/c2bfadd666b5852974071df0588d7eb0f499b7b5/
for linux-next.git . You can try this patch to see what the owner of rtnl_mutex is doing.
diff mbox series

Patch

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..94f3860526bfaa5793e8b3917250ec0e751687b5 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -130,7 +130,7 @@  static const char *const nlk_cb_mutex_key_strings[MAX_LINKS + 1] = {
 	"nlk_cb_mutex-MAX_LINKS"
 };
 
-static int netlink_dump(struct sock *sk);
+static int netlink_dump(struct sock *sk, bool lock_taken);
 
 /* nl_table locking explained:
  * Lookup and traversal are protected with an RCU read-side lock. Insertion
@@ -1987,7 +1987,7 @@  static int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 
 	if (READ_ONCE(nlk->cb_running) &&
 	    atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) {
-		ret = netlink_dump(sk);
+		ret = netlink_dump(sk, false);
 		if (ret) {
 			WRITE_ONCE(sk->sk_err, -ret);
 			sk_error_report(sk);
@@ -2196,7 +2196,7 @@  static int netlink_dump_done(struct netlink_sock *nlk, struct sk_buff *skb,
 	return 0;
 }
 
-static int netlink_dump(struct sock *sk)
+static int netlink_dump(struct sock *sk, bool lock_taken)
 {
 	struct netlink_sock *nlk = nlk_sk(sk);
 	struct netlink_ext_ack extack = {};
@@ -2208,7 +2208,8 @@  static int netlink_dump(struct sock *sk)
 	int alloc_min_size;
 	int alloc_size;
 
-	mutex_lock(nlk->cb_mutex);
+	if (!lock_taken)
+		mutex_lock(nlk->cb_mutex);
 	if (!nlk->cb_running) {
 		err = -EINVAL;
 		goto errout_skb;
@@ -2365,9 +2366,7 @@  int __netlink_dump_start(struct sock *ssk, struct sk_buff *skb,
 	WRITE_ONCE(nlk->cb_running, true);
 	nlk->dump_done_errno = INT_MAX;
 
-	mutex_unlock(nlk->cb_mutex);
-
-	ret = netlink_dump(sk);
+	ret = netlink_dump(sk, true);
 
 	sock_put(sk);