Message ID | 20240222105021.1943116-7-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | b5590270068c4324dac4a2b5a4a156e02e21339f |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | rtnetlink: reduce RTNL pressure for dumps | expand |
Thu, Feb 22, 2024 at 11:50:13AM CET, edumazet@google.com wrote: >__netlink_dump_start() releases nlk->cb_mutex right before >calling netlink_dump() which grabs it again. Yeah, I spotted this recently as well. Good to get rid of it. > >This seems dangerous, even if KASAN did not bother yet. > >Add a @lock_taken parameter to netlink_dump() to let it >grab the mutex if called from netlink_recvmsg() only. > >Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Hello. While investigating hung task reports involving rtnl_mutex, I came to suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer in __netlink_dump_start()") is buggy, for that commit made only mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make mutex_unlock(nlk->cb_mutex) side conditionally? diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index fa9c090cf629..c23a8d4ddcae 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2352,7 +2352,8 @@ static int netlink_dump(struct sock *sk, bool lock_taken) if (nlk->dump_done_errno > 0 || skb_tailroom(skb) < nlmsg_total_size(sizeof(nlk->dump_done_errno))) { - mutex_unlock(&nlk->nl_cb_mutex); + if (!lock_taken) + mutex_unlock(&nlk->nl_cb_mutex); if (sk_filter(sk, skb)) kfree_skb(skb); @@ -2386,13 +2387,15 @@ static int netlink_dump(struct sock *sk, bool lock_taken) WRITE_ONCE(nlk->cb_running, false); module = cb->module; skb = cb->skb; - mutex_unlock(&nlk->nl_cb_mutex); + if (!lock_taken) + mutex_unlock(&nlk->nl_cb_mutex); module_put(module); consume_skb(skb); return 0; errout_skb: - mutex_unlock(&nlk->nl_cb_mutex); + if (!lock_taken) + mutex_unlock(&nlk->nl_cb_mutex); kfree_skb(skb); return err; }
On 2024/06/09 17:17, Tetsuo Handa wrote: > Hello. > > While investigating hung task reports involving rtnl_mutex, I came to > suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer > in __netlink_dump_start()") is buggy, for that commit made only > mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make > mutex_unlock(nlk->cb_mutex) side conditionally? > Sorry for the noise. That commit should be correct, for the caller no longer calls mutex_unlock(nlk->cb_mutex). I'll try a debug printk() patch for linux-next.
On Sun, Jun 9, 2024 at 10:29 AM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > On 2024/06/09 17:17, Tetsuo Handa wrote: > > Hello. > > > > While investigating hung task reports involving rtnl_mutex, I came to > > suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer > > in __netlink_dump_start()") is buggy, for that commit made only > > mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make > > mutex_unlock(nlk->cb_mutex) side conditionally? > > > > Sorry for the noise. That commit should be correct, for the caller > no longer calls mutex_unlock(nlk->cb_mutex). > > I'll try a debug printk() patch for linux-next. I also have a lot of hung task reports as well, but in most reports the console is flooded before the crashes. [ 276.515597][ C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71 [ 276.522774][ C1] yealink 4-1:36.0: urb_irq_callback - urb status -71 [ 276.529566][ C1] yealink 4-1:36.0: unexpected response 0 [ 276.535875][ C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71 [ 276.543011][ C1] yealink 4-1:36.0: urb_irq_callback - urb status -71 [ 276.549951][ C1] yealink 4-1:36.0: unexpected response 0 [ 276.556111][ C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71 [ 276.563143][ C1] yealink 4-1:36.0: urb_irq_callback - urb status -71 [ 276.570382][ C1] yealink 4-1:36.0: unexpected response 0 [ 276.576399][ C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71 [ 276.584381][ C1] yealink 4-1:36.0: urb_irq_callback - urb status -71 [ 276.591617][ C1] yealink 4-1:36.0: unexpected response 0 [ 276.597904][ C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71 [ 276.605126][ C1] yealink 4-1:36.0: urb_irq_callback - urb status -71 [ 276.612153][ C1] yealink 4-1:36.0: unexpected response 0 [ 276.618588][ C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71 [ 276.626153][ C1] yealink 4-1:36.0: urb_irq_callback - urb status -71 [ 276.631595][ T30] INFO: task dhcpcd:4749 blocked for more than 143 seconds. [ 276.633015][ C1] yealink 4-1:36.0: unexpected response 0 [ 276.646813][ C1] yealink 4-1:36.0: urb_ctl_callback - urb status -71 [ 276.654401][ T30] Not tainted 6.10.0-rc2-syzkaller-00269-g96e09b8f8166 #0 2024/06/08 02:48:35 SYZFATAL: failed to recv *flatrpc.HostMessageRaw: EOF [ 276.654461][ C1] yealink 4-1:36.0: urb_irq_callback - urb status -71 I wonder how to deal with SYZFATAL, maybe the reports are truncated and we do not see who owns rtnl mutex.
On 2024/06/10 21:59, Eric Dumazet wrote: > On Sun, Jun 9, 2024 at 10:29 AM Tetsuo Handa > <penguin-kernel@i-love.sakura.ne.jp> wrote: >> >> On 2024/06/09 17:17, Tetsuo Handa wrote: >>> Hello. >>> >>> While investigating hung task reports involving rtnl_mutex, I came to >>> suspect that commit b5590270068c ("netlink: hold nlk->cb_mutex longer >>> in __netlink_dump_start()") is buggy, for that commit made only >>> mutex_lock(nlk->cb_mutex) side conditionally. Why don't we need to make >>> mutex_unlock(nlk->cb_mutex) side conditionally? >>> >> >> Sorry for the noise. That commit should be correct, for the caller >> no longer calls mutex_unlock(nlk->cb_mutex). >> >> I'll try a debug printk() patch for linux-next. > > I also have a lot of hung task reports as well, but in most reports > the console is flooded > before the crashes. Yeah, printk() flooding is the cause of some of hung task reports. I queued https://sourceforge.net/p/tomoyo/tomoyo.git/ci/c2bfadd666b5852974071df0588d7eb0f499b7b5/ for linux-next.git . You can try this patch to see what the owner of rtnl_mutex is doing.
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..94f3860526bfaa5793e8b3917250ec0e751687b5 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -130,7 +130,7 @@ static const char *const nlk_cb_mutex_key_strings[MAX_LINKS + 1] = { "nlk_cb_mutex-MAX_LINKS" }; -static int netlink_dump(struct sock *sk); +static int netlink_dump(struct sock *sk, bool lock_taken); /* nl_table locking explained: * Lookup and traversal are protected with an RCU read-side lock. Insertion @@ -1987,7 +1987,7 @@ static int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, if (READ_ONCE(nlk->cb_running) && atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) { - ret = netlink_dump(sk); + ret = netlink_dump(sk, false); if (ret) { WRITE_ONCE(sk->sk_err, -ret); sk_error_report(sk); @@ -2196,7 +2196,7 @@ static int netlink_dump_done(struct netlink_sock *nlk, struct sk_buff *skb, return 0; } -static int netlink_dump(struct sock *sk) +static int netlink_dump(struct sock *sk, bool lock_taken) { struct netlink_sock *nlk = nlk_sk(sk); struct netlink_ext_ack extack = {}; @@ -2208,7 +2208,8 @@ static int netlink_dump(struct sock *sk) int alloc_min_size; int alloc_size; - mutex_lock(nlk->cb_mutex); + if (!lock_taken) + mutex_lock(nlk->cb_mutex); if (!nlk->cb_running) { err = -EINVAL; goto errout_skb; @@ -2365,9 +2366,7 @@ int __netlink_dump_start(struct sock *ssk, struct sk_buff *skb, WRITE_ONCE(nlk->cb_running, true); nlk->dump_done_errno = INT_MAX; - mutex_unlock(nlk->cb_mutex); - - ret = netlink_dump(sk); + ret = netlink_dump(sk, true); sock_put(sk);
__netlink_dump_start() releases nlk->cb_mutex right before calling netlink_dump() which grabs it again. This seems dangerous, even if KASAN did not bother yet. Add a @lock_taken parameter to netlink_dump() to let it grab the mutex if called from netlink_recvmsg() only. Signed-off-by: Eric Dumazet <edumazet@google.com> --- net/netlink/af_netlink.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)