From patchwork Sat Feb 13 17:51:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 12086977 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6C00C433DB for ; Sat, 13 Feb 2021 17:52:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B5A3A60203 for ; Sat, 13 Feb 2021 17:52:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229780AbhBMRwi (ORCPT ); Sat, 13 Feb 2021 12:52:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229742AbhBMRw2 (ORCPT ); Sat, 13 Feb 2021 12:52:28 -0500 Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02A77C061794 for ; Sat, 13 Feb 2021 09:51:18 -0800 (PST) Received: by mail-pj1-x1032.google.com with SMTP id e9so1451950pjj.0 for ; Sat, 13 Feb 2021 09:51:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=O0go3ge/KxqBJRP2Lv98y/9NQktoO2Bv2W4Gj6Xz920=; b=rGEmDFflWNeyDh7klT0KtfB8azO5y6d7sUxx5npjzeTIP3Dzr66bpbO9UvYt9FjkrN TwThBs9ouAJBI6Sr7BCUKi49bQua/m+vQ7o7R3n8YJ8oMzQ8q8oPl1bFmNLXnP4COWiF upR7TnzmK6JlkUa33jxEol/wXzBDkLAtuf2e1qNjabzVwNnKNJJTfQiVLvmrHVA097MO wcCHr98rdLlCadOaybDZVFtaur0j0CWsBC3Ymg4kgIN/TTs6ubxsBuBVYfia10ovbH70 MtVwkF+F8/10X88NH0Z7CGZEpJzzjeJU4QSPZ888xNgI/MS9bYxISOXI/6l6DcCdhf0+ ys+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=O0go3ge/KxqBJRP2Lv98y/9NQktoO2Bv2W4Gj6Xz920=; b=oRFUs9txAz/Ep/9ZBqpQk3Z1v7tMXN/zsi5aehSbn0pYzKfuuaMWdH6yc6gAN+N6p+ v90WT6pDzrn5ZIJX7RnJs1eR+8BHwTzRPd2N2wELGGPCz8wYuXu4ei0GQpHW3ROYK/BI o5sdHEdAbWO6q0T5o7ljvGcCmkHVEl4htBtN7W6x4Gu9jUBLeg+JtyAQ4zCIjrIkcFAg VyL0Sq87k3yUOoj+c399bUpKCsS/iSxeiZHNcNWoV39P1vfLEVg5RV0NY55zf0H4DkKC yoQxQORV4qUe4iCZUdjgq2X00tYArdbQFlI+t8HMnCa8Zu8RiawrvGr5LmRoa/LIplkl D4Sg== X-Gm-Message-State: AOAM530w5SgpEzaDoLvLqpJ3TE+tFzUJAQsM5O6vsINHwosNOOFdnQbb 5CGByslEaPLwdWdDWOKDeck= X-Google-Smtp-Source: ABdhPJz8mBo3RzpCtzgPflI/2vY5iRzbLHVbYu+D2+QH/yflgskTPkb9VVmQywzWAdvXA2HZhZQx+w== X-Received: by 2002:a17:90b:46c5:: with SMTP id jx5mr7929921pjb.27.1613238677460; Sat, 13 Feb 2021 09:51:17 -0800 (PST) Received: from localhost.localdomain ([49.173.165.50]) by smtp.gmail.com with ESMTPSA id 16sm11181894pjc.28.2021.02.13.09.51.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Feb 2021 09:51:16 -0800 (PST) From: Taehee Yoo To: davem@davemloft.net, kuba@kernel.org, xiyou.wangcong@gmail.com, netdev@vger.kernel.org, jwi@linux.ibm.com, kgraul@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com, mareklindner@neomailbox.ch, sw@simonwunderlich.de, a@unstable.cc, sven@narfation.org, yoshfuji@linux-ipv6.org, dsahern@kernel.org Cc: Taehee Yoo Subject: [PATCH net-next v2 1/7] mld: convert from timer to delayed work Date: Sat, 13 Feb 2021 17:51:02 +0000 Message-Id: <20210213175102.28227-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org mcast.c has several timers for delaying works. Timer's expire handler is working under atomic context so it can't use sleepable things such as GFP_KERNEL, mutex, etc. In order to use sleepable APIs, it converts from timers to delayed work. But there are some critical sections, which is used by both process and BH context. So that it still uses spin_lock_bh() and rwlock. Suggested-by: Cong Wang Signed-off-by: Taehee Yoo --- v1 -> v2: - Separated from previous big one patch. include/net/if_inet6.h | 8 +-- net/ipv6/mcast.c | 148 ++++++++++++++++++++++++----------------- 2 files changed, 91 insertions(+), 65 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 8bf5906073bc..af5244c9ca5c 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -120,7 +120,7 @@ struct ifmcaddr6 { unsigned int mca_sfmode; unsigned char mca_crcount; unsigned long mca_sfcount[2]; - struct timer_list mca_timer; + struct delayed_work mca_work; unsigned int mca_flags; int mca_users; refcount_t mca_refcnt; @@ -179,9 +179,9 @@ struct inet6_dev { unsigned long mc_qri; /* Query Response Interval */ unsigned long mc_maxdelay; - struct timer_list mc_gq_timer; /* general query timer */ - struct timer_list mc_ifc_timer; /* interface change timer */ - struct timer_list mc_dad_timer; /* dad complete mc timer */ + struct delayed_work mc_gq_work; /* general query work */ + struct delayed_work mc_ifc_work; /* interface change work */ + struct delayed_work mc_dad_work; /* dad complete mc work */ struct ifacaddr6 *ac_list; rwlock_t lock; diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 6c8604390266..80597dc56f2a 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -29,7 +29,6 @@ #include #include #include -#include #include #include #include @@ -42,6 +41,7 @@ #include #include #include +#include #include #include @@ -67,14 +67,13 @@ static int __mld2_query_bugs[] __attribute__((__unused__)) = { BUILD_BUG_ON_ZERO(offsetof(struct mld2_grec, grec_mca) % 4) }; +static struct workqueue_struct *mld_wq; static struct in6_addr mld2_all_mcr = MLD2_ALL_MCR_INIT; static void igmp6_join_group(struct ifmcaddr6 *ma); static void igmp6_leave_group(struct ifmcaddr6 *ma); -static void igmp6_timer_handler(struct timer_list *t); +static void mld_mca_work(struct work_struct *work); -static void mld_gq_timer_expire(struct timer_list *t); -static void mld_ifc_timer_expire(struct timer_list *t); static void mld_ifc_event(struct inet6_dev *idev); static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *pmc); static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *pmc); @@ -713,7 +712,7 @@ static void igmp6_group_dropped(struct ifmcaddr6 *mc) igmp6_leave_group(mc); spin_lock_bh(&mc->mca_lock); - if (del_timer(&mc->mca_timer)) + if (cancel_delayed_work(&mc->mca_work)) refcount_dec(&mc->mca_refcnt); spin_unlock_bh(&mc->mca_lock); } @@ -854,7 +853,7 @@ static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev, if (!mc) return NULL; - timer_setup(&mc->mca_timer, igmp6_timer_handler, 0); + INIT_DELAYED_WORK(&mc->mca_work, mld_mca_work); mc->mca_addr = *addr; mc->idev = idev; /* reference taken by caller */ @@ -1027,48 +1026,48 @@ bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group, return rv; } -static void mld_gq_start_timer(struct inet6_dev *idev) +static void mld_gq_start_work(struct inet6_dev *idev) { unsigned long tv = prandom_u32() % idev->mc_maxdelay; idev->mc_gq_running = 1; - if (!mod_timer(&idev->mc_gq_timer, jiffies+tv+2)) + if (!mod_delayed_work(mld_wq, &idev->mc_gq_work, msecs_to_jiffies(tv + 2))) in6_dev_hold(idev); } -static void mld_gq_stop_timer(struct inet6_dev *idev) +static void mld_gq_stop_work(struct inet6_dev *idev) { idev->mc_gq_running = 0; - if (del_timer(&idev->mc_gq_timer)) + if (cancel_delayed_work(&idev->mc_gq_work)) __in6_dev_put(idev); } -static void mld_ifc_start_timer(struct inet6_dev *idev, unsigned long delay) +static void mld_ifc_start_work(struct inet6_dev *idev, unsigned long delay) { unsigned long tv = prandom_u32() % delay; - if (!mod_timer(&idev->mc_ifc_timer, jiffies+tv+2)) + if (!mod_delayed_work(mld_wq, &idev->mc_ifc_work, msecs_to_jiffies(tv + 2))) in6_dev_hold(idev); } -static void mld_ifc_stop_timer(struct inet6_dev *idev) +static void mld_ifc_stop_work(struct inet6_dev *idev) { idev->mc_ifc_count = 0; - if (del_timer(&idev->mc_ifc_timer)) + if (cancel_delayed_work(&idev->mc_ifc_work)) __in6_dev_put(idev); } -static void mld_dad_start_timer(struct inet6_dev *idev, unsigned long delay) +static void mld_dad_start_work(struct inet6_dev *idev, unsigned long delay) { unsigned long tv = prandom_u32() % delay; - if (!mod_timer(&idev->mc_dad_timer, jiffies+tv+2)) + if (!mod_delayed_work(mld_wq, &idev->mc_dad_work, msecs_to_jiffies(tv + 2))) in6_dev_hold(idev); } -static void mld_dad_stop_timer(struct inet6_dev *idev) +static void mld_dad_stop_work(struct inet6_dev *idev) { - if (del_timer(&idev->mc_dad_timer)) + if (cancel_delayed_work(&idev->mc_dad_work)) __in6_dev_put(idev); } @@ -1080,21 +1079,20 @@ static void igmp6_group_queried(struct ifmcaddr6 *ma, unsigned long resptime) { unsigned long delay = resptime; - /* Do not start timer for these addresses */ + /* Do not start work for these addresses */ if (ipv6_addr_is_ll_all_nodes(&ma->mca_addr) || IPV6_ADDR_MC_SCOPE(&ma->mca_addr) < IPV6_ADDR_SCOPE_LINKLOCAL) return; - if (del_timer(&ma->mca_timer)) { + if (cancel_delayed_work(&ma->mca_work)) { refcount_dec(&ma->mca_refcnt); - delay = ma->mca_timer.expires - jiffies; + delay = ma->mca_work.timer.expires - jiffies; } if (delay >= resptime) delay = prandom_u32() % resptime; - ma->mca_timer.expires = jiffies + delay; - if (!mod_timer(&ma->mca_timer, jiffies + delay)) + if (!mod_delayed_work(mld_wq, &ma->mca_work, msecs_to_jiffies(delay))) refcount_inc(&ma->mca_refcnt); ma->mca_flags |= MAF_TIMER_RUNNING; } @@ -1305,10 +1303,10 @@ static int mld_process_v1(struct inet6_dev *idev, struct mld_msg *mld, if (v1_query) mld_set_v1_mode(idev); - /* cancel MLDv2 report timer */ - mld_gq_stop_timer(idev); - /* cancel the interface change timer */ - mld_ifc_stop_timer(idev); + /* cancel MLDv2 report work */ + mld_gq_stop_work(idev); + /* cancel the interface change work */ + mld_ifc_stop_work(idev); /* clear deleted report items */ mld_clear_delrec(idev); @@ -1398,7 +1396,7 @@ int igmp6_event_query(struct sk_buff *skb) if (mlh2->mld2q_nsrcs) return -EINVAL; /* no sources allowed */ - mld_gq_start_timer(idev); + mld_gq_start_work(idev); return 0; } /* mark sources to include, if group & source-specific */ @@ -1482,14 +1480,14 @@ int igmp6_event_report(struct sk_buff *skb) return -ENODEV; /* - * Cancel the timer for this group + * Cancel the work for this group */ read_lock_bh(&idev->lock); for (ma = idev->mc_list; ma; ma = ma->next) { if (ipv6_addr_equal(&ma->mca_addr, &mld->mld_mca)) { spin_lock(&ma->mca_lock); - if (del_timer(&ma->mca_timer)) + if (cancel_delayed_work(&ma->mca_work)) refcount_dec(&ma->mca_refcnt); ma->mca_flags &= ~(MAF_LAST_REPORTER|MAF_TIMER_RUNNING); spin_unlock(&ma->mca_lock); @@ -2103,22 +2101,26 @@ void ipv6_mc_dad_complete(struct inet6_dev *idev) mld_send_initial_cr(idev); idev->mc_dad_count--; if (idev->mc_dad_count) - mld_dad_start_timer(idev, - unsolicited_report_interval(idev)); + mld_dad_start_work(idev, + unsolicited_report_interval(idev)); } } -static void mld_dad_timer_expire(struct timer_list *t) +static void mld_dad_work(struct work_struct *work) { - struct inet6_dev *idev = from_timer(idev, t, mc_dad_timer); + struct inet6_dev *idev = container_of(to_delayed_work(work), + struct inet6_dev, + mc_dad_work); + rtnl_lock(); mld_send_initial_cr(idev); if (idev->mc_dad_count) { idev->mc_dad_count--; if (idev->mc_dad_count) - mld_dad_start_timer(idev, - unsolicited_report_interval(idev)); + mld_dad_start_work(idev, + unsolicited_report_interval(idev)); } + rtnl_unlock(); in6_dev_put(idev); } @@ -2416,12 +2418,12 @@ static void igmp6_join_group(struct ifmcaddr6 *ma) delay = prandom_u32() % unsolicited_report_interval(ma->idev); spin_lock_bh(&ma->mca_lock); - if (del_timer(&ma->mca_timer)) { + if (cancel_delayed_work(&ma->mca_work)) { refcount_dec(&ma->mca_refcnt); - delay = ma->mca_timer.expires - jiffies; + delay = ma->mca_work.timer.expires - jiffies; } - if (!mod_timer(&ma->mca_timer, jiffies + delay)) + if (!mod_delayed_work(mld_wq, &ma->mca_work, msecs_to_jiffies(delay))) refcount_inc(&ma->mca_refcnt); ma->mca_flags |= MAF_TIMER_RUNNING | MAF_LAST_REPORTER; spin_unlock_bh(&ma->mca_lock); @@ -2458,26 +2460,34 @@ static void igmp6_leave_group(struct ifmcaddr6 *ma) } } -static void mld_gq_timer_expire(struct timer_list *t) +static void mld_gq_work(struct work_struct *work) { - struct inet6_dev *idev = from_timer(idev, t, mc_gq_timer); + struct inet6_dev *idev = container_of(to_delayed_work(work), + struct inet6_dev, + mc_gq_work); idev->mc_gq_running = 0; + rtnl_lock(); mld_send_report(idev, NULL); + rtnl_unlock(); in6_dev_put(idev); } -static void mld_ifc_timer_expire(struct timer_list *t) +static void mld_ifc_work(struct work_struct *work) { - struct inet6_dev *idev = from_timer(idev, t, mc_ifc_timer); + struct inet6_dev *idev = container_of(to_delayed_work(work), + struct inet6_dev, + mc_ifc_work); + rtnl_lock(); mld_send_cr(idev); if (idev->mc_ifc_count) { idev->mc_ifc_count--; if (idev->mc_ifc_count) - mld_ifc_start_timer(idev, - unsolicited_report_interval(idev)); + mld_ifc_start_work(idev, + unsolicited_report_interval(idev)); } + rtnl_unlock(); in6_dev_put(idev); } @@ -2486,22 +2496,25 @@ static void mld_ifc_event(struct inet6_dev *idev) if (mld_in_v1_mode(idev)) return; idev->mc_ifc_count = idev->mc_qrv; - mld_ifc_start_timer(idev, 1); + mld_ifc_start_work(idev, 1); } -static void igmp6_timer_handler(struct timer_list *t) +static void mld_mca_work(struct work_struct *work) { - struct ifmcaddr6 *ma = from_timer(ma, t, mca_timer); + struct ifmcaddr6 *ma = container_of(to_delayed_work(work), + struct ifmcaddr6, mca_work); + rtnl_lock(); if (mld_in_v1_mode(ma->idev)) igmp6_send(&ma->mca_addr, ma->idev->dev, ICMPV6_MGM_REPORT); else mld_send_report(ma->idev, ma); + rtnl_unlock(); - spin_lock(&ma->mca_lock); + spin_lock_bh(&ma->mca_lock); ma->mca_flags |= MAF_LAST_REPORTER; ma->mca_flags &= ~MAF_TIMER_RUNNING; - spin_unlock(&ma->mca_lock); + spin_unlock_bh(&ma->mca_lock); ma_put(ma); } @@ -2537,12 +2550,12 @@ void ipv6_mc_down(struct inet6_dev *idev) for (i = idev->mc_list; i; i = i->next) igmp6_group_dropped(i); - /* Should stop timer after group drop. or we will - * start timer again in mld_ifc_event() + /* Should stop work after group drop. or we will + * start work again in mld_ifc_event() */ - mld_ifc_stop_timer(idev); - mld_gq_stop_timer(idev); - mld_dad_stop_timer(idev); + mld_ifc_stop_work(idev); + mld_gq_stop_work(idev); + mld_dad_stop_work(idev); read_unlock_bh(&idev->lock); } @@ -2579,11 +2592,11 @@ void ipv6_mc_init_dev(struct inet6_dev *idev) write_lock_bh(&idev->lock); spin_lock_init(&idev->mc_lock); idev->mc_gq_running = 0; - timer_setup(&idev->mc_gq_timer, mld_gq_timer_expire, 0); + INIT_DELAYED_WORK(&idev->mc_gq_work, mld_gq_work); idev->mc_tomb = NULL; idev->mc_ifc_count = 0; - timer_setup(&idev->mc_ifc_timer, mld_ifc_timer_expire, 0); - timer_setup(&idev->mc_dad_timer, mld_dad_timer_expire, 0); + INIT_DELAYED_WORK(&idev->mc_ifc_work, mld_ifc_work); + INIT_DELAYED_WORK(&idev->mc_dad_work, mld_dad_work); ipv6_mc_reset(idev); write_unlock_bh(&idev->lock); } @@ -2596,7 +2609,7 @@ void ipv6_mc_destroy_dev(struct inet6_dev *idev) { struct ifmcaddr6 *i; - /* Deactivate timers */ + /* Deactivate works */ ipv6_mc_down(idev); mld_clear_delrec(idev); @@ -2763,7 +2776,7 @@ static int igmp6_mc_seq_show(struct seq_file *seq, void *v) &im->mca_addr, im->mca_users, im->mca_flags, (im->mca_flags&MAF_TIMER_RUNNING) ? - jiffies_to_clock_t(im->mca_timer.expires-jiffies) : 0); + jiffies_to_clock_t(im->mca_work.timer.expires - jiffies) : 0); return 0; } @@ -3002,7 +3015,19 @@ static struct pernet_operations igmp6_net_ops = { int __init igmp6_init(void) { - return register_pernet_subsys(&igmp6_net_ops); + int err; + + err = register_pernet_subsys(&igmp6_net_ops); + if (err) + return err; + + mld_wq = create_workqueue("mld"); + if (!mld_wq) { + unregister_pernet_subsys(&igmp6_net_ops); + return -ENOMEM; + } + + return err; } int __init igmp6_late_init(void) @@ -3013,6 +3038,7 @@ int __init igmp6_late_init(void) void igmp6_cleanup(void) { unregister_pernet_subsys(&igmp6_net_ops); + destroy_workqueue(mld_wq); } void igmp6_late_cleanup(void) From patchwork Sat Feb 13 17:51:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 12086979 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ECDEC433E0 for ; Sat, 13 Feb 2021 17:52:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 68D0A64E38 for ; Sat, 13 Feb 2021 17:52:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229792AbhBMRwq (ORCPT ); Sat, 13 Feb 2021 12:52:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229771AbhBMRw3 (ORCPT ); Sat, 13 Feb 2021 12:52:29 -0500 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA261C061574 for ; Sat, 13 Feb 2021 09:51:42 -0800 (PST) Received: by mail-pl1-x636.google.com with SMTP id ba1so1506642plb.1 for ; Sat, 13 Feb 2021 09:51:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=jW9Dez18xwILkq1/9jHzyGFuCSTTILOrwI1eS2y0tKE=; b=BPrO4V3DFaGYlUR4zhlzaXKhUMhnHvks2TFtyYDOD0dt0GkWa0vlAzvTvqG4pD1UBc JiUhOc6xMRk/0dFLFSjl1jLvhu1f8wOMT1vb/AEs712Bn4cBXsBlLKiNq90tILibg9HR PztoYprZjCCg2gBt87CBfoxE7M8mgRx4ah807kjKHJCJjh94tDdijkSYVgb4SnABfyYW 37pQeAnFtOfCOwNkHg+xCSJdSQmVYXTS/tlkcADYUQOgrDkIkIX3IeFIDO0TxIRDes6k FVrDfszh9ECquMNwMynU2e6NACGcXn25hRSqJ31RFeAs8YHQmQDHRz7YBupNkLQuT5Bv D/Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=jW9Dez18xwILkq1/9jHzyGFuCSTTILOrwI1eS2y0tKE=; b=gcKUzd87i0bC7yl7xdC6bwBOwYy0mS7YR63h5Z0eLkhSFQRSnMbYPHipMaSgDNhRQe I39m0QRK+dsbh4omb7aMDnSH9/v4pWnldVtvbOX1RVY3hjOFsFLvg73CWlsGT5b2Ha5N N0mmMkYp7ZlozU/Wek1sHIPDSABngIjlavaR4w2ZBUJ4De3YxbfCmLdqgMPd9kkFvMEq 7Xu1zH94T1mE0rJds9Q3vlbUFINspt7pSAfnM/Ajy2ExM8UaZplqniKX1lwngLowbsGe sCagV7C1S1yWMhc7xkuBq6jU4CKll8+lT+0qkrkl7SehAGYiyGwy3VortMf5bUUAxrTn IuXA== X-Gm-Message-State: AOAM530giizxtp3RXR9vKPrN2wOKWBI7eNwW7x2Rqbvq+A2f3gDAvKXn 1BnaJQAfQ5NL4q3hFEEskXI= X-Google-Smtp-Source: ABdhPJztW0fHtLNb2UMT0JbocfnQXGC2ezAPLx+3B8sDKz+X7OJ1G8DzyUrHbVnkxpT2Gn5qs0cvgA== X-Received: by 2002:a17:902:ce8b:b029:df:edfe:faf2 with SMTP id f11-20020a170902ce8bb02900dfedfefaf2mr7891033plg.56.1613238702434; Sat, 13 Feb 2021 09:51:42 -0800 (PST) Received: from localhost.localdomain ([49.173.165.50]) by smtp.gmail.com with ESMTPSA id 73sm12721844pfa.27.2021.02.13.09.51.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Feb 2021 09:51:41 -0800 (PST) From: Taehee Yoo To: davem@davemloft.net, kuba@kernel.org, xiyou.wangcong@gmail.com, netdev@vger.kernel.org, jwi@linux.ibm.com, kgraul@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com, mareklindner@neomailbox.ch, sw@simonwunderlich.de, a@unstable.cc, sven@narfation.org, yoshfuji@linux-ipv6.org, dsahern@kernel.org Cc: Taehee Yoo Subject: [PATCH net-next v2 2/7] mld: separate two flags from ifmcaddr6->mca_flags Date: Sat, 13 Feb 2021 17:51:27 +0000 Message-Id: <20210213175127.28300-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org In the ifmcaddr6->mca_flags, there are follows flags. MAF_TIMER_RUNNING MAF_LAST_REPORTER MAF_LOADED MAF_NOREPORT MAF_GSQUERY The mca_flags value will be protected by a spinlock since the next patches. But MAF_LOADED and MAF_NOREPORT do not need spinlock because they will be protected by RTNL. So, if they are separated from mca_flags, it could reduce atomic context. Suggested-by: Cong Wang Signed-off-by: Taehee Yoo --- v1 -> v2: - Separated from previous big one patch. include/net/if_inet6.h | 8 ++++---- net/ipv6/mcast.c | 26 +++++++++++--------------- 2 files changed, 15 insertions(+), 19 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index af5244c9ca5c..bec372283ac0 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -107,9 +107,7 @@ struct ip6_sf_list { #define MAF_TIMER_RUNNING 0x01 #define MAF_LAST_REPORTER 0x02 -#define MAF_LOADED 0x04 -#define MAF_NOREPORT 0x08 -#define MAF_GSQUERY 0x10 +#define MAF_GSQUERY 0x04 struct ifmcaddr6 { struct in6_addr mca_addr; @@ -121,7 +119,9 @@ struct ifmcaddr6 { unsigned char mca_crcount; unsigned long mca_sfcount[2]; struct delayed_work mca_work; - unsigned int mca_flags; + unsigned char mca_flags; + bool mca_noreport; + bool mca_loaded; int mca_users; refcount_t mca_refcnt; spinlock_t mca_lock; diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 80597dc56f2a..1f7fd3fbb4b6 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -661,15 +661,13 @@ static void igmp6_group_added(struct ifmcaddr6 *mc) IPV6_ADDR_SCOPE_LINKLOCAL) return; - spin_lock_bh(&mc->mca_lock); - if (!(mc->mca_flags&MAF_LOADED)) { - mc->mca_flags |= MAF_LOADED; + if (!mc->mca_loaded) { + mc->mca_loaded = true; if (ndisc_mc_map(&mc->mca_addr, buf, dev, 0) == 0) dev_mc_add(dev, buf); } - spin_unlock_bh(&mc->mca_lock); - if (!(dev->flags & IFF_UP) || (mc->mca_flags & MAF_NOREPORT)) + if (!(dev->flags & IFF_UP) || mc->mca_noreport) return; if (mld_in_v1_mode(mc->idev)) { @@ -697,15 +695,13 @@ static void igmp6_group_dropped(struct ifmcaddr6 *mc) IPV6_ADDR_SCOPE_LINKLOCAL) return; - spin_lock_bh(&mc->mca_lock); - if (mc->mca_flags&MAF_LOADED) { - mc->mca_flags &= ~MAF_LOADED; + if (mc->mca_loaded) { + mc->mca_loaded = false; if (ndisc_mc_map(&mc->mca_addr, buf, dev, 0) == 0) dev_mc_del(dev, buf); } - spin_unlock_bh(&mc->mca_lock); - if (mc->mca_flags & MAF_NOREPORT) + if (mc->mca_noreport) return; if (!mc->idev->dead) @@ -868,7 +864,7 @@ static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev, if (ipv6_addr_is_ll_all_nodes(&mc->mca_addr) || IPV6_ADDR_MC_SCOPE(&mc->mca_addr) < IPV6_ADDR_SCOPE_LINKLOCAL) - mc->mca_flags |= MAF_NOREPORT; + mc->mca_noreport = true; return mc; } @@ -1733,7 +1729,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, int scount, stotal, first, isquery, truncate; unsigned int mtu; - if (pmc->mca_flags & MAF_NOREPORT) + if (pmc->mca_noreport) return skb; mtu = READ_ONCE(dev->mtu); @@ -1855,7 +1851,7 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc) read_lock_bh(&idev->lock); if (!pmc) { for (pmc = idev->mc_list; pmc; pmc = pmc->next) { - if (pmc->mca_flags & MAF_NOREPORT) + if (pmc->mca_noreport) continue; spin_lock_bh(&pmc->mca_lock); if (pmc->mca_sfcount[MCAST_EXCLUDE]) @@ -2149,7 +2145,7 @@ static int ip6_mc_del1_src(struct ifmcaddr6 *pmc, int sfmode, psf_prev->sf_next = psf->sf_next; else pmc->mca_sources = psf->sf_next; - if (psf->sf_oldin && !(pmc->mca_flags & MAF_NOREPORT) && + if (psf->sf_oldin && !pmc->mca_noreport && !mld_in_v1_mode(idev)) { psf->sf_crcount = idev->mc_qrv; psf->sf_next = pmc->mca_tomb; @@ -2410,7 +2406,7 @@ static void igmp6_join_group(struct ifmcaddr6 *ma) { unsigned long delay; - if (ma->mca_flags & MAF_NOREPORT) + if (ma->mca_noreport) return; igmp6_send(&ma->mca_addr, ma->idev->dev, ICMPV6_MGM_REPORT); From patchwork Sat Feb 13 17:51:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 12086981 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF83CC433DB for ; Sat, 13 Feb 2021 17:52:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 79D3960203 for ; Sat, 13 Feb 2021 17:52:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229803AbhBMRw5 (ORCPT ); Sat, 13 Feb 2021 12:52:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229790AbhBMRwp (ORCPT ); Sat, 13 Feb 2021 12:52:45 -0500 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5AFE6C061756 for ; Sat, 13 Feb 2021 09:52:05 -0800 (PST) Received: by mail-pj1-x102a.google.com with SMTP id z9so1436471pjl.5 for ; Sat, 13 Feb 2021 09:52:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=pxli47LgSRMbu8PU178GClDlDkM8tqhxlPi1HIPfNF8=; b=cHqrqD4RV+hw95XMTucTPNVfY/24Xv3/BcLUayztBGnqFHOQw5kBTRwI6N/Mj+se7T jq0B25sBOpVr1GM7roFTalN52A/d2xFC5kO4l2xPRNgsQXBoWT1flDENZ0qZSgEO+Z0w h+U3VE1u/vMtS1fzI1SPDxMmm5zxysTQJQI8lhIbTuK06dLrmgobielyr/6hpg2bBjgl Zb7xH3yM+fjZEOb/LkUlSdbcqiphdDdxxAkOR6u01v++jlF+HoUyuy+X+KlzlArWyCF2 jbfG+QlmAv7/7JxJKzUlEtaa+5k9DEFFuTspARFEuU6MR6A2ZfhXcOf5vnHBnY+KFEmY 64Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=pxli47LgSRMbu8PU178GClDlDkM8tqhxlPi1HIPfNF8=; b=TtdpfRidPdERSLO2+goOaqs/ntD4CiMyEom8VEtJoIxhgib6Q/75Xmg+b1uBan8dSF p/YSMqPbaECH6h3lqxbPhd3Vlb8Owh3MfLQM9B+HI7qEKOTjpwKgDtMvKUJVG6PseNTS 8DlkvqGPw0I5GHw4gtuFj4iDKdZvgT+TQB5VKKUFzsOKJ4ef5UGRRRQr2M5xvXehVTVl LeKO9bW/SJrxzWbQmQhbSgM7T/CMspvwKrCDefR97jvcye2TIDUi6Ovgg7/xCvS4I9eT 11G4CmmhrZ6mEMI9w4REb/44ZiL/7WtoL/YR8Ui2mbu8IA16e/DtsN1KEVT5a2J2j13R E0fw== X-Gm-Message-State: AOAM531IOPKXVPY5YpQY0RZ97CIB79B5QzroowTni9QqozWHrgpMLZkZ syBij/4B234YQ+7RHkabCWg= X-Google-Smtp-Source: ABdhPJynlPq9UP0BLmX5CZm7Fw4Ozb/K/4mnywi5LrAyXbdiEIbKoeE/QtswHP8AcFLdP/ci9QV+0w== X-Received: by 2002:a17:90a:588c:: with SMTP id j12mr7894559pji.93.1613238724964; Sat, 13 Feb 2021 09:52:04 -0800 (PST) Received: from localhost.localdomain ([49.173.165.50]) by smtp.gmail.com with ESMTPSA id n1sm12678171pfd.212.2021.02.13.09.52.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Feb 2021 09:52:04 -0800 (PST) From: Taehee Yoo To: davem@davemloft.net, kuba@kernel.org, xiyou.wangcong@gmail.com, netdev@vger.kernel.org, jwi@linux.ibm.com, kgraul@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com, mareklindner@neomailbox.ch, sw@simonwunderlich.de, a@unstable.cc, sven@narfation.org, yoshfuji@linux-ipv6.org, dsahern@kernel.org Cc: Taehee Yoo Subject: [PATCH net-next v2 3/7] mld: add a new delayed_work, mc_delrec_work Date: Sat, 13 Feb 2021 17:51:48 +0000 Message-Id: <20210213175148.28375-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The goal of mc_delrec_work delayed work is to call mld_clear_delrec(). The mld_clear_delrec() is called under both data path and control path. So, the context of mld_clear_delrec() can be atomic. But this function accesses struct ifmcaddr6 and struct ip6_sf_list. These structures are going to be protected by RTNL. So, this function should be called in a sleepable context. Suggested-by: Cong Wang Signed-off-by: Taehee Yoo --- v1 -> v2: - Separated from previous big one patch. include/net/if_inet6.h | 1 + net/ipv6/mcast.c | 32 +++++++++++++++++++++++++++++++- 2 files changed, 32 insertions(+), 1 deletion(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index bec372283ac0..5946b5d76f7b 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -182,6 +182,7 @@ struct inet6_dev { struct delayed_work mc_gq_work; /* general query work */ struct delayed_work mc_ifc_work; /* interface change work */ struct delayed_work mc_dad_work; /* dad complete mc work */ + struct delayed_work mc_delrec_work; /* delete records work */ struct ifacaddr6 *ac_list; rwlock_t lock; diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 1f7fd3fbb4b6..ca8ca6faca4e 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -1067,6 +1067,22 @@ static void mld_dad_stop_work(struct inet6_dev *idev) __in6_dev_put(idev); } +static void mld_clear_delrec_start_work(struct inet6_dev *idev) +{ + write_lock_bh(&idev->lock); + if (!mod_delayed_work(mld_wq, &idev->mc_delrec_work, 0)) + in6_dev_hold(idev); + write_unlock_bh(&idev->lock); +} + +static void mld_clear_delrec_stop_work(struct inet6_dev *idev) +{ + write_lock_bh(&idev->lock); + if (cancel_delayed_work(&idev->mc_delrec_work)) + __in6_dev_put(idev); + write_unlock_bh(&idev->lock); +} + /* * IGMP handling (alias multicast ICMPv6 messages) */ @@ -1304,7 +1320,7 @@ static int mld_process_v1(struct inet6_dev *idev, struct mld_msg *mld, /* cancel the interface change work */ mld_ifc_stop_work(idev); /* clear deleted report items */ - mld_clear_delrec(idev); + mld_clear_delrec_start_work(idev); return 0; } @@ -2120,6 +2136,18 @@ static void mld_dad_work(struct work_struct *work) in6_dev_put(idev); } +static void mld_clear_delrec_work(struct work_struct *work) +{ + struct inet6_dev *idev = container_of(to_delayed_work(work), + struct inet6_dev, + mc_delrec_work); + + rtnl_lock(); + mld_clear_delrec(idev); + rtnl_unlock(); + in6_dev_put(idev); +} + static int ip6_mc_del1_src(struct ifmcaddr6 *pmc, int sfmode, const struct in6_addr *psfsrc) { @@ -2553,6 +2581,7 @@ void ipv6_mc_down(struct inet6_dev *idev) mld_gq_stop_work(idev); mld_dad_stop_work(idev); read_unlock_bh(&idev->lock); + mld_clear_delrec_stop_work(idev); } static void ipv6_mc_reset(struct inet6_dev *idev) @@ -2593,6 +2622,7 @@ void ipv6_mc_init_dev(struct inet6_dev *idev) idev->mc_ifc_count = 0; INIT_DELAYED_WORK(&idev->mc_ifc_work, mld_ifc_work); INIT_DELAYED_WORK(&idev->mc_dad_work, mld_dad_work); + INIT_DELAYED_WORK(&idev->mc_delrec_work, mld_clear_delrec_work); ipv6_mc_reset(idev); write_unlock_bh(&idev->lock); } From patchwork Sat Feb 13 17:52:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 12086983 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDCFEC433E0 for ; Sat, 13 Feb 2021 17:53:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9491664E38 for ; Sat, 13 Feb 2021 17:53:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229873AbhBMRxi (ORCPT ); Sat, 13 Feb 2021 12:53:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229827AbhBMRxM (ORCPT ); Sat, 13 Feb 2021 12:53:12 -0500 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35AB6C0613D6 for ; Sat, 13 Feb 2021 09:52:32 -0800 (PST) Received: by mail-pl1-x636.google.com with SMTP id d15so1494854plh.4 for ; Sat, 13 Feb 2021 09:52:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=z1cf+f1fiGYI7zzf+P+a0WeccVhMuwpCe8uIH4ZnIBU=; b=m4Mg1bRHlGKl3R1kduEaxxXXBX3/KZIAPD4ve1w89e6h012jpo5DQ/9Y4wI4of8s2Y Abyq4Lrsjo+BTUw+hg4kj0abyxnFZlVlMSyfN7AhbdhKHiq6dqwAwDGDv2k8XJza8k/P 6AAObd6FE5dSEDglkd1w5EtCeol3FeDmgeDRNNpp7oQ3JKHGHotZ3deqfeYvOBA6/RUt pEECQQNsXepUmV9w1dxOmd0Dssmas2N6J5dR493VQ54W3KH2nihhDwFq557F/5kl00gW llliKfvr8NIXmVLvWC8ZiBsRF8QJNMuXGAv5PqEKhjzRUVXQVAsFPDYBiA6M/0WvIu6F 0t4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=z1cf+f1fiGYI7zzf+P+a0WeccVhMuwpCe8uIH4ZnIBU=; b=nCgw06sSJLSJLtjvExKavRtAWCt6f6oW4Az47BUL8JH+jyPTHsNodDMHO6VgDyo9Gg m8yz8FoBH7GtIzOi2sHcnUI0QTDTWBq2Zke3flcufl3Q4NebtaIR9OrkmDWkbivkE4f/ 5IPXXSdlAf12kCmIr0avxSpaC97sejFwwODAUO4mLQQC0rbZOL37Tf8SIOzgHYx89QKt qq6fGj/Yml/DIfqJvxuH/9Jc2pmTzsIn4WAlNvykCkf43EQSa7KQrQTYY45R/IwjFDt2 gwlN1Dm02J187smB6AMEjTsaLquBf+EeBU9f0VUwguTHJuc8fdLtlX3qu5uXDU++Gno3 z35w== X-Gm-Message-State: AOAM5336Vp6n6L3DJlqt9jaYIl6jHc0+/5F+yFSJIMi+g3SIrQ/Nl8CH qErmGsWhSOedXQgYX8OGPSA= X-Google-Smtp-Source: ABdhPJyWkQ4Xx5poLAxEV2ojCEG7ywKFMksqOUqDiuFdggpXb56XD4wROkV1dd6OAuF6267qT3jSjA== X-Received: by 2002:a17:90a:7108:: with SMTP id h8mr3909836pjk.98.1613238751807; Sat, 13 Feb 2021 09:52:31 -0800 (PST) Received: from localhost.localdomain ([49.173.165.50]) by smtp.gmail.com with ESMTPSA id c6sm11173248pjd.21.2021.02.13.09.52.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Feb 2021 09:52:31 -0800 (PST) From: Taehee Yoo To: davem@davemloft.net, kuba@kernel.org, xiyou.wangcong@gmail.com, netdev@vger.kernel.org, jwi@linux.ibm.com, kgraul@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com, mareklindner@neomailbox.ch, sw@simonwunderlich.de, a@unstable.cc, sven@narfation.org, yoshfuji@linux-ipv6.org, dsahern@kernel.org Cc: Taehee Yoo Subject: [PATCH net-next v2 4/7] mld: get rid of inet6_dev->mc_lock Date: Sat, 13 Feb 2021 17:52:24 +0000 Message-Id: <20210213175224.28511-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The purpose of mc_lock is to protect inet6_dev->mc_tomb. But mc_tomb is already protected by RTNL and all functions, which manipulate mc_tomb are called under RTNL. So, mc_lock is not needed. Furthermore, it is spinlock so the critical section is atomic. In order to reduce atomic context, it should be removed. Suggested-by: Cong Wang Signed-off-by: Taehee Yoo --- v1 -> v2: - Separated from previous big one patch. include/net/if_inet6.h | 1 - net/ipv6/mcast.c | 9 --------- 2 files changed, 10 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 5946b5d76f7b..4d9855be644c 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -167,7 +167,6 @@ struct inet6_dev { struct ifmcaddr6 *mc_list; struct ifmcaddr6 *mc_tomb; - spinlock_t mc_lock; unsigned char mc_qrv; /* Query Robustness Variable */ unsigned char mc_gq_running; diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index ca8ca6faca4e..e80b78b1a8a7 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -748,10 +748,8 @@ static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) } spin_unlock_bh(&im->mca_lock); - spin_lock_bh(&idev->mc_lock); pmc->next = idev->mc_tomb; idev->mc_tomb = pmc; - spin_unlock_bh(&idev->mc_lock); } static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) @@ -760,7 +758,6 @@ static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) struct ip6_sf_list *psf; struct in6_addr *pmca = &im->mca_addr; - spin_lock_bh(&idev->mc_lock); pmc_prev = NULL; for (pmc = idev->mc_tomb; pmc; pmc = pmc->next) { if (ipv6_addr_equal(&pmc->mca_addr, pmca)) @@ -773,7 +770,6 @@ static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) else idev->mc_tomb = pmc->next; } - spin_unlock_bh(&idev->mc_lock); spin_lock_bh(&im->mca_lock); if (pmc) { @@ -797,10 +793,8 @@ static void mld_clear_delrec(struct inet6_dev *idev) { struct ifmcaddr6 *pmc, *nextpmc; - spin_lock_bh(&idev->mc_lock); pmc = idev->mc_tomb; idev->mc_tomb = NULL; - spin_unlock_bh(&idev->mc_lock); for (; pmc; pmc = nextpmc) { nextpmc = pmc->next; @@ -1919,7 +1913,6 @@ static void mld_send_cr(struct inet6_dev *idev) int type, dtype; read_lock_bh(&idev->lock); - spin_lock(&idev->mc_lock); /* deleted MCA's */ pmc_prev = NULL; @@ -1953,7 +1946,6 @@ static void mld_send_cr(struct inet6_dev *idev) } else pmc_prev = pmc; } - spin_unlock(&idev->mc_lock); /* change recs */ for (pmc = idev->mc_list; pmc; pmc = pmc->next) { @@ -2615,7 +2607,6 @@ void ipv6_mc_up(struct inet6_dev *idev) void ipv6_mc_init_dev(struct inet6_dev *idev) { write_lock_bh(&idev->lock); - spin_lock_init(&idev->mc_lock); idev->mc_gq_running = 0; INIT_DELAYED_WORK(&idev->mc_gq_work, mld_gq_work); idev->mc_tomb = NULL; From patchwork Sat Feb 13 17:52:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 12086985 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3B15C433DB for ; Sat, 13 Feb 2021 17:53:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7607760203 for ; Sat, 13 Feb 2021 17:53:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229869AbhBMRxc (ORCPT ); Sat, 13 Feb 2021 12:53:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229812AbhBMRxG (ORCPT ); Sat, 13 Feb 2021 12:53:06 -0500 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17388C061786 for ; Sat, 13 Feb 2021 09:52:51 -0800 (PST) Received: by mail-pl1-x635.google.com with SMTP id e9so1501507plh.3 for ; Sat, 13 Feb 2021 09:52:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=tS0OhL26S2iKnuNGulWkuQQ2NID6hvomYZoorvgFPLs=; b=kMj7lEOnfl5eEknng9spHqMJ9i0wYlUH2Zqbo81ljH5FfqTC+6uacNLBeMWQScLpgq 1z+ugASXBcMoOKqzjq5/3zgkcF+J8qHhiURo8tKwjetb4N7q+Qq8Bd/2NtJymoGcc3eP LKeYjgbJgrfPPSwilWEvkZTE67zjgc/1M5k4kCrRDmJO8b82HLEhTuf/LHg10dUpfxaP 3bbpm9S5azaseFrMKzEtvHy8OXU+HFzLjsJq0/5xJrVeZNc33TDIh4ijcMYD1OUGjSy9 rVmeeHKHjsbB2eXWYW/7Qces8TDQ7bowAxLOmRaQ9JZjvSADfUPZFRw7D99D+QsZV7Jw CebA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=tS0OhL26S2iKnuNGulWkuQQ2NID6hvomYZoorvgFPLs=; b=pR/l9BGVzO7OOvc6BpDWQ8GRq00dp/hPVkKzGMBFyolnLvVGQyFLjtOorE6Jc4Kowj FtDbbfAWcgvAmsqGt+lRGzP64U4yzI9z3okdbx3UP/RtABbB+THgp+2/1kTr5dWX3UFD A2vDdUYenJlVqlPYUxmlybVJq8XX17mruIN62kH2cPCK8pLiW+M86bNA6hCRINQ5YfNk BXv7fDMJXI7X+IMwDhj9MdQmIpssoEnL3mF/8kuvAQUvm0Al4s54j4wkYUbZAb5PC1X/ Dfpz8iR/CRDaW6BN22xLSo7hvvU6LwND8TnfQ0kONJV8DTe9/j1MjbZRR9aLfJnWXbYf /mJQ== X-Gm-Message-State: AOAM533kUkvjkgImYlYiN7TugRVwsY9KE75K2BVC3atwBiecYV482eqg Mv039Qapq7d6LFxBCtmupw0= X-Google-Smtp-Source: ABdhPJy8ayUSe9F1iRC158u1zBEFwRM1Ay6txjAr9hMqaB2NQQSapdv0NKV+3FGyK8A04tgpItUwUg== X-Received: by 2002:a17:90b:17cb:: with SMTP id me11mr7852936pjb.64.1613238770565; Sat, 13 Feb 2021 09:52:50 -0800 (PST) Received: from localhost.localdomain ([49.173.165.50]) by smtp.gmail.com with ESMTPSA id y2sm11412468pjw.36.2021.02.13.09.52.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Feb 2021 09:52:49 -0800 (PST) From: Taehee Yoo To: davem@davemloft.net, kuba@kernel.org, xiyou.wangcong@gmail.com, netdev@vger.kernel.org, jwi@linux.ibm.com, kgraul@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com, mareklindner@neomailbox.ch, sw@simonwunderlich.de, a@unstable.cc, sven@narfation.org, yoshfuji@linux-ipv6.org, dsahern@kernel.org Cc: Taehee Yoo Subject: [PATCH net-next v2 5/7] mld: convert ipv6_mc_socklist->sflist to RCU Date: Sat, 13 Feb 2021 17:52:39 +0000 Message-Id: <20210213175239.28571-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The sflist has been protected by rwlock so that the critical section is atomic context. In order to switch this context, changing locking is needed. The sflist actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Suggested-by: Cong Wang Signed-off-by: Taehee Yoo Reported-by: kernel test robot --- v1 -> v2: - Separated from previous big one patch. include/net/if_inet6.h | 4 ++-- net/ipv6/mcast.c | 52 ++++++++++++++++++------------------------ 2 files changed, 24 insertions(+), 32 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 4d9855be644c..d8507bef0a0c 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -78,6 +78,7 @@ struct inet6_ifaddr { struct ip6_sf_socklist { unsigned int sl_max; unsigned int sl_count; + struct rcu_head rcu; struct in6_addr sl_addr[]; }; @@ -91,8 +92,7 @@ struct ipv6_mc_socklist { int ifindex; unsigned int sfmode; /* MCAST_{INCLUDE,EXCLUDE} */ struct ipv6_mc_socklist __rcu *next; - rwlock_t sflock; - struct ip6_sf_socklist *sflist; + struct ip6_sf_socklist __rcu *sflist; struct rcu_head rcu; }; diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index e80b78b1a8a7..cffa2eeb88c5 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -178,8 +178,7 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex, mc_lst->ifindex = dev->ifindex; mc_lst->sfmode = mode; - rwlock_init(&mc_lst->sflock); - mc_lst->sflist = NULL; + RCU_INIT_POINTER(mc_lst->sflist, NULL); /* * now add/increase the group membership on the device @@ -335,7 +334,6 @@ int ip6_mc_source(int add, int omode, struct sock *sk, struct net *net = sock_net(sk); int i, j, rv; int leavegroup = 0; - int pmclocked = 0; int err; source = &((struct sockaddr_in6 *)&pgsr->gsr_source)->sin6_addr; @@ -364,7 +362,7 @@ int ip6_mc_source(int add, int omode, struct sock *sk, goto done; } /* if a source filter was set, must be the same mode as before */ - if (pmc->sflist) { + if (rcu_access_pointer(pmc->sflist)) { if (pmc->sfmode != omode) { err = -EINVAL; goto done; @@ -376,10 +374,7 @@ int ip6_mc_source(int add, int omode, struct sock *sk, pmc->sfmode = omode; } - write_lock(&pmc->sflock); - pmclocked = 1; - - psl = pmc->sflist; + psl = rtnl_dereference(pmc->sflist); if (!add) { if (!psl) goto done; /* err = -EADDRNOTAVAIL */ @@ -429,9 +424,11 @@ int ip6_mc_source(int add, int omode, struct sock *sk, if (psl) { for (i = 0; i < psl->sl_count; i++) newpsl->sl_addr[i] = psl->sl_addr[i]; - sock_kfree_s(sk, psl, IP6_SFLSIZE(psl->sl_max)); + atomic_sub(IP6_SFLSIZE(psl->sl_max), &sk->sk_omem_alloc); + kfree_rcu(psl, rcu); } - pmc->sflist = psl = newpsl; + rcu_assign_pointer(psl, newpsl); + rcu_assign_pointer(pmc->sflist, psl); } rv = 1; /* > 0 for insert logic below if sl_count is 0 */ for (i = 0; i < psl->sl_count; i++) { @@ -447,8 +444,6 @@ int ip6_mc_source(int add, int omode, struct sock *sk, /* update the interface list */ ip6_mc_add_src(idev, group, omode, 1, source, 1); done: - if (pmclocked) - write_unlock(&pmc->sflock); read_unlock_bh(&idev->lock); rcu_read_unlock(); if (leavegroup) @@ -526,17 +521,16 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf, (void) ip6_mc_add_src(idev, group, gsf->gf_fmode, 0, NULL, 0); } - write_lock(&pmc->sflock); - psl = pmc->sflist; + psl = rtnl_dereference(pmc->sflist); if (psl) { (void) ip6_mc_del_src(idev, group, pmc->sfmode, psl->sl_count, psl->sl_addr, 0); - sock_kfree_s(sk, psl, IP6_SFLSIZE(psl->sl_max)); + atomic_sub(IP6_SFLSIZE(psl->sl_max), &sk->sk_omem_alloc); + kfree_rcu(psl, rcu); } else (void) ip6_mc_del_src(idev, group, pmc->sfmode, 0, NULL, 0); - pmc->sflist = newpsl; + rcu_assign_pointer(pmc->sflist, newpsl); pmc->sfmode = gsf->gf_fmode; - write_unlock(&pmc->sflock); err = 0; done: read_unlock_bh(&idev->lock); @@ -585,16 +579,14 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, if (!pmc) /* must have a prior join */ goto done; gsf->gf_fmode = pmc->sfmode; - psl = pmc->sflist; + psl = rtnl_dereference(pmc->sflist); count = psl ? psl->sl_count : 0; read_unlock_bh(&idev->lock); rcu_read_unlock(); copycount = count < gsf->gf_numsrc ? count : gsf->gf_numsrc; gsf->gf_numsrc = count; - /* changes to psl require the socket lock, and a write lock - * on pmc->sflock. We have the socket lock so reading here is safe. - */ + for (i = 0; i < copycount; i++, p++) { struct sockaddr_in6 *psin6; struct sockaddr_storage ss; @@ -630,8 +622,7 @@ bool inet6_mc_check(struct sock *sk, const struct in6_addr *mc_addr, rcu_read_unlock(); return np->mc_all; } - read_lock(&mc->sflock); - psl = mc->sflist; + psl = rcu_dereference(mc->sflist); if (!psl) { rv = mc->sfmode == MCAST_EXCLUDE; } else { @@ -646,7 +637,6 @@ bool inet6_mc_check(struct sock *sk, const struct in6_addr *mc_addr, if (mc->sfmode == MCAST_EXCLUDE && i < psl->sl_count) rv = false; } - read_unlock(&mc->sflock); rcu_read_unlock(); return rv; @@ -2448,19 +2438,21 @@ static void igmp6_join_group(struct ifmcaddr6 *ma) static int ip6_mc_leave_src(struct sock *sk, struct ipv6_mc_socklist *iml, struct inet6_dev *idev) { + struct ip6_sf_socklist *psl; int err; - write_lock_bh(&iml->sflock); - if (!iml->sflist) { + psl = rtnl_dereference(iml->sflist); + + if (!psl) { /* any-source empty exclude case */ err = ip6_mc_del_src(idev, &iml->addr, iml->sfmode, 0, NULL, 0); } else { err = ip6_mc_del_src(idev, &iml->addr, iml->sfmode, - iml->sflist->sl_count, iml->sflist->sl_addr, 0); - sock_kfree_s(sk, iml->sflist, IP6_SFLSIZE(iml->sflist->sl_max)); - iml->sflist = NULL; + psl->sl_count, psl->sl_addr, 0); + RCU_INIT_POINTER(iml->sflist, NULL); + atomic_sub(IP6_SFLSIZE(psl->sl_max), &sk->sk_omem_alloc); + kfree_rcu(psl, rcu); } - write_unlock_bh(&iml->sflock); return err; } From patchwork Sat Feb 13 17:52:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 12086987 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B40BAC433DB for ; Sat, 13 Feb 2021 17:54:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6D41C60203 for ; Sat, 13 Feb 2021 17:54:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229907AbhBMRy0 (ORCPT ); Sat, 13 Feb 2021 12:54:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229864AbhBMRxw (ORCPT ); Sat, 13 Feb 2021 12:53:52 -0500 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5877AC061756 for ; Sat, 13 Feb 2021 09:53:12 -0800 (PST) Received: by mail-pj1-x1030.google.com with SMTP id t2so1449942pjq.2 for ; Sat, 13 Feb 2021 09:53:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=KyJKsxyhWhIGoacEnCll6j750J1cvUNLWDsywvBaATY=; b=DN0hlMjSu18+Kxkbbc9sfD1kCjYp6DzSdbHNOXCuBrFcvC3L57oh9DhixO76RXlwWR ACZNpyLOj4Ufw8nT3EeqJvtFOJOaK1FJT1ib3yZPr3uQ/mBMOj9+omqT7qwDDxsIrVVV F3583ZevX09RR7I/QpBuukLKpPOgDxRCoFDMgD3HUxqL98hlb6R8fVFf5DLkHQ5JJjBe CJw5I0GOCOofo7gdh7r8adbzMVJs0b4zdTn147qhU6/M7MIyfHjv4uRrob/dQuK7L/Zi 8Wyba8TSQykURWxYU2WkspAMd7jXMLz6I2yADJsglcqFqXKBPiduxcD5mUul6QMJyFup WnRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=KyJKsxyhWhIGoacEnCll6j750J1cvUNLWDsywvBaATY=; b=pZupjghuNM3NcPuANjmnYZnl3lZujWsmBOdPldDEoTi59cOiAYiIhj4R3wheyeF17X GsffE8MzaSYTlczVfIGjaJ5YiQNoS89qi6Qx+xNcEizofng9lndIrrTmW4s+hqzPUfhk 1SYQQ07RdOeZa0GkCKzniYsAiWSBw9CWkMYQPmamONgkZ86yLlA+JZFXFaQrnpsCLHvL NA7j4tHXiwURqkCuzo3k4xl375MUBxbvdaU3ds4FQhOsfKPVTPyIlmn8OZrYuROwXsMV FFr/YUioiUFM0xU/eAedEJxVeL5ybU8Bv4CEZoRonMC36klLbaUtFaIjg0Su8oJGlmPT CfsA== X-Gm-Message-State: AOAM530Sx/podL+WIQznIn8UsaLPLAWKfJIBwUNCU03nP84GKdL0BI4B qCt9XCAfssmCyYkRHRNfUOA= X-Google-Smtp-Source: ABdhPJxkz1kjV1t/SyaO4ffuno1sJU5g+cct6SgAzNKLvVgnSMOn4Fq6808rNxyi2wUNlZKh1QlWfQ== X-Received: by 2002:a17:902:c3c2:b029:e3:29a8:1504 with SMTP id j2-20020a170902c3c2b02900e329a81504mr5639963plj.22.1613238791823; Sat, 13 Feb 2021 09:53:11 -0800 (PST) Received: from localhost.localdomain ([49.173.165.50]) by smtp.gmail.com with ESMTPSA id h12sm12610410pgs.7.2021.02.13.09.53.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Feb 2021 09:53:11 -0800 (PST) From: Taehee Yoo To: davem@davemloft.net, kuba@kernel.org, xiyou.wangcong@gmail.com, netdev@vger.kernel.org, jwi@linux.ibm.com, kgraul@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com, mareklindner@neomailbox.ch, sw@simonwunderlich.de, a@unstable.cc, sven@narfation.org, yoshfuji@linux-ipv6.org, dsahern@kernel.org Cc: Taehee Yoo Subject: [PATCH net-next v2 6/7] mld: convert ip6_sf_list to RCU Date: Sat, 13 Feb 2021 17:52:57 +0000 Message-Id: <20210213175257.28642-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The ip6_sf_list has been protected by mca_lock(spin_lock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ip6_sf_list actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. But It doesn't remove mca_lock yet because ifmcaddr6 isn't converted to RCU yet. So, It's not fully converted to the sleepable context. Suggested-by: Cong Wang Signed-off-by: Taehee Yoo Reported-by: kernel test robot --- v1 -> v2: - Separated from previous big one patch. include/net/if_inet6.h | 5 +- net/ipv6/mcast.c | 150 +++++++++++++++++++++++++---------------- 2 files changed, 95 insertions(+), 60 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index d8507bef0a0c..b26fecb669e3 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -97,12 +97,13 @@ struct ipv6_mc_socklist { }; struct ip6_sf_list { - struct ip6_sf_list *sf_next; + struct ip6_sf_list __rcu *sf_next; struct in6_addr sf_addr; unsigned long sf_count[2]; /* include/exclude counts */ unsigned char sf_gsresp; /* include in g & s response? */ unsigned char sf_oldin; /* change state */ unsigned char sf_crcount; /* retrans. left to send */ + struct rcu_head rcu; }; #define MAF_TIMER_RUNNING 0x01 @@ -113,7 +114,7 @@ struct ifmcaddr6 { struct in6_addr mca_addr; struct inet6_dev *idev; struct ifmcaddr6 *next; - struct ip6_sf_list *mca_sources; + struct ip6_sf_list __rcu *mca_sources; struct ip6_sf_list *mca_tomb; unsigned int mca_sfmode; unsigned char mca_crcount; diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index cffa2eeb88c5..792f16e2ad83 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -113,10 +113,20 @@ int sysctl_mld_qrv __read_mostly = MLD_QRV_DEFAULT; */ #define for_each_pmc_rcu(np, pmc) \ - for (pmc = rcu_dereference(np->ipv6_mc_list); \ - pmc != NULL; \ + for (pmc = rcu_dereference((np)->ipv6_mc_list); \ + pmc; \ pmc = rcu_dereference(pmc->next)) +#define for_each_psf_rtnl(mc, psf) \ + for (psf = rtnl_dereference((mc)->mca_sources); \ + psf; \ + psf = rtnl_dereference(psf->sf_next)) + +#define for_each_psf_rcu(mc, psf) \ + for (psf = rcu_dereference((mc)->mca_sources); \ + psf; \ + psf = rcu_dereference(psf->sf_next)) + static int unsolicited_report_interval(struct inet6_dev *idev) { int iv; @@ -731,22 +741,25 @@ static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) struct ip6_sf_list *psf; pmc->mca_tomb = im->mca_tomb; - pmc->mca_sources = im->mca_sources; - im->mca_tomb = im->mca_sources = NULL; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) + rcu_assign_pointer(pmc->mca_sources, + rtnl_dereference(im->mca_sources)); + im->mca_tomb = NULL; + RCU_INIT_POINTER(im->mca_sources, NULL); + + for_each_psf_rtnl(pmc, psf) psf->sf_crcount = pmc->mca_crcount; } spin_unlock_bh(&im->mca_lock); - pmc->next = idev->mc_tomb; + rcu_assign_pointer(pmc->next, idev->mc_tomb); idev->mc_tomb = pmc; } static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) { - struct ifmcaddr6 *pmc, *pmc_prev; - struct ip6_sf_list *psf; struct in6_addr *pmca = &im->mca_addr; + struct ip6_sf_list *psf, *sources; + struct ifmcaddr6 *pmc, *pmc_prev; pmc_prev = NULL; for (pmc = idev->mc_tomb; pmc; pmc = pmc->next) { @@ -766,8 +779,12 @@ static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) im->idev = pmc->idev; if (im->mca_sfmode == MCAST_INCLUDE) { swap(im->mca_tomb, pmc->mca_tomb); - swap(im->mca_sources, pmc->mca_sources); - for (psf = im->mca_sources; psf; psf = psf->sf_next) + + sources = rcu_replace_pointer(im->mca_sources, + pmc->mca_sources, + lockdep_rtnl_is_held()); + rcu_assign_pointer(pmc->mca_sources, sources); + for_each_psf_rtnl(im, psf) psf->sf_crcount = idev->mc_qrv; } else { im->mca_crcount = idev->mc_qrv; @@ -803,8 +820,8 @@ static void mld_clear_delrec(struct inet6_dev *idev) pmc->mca_tomb = NULL; spin_unlock_bh(&pmc->mca_lock); for (; psf; psf = psf_next) { - psf_next = psf->sf_next; - kfree(psf); + psf_next = rtnl_dereference(psf->sf_next); + kfree_rcu(psf, rcu); } } read_unlock_bh(&idev->lock); @@ -986,7 +1003,7 @@ bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group, struct ip6_sf_list *psf; spin_lock_bh(&mc->mca_lock); - for (psf = mc->mca_sources; psf; psf = psf->sf_next) { + for_each_psf_rcu(mc, psf) { if (ipv6_addr_equal(&psf->sf_addr, src_addr)) break; } @@ -1101,7 +1118,7 @@ static bool mld_xmarksources(struct ifmcaddr6 *pmc, int nsrcs, int i, scount; scount = 0; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) { + for_each_psf_rcu(pmc, psf) { if (scount == nsrcs) break; for (i = 0; i < nsrcs; i++) { @@ -1134,7 +1151,7 @@ static bool mld_marksources(struct ifmcaddr6 *pmc, int nsrcs, /* mark INCLUDE-mode sources */ scount = 0; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) { + for_each_psf_rcu(pmc, psf) { if (scount == nsrcs) break; for (i = 0; i < nsrcs; i++) { @@ -1544,7 +1561,7 @@ mld_scount(struct ifmcaddr6 *pmc, int type, int gdeleted, int sdeleted) struct ip6_sf_list *psf; int scount = 0; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) { + for_each_psf_rtnl(pmc, psf) { if (!is_in(pmc, psf, type, gdeleted, sdeleted)) continue; scount++; @@ -1719,14 +1736,16 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc, #define AVAILABLE(skb) ((skb) ? skb_availroom(skb) : 0) static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, - int type, int gdeleted, int sdeleted, int crsend) + int type, int gdeleted, int sdeleted, + int crsend) { + int scount, stotal, first, isquery, truncate; + struct ip6_sf_list __rcu **psf_list; + struct ip6_sf_list *psf, *psf_prev; struct inet6_dev *idev = pmc->idev; struct net_device *dev = idev->dev; - struct mld2_report *pmr; struct mld2_grec *pgr = NULL; - struct ip6_sf_list *psf, *psf_next, *psf_prev, **psf_list; - int scount, stotal, first, isquery, truncate; + struct mld2_report *pmr; unsigned int mtu; if (pmc->mca_noreport) @@ -1743,10 +1762,13 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, stotal = scount = 0; - psf_list = sdeleted ? &pmc->mca_tomb : &pmc->mca_sources; - - if (!*psf_list) - goto empty_source; + if (sdeleted) { + if (!pmc->mca_tomb) + goto empty_source; + } else { + if (!rcu_access_pointer(pmc->mca_sources)) + goto empty_source; + } pmr = skb ? (struct mld2_report *)skb_transport_header(skb) : NULL; @@ -1761,10 +1783,12 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, } first = 1; psf_prev = NULL; - for (psf = *psf_list; psf; psf = psf_next) { + for (psf_list = &pmc->mca_sources; + (psf = rtnl_dereference(*psf_list)) != NULL; + psf_list = &psf->sf_next) { struct in6_addr *psrc; - psf_next = psf->sf_next; + *psf_list = rtnl_dereference(psf->sf_next); if (!is_in(pmc, psf, type, gdeleted, sdeleted) && !crsend) { psf_prev = psf; @@ -1811,10 +1835,11 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, psf->sf_crcount--; if ((sdeleted || gdeleted) && psf->sf_crcount == 0) { if (psf_prev) - psf_prev->sf_next = psf->sf_next; + rcu_assign_pointer(psf_prev->sf_next, + rtnl_dereference(psf->sf_next)); else - *psf_list = psf->sf_next; - kfree(psf); + *psf_list = rtnl_dereference(psf->sf_next); + kfree_rcu(psf, rcu); continue; } } @@ -1883,16 +1908,18 @@ static void mld_clear_zeros(struct ip6_sf_list **ppsf) struct ip6_sf_list *psf_prev, *psf_next, *psf; psf_prev = NULL; - for (psf = *ppsf; psf; psf = psf_next) { - psf_next = psf->sf_next; + for (psf = rtnl_dereference(*ppsf); psf; psf = psf_next) { + psf_next = rtnl_dereference(psf->sf_next); if (psf->sf_crcount == 0) { if (psf_prev) - psf_prev->sf_next = psf->sf_next; + rcu_assign_pointer(psf_prev->sf_next, + rtnl_dereference(psf->sf_next)); else - *ppsf = psf->sf_next; - kfree(psf); - } else + *ppsf = rtnl_dereference(psf->sf_next); + kfree_rcu(psf, rcu); + } else { psf_prev = psf; + } } } @@ -2137,7 +2164,7 @@ static int ip6_mc_del1_src(struct ifmcaddr6 *pmc, int sfmode, int rv = 0; psf_prev = NULL; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) { + for_each_psf_rtnl(pmc, psf) { if (ipv6_addr_equal(&psf->sf_addr, psfsrc)) break; psf_prev = psf; @@ -2152,17 +2179,20 @@ static int ip6_mc_del1_src(struct ifmcaddr6 *pmc, int sfmode, /* no more filters for this source */ if (psf_prev) - psf_prev->sf_next = psf->sf_next; + rcu_assign_pointer(psf_prev->sf_next, + rtnl_dereference(psf->sf_next)); else - pmc->mca_sources = psf->sf_next; + rcu_assign_pointer(pmc->mca_sources, + rtnl_dereference(psf->sf_next)); if (psf->sf_oldin && !pmc->mca_noreport && !mld_in_v1_mode(idev)) { psf->sf_crcount = idev->mc_qrv; - psf->sf_next = pmc->mca_tomb; + rcu_assign_pointer(psf->sf_next, pmc->mca_tomb); pmc->mca_tomb = psf; rv = 1; - } else - kfree(psf); + } else { + kfree_rcu(psf, rcu); + } } return rv; } @@ -2214,7 +2244,7 @@ static int ip6_mc_del_src(struct inet6_dev *idev, const struct in6_addr *pmca, pmc->mca_sfmode = MCAST_INCLUDE; pmc->mca_crcount = idev->mc_qrv; idev->mc_ifc_count = pmc->mca_crcount; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) + for_each_psf_rtnl(pmc, psf) psf->sf_crcount = 0; mld_ifc_event(pmc->idev); } else if (sf_setstate(pmc) || changerec) @@ -2233,7 +2263,7 @@ static int ip6_mc_add1_src(struct ifmcaddr6 *pmc, int sfmode, struct ip6_sf_list *psf, *psf_prev; psf_prev = NULL; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) { + for_each_psf_rtnl(pmc, psf) { if (ipv6_addr_equal(&psf->sf_addr, psfsrc)) break; psf_prev = psf; @@ -2245,9 +2275,10 @@ static int ip6_mc_add1_src(struct ifmcaddr6 *pmc, int sfmode, psf->sf_addr = *psfsrc; if (psf_prev) { - psf_prev->sf_next = psf; - } else - pmc->mca_sources = psf; + rcu_assign_pointer(psf_prev->sf_next, psf); + } else { + rcu_assign_pointer(pmc->mca_sources, psf); + } } psf->sf_count[sfmode]++; return 0; @@ -2258,13 +2289,15 @@ static void sf_markstate(struct ifmcaddr6 *pmc) struct ip6_sf_list *psf; int mca_xcount = pmc->mca_sfcount[MCAST_EXCLUDE]; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) + for_each_psf_rtnl(pmc, psf) { if (pmc->mca_sfcount[MCAST_EXCLUDE]) { psf->sf_oldin = mca_xcount == psf->sf_count[MCAST_EXCLUDE] && !psf->sf_count[MCAST_INCLUDE]; - } else + } else { psf->sf_oldin = psf->sf_count[MCAST_INCLUDE] != 0; + } + } } static int sf_setstate(struct ifmcaddr6 *pmc) @@ -2275,7 +2308,7 @@ static int sf_setstate(struct ifmcaddr6 *pmc) int new_in, rv; rv = 0; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) { + for_each_psf_rtnl(pmc, psf) { if (pmc->mca_sfcount[MCAST_EXCLUDE]) { new_in = mca_xcount == psf->sf_count[MCAST_EXCLUDE] && !psf->sf_count[MCAST_INCLUDE]; @@ -2317,7 +2350,6 @@ static int sf_setstate(struct ifmcaddr6 *pmc) if (!dpsf) continue; *dpsf = *psf; - /* pmc->mca_lock held by callers */ dpsf->sf_next = pmc->mca_tomb; pmc->mca_tomb = dpsf; } @@ -2382,7 +2414,7 @@ static int ip6_mc_add_src(struct inet6_dev *idev, const struct in6_addr *pmca, pmc->mca_crcount = idev->mc_qrv; idev->mc_ifc_count = pmc->mca_crcount; - for (psf = pmc->mca_sources; psf; psf = psf->sf_next) + for_each_psf_rtnl(pmc, psf) psf->sf_crcount = 0; mld_ifc_event(idev); } else if (sf_setstate(pmc)) @@ -2401,11 +2433,13 @@ static void ip6_mc_clear_src(struct ifmcaddr6 *pmc) kfree(psf); } pmc->mca_tomb = NULL; - for (psf = pmc->mca_sources; psf; psf = nextpsf) { - nextpsf = psf->sf_next; - kfree(psf); + for (psf = rtnl_dereference(pmc->mca_sources); + psf; + psf = nextpsf) { + nextpsf = rtnl_dereference(psf->sf_next); + kfree_rcu(psf, rcu); } - pmc->mca_sources = NULL; + RCU_INIT_POINTER(pmc->mca_sources, NULL); pmc->mca_sfmode = MCAST_EXCLUDE; pmc->mca_sfcount[MCAST_INCLUDE] = 0; pmc->mca_sfcount[MCAST_EXCLUDE] = 1; @@ -2823,7 +2857,7 @@ static inline struct ip6_sf_list *igmp6_mcf_get_first(struct seq_file *seq) im = idev->mc_list; if (likely(im)) { spin_lock_bh(&im->mca_lock); - psf = im->mca_sources; + psf = rcu_dereference(im->mca_sources); if (likely(psf)) { state->im = im; state->idev = idev; @@ -2840,7 +2874,7 @@ static struct ip6_sf_list *igmp6_mcf_get_next(struct seq_file *seq, struct ip6_s { struct igmp6_mcf_iter_state *state = igmp6_mcf_seq_private(seq); - psf = psf->sf_next; + psf = rcu_dereference(psf->sf_next); while (!psf) { spin_unlock_bh(&state->im->mca_lock); state->im = state->im->next; @@ -2862,7 +2896,7 @@ static struct ip6_sf_list *igmp6_mcf_get_next(struct seq_file *seq, struct ip6_s if (!state->im) break; spin_lock_bh(&state->im->mca_lock); - psf = state->im->mca_sources; + psf = rcu_dereference(state->im->mca_sources); } out: return psf; From patchwork Sat Feb 13 17:53:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 12086989 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2148C433E0 for ; Sat, 13 Feb 2021 17:54:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8F86A64E38 for ; Sat, 13 Feb 2021 17:54:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229888AbhBMRyh (ORCPT ); Sat, 13 Feb 2021 12:54:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229905AbhBMRyE (ORCPT ); Sat, 13 Feb 2021 12:54:04 -0500 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26D39C0613D6 for ; Sat, 13 Feb 2021 09:53:24 -0800 (PST) Received: by mail-pj1-x1030.google.com with SMTP id e9so1453487pjj.0 for ; Sat, 13 Feb 2021 09:53:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=W6cX2R/XxA8xXlioJTWLTiN6J6N/xAO8S2W6b5ypstU=; b=t0mbbsNy7OOG5egvZQUaBDpuL2mJhFrjqO9He81vyNwNewSqpEOUGHEDHBIwh59YUb FAbGAqr7aLmpRHxbvg3MwLmdYy7N2DcX0NSJwFFf6F7ZB2Tr6YDKBxhANiJlfhqZ7XxS 1DMLFZpX9XrOMWAySXbR/KHnCTGz1zUVgzjHJKOhBN+spYzBKe4k10o9c4eZE5iEANrm ezuQNrTJ7XPXpH0pHsJSARfTiF/DllweqmY7awfEGEMjAduH+7PZ+FUtVM3BmsHIufcX Cmwe5rlPPEnf5QOpHVdOtTBFhgSYUZIp5X30Q4KXQmYWmq0AqVcOlvvqlT3iHU7qie2u O5hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=W6cX2R/XxA8xXlioJTWLTiN6J6N/xAO8S2W6b5ypstU=; b=QXAS7RrodVSnGF0TpAsrr9nqD5rqNbXAVPB6LP0jm3pqeovxi1xPVFuyKOcW9J+KsW KygO5v+hRyNeZvvK+eaYRU2jPCDMZPGifgXBf+GpT+UUPytSCqr2X0wyvtxVoyeLgnSg Jy1qHiDb7dqZBDM0TZK5Ns7wfD3ujWKA1aREO2eCcmAfvvXn0RiHOiK2R+y0MuvBaAOg Q3NZaIzZqz4Z/DZTmr/ESL9GLNXyChY4ehDLP9OOaZfGQfR2Vl7l0vwp3vx/Nd9fIcna O26W2/aaEWgmZFY8i5HVox3dTSwsfZp1IrV3OLvQHwNTnce3f16lID9ZjeK6ENPbnsbH 5FIQ== X-Gm-Message-State: AOAM531s3oc4mrxQNRDh4qimG7iPLiUf35vGNthiw1KKGfSefdkvk1kz lCGbDEnbzGVkYltfv82Mc2k= X-Google-Smtp-Source: ABdhPJx5AevkASPjfOx33yPYhNwE/wRvN3FgHmFrS9gLLstHbc/nBcCliffeteHnhfqx7HFMZV2Pjw== X-Received: by 2002:a17:90b:11c9:: with SMTP id gv9mr7986191pjb.196.1613238803433; Sat, 13 Feb 2021 09:53:23 -0800 (PST) Received: from localhost.localdomain ([49.173.165.50]) by smtp.gmail.com with ESMTPSA id q15sm11414225pja.22.2021.02.13.09.53.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Feb 2021 09:53:22 -0800 (PST) From: Taehee Yoo To: davem@davemloft.net, kuba@kernel.org, xiyou.wangcong@gmail.com, netdev@vger.kernel.org, jwi@linux.ibm.com, kgraul@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, borntraeger@de.ibm.com, mareklindner@neomailbox.ch, sw@simonwunderlich.de, a@unstable.cc, sven@narfation.org, yoshfuji@linux-ipv6.org, dsahern@kernel.org Cc: Taehee Yoo Subject: [PATCH net-next v2 7/7] mld: convert ifmcaddr6 to RCU Date: Sat, 13 Feb 2021 17:53:15 +0000 Message-Id: <20210213175315.28717-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The ifmcaddr6 has been protected by inet6_dev->lock(rwlock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ifmcaddr6 actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Because of this conversion, the locking scenario can be changed. So, the locking scenario is changed to the following. 1. ifmcaddr6->mca_lock only protects following resources. a) ifmcaddr6->mca_flags b) ifmcaddr6->mca_work. c) ifmcaddr6->mca_sources->sf_gsresp 2. inet6_dev->lock only protects following resources. a) inet6_dev->mc_gq_running b) inet6_dev->mc_gq_work c) inet6_dev->mc_ifc_count d) inet6_dev->mc_ifc_work e) inet6_dev->mc_delerec_work 3. Other resources are protected by RTNL and RCU. There are only two atomic context locks, they are ifmcaddr6->mca_lock and inet6_dev->lock. These locks are protecting resources, they are written on the datapath. RTNL can't be used on the datapath, these locks are used. Suggested-by: Cong Wang Signed-off-by: Taehee Yoo Reported-by: kernel test robot --- v1 -> v2: - Separated from previous big one patch. - Do not rename 'ifmcaddr6->mca_lock'. - Do not use atomic_t for 'ifmcaddr6->mca_sfcount' - Do not use atomic_t for 'ipv6_mc_socklist'->sf_count'. - Do not add mld_check_leave_group() function. - Do not add ip6_mc_del_src_bulk() function. - Do not add ip6_mc_add_src_bulk() function. - Do not use rcu_read_lock() in the qeth_l3_add_mcast_rtnl(). drivers/s390/net/qeth_l3_main.c | 6 +- include/net/if_inet6.h | 5 +- net/batman-adv/multicast.c | 6 +- net/ipv6/addrconf.c | 9 +- net/ipv6/addrconf_core.c | 2 +- net/ipv6/af_inet6.c | 2 +- net/ipv6/mcast.c | 340 ++++++++++++++------------------ 7 files changed, 169 insertions(+), 201 deletions(-) diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c index dd441eaec66e..5a0ba65971cc 100644 --- a/drivers/s390/net/qeth_l3_main.c +++ b/drivers/s390/net/qeth_l3_main.c @@ -1098,8 +1098,9 @@ static int qeth_l3_add_mcast_rtnl(struct net_device *dev, int vid, void *arg) tmp.disp_flag = QETH_DISP_ADDR_ADD; tmp.is_multicast = 1; - read_lock_bh(&in6_dev->lock); - for (im6 = in6_dev->mc_list; im6 != NULL; im6 = im6->next) { + for (im6 = rtnl_dereference(in6_dev->mc_list); + im6; + im6 = rtnl_dereference(im6->next)) { tmp.u.a6.addr = im6->mca_addr; ipm = qeth_l3_find_addr_by_ip(card, &tmp); @@ -1117,7 +1118,6 @@ static int qeth_l3_add_mcast_rtnl(struct net_device *dev, int vid, void *arg) qeth_l3_ipaddr_hash(ipm)); } - read_unlock_bh(&in6_dev->lock); out: return 0; diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index b26fecb669e3..9e7f5b4bf7ae 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -113,7 +113,7 @@ struct ip6_sf_list { struct ifmcaddr6 { struct in6_addr mca_addr; struct inet6_dev *idev; - struct ifmcaddr6 *next; + struct ifmcaddr6 __rcu *next; struct ip6_sf_list __rcu *mca_sources; struct ip6_sf_list *mca_tomb; unsigned int mca_sfmode; @@ -128,6 +128,7 @@ struct ifmcaddr6 { spinlock_t mca_lock; unsigned long mca_cstamp; unsigned long mca_tstamp; + struct rcu_head rcu; }; /* Anycast stuff */ @@ -166,7 +167,7 @@ struct inet6_dev { struct list_head addr_list; - struct ifmcaddr6 *mc_list; + struct ifmcaddr6 __rcu *mc_list; struct ifmcaddr6 *mc_tomb; unsigned char mc_qrv; /* Query Robustness Variable */ diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c index 28166402d30c..1d63c8cbbfe7 100644 --- a/net/batman-adv/multicast.c +++ b/net/batman-adv/multicast.c @@ -454,8 +454,9 @@ batadv_mcast_mla_softif_get_ipv6(struct net_device *dev, return 0; } - read_lock_bh(&in6_dev->lock); - for (pmc6 = in6_dev->mc_list; pmc6; pmc6 = pmc6->next) { + for (pmc6 = rcu_dereference(in6_dev->mc_list); + pmc6; + pmc6 = rcu_dereference(pmc6->next)) { if (IPV6_ADDR_MC_SCOPE(&pmc6->mca_addr) < IPV6_ADDR_SCOPE_LINKLOCAL) continue; @@ -484,7 +485,6 @@ batadv_mcast_mla_softif_get_ipv6(struct net_device *dev, hlist_add_head(&new->list, mcast_list); ret++; } - read_unlock_bh(&in6_dev->lock); rcu_read_unlock(); return ret; diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index f2337fb756ac..b502f78d5091 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -5107,17 +5107,20 @@ static int in6_dump_addrs(struct inet6_dev *idev, struct sk_buff *skb, break; } case MULTICAST_ADDR: + read_unlock_bh(&idev->lock); fillargs->event = RTM_GETMULTICAST; /* multicast address */ - for (ifmca = idev->mc_list; ifmca; - ifmca = ifmca->next, ip_idx++) { + for (ifmca = rcu_dereference(idev->mc_list); + ifmca; + ifmca = rcu_dereference(ifmca->next), ip_idx++) { if (ip_idx < s_ip_idx) continue; err = inet6_fill_ifmcaddr(skb, ifmca, fillargs); if (err < 0) break; } + read_lock_bh(&idev->lock); break; case ANYCAST_ADDR: fillargs->event = RTM_GETANYCAST; @@ -6093,10 +6096,8 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) { - rcu_read_lock_bh(); if (likely(ifp->idev->dead == 0)) __ipv6_ifa_notify(event, ifp); - rcu_read_unlock_bh(); } #ifdef CONFIG_SYSCTL diff --git a/net/ipv6/addrconf_core.c b/net/ipv6/addrconf_core.c index c70c192bc91b..a36626afbc02 100644 --- a/net/ipv6/addrconf_core.c +++ b/net/ipv6/addrconf_core.c @@ -250,7 +250,7 @@ void in6_dev_finish_destroy(struct inet6_dev *idev) struct net_device *dev = idev->dev; WARN_ON(!list_empty(&idev->addr_list)); - WARN_ON(idev->mc_list); + WARN_ON(rcu_access_pointer(idev->mc_list)); WARN_ON(timer_pending(&idev->rs_timer)); #ifdef NET_REFCNT_DEBUG diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 0e9994e0ecd7..8de7eb53e5ea 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -222,7 +222,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol, inet->mc_loop = 1; inet->mc_ttl = 1; inet->mc_index = 0; - inet->mc_list = NULL; + RCU_INIT_POINTER(inet->mc_list, NULL); inet->rcv_tos = 0; if (net->ipv4.sysctl_ip_no_pmtu_disc) diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 792f16e2ad83..2b33196d6a84 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -112,11 +112,26 @@ int sysctl_mld_qrv __read_mostly = MLD_QRV_DEFAULT; * socket join on multicast group */ +#define for_each_pmc_rtnl(np, pmc) \ + for (pmc = rtnl_dereference((np)->ipv6_mc_list); \ + pmc; \ + pmc = rtnl_dereference(pmc->next)) + #define for_each_pmc_rcu(np, pmc) \ for (pmc = rcu_dereference((np)->ipv6_mc_list); \ pmc; \ pmc = rcu_dereference(pmc->next)) +#define for_each_mc_rtnl(idev, mc) \ + for (mc = rtnl_dereference((idev)->mc_list); \ + mc; \ + mc = rtnl_dereference(mc->next)) + +#define for_each_mc_rcu(idev, mc) \ + for (mc = rcu_dereference((idev)->mc_list); \ + mc; \ + mc = rcu_dereference(mc->next)) + #define for_each_psf_rtnl(mc, psf) \ for (psf = rtnl_dereference((mc)->mca_sources); \ psf; \ @@ -153,15 +168,11 @@ static int __ipv6_sock_mc_join(struct sock *sk, int ifindex, if (!ipv6_addr_is_multicast(addr)) return -EINVAL; - rcu_read_lock(); - for_each_pmc_rcu(np, mc_lst) { + for_each_pmc_rtnl(np, mc_lst) { if ((ifindex == 0 || mc_lst->ifindex == ifindex) && - ipv6_addr_equal(&mc_lst->addr, addr)) { - rcu_read_unlock(); + ipv6_addr_equal(&mc_lst->addr, addr)) return -EADDRINUSE; - } } - rcu_read_unlock(); mc_lst = sock_kmalloc(sk, sizeof(struct ipv6_mc_socklist), GFP_KERNEL); @@ -263,10 +274,9 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) } EXPORT_SYMBOL(ipv6_sock_mc_drop); -/* called with rcu_read_lock() */ -static struct inet6_dev *ip6_mc_find_dev_rcu(struct net *net, - const struct in6_addr *group, - int ifindex) +static struct inet6_dev *ip6_mc_find_dev_rtnl(struct net *net, + const struct in6_addr *group, + int ifindex) { struct net_device *dev = NULL; struct inet6_dev *idev = NULL; @@ -279,18 +289,17 @@ static struct inet6_dev *ip6_mc_find_dev_rcu(struct net *net, ip6_rt_put(rt); } } else - dev = dev_get_by_index_rcu(net, ifindex); + dev = __dev_get_by_index(net, ifindex); if (!dev) return NULL; idev = __in6_dev_get(dev); if (!idev) return NULL; - read_lock_bh(&idev->lock); - if (idev->dead) { - read_unlock_bh(&idev->lock); + + if (idev->dead) return NULL; - } + return idev; } @@ -346,22 +355,21 @@ int ip6_mc_source(int add, int omode, struct sock *sk, int leavegroup = 0; int err; + ASSERT_RTNL(); + source = &((struct sockaddr_in6 *)&pgsr->gsr_source)->sin6_addr; group = &((struct sockaddr_in6 *)&pgsr->gsr_group)->sin6_addr; if (!ipv6_addr_is_multicast(group)) return -EINVAL; - rcu_read_lock(); - idev = ip6_mc_find_dev_rcu(net, group, pgsr->gsr_interface); - if (!idev) { - rcu_read_unlock(); + idev = ip6_mc_find_dev_rtnl(net, group, pgsr->gsr_interface); + if (!idev) return -ENODEV; - } err = -EADDRNOTAVAIL; - for_each_pmc_rcu(inet6, pmc) { + for_each_pmc_rtnl(inet6, pmc) { if (pgsr->gsr_interface && pmc->ifindex != pgsr->gsr_interface) continue; if (ipv6_addr_equal(&pmc->addr, group)) @@ -371,6 +379,7 @@ int ip6_mc_source(int add, int omode, struct sock *sk, err = -EINVAL; goto done; } + /* if a source filter was set, must be the same mode as before */ if (rcu_access_pointer(pmc->sflist)) { if (pmc->sfmode != omode) { @@ -424,7 +433,7 @@ int ip6_mc_source(int add, int omode, struct sock *sk, if (psl) count += psl->sl_max; - newpsl = sock_kmalloc(sk, IP6_SFLSIZE(count), GFP_ATOMIC); + newpsl = sock_kmalloc(sk, IP6_SFLSIZE(count), GFP_KERNEL); if (!newpsl) { err = -ENOBUFS; goto done; @@ -454,8 +463,6 @@ int ip6_mc_source(int add, int omode, struct sock *sk, /* update the interface list */ ip6_mc_add_src(idev, group, omode, 1, source, 1); done: - read_unlock_bh(&idev->lock); - rcu_read_unlock(); if (leavegroup) err = ipv6_sock_mc_drop(sk, pgsr->gsr_interface, group); return err; @@ -473,6 +480,8 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf, int leavegroup = 0; int i, err; + ASSERT_RTNL(); + group = &((struct sockaddr_in6 *)&gsf->gf_group)->sin6_addr; if (!ipv6_addr_is_multicast(group)) @@ -481,13 +490,10 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf, gsf->gf_fmode != MCAST_EXCLUDE) return -EINVAL; - rcu_read_lock(); - idev = ip6_mc_find_dev_rcu(net, group, gsf->gf_interface); + idev = ip6_mc_find_dev_rtnl(net, group, gsf->gf_interface); - if (!idev) { - rcu_read_unlock(); + if (!idev) return -ENODEV; - } err = 0; @@ -496,7 +502,7 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf, goto done; } - for_each_pmc_rcu(inet6, pmc) { + for_each_pmc_rtnl(inet6, pmc) { if (pmc->ifindex != gsf->gf_interface) continue; if (ipv6_addr_equal(&pmc->addr, group)) @@ -508,7 +514,7 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf, } if (gsf->gf_numsrc) { newpsl = sock_kmalloc(sk, IP6_SFLSIZE(gsf->gf_numsrc), - GFP_ATOMIC); + GFP_KERNEL); if (!newpsl) { err = -ENOBUFS; goto done; @@ -543,8 +549,6 @@ int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf, pmc->sfmode = gsf->gf_fmode; err = 0; done: - read_unlock_bh(&idev->lock); - rcu_read_unlock(); if (leavegroup) err = ipv6_sock_mc_drop(sk, gsf->gf_interface, group); return err; @@ -561,13 +565,14 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, struct ip6_sf_socklist *psl; struct net *net = sock_net(sk); + ASSERT_RTNL(); + group = &((struct sockaddr_in6 *)&gsf->gf_group)->sin6_addr; if (!ipv6_addr_is_multicast(group)) return -EINVAL; - rcu_read_lock(); - idev = ip6_mc_find_dev_rcu(net, group, gsf->gf_interface); + idev = ip6_mc_find_dev_rtnl(net, group, gsf->gf_interface); if (!idev) { rcu_read_unlock(); @@ -580,7 +585,7 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, * so reading the list is safe. */ - for_each_pmc_rcu(inet6, pmc) { + for_each_pmc_rtnl(inet6, pmc) { if (pmc->ifindex != gsf->gf_interface) continue; if (ipv6_addr_equal(group, &pmc->addr)) @@ -591,8 +596,6 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, gsf->gf_fmode = pmc->sfmode; psl = rtnl_dereference(pmc->sflist); count = psl ? psl->sl_count : 0; - read_unlock_bh(&idev->lock); - rcu_read_unlock(); copycount = count < gsf->gf_numsrc ? count : gsf->gf_numsrc; gsf->gf_numsrc = count; @@ -610,8 +613,6 @@ int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf, } return 0; done: - read_unlock_bh(&idev->lock); - rcu_read_unlock(); return err; } @@ -726,11 +727,10 @@ static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) * for deleted items allows change reports to use common code with * non-deleted or query-response MCA's. */ - pmc = kzalloc(sizeof(*pmc), GFP_ATOMIC); + pmc = kzalloc(sizeof(*pmc), GFP_KERNEL); if (!pmc) return; - spin_lock_bh(&im->mca_lock); spin_lock_init(&pmc->mca_lock); pmc->idev = im->idev; in6_dev_hold(idev); @@ -749,7 +749,6 @@ static void mld_add_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) for_each_psf_rtnl(pmc, psf) psf->sf_crcount = pmc->mca_crcount; } - spin_unlock_bh(&im->mca_lock); rcu_assign_pointer(pmc->next, idev->mc_tomb); idev->mc_tomb = pmc; @@ -774,7 +773,6 @@ static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) idev->mc_tomb = pmc->next; } - spin_lock_bh(&im->mca_lock); if (pmc) { im->idev = pmc->idev; if (im->mca_sfmode == MCAST_INCLUDE) { @@ -793,7 +791,6 @@ static void mld_del_delrec(struct inet6_dev *idev, struct ifmcaddr6 *im) ip6_mc_clear_src(pmc); kfree(pmc); } - spin_unlock_bh(&im->mca_lock); } static void mld_clear_delrec(struct inet6_dev *idev) @@ -811,20 +808,16 @@ static void mld_clear_delrec(struct inet6_dev *idev) } /* clear dead sources, too */ - read_lock_bh(&idev->lock); - for (pmc = idev->mc_list; pmc; pmc = pmc->next) { + for_each_mc_rtnl(idev, pmc) { struct ip6_sf_list *psf, *psf_next; - spin_lock_bh(&pmc->mca_lock); psf = pmc->mca_tomb; pmc->mca_tomb = NULL; - spin_unlock_bh(&pmc->mca_lock); for (; psf; psf = psf_next) { psf_next = rtnl_dereference(psf->sf_next); kfree_rcu(psf, rcu); } } - read_unlock_bh(&idev->lock); } static void mca_get(struct ifmcaddr6 *mc) @@ -836,7 +829,7 @@ static void ma_put(struct ifmcaddr6 *mc) { if (refcount_dec_and_test(&mc->mca_refcnt)) { in6_dev_put(mc->idev); - kfree(mc); + kfree_rcu(mc, rcu); } } @@ -846,7 +839,7 @@ static struct ifmcaddr6 *mca_alloc(struct inet6_dev *idev, { struct ifmcaddr6 *mc; - mc = kzalloc(sizeof(*mc), GFP_ATOMIC); + mc = kzalloc(sizeof(*mc), GFP_KERNEL); if (!mc) return NULL; @@ -887,17 +880,14 @@ static int __ipv6_dev_mc_inc(struct net_device *dev, if (!idev) return -EINVAL; - write_lock_bh(&idev->lock); if (idev->dead) { - write_unlock_bh(&idev->lock); in6_dev_put(idev); return -ENODEV; } - for (mc = idev->mc_list; mc; mc = mc->next) { + for_each_mc_rtnl(idev, mc) { if (ipv6_addr_equal(&mc->mca_addr, addr)) { mc->mca_users++; - write_unlock_bh(&idev->lock); ip6_mc_add_src(idev, &mc->mca_addr, mode, 0, NULL, 0); in6_dev_put(idev); return 0; @@ -906,19 +896,14 @@ static int __ipv6_dev_mc_inc(struct net_device *dev, mc = mca_alloc(idev, addr, mode); if (!mc) { - write_unlock_bh(&idev->lock); in6_dev_put(idev); return -ENOMEM; } - mc->next = idev->mc_list; - idev->mc_list = mc; + rcu_assign_pointer(mc->next, idev->mc_list); + rcu_assign_pointer(idev->mc_list, mc); - /* Hold this for the code below before we unlock, - * it is already exposed via idev->mc_list. - */ mca_get(mc); - write_unlock_bh(&idev->lock); mld_del_delrec(idev, mc); igmp6_group_added(mc); @@ -937,16 +922,17 @@ EXPORT_SYMBOL(ipv6_dev_mc_inc); */ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr) { - struct ifmcaddr6 *ma, **map; + struct ifmcaddr6 __rcu **map; + struct ifmcaddr6 *ma; ASSERT_RTNL(); - write_lock_bh(&idev->lock); - for (map = &idev->mc_list; (ma = *map) != NULL; map = &ma->next) { + for (map = &idev->mc_list; + (ma = rtnl_dereference(*map)) != NULL; + map = &ma->next) { if (ipv6_addr_equal(&ma->mca_addr, addr)) { if (--ma->mca_users == 0) { - *map = ma->next; - write_unlock_bh(&idev->lock); + *map = rtnl_dereference(ma->next); igmp6_group_dropped(ma); ip6_mc_clear_src(ma); @@ -954,11 +940,9 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr) ma_put(ma); return 0; } - write_unlock_bh(&idev->lock); return 0; } } - write_unlock_bh(&idev->lock); return -ENOENT; } @@ -993,8 +977,7 @@ bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group, rcu_read_lock(); idev = __in6_dev_get(dev); if (idev) { - read_lock_bh(&idev->lock); - for (mc = idev->mc_list; mc; mc = mc->next) { + for_each_mc_rcu(idev, mc) { if (ipv6_addr_equal(&mc->mca_addr, group)) break; } @@ -1002,7 +985,6 @@ bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group, if (src_addr && !ipv6_addr_any(src_addr)) { struct ip6_sf_list *psf; - spin_lock_bh(&mc->mca_lock); for_each_psf_rcu(mc, psf) { if (ipv6_addr_equal(&psf->sf_addr, src_addr)) break; @@ -1013,11 +995,9 @@ bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group, mc->mca_sfcount[MCAST_EXCLUDE]; else rv = mc->mca_sfcount[MCAST_EXCLUDE] != 0; - spin_unlock_bh(&mc->mca_lock); } else rv = true; /* don't filter unspecified source */ } - read_unlock_bh(&idev->lock); } rcu_read_unlock(); return rv; @@ -1027,16 +1007,20 @@ static void mld_gq_start_work(struct inet6_dev *idev) { unsigned long tv = prandom_u32() % idev->mc_maxdelay; + write_lock_bh(&idev->lock); idev->mc_gq_running = 1; if (!mod_delayed_work(mld_wq, &idev->mc_gq_work, msecs_to_jiffies(tv + 2))) in6_dev_hold(idev); + write_unlock_bh(&idev->lock); } static void mld_gq_stop_work(struct inet6_dev *idev) { + write_lock_bh(&idev->lock); idev->mc_gq_running = 0; if (cancel_delayed_work(&idev->mc_gq_work)) __in6_dev_put(idev); + write_unlock_bh(&idev->lock); } static void mld_ifc_start_work(struct inet6_dev *idev, unsigned long delay) @@ -1049,9 +1033,11 @@ static void mld_ifc_start_work(struct inet6_dev *idev, unsigned long delay) static void mld_ifc_stop_work(struct inet6_dev *idev) { + write_lock_bh(&idev->lock); idev->mc_ifc_count = 0; if (cancel_delayed_work(&idev->mc_ifc_work)) __in6_dev_put(idev); + write_unlock_bh(&idev->lock); } static void mld_dad_start_work(struct inet6_dev *idev, unsigned long delay) @@ -1084,10 +1070,9 @@ static void mld_clear_delrec_stop_work(struct inet6_dev *idev) write_unlock_bh(&idev->lock); } -/* - * IGMP handling (alias multicast ICMPv6 messages) +/* IGMP handling (alias multicast ICMPv6 messages) + * called with mca_lock */ - static void igmp6_group_queried(struct ifmcaddr6 *ma, unsigned long resptime) { unsigned long delay = resptime; @@ -1425,15 +1410,14 @@ int igmp6_event_query(struct sk_buff *skb) return -EINVAL; } - read_lock_bh(&idev->lock); if (group_type == IPV6_ADDR_ANY) { - for (ma = idev->mc_list; ma; ma = ma->next) { + for_each_mc_rcu(idev, ma) { spin_lock_bh(&ma->mca_lock); igmp6_group_queried(ma, max_delay); spin_unlock_bh(&ma->mca_lock); } } else { - for (ma = idev->mc_list; ma; ma = ma->next) { + for_each_mc_rcu(idev, ma) { if (!ipv6_addr_equal(group, &ma->mca_addr)) continue; spin_lock_bh(&ma->mca_lock); @@ -1455,7 +1439,6 @@ int igmp6_event_query(struct sk_buff *skb) break; } } - read_unlock_bh(&idev->lock); return 0; } @@ -1496,18 +1479,17 @@ int igmp6_event_report(struct sk_buff *skb) * Cancel the work for this group */ - read_lock_bh(&idev->lock); - for (ma = idev->mc_list; ma; ma = ma->next) { + for_each_mc_rcu(idev, ma) { if (ipv6_addr_equal(&ma->mca_addr, &mld->mld_mca)) { spin_lock(&ma->mca_lock); if (cancel_delayed_work(&ma->mca_work)) refcount_dec(&ma->mca_refcnt); - ma->mca_flags &= ~(MAF_LAST_REPORTER|MAF_TIMER_RUNNING); + ma->mca_flags &= ~(MAF_LAST_REPORTER | + MAF_TIMER_RUNNING); spin_unlock(&ma->mca_lock); break; } } - read_unlock_bh(&idev->lock); return 0; } @@ -1562,8 +1544,12 @@ mld_scount(struct ifmcaddr6 *pmc, int type, int gdeleted, int sdeleted) int scount = 0; for_each_psf_rtnl(pmc, psf) { - if (!is_in(pmc, psf, type, gdeleted, sdeleted)) + spin_lock_bh(&pmc->mca_lock); + if (!is_in(pmc, psf, type, gdeleted, sdeleted)) { + spin_unlock_bh(&pmc->mca_lock); continue; + } + spin_unlock_bh(&pmc->mca_lock); scount++; } return scount; @@ -1790,10 +1776,13 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, *psf_list = rtnl_dereference(psf->sf_next); + spin_lock_bh(&pmc->mca_lock); if (!is_in(pmc, psf, type, gdeleted, sdeleted) && !crsend) { + spin_unlock_bh(&pmc->mca_lock); psf_prev = psf; continue; } + spin_unlock_bh(&pmc->mca_lock); /* Based on RFC3810 6.1. Should not send source-list change * records when there is a filter mode change. @@ -1805,8 +1794,11 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, goto decrease_sf_crcount; /* clear marks on query responses */ - if (isquery) + if (isquery) { + spin_lock_bh(&pmc->mca_lock); psf->sf_gsresp = 0; + spin_unlock_bh(&pmc->mca_lock); + } if (AVAILABLE(skb) < sizeof(*psrc) + first*sizeof(struct mld2_grec)) { @@ -1863,8 +1855,11 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ifmcaddr6 *pmc, if (pgr) pgr->grec_nsrcs = htons(scount); - if (isquery) + if (isquery) { + spin_lock_bh(&pmc->mca_lock); pmc->mca_flags &= ~MAF_GSQUERY; /* clear query state */ + spin_unlock_bh(&pmc->mca_lock); + } return skb; } @@ -1873,29 +1868,24 @@ static void mld_send_report(struct inet6_dev *idev, struct ifmcaddr6 *pmc) struct sk_buff *skb = NULL; int type; - read_lock_bh(&idev->lock); if (!pmc) { - for (pmc = idev->mc_list; pmc; pmc = pmc->next) { + for_each_mc_rtnl(idev, pmc) { if (pmc->mca_noreport) continue; - spin_lock_bh(&pmc->mca_lock); if (pmc->mca_sfcount[MCAST_EXCLUDE]) type = MLD2_MODE_IS_EXCLUDE; else type = MLD2_MODE_IS_INCLUDE; skb = add_grec(skb, pmc, type, 0, 0, 0); - spin_unlock_bh(&pmc->mca_lock); } } else { - spin_lock_bh(&pmc->mca_lock); if (pmc->mca_sfcount[MCAST_EXCLUDE]) type = MLD2_MODE_IS_EXCLUDE; else type = MLD2_MODE_IS_INCLUDE; skb = add_grec(skb, pmc, type, 0, 0, 0); - spin_unlock_bh(&pmc->mca_lock); } - read_unlock_bh(&idev->lock); + if (skb) mld_sendpack(skb); } @@ -1929,8 +1919,6 @@ static void mld_send_cr(struct inet6_dev *idev) struct sk_buff *skb = NULL; int type, dtype; - read_lock_bh(&idev->lock); - /* deleted MCA's */ pmc_prev = NULL; for (pmc = idev->mc_tomb; pmc; pmc = pmc_next) { @@ -1965,8 +1953,7 @@ static void mld_send_cr(struct inet6_dev *idev) } /* change recs */ - for (pmc = idev->mc_list; pmc; pmc = pmc->next) { - spin_lock_bh(&pmc->mca_lock); + for_each_mc_rtnl(idev, pmc) { if (pmc->mca_sfcount[MCAST_EXCLUDE]) { type = MLD2_BLOCK_OLD_SOURCES; dtype = MLD2_ALLOW_NEW_SOURCES; @@ -1986,9 +1973,8 @@ static void mld_send_cr(struct inet6_dev *idev) skb = add_grec(skb, pmc, type, 0, 0, 0); pmc->mca_crcount--; } - spin_unlock_bh(&pmc->mca_lock); } - read_unlock_bh(&idev->lock); + if (!skb) return; (void) mld_sendpack(skb); @@ -2100,17 +2086,14 @@ static void mld_send_initial_cr(struct inet6_dev *idev) return; skb = NULL; - read_lock_bh(&idev->lock); - for (pmc = idev->mc_list; pmc; pmc = pmc->next) { - spin_lock_bh(&pmc->mca_lock); + for_each_mc_rtnl(idev, pmc) { if (pmc->mca_sfcount[MCAST_EXCLUDE]) type = MLD2_CHANGE_TO_EXCLUDE; else type = MLD2_ALLOW_NEW_SOURCES; skb = add_grec(skb, pmc, type, 0, 0, 1); - spin_unlock_bh(&pmc->mca_lock); } - read_unlock_bh(&idev->lock); + if (skb) mld_sendpack(skb); } @@ -2207,24 +2190,19 @@ static int ip6_mc_del_src(struct inet6_dev *idev, const struct in6_addr *pmca, if (!idev) return -ENODEV; - read_lock_bh(&idev->lock); - for (pmc = idev->mc_list; pmc; pmc = pmc->next) { + + for_each_mc_rtnl(idev, pmc) { if (ipv6_addr_equal(pmca, &pmc->mca_addr)) break; } - if (!pmc) { + if (!pmc) /* MCA not found?? bug */ - read_unlock_bh(&idev->lock); return -ESRCH; - } - spin_lock_bh(&pmc->mca_lock); + sf_markstate(pmc); if (!delta) { - if (!pmc->mca_sfcount[sfmode]) { - spin_unlock_bh(&pmc->mca_lock); - read_unlock_bh(&idev->lock); + if (!pmc->mca_sfcount[sfmode]) return -EINVAL; - } pmc->mca_sfcount[sfmode]--; } err = 0; @@ -2243,14 +2221,15 @@ static int ip6_mc_del_src(struct inet6_dev *idev, const struct in6_addr *pmca, /* filter mode change */ pmc->mca_sfmode = MCAST_INCLUDE; pmc->mca_crcount = idev->mc_qrv; + write_lock_bh(&idev->lock); idev->mc_ifc_count = pmc->mca_crcount; + write_unlock_bh(&idev->lock); for_each_psf_rtnl(pmc, psf) psf->sf_crcount = 0; mld_ifc_event(pmc->idev); } else if (sf_setstate(pmc) || changerec) mld_ifc_event(pmc->idev); - spin_unlock_bh(&pmc->mca_lock); - read_unlock_bh(&idev->lock); + return err; } @@ -2269,7 +2248,7 @@ static int ip6_mc_add1_src(struct ifmcaddr6 *pmc, int sfmode, psf_prev = psf; } if (!psf) { - psf = kzalloc(sizeof(*psf), GFP_ATOMIC); + psf = kzalloc(sizeof(*psf), GFP_KERNEL); if (!psf) return -ENOBUFS; @@ -2346,7 +2325,7 @@ static int sf_setstate(struct ifmcaddr6 *pmc) &psf->sf_addr)) break; if (!dpsf) { - dpsf = kmalloc(sizeof(*dpsf), GFP_ATOMIC); + dpsf = kmalloc(sizeof(*dpsf), GFP_KERNEL); if (!dpsf) continue; *dpsf = *psf; @@ -2373,17 +2352,14 @@ static int ip6_mc_add_src(struct inet6_dev *idev, const struct in6_addr *pmca, if (!idev) return -ENODEV; - read_lock_bh(&idev->lock); - for (pmc = idev->mc_list; pmc; pmc = pmc->next) { + + for_each_mc_rtnl(idev, pmc) { if (ipv6_addr_equal(pmca, &pmc->mca_addr)) break; } - if (!pmc) { + if (!pmc) /* MCA not found?? bug */ - read_unlock_bh(&idev->lock); return -ESRCH; - } - spin_lock_bh(&pmc->mca_lock); sf_markstate(pmc); isexclude = pmc->mca_sfmode == MCAST_EXCLUDE; @@ -2413,14 +2389,16 @@ static int ip6_mc_add_src(struct inet6_dev *idev, const struct in6_addr *pmca, /* else no filters; keep old mode for reports */ pmc->mca_crcount = idev->mc_qrv; + write_lock_bh(&idev->lock); idev->mc_ifc_count = pmc->mca_crcount; + write_unlock_bh(&idev->lock); for_each_psf_rtnl(pmc, psf) psf->sf_crcount = 0; mld_ifc_event(idev); - } else if (sf_setstate(pmc)) + } else if (sf_setstate(pmc)) { mld_ifc_event(idev); - spin_unlock_bh(&pmc->mca_lock); - read_unlock_bh(&idev->lock); + } + return err; } @@ -2493,9 +2471,14 @@ static int ip6_mc_leave_src(struct sock *sk, struct ipv6_mc_socklist *iml, static void igmp6_leave_group(struct ifmcaddr6 *ma) { if (mld_in_v1_mode(ma->idev)) { - if (ma->mca_flags & MAF_LAST_REPORTER) + spin_lock_bh(&ma->mca_lock); + if (ma->mca_flags & MAF_LAST_REPORTER) { + spin_unlock_bh(&ma->mca_lock); igmp6_send(&ma->mca_addr, ma->idev->dev, ICMPV6_MGM_REDUCTION); + } else { + spin_unlock_bh(&ma->mca_lock); + } } else { mld_add_delrec(ma->idev, ma); mld_ifc_event(ma->idev); @@ -2508,10 +2491,12 @@ static void mld_gq_work(struct work_struct *work) struct inet6_dev, mc_gq_work); - idev->mc_gq_running = 0; rtnl_lock(); mld_send_report(idev, NULL); rtnl_unlock(); + write_lock_bh(&idev->lock); + idev->mc_gq_running = 0; + write_unlock_bh(&idev->lock); in6_dev_put(idev); } @@ -2523,13 +2508,16 @@ static void mld_ifc_work(struct work_struct *work) rtnl_lock(); mld_send_cr(idev); + rtnl_unlock(); + + write_lock_bh(&idev->lock); if (idev->mc_ifc_count) { idev->mc_ifc_count--; if (idev->mc_ifc_count) mld_ifc_start_work(idev, unsolicited_report_interval(idev)); } - rtnl_unlock(); + write_unlock_bh(&idev->lock); in6_dev_put(idev); } @@ -2537,8 +2525,11 @@ static void mld_ifc_event(struct inet6_dev *idev) { if (mld_in_v1_mode(idev)) return; + + write_lock_bh(&idev->lock); idev->mc_ifc_count = idev->mc_qrv; mld_ifc_start_work(idev, 1); + write_unlock_bh(&idev->lock); } static void mld_mca_work(struct work_struct *work) @@ -2568,10 +2559,8 @@ void ipv6_mc_unmap(struct inet6_dev *idev) /* Install multicast list, except for all-nodes (already installed) */ - read_lock_bh(&idev->lock); - for (i = idev->mc_list; i; i = i->next) + for_each_mc_rtnl(idev, i) igmp6_group_dropped(i); - read_unlock_bh(&idev->lock); } void ipv6_mc_remap(struct inet6_dev *idev) @@ -2587,9 +2576,7 @@ void ipv6_mc_down(struct inet6_dev *idev) /* Withdraw multicast list */ - read_lock_bh(&idev->lock); - - for (i = idev->mc_list; i; i = i->next) + for_each_mc_rtnl(idev, i) igmp6_group_dropped(i); /* Should stop work after group drop. or we will @@ -2598,7 +2585,6 @@ void ipv6_mc_down(struct inet6_dev *idev) mld_ifc_stop_work(idev); mld_gq_stop_work(idev); mld_dad_stop_work(idev); - read_unlock_bh(&idev->lock); mld_clear_delrec_stop_work(idev); } @@ -2619,20 +2605,19 @@ void ipv6_mc_up(struct inet6_dev *idev) /* Install multicast list, except for all-nodes (already installed) */ - read_lock_bh(&idev->lock); ipv6_mc_reset(idev); - for (i = idev->mc_list; i; i = i->next) { + for_each_mc_rtnl(idev, i) { mld_del_delrec(idev, i); igmp6_group_added(i); } - read_unlock_bh(&idev->lock); } /* IPv6 device initialization. */ void ipv6_mc_init_dev(struct inet6_dev *idev) { - write_lock_bh(&idev->lock); + ASSERT_RTNL(); + idev->mc_gq_running = 0; INIT_DELAYED_WORK(&idev->mc_gq_work, mld_gq_work); idev->mc_tomb = NULL; @@ -2641,7 +2626,6 @@ void ipv6_mc_init_dev(struct inet6_dev *idev) INIT_DELAYED_WORK(&idev->mc_dad_work, mld_dad_work); INIT_DELAYED_WORK(&idev->mc_delrec_work, mld_clear_delrec_work); ipv6_mc_reset(idev); - write_unlock_bh(&idev->lock); } /* @@ -2652,6 +2636,8 @@ void ipv6_mc_destroy_dev(struct inet6_dev *idev) { struct ifmcaddr6 *i; + ASSERT_RTNL(); + /* Deactivate works */ ipv6_mc_down(idev); mld_clear_delrec(idev); @@ -2666,16 +2652,12 @@ void ipv6_mc_destroy_dev(struct inet6_dev *idev) if (idev->cnf.forwarding) __ipv6_dev_mc_dec(idev, &in6addr_linklocal_allrouters); - write_lock_bh(&idev->lock); - while ((i = idev->mc_list) != NULL) { - idev->mc_list = i->next; + while ((i = rtnl_dereference(idev->mc_list)) != NULL) { + rcu_assign_pointer(idev->mc_list, rtnl_dereference(i->next)); - write_unlock_bh(&idev->lock); ip6_mc_clear_src(i); ma_put(i); - write_lock_bh(&idev->lock); } - write_unlock_bh(&idev->lock); } static void ipv6_mc_rejoin_groups(struct inet6_dev *idev) @@ -2685,12 +2667,11 @@ static void ipv6_mc_rejoin_groups(struct inet6_dev *idev) ASSERT_RTNL(); if (mld_in_v1_mode(idev)) { - read_lock_bh(&idev->lock); - for (pmc = idev->mc_list; pmc; pmc = pmc->next) + for_each_mc_rtnl(idev, pmc) igmp6_join_group(pmc); - read_unlock_bh(&idev->lock); - } else + } else { mld_send_report(idev, NULL); + } } static int ipv6_mc_netdev_event(struct notifier_block *this, @@ -2737,13 +2718,11 @@ static inline struct ifmcaddr6 *igmp6_mc_get_first(struct seq_file *seq) idev = __in6_dev_get(state->dev); if (!idev) continue; - read_lock_bh(&idev->lock); - im = idev->mc_list; + im = rcu_dereference(idev->mc_list); if (im) { state->idev = idev; break; } - read_unlock_bh(&idev->lock); } return im; } @@ -2752,11 +2731,8 @@ static struct ifmcaddr6 *igmp6_mc_get_next(struct seq_file *seq, struct ifmcaddr { struct igmp6_mc_iter_state *state = igmp6_mc_seq_private(seq); - im = im->next; + im = rcu_dereference(im->next); while (!im) { - if (likely(state->idev)) - read_unlock_bh(&state->idev->lock); - state->dev = next_net_device_rcu(state->dev); if (!state->dev) { state->idev = NULL; @@ -2765,8 +2741,7 @@ static struct ifmcaddr6 *igmp6_mc_get_next(struct seq_file *seq, struct ifmcaddr state->idev = __in6_dev_get(state->dev); if (!state->idev) continue; - read_lock_bh(&state->idev->lock); - im = state->idev->mc_list; + im = rcu_dereference(state->idev->mc_list); } return im; } @@ -2800,10 +2775,9 @@ static void igmp6_mc_seq_stop(struct seq_file *seq, void *v) { struct igmp6_mc_iter_state *state = igmp6_mc_seq_private(seq); - if (likely(state->idev)) { - read_unlock_bh(&state->idev->lock); + if (likely(state->idev)) state->idev = NULL; - } + state->dev = NULL; rcu_read_unlock(); } @@ -2813,13 +2787,15 @@ static int igmp6_mc_seq_show(struct seq_file *seq, void *v) struct ifmcaddr6 *im = (struct ifmcaddr6 *)v; struct igmp6_mc_iter_state *state = igmp6_mc_seq_private(seq); + spin_lock_bh(&im->mca_lock); seq_printf(seq, "%-4d %-15s %pi6 %5d %08X %ld\n", state->dev->ifindex, state->dev->name, &im->mca_addr, im->mca_users, im->mca_flags, - (im->mca_flags&MAF_TIMER_RUNNING) ? + (im->mca_flags & MAF_TIMER_RUNNING) ? jiffies_to_clock_t(im->mca_work.timer.expires - jiffies) : 0); + spin_unlock_bh(&im->mca_lock); return 0; } @@ -2850,22 +2826,19 @@ static inline struct ip6_sf_list *igmp6_mcf_get_first(struct seq_file *seq) state->im = NULL; for_each_netdev_rcu(net, state->dev) { struct inet6_dev *idev; + idev = __in6_dev_get(state->dev); if (unlikely(idev == NULL)) continue; - read_lock_bh(&idev->lock); - im = idev->mc_list; + im = rcu_dereference(idev->mc_list); if (likely(im)) { - spin_lock_bh(&im->mca_lock); psf = rcu_dereference(im->mca_sources); if (likely(psf)) { state->im = im; state->idev = idev; break; } - spin_unlock_bh(&im->mca_lock); } - read_unlock_bh(&idev->lock); } return psf; } @@ -2876,12 +2849,8 @@ static struct ip6_sf_list *igmp6_mcf_get_next(struct seq_file *seq, struct ip6_s psf = rcu_dereference(psf->sf_next); while (!psf) { - spin_unlock_bh(&state->im->mca_lock); - state->im = state->im->next; + state->im = rcu_dereference(state->im->next); while (!state->im) { - if (likely(state->idev)) - read_unlock_bh(&state->idev->lock); - state->dev = next_net_device_rcu(state->dev); if (!state->dev) { state->idev = NULL; @@ -2890,12 +2859,10 @@ static struct ip6_sf_list *igmp6_mcf_get_next(struct seq_file *seq, struct ip6_s state->idev = __in6_dev_get(state->dev); if (!state->idev) continue; - read_lock_bh(&state->idev->lock); - state->im = state->idev->mc_list; + state->im = rcu_dereference(state->idev->mc_list); } if (!state->im) break; - spin_lock_bh(&state->im->mca_lock); psf = rcu_dereference(state->im->mca_sources); } out: @@ -2933,14 +2900,12 @@ static void igmp6_mcf_seq_stop(struct seq_file *seq, void *v) __releases(RCU) { struct igmp6_mcf_iter_state *state = igmp6_mcf_seq_private(seq); - if (likely(state->im)) { - spin_unlock_bh(&state->im->mca_lock); + if (likely(state->im)) state->im = NULL; - } - if (likely(state->idev)) { - read_unlock_bh(&state->idev->lock); + + if (likely(state->idev)) state->idev = NULL; - } + state->dev = NULL; rcu_read_unlock(); } @@ -3021,6 +2986,7 @@ static int __net_init igmp6_net_init(struct net *net) } inet6_sk(net->ipv6.igmp_sk)->hop_limit = 1; + net->ipv6.igmp_sk->sk_allocation = GFP_KERNEL; err = inet_ctl_sock_create(&net->ipv6.mc_autojoin_sk, PF_INET6, SOCK_RAW, IPPROTO_ICMPV6, net);