From patchwork Wed Nov 30 18:13:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13060201 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A823DC4708A for ; Wed, 30 Nov 2022 18:13:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230064AbiK3SNs (ORCPT ); Wed, 30 Nov 2022 13:13:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229852AbiK3SNc (ORCPT ); Wed, 30 Nov 2022 13:13:32 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E109862D7; Wed, 30 Nov 2022 10:13:31 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C618961D74; Wed, 30 Nov 2022 18:13:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1DD0AC43152; Wed, 30 Nov 2022 18:13:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669832008; bh=O3sihN19Kz2sN5DYDWZyYgF1v27O1pMTYDT2Vmf9Dd4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pN9sTHaln9m6EGe3r9Eb2mbBcomn8O7ZzzGuAofcxjdtAaYBBhuhEjCdEYKuuSr06 v0BxvoaljMSpxX4FpfkPcIJSQGSpia3bFMMSWxo9Hw1m5HwSGwMZFeJpkfP4YAf9ZR xYFs1/y0W4mmseFXE/jsG/vHS0wbW4TNkTZnkww03YP1y/+SXiuDlCBqDWwywzHvf6 UVnawFrUxFkSAl1hhOSYWBxQF7dkCz4UCuF5KH2hARoBTkQY1Nu5lTOiIJ4fqmvdez t973eHlre30kymo32erxf/RHRtlVEcgD/glMh8n2Zh+g+Qak1llmk5ftZmJKfsUce4 JU7NjJFwIrRmw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 7006E5C196D; Wed, 30 Nov 2022 10:13:27 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , David Howells , Marc Dionne , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , linux-afs@lists.infradead.org, netdev@vger.kernel.org, "Paul E . McKenney" Subject: [PATCH rcu 14/16] rxrpc: Use call_rcu_hurry() instead of call_rcu() Date: Wed, 30 Nov 2022 10:13:23 -0800 Message-Id: <20221130181325.1012760-14-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221130181316.GA1012431@paulmck-ThinkPad-P17-Gen-1> References: <20221130181316.GA1012431@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: "Joel Fernandes (Google)" Earlier commits in this series allow battery-powered systems to build their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option. This Kconfig option causes call_rcu() to delay its callbacks in order to batch them. This means that a given RCU grace period covers more callbacks, thus reducing the number of grace periods, in turn reducing the amount of energy consumed, which increases battery lifetime which can be a very good thing. This is not a subtle effect: In some important use cases, the battery lifetime is increased by more than 10%. This CONFIG_RCU_LAZY=y option is available only for CPUs that offload callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y. Delaying callbacks is normally not a problem because most callbacks do nothing but free memory. If the system is short on memory, a shrinker will kick all currently queued lazy callbacks out of their laziness, thus freeing their memory in short order. Similarly, the rcu_barrier() function, which blocks until all currently queued callbacks are invoked, will also kick lazy callbacks, thus enabling rcu_barrier() to complete in a timely manner. However, there are some cases where laziness is not a good option. For example, synchronize_rcu() invokes call_rcu(), and blocks until the newly queued callback is invoked. It would not be a good for synchronize_rcu() to block for ten seconds, even on an idle system. Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a given CPU kicks any lazy callbacks that might be already queued on that CPU. After all, if there is going to be a grace period, all callbacks might as well get full benefit from it. Yes, this could be done the other way around by creating a call_rcu_lazy(), but earlier experience with this approach and feedback at the 2022 Linux Plumbers Conference shifted the approach to call_rcu() being lazy with call_rcu_hurry() for the few places where laziness is inappropriate. And another call_rcu() instance that cannot be lazy is the one in rxrpc_kill_connection(), which sometimes does a wakeup that should not be unduly delayed. Therefore, make rxrpc_kill_connection() use call_rcu_hurry() in order to revert to the old behavior. [ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ] Signed-off-by: Joel Fernandes (Google) Cc: David Howells Cc: Marc Dionne Cc: "David S. Miller" Cc: Eric Dumazet Cc: Jakub Kicinski Cc: Paolo Abeni Cc: Cc: Signed-off-by: Paul E. McKenney Reviewed-by: Eric Dumazet --- net/rxrpc/conn_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index 22089e37e97f0..9c5fae9ca106c 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -253,7 +253,7 @@ void rxrpc_kill_connection(struct rxrpc_connection *conn) * must carry a ref on the connection to prevent us getting here whilst * it is queued or running. */ - call_rcu(&conn->rcu, rxrpc_destroy_connection); + call_rcu_hurry(&conn->rcu, rxrpc_destroy_connection); } /* From patchwork Wed Nov 30 18:13:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13060202 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4011C352A1 for ; Wed, 30 Nov 2022 18:14:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230199AbiK3SOB (ORCPT ); Wed, 30 Nov 2022 13:14:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230001AbiK3SNd (ORCPT ); Wed, 30 Nov 2022 13:13:33 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4D0E862E7; Wed, 30 Nov 2022 10:13:31 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 8630DB81CA1; Wed, 30 Nov 2022 18:13:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2187AC4315A; Wed, 30 Nov 2022 18:13:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669832008; bh=tLEwwsXJ50F5WMjv+BokwLjrPrLGPh46UmjaqeJOMNw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Qn/G0f3uTFbWSF42AqjtciE1KrJdOMt55GTNskCSfXG+ehVdvjQo7PT/4uic/Xzg1 Wi/E5KetpU6444YTs4tO67fBw4KORpTJ0BKVkR/Z3EMAh6cV20/1KWN/BxoKaOaAAN 4IPkSER2QT2w1PfvXv0JlBCGSXN2yG7XA0W899GLRdiFLJ5uYDWtNbzwrVYZ5jU9iz hauOlXc4XrtNkMQU9gYEnLcdZLg3winmFw/9vpHoo+ESsEjtdGfVEVJ2RCwpQJVQq1 NwKPlMpxbYYVPJjir5RwdfEOfd2sKQckEBkPR+Stgh/QD3zHNsxWDTgBaOGCX58MaF 0ukAfUlmoFqgg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 724285C1A02; Wed, 30 Nov 2022 10:13:27 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, "Joel Fernandes (Google)" , David Ahern , "David S. Miller" , Eric Dumazet , Hideaki YOSHIFUJI , Jakub Kicinski , Paolo Abeni , netdev@vger.kernel.org, "Paul E . McKenney" Subject: [PATCH rcu 15/16] net: Use call_rcu_hurry() for dst_release() Date: Wed, 30 Nov 2022 10:13:24 -0800 Message-Id: <20221130181325.1012760-15-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221130181316.GA1012431@paulmck-ThinkPad-P17-Gen-1> References: <20221130181316.GA1012431@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: "Joel Fernandes (Google)" In a networking test on ChromeOS, kernels built with the new CONFIG_RCU_LAZY=y Kconfig option fail a networking test in the teardown phase. This failure may be reproduced as follows: ip netns del The CONFIG_RCU_LAZY=y Kconfig option was introduced by earlier commits in this series for the benefit of certain battery-powered systems. This Kconfig option causes call_rcu() to delay its callbacks in order to batch them. This means that a given RCU grace period covers more callbacks, thus reducing the number of grace periods, in turn reducing the amount of energy consumed, which increases battery lifetime which can be a very good thing. This is not a subtle effect: In some important use cases, the battery lifetime is increased by more than 10%. This CONFIG_RCU_LAZY=y option is available only for CPUs that offload callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y. Delaying callbacks is normally not a problem because most callbacks do nothing but free memory. If the system is short on memory, a shrinker will kick all currently queued lazy callbacks out of their laziness, thus freeing their memory in short order. Similarly, the rcu_barrier() function, which blocks until all currently queued callbacks are invoked, will also kick lazy callbacks, thus enabling rcu_barrier() to complete in a timely manner. However, there are some cases where laziness is not a good option. For example, synchronize_rcu() invokes call_rcu(), and blocks until the newly queued callback is invoked. It would not be a good for synchronize_rcu() to block for ten seconds, even on an idle system. Therefore, synchronize_rcu() invokes call_rcu_hurry() instead of call_rcu(). The arrival of a non-lazy call_rcu_hurry() callback on a given CPU kicks any lazy callbacks that might be already queued on that CPU. After all, if there is going to be a grace period, all callbacks might as well get full benefit from it. Yes, this could be done the other way around by creating a call_rcu_lazy(), but earlier experience with this approach and feedback at the 2022 Linux Plumbers Conference shifted the approach to call_rcu() being lazy with call_rcu_hurry() for the few places where laziness is inappropriate. Returning to the test failure, use of ftrace showed that this failure cause caused by the aadded delays due to this new lazy behavior of call_rcu() in kernels built with CONFIG_RCU_LAZY=y. Therefore, make dst_release() use call_rcu_hurry() in order to revert to the old test-failure-free behavior. [ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ] Signed-off-by: Joel Fernandes (Google) Cc: David Ahern Cc: "David S. Miller" Cc: Eric Dumazet Cc: Hideaki YOSHIFUJI Cc: Jakub Kicinski Cc: Paolo Abeni Cc: Signed-off-by: Paul E. McKenney Reviewed-by: Eric Dumazet --- net/core/dst.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/dst.c b/net/core/dst.c index bc9c9be4e0801..a4e738d321ba2 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -174,7 +174,7 @@ void dst_release(struct dst_entry *dst) net_warn_ratelimited("%s: dst:%p refcnt:%d\n", __func__, dst, newrefcnt); if (!newrefcnt) - call_rcu(&dst->rcu_head, dst_destroy_rcu); + call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu); } } EXPORT_SYMBOL(dst_release); From patchwork Wed Nov 30 18:13:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13060200 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0406EC352A1 for ; Wed, 30 Nov 2022 18:13:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230023AbiK3SNp (ORCPT ); Wed, 30 Nov 2022 13:13:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230114AbiK3SNb (ORCPT ); Wed, 30 Nov 2022 13:13:31 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFFD3862D6; Wed, 30 Nov 2022 10:13:29 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7496461D69; Wed, 30 Nov 2022 18:13:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5EE5BC4347C; Wed, 30 Nov 2022 18:13:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669832008; bh=yoNO14jnT7/3wJpWHWo0CYgw8+T0b3nI0VWmVo/0VZk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QmUth/WNvneVJ7GY/NtQg2tQXkTTyS6dQZtGDqBepyOjRyaRmzdEVbxOwHWojNgNG fy274sxQ8p2fNmFr7zOX+v8PTdexYqqR+nEhTS6TAca3mgKJxgJmgNeZ7Qgfyb13vk KmRg7eHwgtZRQ3dkAQTxR3kVNdYZZUuawesr4MoOkISj5Ge9v87CEKqjYT49l8Q27B czHeZkDRTlPmeLD5CGnM9FyqIl1YmKi8jqqfbS6/XwRSsWdypQh1osFZVzDx+3doNT ZEc5DK7JuXrrQWn2jEe1fbX8zvfQHWLv13WWUGCyBNyPpiiXarzI79sX1I3H8M8AhX ZlhvX2BidzHTA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 744535C1A77; Wed, 30 Nov 2022 10:13:27 -0800 (PST) From: "Paul E. McKenney" To: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, Eric Dumazet , Joel Fernandes , David Ahern , "David S. Miller" , Hideaki YOSHIFUJI , Jakub Kicinski , Paolo Abeni , netdev@vger.kernel.org, "Paul E . McKenney" Subject: [PATCH rcu 16/16] net: devinet: Reduce refcount before grace period Date: Wed, 30 Nov 2022 10:13:25 -0800 Message-Id: <20221130181325.1012760-16-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20221130181316.GA1012431@paulmck-ThinkPad-P17-Gen-1> References: <20221130181316.GA1012431@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Eric Dumazet Currently, the inetdev_destroy() function waits for an RCU grace period before decrementing the refcount and freeing memory. This causes a delay with a new RCU configuration that tries to save power, which results in the network interface disappearing later than expected. The resulting delay causes test failures on ChromeOS. Refactor the code such that the refcount is freed before the grace period and memory is freed after. With this a ChromeOS network test passes that does 'ip netns del' and polls for an interface disappearing, now passes. Reported-by: Joel Fernandes (Google) Signed-off-by: Eric Dumazet Signed-off-by: Joel Fernandes (Google) Cc: David Ahern Cc: "David S. Miller" Cc: Hideaki YOSHIFUJI Cc: Jakub Kicinski Cc: Paolo Abeni Cc: Signed-off-by: Paul E. McKenney --- net/ipv4/devinet.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index e8b9a9202fecd..b0acf6e19aed3 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -234,13 +234,20 @@ static void inet_free_ifa(struct in_ifaddr *ifa) call_rcu(&ifa->rcu_head, inet_rcu_free_ifa); } +static void in_dev_free_rcu(struct rcu_head *head) +{ + struct in_device *idev = container_of(head, struct in_device, rcu_head); + + kfree(rcu_dereference_protected(idev->mc_hash, 1)); + kfree(idev); +} + void in_dev_finish_destroy(struct in_device *idev) { struct net_device *dev = idev->dev; WARN_ON(idev->ifa_list); WARN_ON(idev->mc_list); - kfree(rcu_dereference_protected(idev->mc_hash, 1)); #ifdef NET_REFCNT_DEBUG pr_debug("%s: %p=%s\n", __func__, idev, dev ? dev->name : "NIL"); #endif @@ -248,7 +255,7 @@ void in_dev_finish_destroy(struct in_device *idev) if (!idev->dead) pr_err("Freeing alive in_device %p\n", idev); else - kfree(idev); + call_rcu(&idev->rcu_head, in_dev_free_rcu); } EXPORT_SYMBOL(in_dev_finish_destroy); @@ -298,12 +305,6 @@ static struct in_device *inetdev_init(struct net_device *dev) goto out; } -static void in_dev_rcu_put(struct rcu_head *head) -{ - struct in_device *idev = container_of(head, struct in_device, rcu_head); - in_dev_put(idev); -} - static void inetdev_destroy(struct in_device *in_dev) { struct net_device *dev; @@ -328,7 +329,7 @@ static void inetdev_destroy(struct in_device *in_dev) neigh_parms_release(&arp_tbl, in_dev->arp_parms); arp_ifdown(dev); - call_rcu(&in_dev->rcu_head, in_dev_rcu_put); + in_dev_put(in_dev); } int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)