From patchwork Wed Feb 2 12:28:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12732865 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE85AC433FE for ; Wed, 2 Feb 2022 12:29:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344164AbiBBM3F (ORCPT ); Wed, 2 Feb 2022 07:29:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344145AbiBBM25 (ORCPT ); Wed, 2 Feb 2022 07:28:57 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCD22C06173D; Wed, 2 Feb 2022 04:28:56 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1643804934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K47l6RiCN9/QajBAo5t2g/2ZmVf5Mm+Chno5arbg/P0=; b=tIvFgOEIEpi+4iP6xggXWvqvQPe9d6VdvUOCoWhlkjn+3af6Lnw/fcn/WDd6FYh0yAb5dr Afp+58MgGSs6OKv1qy1pzkOLx5b89y4GR41zDajH0ZJ9wjkYSWtD75iP1oP3b0Fxjnfryv mTSLXfgDlN5XwD+BZNcU5+ejl4GYicde/2nWZB9yz/ycN+F8e+0hfMUuqj8J4wTcBplGxq oHObSPvBS9YcBqxUiORG6LqtZb3oFDS+IqYXXNbvkzJLOV6bXLp2iteeqQ/lCOV64MAe3Z LUFlLWN6V7/vqTdwRcgM7WuwWrleoAh8IZQtPhDace3g39NkX2+REeZ8ydTuKg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1643804934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K47l6RiCN9/QajBAo5t2g/2ZmVf5Mm+Chno5arbg/P0=; b=SPBDuk+aUo4wGcYLrIe7o2u8KSzV09+TP43S/k9JOLMblMEuDDHg6A9eRCX2oJQMctqX1e 6eZj0yE9zVsEEMBQ== To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: "David S. Miller" , Alexei Starovoitov , Daniel Borkmann , Eric Dumazet , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Thomas Gleixner , Sebastian Andrzej Siewior Subject: [PATCH net-next 1/4] net: dev: Remove the preempt_disable() in netif_rx_internal(). Date: Wed, 2 Feb 2022 13:28:45 +0100 Message-Id: <20220202122848.647635-2-bigeasy@linutronix.de> In-Reply-To: <20220202122848.647635-1-bigeasy@linutronix.de> References: <20220202122848.647635-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The preempt_disable() and rcu_disable() section was introduced in commit bbbe211c295ff ("net: rcu lock and preempt disable missing around generic xdp") The backtrace shows that bottom halves were disabled and so the usage of smp_processor_id() would not trigger a warning. The "suspicious RCU usage" warning was triggered because rcu_dereference() was not used in rcu_read_lock() section (only rcu_read_lock_bh()). A rcu_read_lock() is sufficient. Remove the preempt_disable() statement which is not needed. Signed-off-by: Sebastian Andrzej Siewior --- net/core/dev.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 1baab07820f65..325b70074f4ae 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4796,7 +4796,6 @@ static int netif_rx_internal(struct sk_buff *skb) struct rps_dev_flow voidflow, *rflow = &voidflow; int cpu; - preempt_disable(); rcu_read_lock(); cpu = get_rps_cpu(skb->dev, skb, &rflow); @@ -4806,7 +4805,6 @@ static int netif_rx_internal(struct sk_buff *skb) ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail); rcu_read_unlock(); - preempt_enable(); } else #endif { From patchwork Wed Feb 2 12:28:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12732864 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9693DC433F5 for ; Wed, 2 Feb 2022 12:29:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344153AbiBBM3A (ORCPT ); Wed, 2 Feb 2022 07:29:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344144AbiBBM25 (ORCPT ); Wed, 2 Feb 2022 07:28:57 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2571C061714; Wed, 2 Feb 2022 04:28:56 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1643804934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YC6m1HGsplsRfYYc3H2HMTpJd7knNBRVUNFvn5d4q1Q=; b=d6Cnvsc1SZIXrfawvWSn/LDVcRzILIAVDf3hJo/bOYiZK3y/5vMMjX5RGbg5XxhcSLVPUa tT6ch7oOOV1axeQAifPn1lQ/lQvBZNpUj5D8fBEKWiyzoOrtBIHJaOikyZikRT3zVrO7Ow QlQ2l1e+p7/iH3t6RiqG/HGawbZglQ2tBrHRfmGluytC45fWMifl7DYVYdX23HccRo4Avb C0LA8qAIJejbZAhudpDDkdLRk6ZS+ELR3HDGB9ICV+0VPn+dwhjwhyTHOUcl7ag3oSs/1F JqmFTsGY5+ijalLE9iyzfO2DA8wsUmGuU72t0lk6nWcZxYN8+FW0vu0c+bxn3g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1643804934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YC6m1HGsplsRfYYc3H2HMTpJd7knNBRVUNFvn5d4q1Q=; b=Y0OSf43DP6OUKNTiwanuLoGQYwOjmvLzb5qz1h1iN+hv+UJDWJ2Yc5yotqXhsMRzCbhLJp d13VvtSWGLTPI3Bw== To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: "David S. Miller" , Alexei Starovoitov , Daniel Borkmann , Eric Dumazet , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Thomas Gleixner , Sebastian Andrzej Siewior Subject: [PATCH net-next 2/4] net: dev: Remove get_cpu() in netif_rx_internal(). Date: Wed, 2 Feb 2022 13:28:46 +0100 Message-Id: <20220202122848.647635-3-bigeasy@linutronix.de> In-Reply-To: <20220202122848.647635-1-bigeasy@linutronix.de> References: <20220202122848.647635-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The get_cpu() usage was added in commit b0e28f1effd1d ("net: netif_rx() must disable preemption") because ip_dev_loopback_xmit() invoked netif_rx() with enabled preemtion causing a warning in smp_processor_id(). The function netif_rx() should only be invoked from an interrupt context which implies disabled preemption. The commit e30b38c298b55 ("ip: Fix ip_dev_loopback_xmit()") was addressing this and replaced netif_rx() with in netif_rx_ni() in ip_dev_loopback_xmit(). Based on the discussion on the list, the former patch (b0e28f1effd1d) should not have been applied only the latter (e30b38c298b55). Remove get_cpu() since the function is supossed to be invoked from context with stable per-CPU pointers (either by disabling preemption or software interrupts). Link: https://lkml.kernel.org/r/20100415.013347.98375530.davem@davemloft.net Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Eric Dumazet Reviewed-by: Toke Høiland-Jørgensen --- net/core/dev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 325b70074f4ae..0d13340ed4054 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4810,8 +4810,7 @@ static int netif_rx_internal(struct sk_buff *skb) { unsigned int qtail; - ret = enqueue_to_backlog(skb, get_cpu(), &qtail); - put_cpu(); + ret = enqueue_to_backlog(skb, smp_processor_id(), &qtail); } return ret; } From patchwork Wed Feb 2 12:28:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12732862 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FF92C433F5 for ; Wed, 2 Feb 2022 12:28:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344067AbiBBM24 (ORCPT ); Wed, 2 Feb 2022 07:28:56 -0500 Received: from Galois.linutronix.de ([193.142.43.55]:46532 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229955AbiBBM24 (ORCPT ); Wed, 2 Feb 2022 07:28:56 -0500 From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1643804934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uxK2Br+YSRgI8layahcCJx/qLAO0UkG+gJh96tUahN0=; b=kaTIMGt1u8YwJHjfZz40FPcEepCRDvWM2zASqyzskNmcFdvhgLpgXEXQj+xSYbfCMAaC/n meBxkx7tkzWLx7WYmHOkUVc3iN0qpFsXk1TIh/EcJv+HHYZjZUKyfN76xc9UN/UCz+pI39 ZAuPGMHWbtpEs3sNpzDSD+6Xqub8PMUhAKyzXARrFT2TOw4iv/nWcRAL24Q212rR4q47SN g5Vsm8HOVIf3WdDAeIUkN/0Hp4JzDhypOlIa69Mw16e0J1XrONTERAZzTbplMv9FMzJhjo AVubCJSnfDKKF7Xp31Tz8EOCT51piP6V51WhThsoHEoI6jP4D1XzfFPMXtJGzQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1643804934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uxK2Br+YSRgI8layahcCJx/qLAO0UkG+gJh96tUahN0=; b=dcOMhmhZoxT1bMa0qCKsJ14XIaEdRnOin1N4DHVjjYpOpJFnE7TP59FA08b7F5CRmZuvMU Und4qPBKeZqRQqDg== To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: "David S. Miller" , Alexei Starovoitov , Daniel Borkmann , Eric Dumazet , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Thomas Gleixner , Sebastian Andrzej Siewior Subject: [PATCH net-next 3/4] net: dev: Makes sure netif_rx() can be invoked in any context. Date: Wed, 2 Feb 2022 13:28:47 +0100 Message-Id: <20220202122848.647635-4-bigeasy@linutronix.de> In-Reply-To: <20220202122848.647635-1-bigeasy@linutronix.de> References: <20220202122848.647635-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Dave suggested a while ago (eleven years by now) "Let's make netif_rx() work in all contexts and get rid of netif_rx_ni()". Eric agreed and pointed out that modern devices should use netif_receive_skb() to avoid the overhead. In the meantime someone added another variant, netif_rx_any_context(), which behaves as suggested. netif_rx() must be invoked with disabled bottom halves to ensure that pending softirqs, which were raised within the function, are handled. netif_rx_ni() can be invoked only from process context (bottom halves must be enabled) because the function handles pending softirqs without checking if bottom halves were disabled or not. netif_rx_any_context() invokes on the former functions by checking in_interrupts(). netif_rx() could be taught to handle both cases (disabled and enabled bottom halves) by simply disabling bottom halves while invoking netif_rx_internal(). The local_bh_enable() invocation will then invoke pending softirqs only if the BH-disable counter drops to zero. Add a local_bh_disable() section in netif_rx() to ensure softirqs are handled if needed. Make netif_rx_ni() and netif_rx_any_context() invoke netif_rx() so they can be removed once they are no more users left. Link: https://lkml.kernel.org/r/20100415.020246.218622820.davem@davemloft.net Signed-off-by: Sebastian Andrzej Siewior --- include/linux/netdevice.h | 13 +++++++++++-- include/trace/events/net.h | 14 -------------- net/core/dev.c | 34 ++-------------------------------- 3 files changed, 13 insertions(+), 48 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e490b84732d16..4086f312f814e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3669,8 +3669,17 @@ u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp, void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog); int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb); int netif_rx(struct sk_buff *skb); -int netif_rx_ni(struct sk_buff *skb); -int netif_rx_any_context(struct sk_buff *skb); + +static inline int netif_rx_ni(struct sk_buff *skb) +{ + return netif_rx(skb); +} + +static inline int netif_rx_any_context(struct sk_buff *skb) +{ + return netif_rx(skb); +} + int netif_receive_skb(struct sk_buff *skb); int netif_receive_skb_core(struct sk_buff *skb); void netif_receive_skb_list_internal(struct list_head *head); diff --git a/include/trace/events/net.h b/include/trace/events/net.h index 78c448c6ab4c5..032b431b987b6 100644 --- a/include/trace/events/net.h +++ b/include/trace/events/net.h @@ -260,13 +260,6 @@ DEFINE_EVENT(net_dev_rx_verbose_template, netif_rx_entry, TP_ARGS(skb) ); -DEFINE_EVENT(net_dev_rx_verbose_template, netif_rx_ni_entry, - - TP_PROTO(const struct sk_buff *skb), - - TP_ARGS(skb) -); - DECLARE_EVENT_CLASS(net_dev_rx_exit_template, TP_PROTO(int ret), @@ -312,13 +305,6 @@ DEFINE_EVENT(net_dev_rx_exit_template, netif_rx_exit, TP_ARGS(ret) ); -DEFINE_EVENT(net_dev_rx_exit_template, netif_rx_ni_exit, - - TP_PROTO(int ret), - - TP_ARGS(ret) -); - DEFINE_EVENT(net_dev_rx_exit_template, netif_receive_skb_list_exit, TP_PROTO(int ret), diff --git a/net/core/dev.c b/net/core/dev.c index 0d13340ed4054..f43d0580fa11d 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4834,47 +4834,17 @@ int netif_rx(struct sk_buff *skb) { int ret; + local_bh_disable(); trace_netif_rx_entry(skb); ret = netif_rx_internal(skb); trace_netif_rx_exit(ret); + local_bh_enable(); return ret; } EXPORT_SYMBOL(netif_rx); -int netif_rx_ni(struct sk_buff *skb) -{ - int err; - - trace_netif_rx_ni_entry(skb); - - preempt_disable(); - err = netif_rx_internal(skb); - if (local_softirq_pending()) - do_softirq(); - preempt_enable(); - trace_netif_rx_ni_exit(err); - - return err; -} -EXPORT_SYMBOL(netif_rx_ni); - -int netif_rx_any_context(struct sk_buff *skb) -{ - /* - * If invoked from contexts which do not invoke bottom half - * processing either at return from interrupt or when softrqs are - * reenabled, use netif_rx_ni() which invokes bottomhalf processing - * directly. - */ - if (in_interrupt()) - return netif_rx(skb); - else - return netif_rx_ni(skb); -} -EXPORT_SYMBOL(netif_rx_any_context); - static __latent_entropy void net_tx_action(struct softirq_action *h) { struct softnet_data *sd = this_cpu_ptr(&softnet_data); From patchwork Wed Feb 2 12:28:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12732863 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BA1FC433EF for ; Wed, 2 Feb 2022 12:28:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344143AbiBBM24 (ORCPT ); Wed, 2 Feb 2022 07:28:56 -0500 Received: from Galois.linutronix.de ([193.142.43.55]:46540 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231841AbiBBM24 (ORCPT ); Wed, 2 Feb 2022 07:28:56 -0500 From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1643804935; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3gaE2CnG9k+LQIe/sH7m2AqTnxKkca6kKPvnnhRPdxM=; b=bYeYj6+w1Ct/bV2eTxYhNSovUyZ+ayN2RRd1RIarP9MFtEftJJnMMlCFyOO9Paagv34K6k XR/lhf6oro1Ph0+qRYFwfKZnAL9zj3aXXYdEgAbOMs9rbqv8PQcAmyLn+gj6Z9R4akqNCB LV8uweL4En0cl41PsMlbO7+tlQqKC7TBNpWOkfRmngeNNflHoxkeIaQSN3q2MdbRUkZOYm efe+bRxRppKxjBYSCIDKt7hmhvUaBN2srq4xMdhEgpCk9ZJISloxMf1XyLWG5vMbkSI5SD 6JX3nJThkCR2EK181CTcfZ/F78X9n3DNegcgIt6UwR2X84AI1XP8ngzP/DJF5w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1643804935; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3gaE2CnG9k+LQIe/sH7m2AqTnxKkca6kKPvnnhRPdxM=; b=+tXhZooXTRVTAjLmkAbTv9Ws669/LWZMNyqfUdsHNFh84X9t0IvoXngzfHR9SvSPIiV7Ge 2U+ApwRXbBRvMLAA== To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: "David S. Miller" , Alexei Starovoitov , Daniel Borkmann , Eric Dumazet , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Thomas Gleixner , Sebastian Andrzej Siewior Subject: [PATCH net-next 4/4] net: dev: Make rps_lock() disable interrupts. Date: Wed, 2 Feb 2022 13:28:48 +0100 Message-Id: <20220202122848.647635-5-bigeasy@linutronix.de> In-Reply-To: <20220202122848.647635-1-bigeasy@linutronix.de> References: <20220202122848.647635-1-bigeasy@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Disabling interrupts and in the RPS case locking input_pkt_queue is split into local_irq_disable() and optional spin_lock(). This breaks on PREEMPT_RT because the spinlock_t typed lock can not be acquired with disabled interrupts. The sections in which the lock is acquired is usually short in a sense that it is not causing long und unbounded latiencies. One exception is the skb_flow_limit() invocation which may invoke a BPF program (and may require sleeping locks). By moving local_irq_disable() + spin_lock() into rps_lock(), we can keep interrupts disabled on !PREEMPT_RT and enabled on PREEMPT_RT kernels. Without RPS on a PREEMPT_RT kernel, the needed synchronisation happens as part of local_bh_disable() on the local CPU. Since interrupts remain enabled, enqueue_to_backlog() needs to disable interrupts for ____napi_schedule(). Signed-off-by: Sebastian Andrzej Siewior --- net/core/dev.c | 72 ++++++++++++++++++++++++++++++-------------------- 1 file changed, 44 insertions(+), 28 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index f43d0580fa11d..e9ea56daee2f0 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -216,18 +216,38 @@ static inline struct hlist_head *dev_index_hash(struct net *net, int ifindex) return &net->dev_index_head[ifindex & (NETDEV_HASHENTRIES - 1)]; } -static inline void rps_lock(struct softnet_data *sd) +static inline void rps_lock_irqsave(struct softnet_data *sd, + unsigned long *flags) { -#ifdef CONFIG_RPS - spin_lock(&sd->input_pkt_queue.lock); -#endif + if (IS_ENABLED(CONFIG_RPS)) + spin_lock_irqsave(&sd->input_pkt_queue.lock, *flags); + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_save(*flags); } -static inline void rps_unlock(struct softnet_data *sd) +static inline void rps_lock_irq_disable(struct softnet_data *sd) { -#ifdef CONFIG_RPS - spin_unlock(&sd->input_pkt_queue.lock); -#endif + if (IS_ENABLED(CONFIG_RPS)) + spin_lock_irq(&sd->input_pkt_queue.lock); + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_disable(); +} + +static inline void rps_unlock_irq_restore(struct softnet_data *sd, + unsigned long *flags) +{ + if (IS_ENABLED(CONFIG_RPS)) + spin_unlock_irqrestore(&sd->input_pkt_queue.lock, *flags); + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_restore(*flags); +} + +static inline void rps_unlock_irq_enable(struct softnet_data *sd) +{ + if (IS_ENABLED(CONFIG_RPS)) + spin_unlock_irq(&sd->input_pkt_queue.lock); + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_enable(); } static struct netdev_name_node *netdev_name_node_alloc(struct net_device *dev, @@ -4525,9 +4545,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu, sd = &per_cpu(softnet_data, cpu); - local_irq_save(flags); - - rps_lock(sd); + rps_lock_irqsave(sd, &flags); if (!netif_running(skb->dev)) goto drop; qlen = skb_queue_len(&sd->input_pkt_queue); @@ -4536,26 +4554,30 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu, enqueue: __skb_queue_tail(&sd->input_pkt_queue, skb); input_queue_tail_incr_save(sd, qtail); - rps_unlock(sd); - local_irq_restore(flags); + rps_unlock_irq_restore(sd, &flags); return NET_RX_SUCCESS; } /* Schedule NAPI for backlog device * We can use non atomic operation since we own the queue lock + * PREEMPT_RT needs to disable interrupts here for + * synchronisation needed in napi_schedule. */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_disable(); + if (!__test_and_set_bit(NAPI_STATE_SCHED, &sd->backlog.state)) { if (!rps_ipi_queued(sd)) ____napi_schedule(sd, &sd->backlog); } + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_enable(); goto enqueue; } drop: sd->dropped++; - rps_unlock(sd); - - local_irq_restore(flags); + rps_unlock_irq_restore(sd, &flags); atomic_long_inc(&skb->dev->rx_dropped); kfree_skb(skb); @@ -5617,8 +5639,7 @@ static void flush_backlog(struct work_struct *work) local_bh_disable(); sd = this_cpu_ptr(&softnet_data); - local_irq_disable(); - rps_lock(sd); + rps_lock_irq_disable(sd); skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) { if (skb->dev->reg_state == NETREG_UNREGISTERING) { __skb_unlink(skb, &sd->input_pkt_queue); @@ -5626,8 +5647,7 @@ static void flush_backlog(struct work_struct *work) input_queue_head_incr(sd); } } - rps_unlock(sd); - local_irq_enable(); + rps_unlock_irq_enable(sd); skb_queue_walk_safe(&sd->process_queue, skb, tmp) { if (skb->dev->reg_state == NETREG_UNREGISTERING) { @@ -5645,16 +5665,14 @@ static bool flush_required(int cpu) struct softnet_data *sd = &per_cpu(softnet_data, cpu); bool do_flush; - local_irq_disable(); - rps_lock(sd); + rps_lock_irq_disable(sd); /* as insertion into process_queue happens with the rps lock held, * process_queue access may race only with dequeue */ do_flush = !skb_queue_empty(&sd->input_pkt_queue) || !skb_queue_empty_lockless(&sd->process_queue); - rps_unlock(sd); - local_irq_enable(); + rps_unlock_irq_enable(sd); return do_flush; #endif @@ -5769,8 +5787,7 @@ static int process_backlog(struct napi_struct *napi, int quota) } - local_irq_disable(); - rps_lock(sd); + rps_lock_irq_disable(sd); if (skb_queue_empty(&sd->input_pkt_queue)) { /* * Inline a custom version of __napi_complete(). @@ -5786,8 +5803,7 @@ static int process_backlog(struct napi_struct *napi, int quota) skb_queue_splice_tail_init(&sd->input_pkt_queue, &sd->process_queue); } - rps_unlock(sd); - local_irq_enable(); + rps_unlock_irq_enable(sd); } return work;