From patchwork Fri Jan 15 00:31:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 12021151 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F395C433E6 for ; Fri, 15 Jan 2021 00:32:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2CE3523AC2 for ; Fri, 15 Jan 2021 00:32:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731317AbhAOAcK (ORCPT ); Thu, 14 Jan 2021 19:32:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731311AbhAOAcI (ORCPT ); Thu, 14 Jan 2021 19:32:08 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22CB2C061757 for ; Thu, 14 Jan 2021 16:31:28 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id w17so3560873ybl.15 for ; Thu, 14 Jan 2021 16:31:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=cx6sA6453BDvYfDMX3ic0zEk85EsNsUlntKHeCUyUVY=; b=vPqtSCPJoZszAfaszVU3lCsA8wec4AqzW3jJ+t8QXgevOhsJ57M+V0eMMm1i/4QABz 4pTFcpz4flPEgAlrKIoYV7+X13PbsyVyTbdkHjWqeK4qpHs1rM8ctHgv5gfCNOksVpst ULAEBuwYGV24DHiMZGODutNpPjGSDyL8bxxv24GkDpp9QycdNB08Mqs0Qk95OijFbzbi KydTa6WK60QrJrOHhjoEEykQEfaZcu8Y3V99KmLIasTjy7ba+MD9Sy2+x8gPe2GGJal+ r7B9SKp1Hjd7EYDWK3LE68Z0mVVJll6zFQLpL3rtX4CF3mCgGfFAVapazMzz0oUSfjcg l0Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=cx6sA6453BDvYfDMX3ic0zEk85EsNsUlntKHeCUyUVY=; b=igSiQAJH/6+gh+KF4WTMiDAE3VB9EXVsFolUszgDuH7hmkjVKWN3FqOyf5pWj6Iev3 rJgu5IxRwKGU23BBpOKi9WFPBWHF0kSqidfQTCJ58wBzJ8zDD9pe1rn01NyAv8Wfiyau KcTQ+GAdF9YGa0m7vqn+uhyXqVxPlgWteNuTZoEOiFvmKH7RUJRI1AkH38L4p9dOhsx5 s3+zwPHok5EzdmbeGuMVihsQxfDx178hdYfB2RkKOdCTX24b+4VrGlBHoxo6xB8mV8DU 2SodZWZpZ10erbnIP6beIvZY5vK4p2K2DNjvXviuzbOHTc1sh447RzHTUCO6s1EJVCUF dU3A== X-Gm-Message-State: AOAM530iooOvpBXX6GFdIUbDeX1XiDn3yBztL662OdDOqD+rCwb7ryAT lcz/8ksjOvuieSZiJ6c30pUgXEBjgEY= X-Google-Smtp-Source: ABdhPJynV+wBiZqFptRUg+z7DWLHqk3/fZJEzlURgrULsPA3PMiCCSIZNPFVKh1G9ZE1ZVekPAWiv1MnAlg= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a25:4155:: with SMTP id o82mr14769773yba.206.1610670687389; Thu, 14 Jan 2021 16:31:27 -0800 (PST) Date: Thu, 14 Jan 2021 16:31:21 -0800 In-Reply-To: <20210115003123.1254314-1-weiwan@google.com> Message-Id: <20210115003123.1254314-2-weiwan@google.com> Mime-Version: 1.0 References: <20210115003123.1254314-1-weiwan@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH net-next v6 1/3] net: extract napi poll functionality to __napi_poll() From: Wei Wang To: David Miller , netdev@vger.kernel.org, Jakub Kicinski Cc: Eric Dumazet , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Felix Fietkau This commit introduces a new function __napi_poll() which does the main logic of the existing napi_poll() function, and will be called by other functions in later commits. This idea and implementation is done by Felix Fietkau and is proposed as part of the patch to move napi work to work_queue context. This commit by itself is a code restructure. Signed-off-by: Felix Fietkau Signed-off-by: Wei Wang --- net/core/dev.c | 35 +++++++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index e4d77c8abe76..83b59e4c0f37 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6772,15 +6772,10 @@ void __netif_napi_del(struct napi_struct *napi) } EXPORT_SYMBOL(__netif_napi_del); -static int napi_poll(struct napi_struct *n, struct list_head *repoll) +static int __napi_poll(struct napi_struct *n, bool *repoll) { - void *have; int work, weight; - list_del_init(&n->poll_list); - - have = netpoll_poll_lock(n); - weight = n->weight; /* This NAPI_STATE_SCHED test is for avoiding a race @@ -6800,7 +6795,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) n->poll, work, weight); if (likely(work < weight)) - goto out_unlock; + return work; /* Drivers must not modify the NAPI state if they * consume the entire weight. In such cases this code @@ -6809,7 +6804,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) */ if (unlikely(napi_disable_pending(n))) { napi_complete(n); - goto out_unlock; + return work; } /* The NAPI context has more processing work, but busy-polling @@ -6822,7 +6817,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) */ napi_schedule(n); } - goto out_unlock; + return work; } if (n->gro_bitmask) { @@ -6840,9 +6835,29 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) if (unlikely(!list_empty(&n->poll_list))) { pr_warn_once("%s: Budget exhausted after napi rescheduled\n", n->dev ? n->dev->name : "backlog"); - goto out_unlock; + return work; } + *repoll = true; + + return work; +} + +static int napi_poll(struct napi_struct *n, struct list_head *repoll) +{ + bool do_repoll = false; + void *have; + int work; + + list_del_init(&n->poll_list); + + have = netpoll_poll_lock(n); + + work = __napi_poll(n, &do_repoll); + + if (!do_repoll) + goto out_unlock; + list_add_tail(&n->poll_list, repoll); out_unlock: From patchwork Fri Jan 15 00:31:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 12021149 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 741E5C433E9 for ; Fri, 15 Jan 2021 00:32:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 41DC723A9D for ; Fri, 15 Jan 2021 00:32:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731327AbhAOAcN (ORCPT ); Thu, 14 Jan 2021 19:32:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731311AbhAOAcK (ORCPT ); Thu, 14 Jan 2021 19:32:10 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C565C0613C1 for ; Thu, 14 Jan 2021 16:31:30 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id x17so3540722ybs.12 for ; Thu, 14 Jan 2021 16:31:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=gS562wUqoYkJEKFW+JWkPRUYuiJjCG86nf7DejVyUic=; b=Utfv52voT4jYVD2kSvGSWMpw8UxgL0+j/x/3uAc+1+2GkzrWTkenqVy4RkHaAzVvZt TEZZWttLHLSnPFSxOLVKAGtRuozZ3wZrhpfcqq0UrC2eaP0i0eaebj5pucf+TtfItX6B zq2fJdM7HebJpKijhSWmCmHQ6ummMG8Mfrvmbk3u1o3w22CJEe6KZI2HlExsD3tw17nk lSifK0bOBbTxvVRL59akKYACjjSD3dBBEVgX49A8uCyH8g4AdSMMHPYmlWdCavfm3wgQ OECFxPFer462r6b1GY7LG6NPOYpALikN2x4xfFGzZ5L18lWxk7Eczs49A+WdwyTYq/Yd UCgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=gS562wUqoYkJEKFW+JWkPRUYuiJjCG86nf7DejVyUic=; b=elyPp2PonnkeDQWM7PL9mYtR9EhOUWemvf3WmSHgeDKwsRDsPkOjvLxvJlM5LxdU1u L+4Eb5TSu/1tTw4y9BSuC8LKafU+cFy2/0brCKvAK5YxMo5tZjC+fl6Put/qtziHMBhW OPScKHcT5DnH3r0i0txIYIZCRY1CLRr6leueliRzizmJP/bqIHGEYNxo/kKPgZKFmcGa LZRe4XvDw7R4uzik8LqkVU9msV98vukFKkMbv1Wc05Wj6dP4J4JvUYP1J0QrF/tHjY6q GoVfBAuNHjOJOYavh0GLauS3JOLbDS0TNeYrezu+ON+NCGmdJyr3f1ez45D5xwOa0sXp 63CQ== X-Gm-Message-State: AOAM533ErQwFth0nJ/QpLd1BcTWv8u81GJJHpIO5WhOZhAzKilSRLFXX nb5KXGjhj1Mvsw/X7OLS8X6XU3XmjjE= X-Google-Smtp-Source: ABdhPJxp6ZBS1pSfia62AH0GGMmIATDuEJzNIvz2fXk1020Cc8qRDJq6Jms4bz44uza/K64e8dMsrJc3Gek= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a5b:eca:: with SMTP id a10mr7373136ybs.91.1610670689292; Thu, 14 Jan 2021 16:31:29 -0800 (PST) Date: Thu, 14 Jan 2021 16:31:22 -0800 In-Reply-To: <20210115003123.1254314-1-weiwan@google.com> Message-Id: <20210115003123.1254314-3-weiwan@google.com> Mime-Version: 1.0 References: <20210115003123.1254314-1-weiwan@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH net-next v6 2/3] net: implement threaded-able napi poll loop support From: Wei Wang To: David Miller , netdev@vger.kernel.org, Jakub Kicinski Cc: Eric Dumazet , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This patch allows running each napi poll loop inside its own kernel thread. The threaded mode could be enabled through napi_set_threaded() api, and does not require a device up/down. The kthread gets created on demand when napi_set_threaded() is called, and gets shut down eventually in napi_disable(). Once that threaded mode is enabled and the kthread is started, napi_schedule() will wake-up such thread instead of scheduling the softirq. The threaded poll loop behaves quite likely the net_rx_action, but it does not have to manipulate local irqs and uses an explicit scheduling point based on netdev_budget. Co-developed-by: Paolo Abeni Signed-off-by: Paolo Abeni Co-developed-by: Hannes Frederic Sowa Signed-off-by: Hannes Frederic Sowa Co-developed-by: Jakub Kicinski Signed-off-by: Jakub Kicinski Signed-off-by: Wei Wang --- include/linux/netdevice.h | 12 ++-- net/core/dev.c | 113 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 118 insertions(+), 7 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5b949076ed23..c24ed232c746 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -347,6 +347,7 @@ struct napi_struct { struct list_head dev_list; struct hlist_node napi_hash_node; unsigned int napi_id; + struct task_struct *thread; }; enum { @@ -358,6 +359,7 @@ enum { NAPI_STATE_NO_BUSY_POLL, /* Do not add in napi_hash, no busy polling */ NAPI_STATE_IN_BUSY_POLL, /* sk_busy_loop() owns this NAPI */ NAPI_STATE_PREFER_BUSY_POLL, /* prefer busy-polling over softirq processing*/ + NAPI_STATE_THREADED, /* The poll is performed inside its own thread*/ }; enum { @@ -369,6 +371,7 @@ enum { NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL), NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL), NAPIF_STATE_PREFER_BUSY_POLL = BIT(NAPI_STATE_PREFER_BUSY_POLL), + NAPIF_STATE_THREADED = BIT(NAPI_STATE_THREADED), }; enum gro_result { @@ -510,13 +513,7 @@ void napi_disable(struct napi_struct *n); * Resume NAPI from being scheduled on this context. * Must be paired with napi_disable. */ -static inline void napi_enable(struct napi_struct *n) -{ - BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); - smp_mb__before_atomic(); - clear_bit(NAPI_STATE_SCHED, &n->state); - clear_bit(NAPI_STATE_NPSVC, &n->state); -} +void napi_enable(struct napi_struct *n); /** * napi_synchronize - wait until NAPI is not running @@ -2140,6 +2137,7 @@ struct net_device { struct lock_class_key *qdisc_tx_busylock; struct lock_class_key *qdisc_running_key; bool proto_down; + bool threaded; unsigned wol_enabled:1; struct list_head net_notifier_list; diff --git a/net/core/dev.c b/net/core/dev.c index 83b59e4c0f37..edcfec1361e9 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -91,6 +91,7 @@ #include #include #include +#include #include #include #include @@ -1493,6 +1494,36 @@ void netdev_notify_peers(struct net_device *dev) } EXPORT_SYMBOL(netdev_notify_peers); +static int napi_threaded_poll(void *data); + +static int napi_kthread_create(struct napi_struct *n) +{ + int err = 0; + + /* Create and wake up the kthread once to put it in + * TASK_INTERRUPTIBLE mode to avoid the blocked task + * warning and work with loadavg. + */ + n->thread = kthread_run(napi_threaded_poll, n, "napi/%s-%d", + n->dev->name, n->napi_id); + if (IS_ERR(n->thread)) { + err = PTR_ERR(n->thread); + pr_err("kthread_run failed with err %d\n", err); + n->thread = NULL; + } + + return err; +} + +static void napi_kthread_stop(struct napi_struct *n) +{ + if (!n->thread) + return; + kthread_stop(n->thread); + clear_bit(NAPI_STATE_THREADED, &n->state); + n->thread = NULL; +} + static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) { const struct net_device_ops *ops = dev->netdev_ops; @@ -4252,6 +4283,11 @@ int gro_normal_batch __read_mostly = 8; static inline void ____napi_schedule(struct softnet_data *sd, struct napi_struct *napi) { + if (test_bit(NAPI_STATE_THREADED, &napi->state)) { + wake_up_process(napi->thread); + return; + } + list_add_tail(&napi->poll_list, &sd->poll_list); __raise_softirq_irqoff(NET_RX_SOFTIRQ); } @@ -6697,6 +6733,27 @@ static void init_gro_hash(struct napi_struct *napi) napi->gro_bitmask = 0; } +static int napi_set_threaded(struct napi_struct *n, bool threaded) +{ + int err = 0; + + if (threaded == !!test_bit(NAPI_STATE_THREADED, &n->state)) + return 0; + if (threaded) { + if (!n->thread) { + err = napi_kthread_create(n); + if (err) + goto out; + } + set_bit(NAPI_STATE_THREADED, &n->state); + } else { + clear_bit(NAPI_STATE_THREADED, &n->state); + } + +out: + return err; +} + void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) { @@ -6738,12 +6795,23 @@ void napi_disable(struct napi_struct *n) msleep(1); hrtimer_cancel(&n->timer); + napi_kthread_stop(n); clear_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state); clear_bit(NAPI_STATE_DISABLE, &n->state); } EXPORT_SYMBOL(napi_disable); +void napi_enable(struct napi_struct *n) +{ + BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); + smp_mb__before_atomic(); + clear_bit(NAPI_STATE_SCHED, &n->state); + clear_bit(NAPI_STATE_NPSVC, &n->state); + WARN_ON(napi_set_threaded(n, n->dev->threaded)); +} +EXPORT_SYMBOL(napi_enable); + static void flush_gro_hash(struct napi_struct *napi) { int i; @@ -6866,6 +6934,51 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) return work; } +static int napi_thread_wait(struct napi_struct *napi) +{ + set_current_state(TASK_INTERRUPTIBLE); + + while (!kthread_should_stop() && !napi_disable_pending(napi)) { + if (test_bit(NAPI_STATE_SCHED, &napi->state)) { + WARN_ON(!list_empty(&napi->poll_list)); + __set_current_state(TASK_RUNNING); + return 0; + } + + schedule(); + set_current_state(TASK_INTERRUPTIBLE); + } + __set_current_state(TASK_RUNNING); + return -1; +} + +static int napi_threaded_poll(void *data) +{ + struct napi_struct *napi = data; + void *have; + + while (!napi_thread_wait(napi)) { + for (;;) { + bool repoll = false; + + local_bh_disable(); + + have = netpoll_poll_lock(napi); + __napi_poll(napi, &repoll); + netpoll_poll_unlock(have); + + __kfree_skb_flush(); + local_bh_enable(); + + if (!repoll) + break; + + cond_resched(); + } + } + return 0; +} + static __latent_entropy void net_rx_action(struct softirq_action *h) { struct softnet_data *sd = this_cpu_ptr(&softnet_data); From patchwork Fri Jan 15 00:31:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 12021153 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92BA3C4332B for ; Fri, 15 Jan 2021 00:32:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7598223AC2 for ; Fri, 15 Jan 2021 00:32:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731336AbhAOAcQ (ORCPT ); Thu, 14 Jan 2021 19:32:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731320AbhAOAcM (ORCPT ); Thu, 14 Jan 2021 19:32:12 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10E12C0613CF for ; Thu, 14 Jan 2021 16:31:32 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id m203so3606662ybf.1 for ; Thu, 14 Jan 2021 16:31:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=OecE8mJTuQ3sRppsTjqxIwKDLyJK9rjkn/l7EUdvJ90=; b=GNQ9+peAbyKPOUtsz5fz0hw4He2lMOo5Py/caCyjOGtOQNN6+klQ7Tfsq/irZqD82A 8QssErXshJHecMrQle53pt/Z0Jhcom3HfLag1hTInDNB4WyE6ksQF2VMak8/CtKBw1PO tWYSDbvc2N0NAWBdBRTeZRuYI6mf0eNGH8HM+JaCQzVeQSO9q8+u+VZqSB7qWG9OVYdx PjG0syacSnHBOy2Qf3CHym6iSHZJ14untSvttmRK7qcG2xoTTSV76wu7IMjyLwhoH4gJ rxHyPKeDdh+e1RhssK4rPKbR+0W0wEulmdaVVN/sj90ntwtgKKte6Cvzg04/zK+RnArG JCqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=OecE8mJTuQ3sRppsTjqxIwKDLyJK9rjkn/l7EUdvJ90=; b=bCmq9pFj3CywhGmRKVZYWtJUb3A71QLgAslIlDBqsawmVrRcSylArThnwbMa5znoRW BrxhR4ekg9hqnTUQWrPPAztUCuMl+2x8BPcvNjOg2PZpwWnmzOKBoH555KbQhucz8ll9 R5l6nYtV+xu9bTeEXFp7ltJXFZZ1sYYzrIO2Z9EIOi1OQXokNCk9cjyicc5J3ZbbdeT8 lbztBtQhxLsUgDqkO0CEJkAVAiIaH6x94aAyk33qVFAkgwlUJvQDHfzGMAAHa6a04iEK 76pVYt4pH48FjWCQaPGj03WG5w7AYbxo0yofm2/MqfmwlY/Oqen88aVkEzXiwdb+kW1T aFOQ== X-Gm-Message-State: AOAM530jwrkKhbm6RI3Wnz7/eXnVoeK2AtKK/I/jo0M9U/TqqhjRHqL9 nrxemmvBTII1mUur9ksQNQwxhIorebo= X-Google-Smtp-Source: ABdhPJxAOj7Zjw5sxXV+Q7v7q4gvisFltRV54G4tkGqPbkbGQvfSK8w5smlhHLOhZ0db0zugsvjSfEaCDs0= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a25:743:: with SMTP id 64mr13826074ybh.333.1610670691335; Thu, 14 Jan 2021 16:31:31 -0800 (PST) Date: Thu, 14 Jan 2021 16:31:23 -0800 In-Reply-To: <20210115003123.1254314-1-weiwan@google.com> Message-Id: <20210115003123.1254314-4-weiwan@google.com> Mime-Version: 1.0 References: <20210115003123.1254314-1-weiwan@google.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7-goog Subject: [PATCH net-next v6 3/3] net: add sysfs attribute to control napi threaded mode From: Wei Wang To: David Miller , netdev@vger.kernel.org, Jakub Kicinski Cc: Eric Dumazet , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This patch adds a new sysfs attribute to the network device class. Said attribute provides a per-device control to enable/disable the threaded mode for all the napi instances of the given network device. Co-developed-by: Paolo Abeni Signed-off-by: Paolo Abeni Co-developed-by: Hannes Frederic Sowa Signed-off-by: Hannes Frederic Sowa Co-developed-by: Felix Fietkau Signed-off-by: Felix Fietkau Signed-off-by: Wei Wang --- include/linux/netdevice.h | 2 ++ net/core/dev.c | 28 +++++++++++++++++ net/core/net-sysfs.c | 63 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 93 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index c24ed232c746..11ae0c9b9350 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -497,6 +497,8 @@ static inline bool napi_complete(struct napi_struct *n) return napi_complete_done(n, 0); } +int dev_set_threaded(struct net_device *dev, bool threaded); + /** * napi_disable - prevent NAPI from scheduling * @n: NAPI context diff --git a/net/core/dev.c b/net/core/dev.c index edcfec1361e9..d5fb95316ea8 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6754,6 +6754,34 @@ static int napi_set_threaded(struct napi_struct *n, bool threaded) return err; } +static void dev_disable_threaded_all(struct net_device *dev) +{ + struct napi_struct *napi; + + list_for_each_entry(napi, &dev->napi_list, dev_list) + napi_set_threaded(napi, false); +} + +int dev_set_threaded(struct net_device *dev, bool threaded) +{ + struct napi_struct *napi; + int ret; + + dev->threaded = threaded; + list_for_each_entry(napi, &dev->napi_list, dev_list) { + ret = napi_set_threaded(napi, threaded); + if (ret) { + /* Error occurred on one of the napi, + * reset threaded mode on all napi. + */ + dev_disable_threaded_all(dev); + break; + } + } + + return ret; +} + void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) { diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index daf502c13d6d..2017f8f07b8d 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -538,6 +538,68 @@ static ssize_t phys_switch_id_show(struct device *dev, } static DEVICE_ATTR_RO(phys_switch_id); +static ssize_t threaded_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct net_device *netdev = to_net_dev(dev); + struct napi_struct *n; + bool enabled; + int ret; + + if (!rtnl_trylock()) + return restart_syscall(); + + if (!dev_isalive(netdev)) { + ret = -EINVAL; + goto unlock; + } + + if (list_empty(&netdev->napi_list)) { + ret = -EOPNOTSUPP; + goto unlock; + } + + /* Only return true if all napi have threaded mode. + * The inconsistency could happen when the device driver calls + * napi_disable()/napi_enable() with dev->threaded set to true, + * but napi_kthread_create() fails. + * We return false in this case to remind the user that one or + * more napi did not have threaded mode enabled properly. + */ + list_for_each_entry(n, &netdev->napi_list, dev_list) { + enabled = !!test_bit(NAPI_STATE_THREADED, &n->state); + if (!enabled) + break; + } + + ret = sprintf(buf, fmt_dec, enabled); + +unlock: + rtnl_unlock(); + return ret; +} + +static int modify_napi_threaded(struct net_device *dev, unsigned long val) +{ + struct napi_struct *napi; + int ret; + + if (list_empty(&dev->napi_list)) + return -EOPNOTSUPP; + + ret = dev_set_threaded(dev, !!val); + + return ret; +} + +static ssize_t threaded_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + return netdev_store(dev, attr, buf, len, modify_napi_threaded); +} +static DEVICE_ATTR_RW(threaded); + static struct attribute *net_class_attrs[] __ro_after_init = { &dev_attr_netdev_group.attr, &dev_attr_type.attr, @@ -570,6 +632,7 @@ static struct attribute *net_class_attrs[] __ro_after_init = { &dev_attr_proto_down.attr, &dev_attr_carrier_up_count.attr, &dev_attr_carrier_down_count.attr, + &dev_attr_threaded.attr, NULL, }; ATTRIBUTE_GROUPS(net_class);