From patchwork Wed Feb 5 00:10:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samiullah Khawaja X-Patchwork-Id: 13960232 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A380173 for ; Wed, 5 Feb 2025 00:10:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714257; cv=none; b=mk0g/fHFZflNiiCL5bmAeCHFHhODcx+ri2RBf4q1qT3B5OPmc9Aj0YtOR2wabeZFygyGwm2yNiLQVHya2cXlqxgIc0qk+BbioedW3a5S+tyqKwgXv895aedk5XD9de3i8Yrn1Len2PhuT9aB5PeMzZCezrgSZjcnMMUPUTk43rw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714257; c=relaxed/simple; bh=OLxBWuMtx2j+M5ovKxsDKNL+5xuxny2f+qy2VU/2IcM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RQAK/zSHFXZqJwQ1KFVUUNrzs5EMBCvjtw71gO++kl+HsBAT70D9hfYp2MylIyx9C50HVyM049VM/Xrn9ChXKjPNLE3p6kvTGme0V8Hlsabzl7YXauxtSQhpi5ZF4/ZA6LDTulPgmV31seEioaW2ebPNkK+eYbJfHH21Yflnhqg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=4B+bMqbE; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4B+bMqbE" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2165433e229so138891695ad.1 for ; Tue, 04 Feb 2025 16:10:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738714255; x=1739319055; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bqYgWXXqNFnskflJl/nYbOag3BEs1qo+1KlWIC2iDTQ=; b=4B+bMqbEi2bVZXGkhn+DwJFwayjx2SgQPU1s6KFzdKH8tFqw5+sqSqk4bDGVi2ln2C E51MEh1gi8It7yjy64bCO5NF/5PQ5vAuAiP7y0kvVMoXx6KHK5ND+1pKLvNDMXau1gM5 u59Oa0R6wujGmvheEL/Ir2168GkJBSfnXSd0yGpt/dwXN8H7ZjGjwdwzTFIyhTcJnN1P HTiU/C0JAz2XnGizoGOloc2grMohSmRp6hxlgY6UHB0Rar8Zn+i5/MS6DQ5wu5uFASes t2+RCHtp/RlHDvAQadq2iyG10cV8VrSzFk4SwP4eu3BR/PZvGfASBqa6tlVeokOs9W61 wq5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738714255; x=1739319055; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bqYgWXXqNFnskflJl/nYbOag3BEs1qo+1KlWIC2iDTQ=; b=p9RqkToVQt1jdlDlz1AN57PjnU7AVRyXizy4DceuUbBs9aOPSzPSyMZGklOSkGBK2E UKdLfQi9DGMmkopZEiZRAsKcHKA0QC4NQ41frF08vXqFzXPqfVNda2sjPYrUFP2xKVP/ 1KuDKkX6VSGUWPcOrVfxIyBoJ1l3nOJfxxnZsu3XM0Gf+GBGGSdGdXp0A9LQXNJULXyt dhZ04EZPSUFd/5m5jvdn1lZeGC6YtFJyPGLqTXUOp/kO/nbeRKECCZOWpO+mJJrucM98 FbB5Wk+3Sqvj4V90bbH+77dBtOE31mqwgv+fJLuzpx6uC9agM/9a9N+1OsVe5x6EDWFS F4Cw== X-Gm-Message-State: AOJu0YzrBbAH5m1ncAD3SfG1hwr8DVN6OV+sYcGwpPKc27QFJ5yopuMq vpQtm/Ej5Tky4/xt1pFaX1CygcQaDkN1lJPChfbl6nfumiRGUrp3it+4pJJsJDiSmd1Yfu1kzHz K3ku9rfiC2g== X-Google-Smtp-Source: AGHT+IHPHx/xCoQgsoowBgqGWc4DZUWKMwSERm9iODPEl5bb7IT28JnuAD0o94g66wu3BJZ2Y0gfLwIviFw/NA== X-Received: from pga24.prod.google.com ([2002:a05:6a02:4f98:b0:ac8:c775:4d06]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ce07:b0:216:414e:aa53 with SMTP id d9443c01a7336-21f17f01ac1mr14372565ad.52.1738714255499; Tue, 04 Feb 2025 16:10:55 -0800 (PST) Date: Wed, 5 Feb 2025 00:10:49 +0000 In-Reply-To: <20250205001052.2590140-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250205001052.2590140-1-skhawaja@google.com> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog Message-ID: <20250205001052.2590140-2-skhawaja@google.com> Subject: [PATCH net-next v3 1/4] Add support to set napi threaded for individual napi From: Samiullah Khawaja To: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni , almasrymina@google.com Cc: netdev@vger.kernel.org, skhawaja@google.com X-Patchwork-Delegate: kuba@kernel.org A net device has a threaded sysctl that can be used to enable threaded napi polling on all of the NAPI contexts under that device. Allow enabling threaded napi polling at individual napi level using netlink. Extend the netlink operation `napi-set` and allow setting the threaded attribute of a NAPI. This will enable the threaded polling on a napi context. Tested using following command in qemu/virtio-net: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --do napi-set --json '{"id": 66, "threaded": 1}' Signed-off-by: Samiullah Khawaja --- Documentation/netlink/specs/netdev.yaml | 10 ++++++++ Documentation/networking/napi.rst | 13 ++++++++++- include/linux/netdevice.h | 10 ++++++++ include/uapi/linux/netdev.h | 1 + net/core/dev.c | 31 +++++++++++++++++++++++++ net/core/netdev-genl-gen.c | 5 ++-- net/core/netdev-genl.c | 9 +++++++ tools/include/uapi/linux/netdev.h | 1 + 8 files changed, 77 insertions(+), 3 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index cbb544bd6c84..785240d60df6 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -268,6 +268,14 @@ attribute-sets: doc: The timeout, in nanoseconds, of how long to suspend irq processing, if event polling finds events type: uint + - + name: threaded + doc: Whether the napi is configured to operate in threaded polling + mode. If this is set to `1` then the NAPI context operates + in threaded polling mode. + type: u32 + checks: + max: 1 - name: queue attributes: @@ -659,6 +667,7 @@ operations: - defer-hard-irqs - gro-flush-timeout - irq-suspend-timeout + - threaded dump: request: attributes: @@ -711,6 +720,7 @@ operations: - defer-hard-irqs - gro-flush-timeout - irq-suspend-timeout + - threaded kernel-family: headers: [ "linux/list.h"] diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst index f970a2be271a..73c83b4533dc 100644 --- a/Documentation/networking/napi.rst +++ b/Documentation/networking/napi.rst @@ -413,7 +413,18 @@ dependent). The NAPI instance IDs will be assigned in the opposite order than the process IDs of the kernel threads. Threaded NAPI is controlled by writing 0/1 to the ``threaded`` file in -netdev's sysfs directory. +netdev's sysfs directory. It can also be enabled for a specific napi using +netlink interface. + +For example, using the script: + +.. code-block:: bash + + $ kernel-source/tools/net/ynl/pyynl/cli.py \ + --spec Documentation/netlink/specs/netdev.yaml \ + --do napi-set \ + --json='{"id": 66, + "threaded": 1}' .. rubric:: Footnotes diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 2a59034a5fa2..a0e485722ed9 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -352,6 +352,7 @@ struct napi_config { u64 gro_flush_timeout; u64 irq_suspend_timeout; u32 defer_hard_irqs; + bool threaded; unsigned int napi_id; }; @@ -572,6 +573,15 @@ static inline bool napi_complete(struct napi_struct *n) int dev_set_threaded(struct net_device *dev, bool threaded); +/* + * napi_set_threaded - set napi threaded state + * @napi: NAPI context + * @threaded: whether this napi does threaded polling + * + * Return 0 on success and negative errno on failure. + */ +int napi_set_threaded(struct napi_struct *napi, bool threaded); + void napi_disable(struct napi_struct *n); void napi_disable_locked(struct napi_struct *n); diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index e4be227d3ad6..829648b2ef65 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -125,6 +125,7 @@ enum { NETDEV_A_NAPI_DEFER_HARD_IRQS, NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT, NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT, + NETDEV_A_NAPI_THREADED, __NETDEV_A_NAPI_MAX, NETDEV_A_NAPI_MAX = (__NETDEV_A_NAPI_MAX - 1) diff --git a/net/core/dev.c b/net/core/dev.c index c0021cbd28fc..50fb234dd7a0 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6787,6 +6787,30 @@ static void init_gro_hash(struct napi_struct *napi) napi->gro_bitmask = 0; } +int napi_set_threaded(struct napi_struct *napi, bool threaded) +{ + if (napi->dev->threaded) + return -EINVAL; + + if (threaded) { + if (!napi->thread) { + int err = napi_kthread_create(napi); + + if (err) + return err; + } + } + + if (napi->config) + napi->config->threaded = threaded; + + /* Make sure kthread is created before THREADED bit is set. */ + smp_mb__before_atomic(); + assign_bit(NAPI_STATE_THREADED, &napi->state, threaded); + + return 0; +} + int dev_set_threaded(struct net_device *dev, bool threaded) { struct napi_struct *napi; @@ -6798,6 +6822,11 @@ int dev_set_threaded(struct net_device *dev, bool threaded) return 0; if (threaded) { + /* Check if threaded is set at napi level already */ + list_for_each_entry(napi, &dev->napi_list, dev_list) + if (test_bit(NAPI_STATE_THREADED, &napi->state)) + return -EINVAL; + list_for_each_entry(napi, &dev->napi_list, dev_list) { if (!napi->thread) { err = napi_kthread_create(napi); @@ -6880,6 +6909,8 @@ static void napi_restore_config(struct napi_struct *n) napi_hash_add(n); n->config->napi_id = n->napi_id; } + + napi_set_threaded(n, n->config->threaded); } static void napi_save_config(struct napi_struct *n) diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index 996ac6a449eb..a1f80e687f53 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -92,11 +92,12 @@ static const struct nla_policy netdev_bind_rx_nl_policy[NETDEV_A_DMABUF_FD + 1] }; /* NETDEV_CMD_NAPI_SET - do */ -static const struct nla_policy netdev_napi_set_nl_policy[NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT + 1] = { +static const struct nla_policy netdev_napi_set_nl_policy[NETDEV_A_NAPI_THREADED + 1] = { [NETDEV_A_NAPI_ID] = { .type = NLA_U32, }, [NETDEV_A_NAPI_DEFER_HARD_IRQS] = NLA_POLICY_FULL_RANGE(NLA_U32, &netdev_a_napi_defer_hard_irqs_range), [NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT] = { .type = NLA_UINT, }, [NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT] = { .type = NLA_UINT, }, + [NETDEV_A_NAPI_THREADED] = NLA_POLICY_MAX(NLA_U32, 1), }; /* Ops table for netdev */ @@ -187,7 +188,7 @@ static const struct genl_split_ops netdev_nl_ops[] = { .cmd = NETDEV_CMD_NAPI_SET, .doit = netdev_nl_napi_set_doit, .policy = netdev_napi_set_nl_policy, - .maxattr = NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT, + .maxattr = NETDEV_A_NAPI_THREADED, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, }; diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 715f85c6b62e..208c3dd768ec 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -183,6 +183,9 @@ netdev_nl_napi_fill_one(struct sk_buff *rsp, struct napi_struct *napi, if (napi->irq >= 0 && nla_put_u32(rsp, NETDEV_A_NAPI_IRQ, napi->irq)) goto nla_put_failure; + if (nla_put_u32(rsp, NETDEV_A_NAPI_THREADED, !!napi->thread)) + goto nla_put_failure; + if (napi->thread) { pid = task_pid_nr(napi->thread); if (nla_put_u32(rsp, NETDEV_A_NAPI_PID, pid)) @@ -321,8 +324,14 @@ netdev_nl_napi_set_config(struct napi_struct *napi, struct genl_info *info) { u64 irq_suspend_timeout = 0; u64 gro_flush_timeout = 0; + u32 threaded = 0; u32 defer = 0; + if (info->attrs[NETDEV_A_NAPI_THREADED]) { + threaded = nla_get_u32(info->attrs[NETDEV_A_NAPI_THREADED]); + napi_set_threaded(napi, !!threaded); + } + if (info->attrs[NETDEV_A_NAPI_DEFER_HARD_IRQS]) { defer = nla_get_u32(info->attrs[NETDEV_A_NAPI_DEFER_HARD_IRQS]); napi_set_defer_hard_irqs(napi, defer); diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index e4be227d3ad6..829648b2ef65 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -125,6 +125,7 @@ enum { NETDEV_A_NAPI_DEFER_HARD_IRQS, NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT, NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT, + NETDEV_A_NAPI_THREADED, __NETDEV_A_NAPI_MAX, NETDEV_A_NAPI_MAX = (__NETDEV_A_NAPI_MAX - 1) From patchwork Wed Feb 5 00:10:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samiullah Khawaja X-Patchwork-Id: 13960233 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 878E2802 for ; Wed, 5 Feb 2025 00:10:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714259; cv=none; b=AYTOayhmm8jgMt/dkxtY2iFhWNugX4AvwVE/bT+h/3jk9tSXdS1LkjYSag9/SbTu174wBAu8+kgPnchslYqBp+JJ3jnc2L9v6tsB0DvfyeCdKdx1drIzqn9A2euHsWD7GSSCTjrEWgqp4px19pynCFOMdSME9X4rowUfVCmZLOA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714259; c=relaxed/simple; bh=3pBkuUxg/nYeSzKcAUj65vrufUT9p0wbFt/OGJy0oiI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=MgIZAx54EBtN3xsQMIswowGyca+Kpcj4rLWcHgG0KDae6Ds/J7TBpDOzcYSQjMY8XF7tpYGB6ZR/AKaPRoFe3w+rfRwRVNtlI1FoO9NFdSfpbI4lIJjKAJ967ZP13TnvmBkIDsXCRCEUYa87sKUlLJxEdS0hx2bgXC4jtWN+QzE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=N3lR+Dbt; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="N3lR+Dbt" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-216387ddda8so135884685ad.3 for ; Tue, 04 Feb 2025 16:10:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738714257; x=1739319057; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NFFdGkl893I80ON5edlJJ5zUbvDxkjqJ3KmjYhJCAGU=; b=N3lR+Dbt1nJIyFWtvALaAnaYJhDiUCJj6EyUx16OTYvFsAQ2Ltm0VqrDLT46LhjLcU 2rahQ9lpqkFu2pdkGfEVyv+z742uOeHfhEAkoLnzSVcgO3P2C7zcPQ+ZhzySWuHsJk3S czsF+AslTWiQMVRL8ZBTdqv/lPBIK0+7vCw00m6YspEiOhJXBduh0NB/s37kx3cQn8we y7Nu3pQ0gz2hu84cgSKSsqzgOgklGFdRCi5Yl+CcgJQCGyrxhnnrlCx3ZnEIEeRJdIZh w51ib+6Ym7g/koDoR2ZeJggPpfiQ45S9pEHtIYjwDSrt2wIMtgdD089wgV8TzlcbBh9+ OT0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738714257; x=1739319057; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NFFdGkl893I80ON5edlJJ5zUbvDxkjqJ3KmjYhJCAGU=; b=WWk4238ayUHTm9PRsHthL92njxLoctykVaZWozwhqMXLJlrIWNzlMBxBSkpQxgKU3n Po2VN2wCMU6REL6k/vgullL6A/4i1CAw2JXHrIP3ocx/st09NEBYNycKd71W9FfhAdyV BBuBKZY4CqMtLyhhIWXmGcUQ8GDeqCl3hx8yniNqeUzUCc7RZfiXfbgAutes098cLXYD 1DBqlMsqXpz1C0nJ3X8Q1Iyc+bET1d/Q7+vmhF8Qd5Et0AjaiY2amsAC7+JlmyRkvKcE SnSNcnRNQlyD8eEFnriQ5Qp9wd0JTKxVzGKSM3bBT7aWUOzRfGNVbe1CQoXlsa8XDTPL nt7w== X-Gm-Message-State: AOJu0Yx0J4fqJIdV88GKKwcw1hltIEhTT22XYoBtwJ0/IpwkeUHspEq4 0pZpbJ9mzmJ2mSTNhxtp51+R0iWAtwRWw2zIU+40JA5JtY0Kfm8vys8PXTyQ3H53izSUEuT17T8 VZJ7uwW8+CQ== X-Google-Smtp-Source: AGHT+IFzOoZQsgT1Z0EhMGMwfCJnmtJYJoeOn//kq4CLQ/v0vVVXjNfMga2xoi2W8DnXgq+hEJyT5F4xd/vkkg== X-Received: from pjbli13.prod.google.com ([2002:a17:90b:48cd:b0:2f5:4762:e778]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ced1:b0:216:69ca:7714 with SMTP id d9443c01a7336-21f17e2bfecmr13651095ad.11.1738714256764; Tue, 04 Feb 2025 16:10:56 -0800 (PST) Date: Wed, 5 Feb 2025 00:10:50 +0000 In-Reply-To: <20250205001052.2590140-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250205001052.2590140-1-skhawaja@google.com> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog Message-ID: <20250205001052.2590140-3-skhawaja@google.com> Subject: [PATCH net-next v3 2/4] net: Create separate gro_flush helper function From: Samiullah Khawaja To: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni , almasrymina@google.com Cc: netdev@vger.kernel.org, skhawaja@google.com X-Patchwork-Delegate: kuba@kernel.org Move multiple copies of same code snippet doing `gro_flush` and `gro_normal_list` into a separate helper function. Signed-off-by: Samiullah Khawaja --- net/core/dev.c | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 50fb234dd7a0..d5dcf9dd6225 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6484,6 +6484,17 @@ static void skb_defer_free_flush(struct softnet_data *sd) } } +static void __napi_gro_flush_helper(struct napi_struct *napi) +{ + if (napi->gro_bitmask) { + /* flush too old packets + * If HZ < 1000, flush all packets. + */ + napi_gro_flush(napi, HZ >= 1000); + } + gro_normal_list(napi); +} + #if defined(CONFIG_NET_RX_BUSY_POLL) static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) @@ -6494,14 +6505,8 @@ static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) return; } - if (napi->gro_bitmask) { - /* flush too old packets - * If HZ < 1000, flush all packets. - */ - napi_gro_flush(napi, HZ >= 1000); - } + __napi_gro_flush_helper(napi); - gro_normal_list(napi); clear_bit(NAPI_STATE_SCHED, &napi->state); } @@ -7170,14 +7175,7 @@ static int __napi_poll(struct napi_struct *n, bool *repoll) return work; } - if (n->gro_bitmask) { - /* flush too old packets - * If HZ < 1000, flush all packets. - */ - napi_gro_flush(n, HZ >= 1000); - } - - gro_normal_list(n); + __napi_gro_flush_helper(n); /* Some drivers may have called napi_schedule * prior to exhausting their budget. From patchwork Wed Feb 5 00:10:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samiullah Khawaja X-Patchwork-Id: 13960234 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1ADB33D8 for ; Wed, 5 Feb 2025 00:10:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714260; cv=none; b=G5oHjQ8oHKFgY4axtF2VUpIRdRfBp9Q1fetbmazYzsuMKhqDkDi+nXHoEotTy1IcmlPPhzYR4HzD0sCJCRNqwTACiT4cXrIMLJ5RzMZMLRWmKTFH0Ynz+vSqmQcyIWrPDHol4UrmOXe6X0KYXJfX6RVIPrBDwbQIXIQfbWRU8zo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714260; c=relaxed/simple; bh=tQnET/IuQr+enIbUwXJWtSGWpWzV7XpwcQgzo76nwJY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=mlWzdCI/7yTmoAUtsS7WGpAyzJ/dPmSHFLaso1QXkVptFIrocE6rDNyRg0v8pIfFebvKw1X7veu7tr+xY5c28J9PN1NMOAxhmNc9Gk6tsvMkpVktJymjcBbmfYEFKBbQVHvm+aq6iCjFJ7D0abHNXnI4G0WOveSjph0h7hWLGUc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=c7rkBCLy; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="c7rkBCLy" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21648c8601cso118525755ad.2 for ; Tue, 04 Feb 2025 16:10:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738714258; x=1739319058; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RepckqquWV6jnHyzsOLEyp/E9ypnuaZTtTKUZxQ71CQ=; b=c7rkBCLy0Xjh7YA7hL0yx0ZC2bmjLsQJzFoZXsVcEz3Kcy3Nea2RiEJPMXzP5hq+YT cP0AV4dzfsbVAW0nlmyeCBPd9bnFvMZ3T3ruXcaI669If8OtnoKqA9WLm4cKG2BZsfNk yewTOqdin0+yG9ONQ09J6RPbRtdzB/2al3okWDqfRC0oUZctNVG2xZUUR0GKO4rWo2no aNS5I0ZWc/1dzQQUdhcvMQu9jToNe1KZUaQFsWWuEudRZ9EibPJq3vF7f/YW1iNeJCpO 8S3dSzJWy/K0KpM5UdvXQXkEJLBMBo3/nuEgcEGuSb+pxEDtF2TyWvhMLELXE7PXVAuz r5OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738714258; x=1739319058; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RepckqquWV6jnHyzsOLEyp/E9ypnuaZTtTKUZxQ71CQ=; b=jgJNwaoyzA4lZdCFAIlXWeTHe1/Gq7wzN3INk//VYlDLFrUXj6HA4VjLjkVgjfTggo kvGv+sk8Ixth7LldhrY/yst5IyZmxD6/B1Ut6yNqr7xnNow34QLMXexuj9VSGKa6ApmH PRgWFCbylEQ3mgo6lFGDrQIfOMI64xiPhQk8YidIfZP+gOPOMXn3geuPxrFv4qOGbcJ9 B2qAqlUQsKh+uklX4xXa+jVJ9ozWkt3Q5ArjYH+GZS1LdUY67v3j+ahh1qQSkax4Ph6h n99K3c1t2ayicemNlhdBPtF6ET/TG71lf5nY4AjpdJxzb8019qpphdjfAMB4kC0g66N1 Kz5A== X-Gm-Message-State: AOJu0YwKRr0YMA+Mx2tlQcpshRzJojDMavOrqs/JpWhm/OktU8UXJE6b 535BOGLlbwtTSuJvuCAFGvBULUF98O73Wxw8pcg/AG+umuj5cjouiFWfaGcyqecqzTisVXUKSuy aIIjM9ERqWg== X-Google-Smtp-Source: AGHT+IHZwTIoSaIaivPRhAUuEyjdhku/ejk4gsf+jY+CLGCtm5Z94WPni74x+OR4dn5lIPYrMTblgCMXQ0cseA== X-Received: from pgbfu19.prod.google.com ([2002:a05:6a02:4a93:b0:823:a2ef:e363]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ec92:b0:21f:a02:2c17 with SMTP id d9443c01a7336-21f17ed09eamr10700645ad.45.1738714258151; Tue, 04 Feb 2025 16:10:58 -0800 (PST) Date: Wed, 5 Feb 2025 00:10:51 +0000 In-Reply-To: <20250205001052.2590140-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250205001052.2590140-1-skhawaja@google.com> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog Message-ID: <20250205001052.2590140-4-skhawaja@google.com> Subject: [PATCH net-next v3 3/4] Extend napi threaded polling to allow kthread based busy polling From: Samiullah Khawaja To: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni , almasrymina@google.com Cc: netdev@vger.kernel.org, skhawaja@google.com X-Patchwork-Delegate: kuba@kernel.org Add a new state to napi state enum: - STATE_THREADED_BUSY_POLL Threaded busy poll is enabled/running for this napi. Following changes are introduced in the napi scheduling and state logic: - When threaded busy poll is enabled through sysfs it also enables NAPI_STATE_THREADED so a kthread is created per napi. It also sets NAPI_STATE_THREADED_BUSY_POLL bit on each napi to indicate that we are supposed to busy poll for each napi. - When napi is scheduled with STATE_SCHED_THREADED and associated kthread is woken up, the kthread owns the context. If NAPI_STATE_THREADED_BUSY_POLL and NAPI_SCHED_THREADED both are set then it means that we can busy poll. - To keep busy polling and to avoid scheduling of the interrupts, the napi_complete_done returns false when both SCHED_THREADED and THREADED_BUSY_POLL flags are set. Also napi_complete_done returns early to avoid the STATE_SCHED_THREADED being unset. - If at any point STATE_THREADED_BUSY_POLL is unset, the napi_complete_done will run and unset the SCHED_THREADED bit also. This will make the associated kthread go to sleep as per existing logic. Signed-off-by: Samiullah Khawaja Reviewed-by: Willem de Bruijn Acked-by: Paul Barker --- Documentation/ABI/testing/sysfs-class-net | 3 +- Documentation/netlink/specs/netdev.yaml | 12 ++-- Documentation/networking/napi.rst | 67 ++++++++++++++++- .../net/ethernet/atheros/atl1c/atl1c_main.c | 2 +- drivers/net/ethernet/mellanox/mlxsw/pci.c | 2 +- drivers/net/ethernet/renesas/ravb_main.c | 2 +- drivers/net/wireless/ath/ath10k/snoc.c | 2 +- include/linux/netdevice.h | 20 ++++-- include/uapi/linux/netdev.h | 6 ++ net/core/dev.c | 72 ++++++++++++++++--- net/core/net-sysfs.c | 2 +- net/core/netdev-genl-gen.c | 2 +- net/core/netdev-genl.c | 2 +- tools/include/uapi/linux/netdev.h | 6 ++ 14 files changed, 171 insertions(+), 29 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net index ebf21beba846..15d7d36a8294 100644 --- a/Documentation/ABI/testing/sysfs-class-net +++ b/Documentation/ABI/testing/sysfs-class-net @@ -343,7 +343,7 @@ Date: Jan 2021 KernelVersion: 5.12 Contact: netdev@vger.kernel.org Description: - Boolean value to control the threaded mode per device. User could + Integer value to control the threaded mode per device. User could set this value to enable/disable threaded mode for all napi belonging to this device, without the need to do device up/down. @@ -351,4 +351,5 @@ Description: == ================================== 0 threaded mode disabled for this dev 1 threaded mode enabled for this dev + 2 threaded mode enabled, and busy polling enabled. == ================================== diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 785240d60df6..db3bf1eb9a63 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -78,6 +78,10 @@ definitions: name: qstats-scope type: flags entries: [ queue ] + - + name: napi-threaded + type: enum + entries: [ disable, enable, busy-poll-enable ] attribute-sets: - @@ -271,11 +275,11 @@ attribute-sets: - name: threaded doc: Whether the napi is configured to operate in threaded polling - mode. If this is set to `1` then the NAPI context operates - in threaded polling mode. + mode. If this is set to `enable` then the NAPI context operates + in threaded polling mode. If this is set to `busy-poll-enable` + then the NAPI kthread also does busypolling. type: u32 - checks: - max: 1 + enum: napi-threaded - name: queue attributes: diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst index 73c83b4533dc..f6596573b777 100644 --- a/Documentation/networking/napi.rst +++ b/Documentation/networking/napi.rst @@ -232,7 +232,9 @@ are not well known). Busy polling is enabled by either setting ``SO_BUSY_POLL`` on selected sockets or using the global ``net.core.busy_poll`` and ``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling -also exists. +also exists. Threaded polling of NAPI also has a mode to busy poll for +packets (:ref:`threaded busy polling`) using the same +thread that is used for NAPI processing. epoll-based busy polling ------------------------ @@ -395,6 +397,69 @@ Therefore, setting ``gro_flush_timeout`` and ``napi_defer_hard_irqs`` is the recommended usage, because otherwise setting ``irq-suspend-timeout`` might not have any discernible effect. +.. _threaded_busy_poll: + +Threaded NAPI busy polling +-------------------------- + +Threaded napi allows processing of packets from each NAPI in a kthread in +kernel. Threaded napi busy polling extends this and adds support to do +continuous busy polling of this napi. This can be used to enable busy polling +independent of userspace application or the API (epoll, io_uring, raw sockets) +being used in userspace to process the packets. + +It can be enabled for each NAPI using netlink interface or at device level using +the threaded NAPI sysctl. + +For example, using following script: + +.. code-block:: bash + + $ kernel-source/tools/net/ynl/pyynl/cli.py \ + --spec Documentation/netlink/specs/netdev.yaml \ + --do napi-set \ + --json='{"id": 66, + "threaded": "busy-poll-enable"}' + + +Enabling it for each NAPI allows finer control to enable busy pollling for +only a set of NIC queues which will get traffic with low latency requirements. + +Depending on application requirement, user might want to set affinity of the +kthread that is busy polling each NAPI. User might also want to set priority +and the scheduler of the thread depending on the latency requirements. + +For a hard low-latency application, user might want to dedicate the full core +for the NAPI polling so the NIC queue descriptors are picked up from the queue +as soon as they appear. For more relaxed low-latency requirement, user might +want to share the core with other threads. + +Once threaded busy polling is enabled for a NAPI, PID of the kthread can be +fetched using netlink interface so the affinity, priority and scheduler +configuration can be done. + +For example, following script can be used to fetch the pid: + +.. code-block:: bash + + $ kernel-source/tools/net/ynl/pyynl/cli.py \ + --spec Documentation/netlink/specs/netdev.yaml \ + --do napi-get \ + --json='{"id": 66}' + +This will output something like following, the pid `258` is the PID of the +kthread that is polling this NAPI. + +.. code-block:: bash + + $ {'defer-hard-irqs': 0, + 'gro-flush-timeout': 0, + 'id': 66, + 'ifindex': 2, + 'irq-suspend-timeout': 0, + 'pid': 258, + 'threaded': 'enable'} + .. _threaded: Threaded NAPI diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c index c571614b1d50..513328476770 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -2688,7 +2688,7 @@ static int atl1c_probe(struct pci_dev *pdev, const struct pci_device_id *ent) adapter->mii.mdio_write = atl1c_mdio_write; adapter->mii.phy_id_mask = 0x1f; adapter->mii.reg_num_mask = MDIO_CTRL_REG_MASK; - dev_set_threaded(netdev, true); + dev_set_threaded(netdev, NETDEV_NAPI_THREADED_ENABLE); for (i = 0; i < adapter->rx_queue_count; ++i) netif_napi_add(netdev, &adapter->rrd_ring[i].napi, atl1c_clean_rx); diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c index 5b44c931b660..52f1ffb77a3c 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/pci.c +++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c @@ -156,7 +156,7 @@ static int mlxsw_pci_napi_devs_init(struct mlxsw_pci *mlxsw_pci) } strscpy(mlxsw_pci->napi_dev_rx->name, "mlxsw_rx", sizeof(mlxsw_pci->napi_dev_rx->name)); - dev_set_threaded(mlxsw_pci->napi_dev_rx, true); + dev_set_threaded(mlxsw_pci->napi_dev_rx, NETDEV_NAPI_THREADED_ENABLE); return 0; diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index c9f4976a3527..12e4f68c0c8f 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -3075,7 +3075,7 @@ static int ravb_probe(struct platform_device *pdev) if (info->coalesce_irqs) { netdev_sw_irq_coalesce_default_on(ndev); if (num_present_cpus() == 1) - dev_set_threaded(ndev, true); + dev_set_threaded(ndev, NETDEV_NAPI_THREADED_ENABLE); } /* Network device register */ diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c index d436a874cd5a..52e0de8c3069 100644 --- a/drivers/net/wireless/ath/ath10k/snoc.c +++ b/drivers/net/wireless/ath/ath10k/snoc.c @@ -935,7 +935,7 @@ static int ath10k_snoc_hif_start(struct ath10k *ar) bitmap_clear(ar_snoc->pending_ce_irqs, 0, CE_COUNT_MAX); - dev_set_threaded(ar->napi_dev, true); + dev_set_threaded(ar->napi_dev, NETDEV_NAPI_THREADED_ENABLE); ath10k_core_napi_enable(ar); ath10k_snoc_irq_enable(ar); ath10k_snoc_rx_post(ar); diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index a0e485722ed9..c3069a17fa7e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -352,7 +352,7 @@ struct napi_config { u64 gro_flush_timeout; u64 irq_suspend_timeout; u32 defer_hard_irqs; - bool threaded; + u8 threaded; unsigned int napi_id; }; @@ -410,6 +410,8 @@ enum { NAPI_STATE_PREFER_BUSY_POLL, /* prefer busy-polling over softirq processing*/ NAPI_STATE_THREADED, /* The poll is performed inside its own thread*/ NAPI_STATE_SCHED_THREADED, /* Napi is currently scheduled in threaded mode */ + NAPI_STATE_THREADED_BUSY_POLL, /* The threaded napi poller will busy poll */ + NAPI_STATE_SCHED_THREADED_BUSY_POLL, /* The threaded napi poller is busy polling */ }; enum { @@ -423,8 +425,14 @@ enum { NAPIF_STATE_PREFER_BUSY_POLL = BIT(NAPI_STATE_PREFER_BUSY_POLL), NAPIF_STATE_THREADED = BIT(NAPI_STATE_THREADED), NAPIF_STATE_SCHED_THREADED = BIT(NAPI_STATE_SCHED_THREADED), + NAPIF_STATE_THREADED_BUSY_POLL = BIT(NAPI_STATE_THREADED_BUSY_POLL), + NAPIF_STATE_SCHED_THREADED_BUSY_POLL + = BIT(NAPI_STATE_SCHED_THREADED_BUSY_POLL), }; +#define NAPIF_STATE_THREADED_BUSY_POLL_MASK \ + (NAPIF_STATE_THREADED | NAPIF_STATE_THREADED_BUSY_POLL) + enum gro_result { GRO_MERGED, GRO_MERGED_FREE, @@ -571,16 +579,18 @@ static inline bool napi_complete(struct napi_struct *n) return napi_complete_done(n, 0); } -int dev_set_threaded(struct net_device *dev, bool threaded); +int dev_set_threaded(struct net_device *dev, + enum netdev_napi_threaded threaded); /* * napi_set_threaded - set napi threaded state * @napi: NAPI context - * @threaded: whether this napi does threaded polling + * @threaded: threading mode * * Return 0 on success and negative errno on failure. */ -int napi_set_threaded(struct napi_struct *napi, bool threaded); +int napi_set_threaded(struct napi_struct *napi, + enum netdev_napi_threaded threaded); void napi_disable(struct napi_struct *n); void napi_disable_locked(struct napi_struct *n); @@ -2404,7 +2414,7 @@ struct net_device { struct sfp_bus *sfp_bus; struct lock_class_key *qdisc_tx_busylock; bool proto_down; - bool threaded; + u8 threaded; /* priv_flags_slow, ungrouped to save space */ unsigned long see_all_hwtstamp_requests:1; diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 829648b2ef65..c2a9dbb361f6 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -74,6 +74,12 @@ enum netdev_qstats_scope { NETDEV_QSTATS_SCOPE_QUEUE = 1, }; +enum netdev_napi_threaded { + NETDEV_NAPI_THREADED_DISABLE, + NETDEV_NAPI_THREADED_ENABLE, + NETDEV_NAPI_THREADED_BUSY_POLL_ENABLE, +}; + enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, diff --git a/net/core/dev.c b/net/core/dev.c index d5dcf9dd6225..1964c184ce8a 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -78,6 +78,7 @@ #include #include #include +#include #include #include #include @@ -6403,7 +6404,8 @@ bool napi_complete_done(struct napi_struct *n, int work_done) * the guarantee we will be called later. */ if (unlikely(n->state & (NAPIF_STATE_NPSVC | - NAPIF_STATE_IN_BUSY_POLL))) + NAPIF_STATE_IN_BUSY_POLL | + NAPIF_STATE_SCHED_THREADED_BUSY_POLL))) return false; if (work_done) { @@ -6792,8 +6794,10 @@ static void init_gro_hash(struct napi_struct *napi) napi->gro_bitmask = 0; } -int napi_set_threaded(struct napi_struct *napi, bool threaded) +int napi_set_threaded(struct napi_struct *napi, + enum netdev_napi_threaded threaded) { + unsigned long val; if (napi->dev->threaded) return -EINVAL; @@ -6811,14 +6815,20 @@ int napi_set_threaded(struct napi_struct *napi, bool threaded) /* Make sure kthread is created before THREADED bit is set. */ smp_mb__before_atomic(); - assign_bit(NAPI_STATE_THREADED, &napi->state, threaded); + val = 0; + if (threaded == NETDEV_NAPI_THREADED_BUSY_POLL_ENABLE) + val |= NAPIF_STATE_THREADED_BUSY_POLL; + if (threaded) + val |= NAPIF_STATE_THREADED; + set_mask_bits(&napi->state, NAPIF_STATE_THREADED_BUSY_POLL_MASK, val); return 0; } -int dev_set_threaded(struct net_device *dev, bool threaded) +int dev_set_threaded(struct net_device *dev, enum netdev_napi_threaded threaded) { struct napi_struct *napi; + unsigned long val; int err = 0; netdev_assert_locked_or_invisible(dev); @@ -6826,17 +6836,22 @@ int dev_set_threaded(struct net_device *dev, bool threaded) if (dev->threaded == threaded) return 0; + val = 0; if (threaded) { /* Check if threaded is set at napi level already */ list_for_each_entry(napi, &dev->napi_list, dev_list) if (test_bit(NAPI_STATE_THREADED, &napi->state)) return -EINVAL; + val |= NAPIF_STATE_THREADED; + if (threaded == NETDEV_NAPI_THREADED_BUSY_POLL_ENABLE) + val |= NAPIF_STATE_THREADED_BUSY_POLL; + list_for_each_entry(napi, &dev->napi_list, dev_list) { if (!napi->thread) { err = napi_kthread_create(napi); if (err) { - threaded = false; + threaded = NETDEV_NAPI_THREADED_DISABLE; break; } } @@ -6855,9 +6870,13 @@ int dev_set_threaded(struct net_device *dev, bool threaded) * polled. In this case, the switch between threaded mode and * softirq mode will happen in the next round of napi_schedule(). * This should not cause hiccups/stalls to the live traffic. + * + * Switch to busy_poll threaded napi will occur after the threaded + * napi is scheduled. */ list_for_each_entry(napi, &dev->napi_list, dev_list) - assign_bit(NAPI_STATE_THREADED, &napi->state, threaded); + set_mask_bits(&napi->state, + NAPIF_STATE_THREADED_BUSY_POLL_MASK, val); return err; } @@ -7235,7 +7254,7 @@ static int napi_thread_wait(struct napi_struct *napi) return -1; } -static void napi_threaded_poll_loop(struct napi_struct *napi) +static void napi_threaded_poll_loop(struct napi_struct *napi, bool busy_poll) { struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; struct softnet_data *sd; @@ -7264,22 +7283,53 @@ static void napi_threaded_poll_loop(struct napi_struct *napi) } skb_defer_free_flush(sd); bpf_net_ctx_clear(bpf_net_ctx); + + /* Push the skbs up the stack if busy polling. */ + if (busy_poll) + __napi_gro_flush_helper(napi); local_bh_enable(); - if (!repoll) + /* If busy polling then do not break here because we need to + * call cond_resched and rcu_softirq_qs_periodic to prevent + * watchdog warnings. + */ + if (!repoll && !busy_poll) break; rcu_softirq_qs_periodic(last_qs); cond_resched(); + + if (!repoll) + break; } } static int napi_threaded_poll(void *data) { struct napi_struct *napi = data; + bool busy_poll_sched; + unsigned long val; + bool busy_poll; + + while (!napi_thread_wait(napi)) { + /* Once woken up, this means that we are scheduled as threaded + * napi and this thread owns the napi context, if busy poll + * state is set then we busy poll this napi. + */ + val = READ_ONCE(napi->state); + busy_poll = val & NAPIF_STATE_THREADED_BUSY_POLL; + busy_poll_sched = val & NAPIF_STATE_SCHED_THREADED_BUSY_POLL; + + /* Do not busy poll if napi is disabled. */ + if (unlikely(val & NAPIF_STATE_DISABLE)) + busy_poll = false; + + if (busy_poll != busy_poll_sched) + assign_bit(NAPI_STATE_SCHED_THREADED_BUSY_POLL, + &napi->state, busy_poll); - while (!napi_thread_wait(napi)) - napi_threaded_poll_loop(napi); + napi_threaded_poll_loop(napi, busy_poll); + } return 0; } @@ -12474,7 +12524,7 @@ static void run_backlog_napi(unsigned int cpu) { struct softnet_data *sd = per_cpu_ptr(&softnet_data, cpu); - napi_threaded_poll_loop(&sd->backlog); + napi_threaded_poll_loop(&sd->backlog, false); } static void backlog_napi_setup(unsigned int cpu) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 07cb99b114bd..beb496bcb633 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -657,7 +657,7 @@ static int modify_napi_threaded(struct net_device *dev, unsigned long val) if (list_empty(&dev->napi_list)) return -EOPNOTSUPP; - if (val != 0 && val != 1) + if (val > NETDEV_NAPI_THREADED_BUSY_POLL_ENABLE) return -EOPNOTSUPP; ret = dev_set_threaded(dev, val); diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index a1f80e687f53..b572beba42e7 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -97,7 +97,7 @@ static const struct nla_policy netdev_napi_set_nl_policy[NETDEV_A_NAPI_THREADED [NETDEV_A_NAPI_DEFER_HARD_IRQS] = NLA_POLICY_FULL_RANGE(NLA_U32, &netdev_a_napi_defer_hard_irqs_range), [NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT] = { .type = NLA_UINT, }, [NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT] = { .type = NLA_UINT, }, - [NETDEV_A_NAPI_THREADED] = NLA_POLICY_MAX(NLA_U32, 1), + [NETDEV_A_NAPI_THREADED] = NLA_POLICY_MAX(NLA_U32, 2), }; /* Ops table for netdev */ diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 208c3dd768ec..7ae5f3ed0961 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -329,7 +329,7 @@ netdev_nl_napi_set_config(struct napi_struct *napi, struct genl_info *info) if (info->attrs[NETDEV_A_NAPI_THREADED]) { threaded = nla_get_u32(info->attrs[NETDEV_A_NAPI_THREADED]); - napi_set_threaded(napi, !!threaded); + napi_set_threaded(napi, threaded); } if (info->attrs[NETDEV_A_NAPI_DEFER_HARD_IRQS]) { diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 829648b2ef65..c2a9dbb361f6 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -74,6 +74,12 @@ enum netdev_qstats_scope { NETDEV_QSTATS_SCOPE_QUEUE = 1, }; +enum netdev_napi_threaded { + NETDEV_NAPI_THREADED_DISABLE, + NETDEV_NAPI_THREADED_ENABLE, + NETDEV_NAPI_THREADED_BUSY_POLL_ENABLE, +}; + enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, From patchwork Wed Feb 5 00:10:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samiullah Khawaja X-Patchwork-Id: 13960235 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9685173 for ; Wed, 5 Feb 2025 00:11:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714262; cv=none; b=EvJdMnw//g0MKGZMPNYkyNRHZ2teaMMnDowavU1LykhhgTpb8vBTmiOS0mg8kQREIFqGe59LsBb+oFS3KY+XHPSj2TtRYOnlU9bIRWFiVDZK2oFCTHKESsPQtHW7QQYuE+qXgHQ/QR+BEYBDE4qsE5dCmgxaWP6JUB9BL8mxDUc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738714262; c=relaxed/simple; bh=AiVG/9rO+Mccot2xBfYlRHIcLJPs5TaZbdGL/nCuPQQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=LMqeomN6Jwqm85sPsySv21rsGjTj3bOqu7BgfFpTd7pBT/ckVgy08qmiC2Yyoh7HdlJJz6gwHSWWlji3tB0i6kFB7qJv8azi46pxko4IbLB95JPbH8yM/5KEZR5N7AGhAsfiCGfhPr8GxhKqKy1rccE4O0zZqwCSlEal6tD9lss= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=mzle17QT; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="mzle17QT" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2f9c1b95ed9so3968884a91.2 for ; Tue, 04 Feb 2025 16:11:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738714260; x=1739319060; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=CPbkx13oTcOZX9OHMPOw2/BKbIzs1tWcDA26CEsjMbg=; b=mzle17QTcQ1s56rO+3LwsJzNVkP0Ll0O9Fbao90exYZ+QizSsl4cPZQ8P1CTNG0N3O 5Jh9E8ty4iVKQmT2qFdDmBlYZbkiPLxi/QN5NnC1KGLhnkdFyDYeBd/ubFKqMqnpplCB fpE1RrUcIa6EvNvzsirfBMpjN73+EwHO/FAPRURt9PhiieeF7u5X4e7e1ltCFlcCzwM9 Ht+bKxfFSd2dDL64I9wk+6dh6xBNxPPp9VIuOvUWeb6BFBM1P5F/8U5Urq+3WCPWo356 oBf/sfgRMpeRuQG+FtCkVkBsoh2LTMD1S4B+j4uLzkO3vjHMbxFS+FjI4y+tBwuY6N/q b9tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738714260; x=1739319060; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CPbkx13oTcOZX9OHMPOw2/BKbIzs1tWcDA26CEsjMbg=; b=g8B12xwXC5+8iJj0Wh2PUg0Wr/EIFDmBr3WutpzzrOdoIuSn13g+qXXF8FwD6gKbne pPP/tdR8iQgBi/LmTLYBcro976Bs4PFmPV7OSSMwZVdfaqnFtNCmnxpUV2sNCoc0gKm7 /ekvFQcAU4XJzHIiTHAsnQKvTwtmJ4X3nVFbAsprgASVDPEGHZfDMCKVNPl3f8u2PBM6 /4QO+cS19p18Y2AsxH64Y30h0tz8Q0NrLglPKiA0eAeD22YbzFIQOMIKtOC/2kAkxqzJ XDAR0exNFuvJ/BTmBvt6pv7d8A4Sn+RfXyiS0DZXJ6XvQy6xUEipmxiQHNagG0/QeSsW gZNA== X-Gm-Message-State: AOJu0YyoAmtNMudwzBY96hWOJnfWZeit4a/lYki0++/0lbccjM+yQLeN Wt4ZMRt6Mdwrz2eGwH6bL/tFs0mtZqLbPnYcM+LKKzxHa47yA71NEuPSAmhTVGtNef9QiEV+q3Q XL2aTy7SKdQ== X-Google-Smtp-Source: AGHT+IHMIsxUDhBj7fP3ul7LHaHVQu/zqUFU9rM+5BQOw2IsmEfnhBnwPysVkj/IQ9ubmSR2u5R/u9PxAQy5IQ== X-Received: from pjbqa1.prod.google.com ([2002:a17:90b:4fc1:b0:2f2:e8f5:d7e8]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:33c4:b0:2ee:59af:a432 with SMTP id 98e67ed59e1d1-2f9e085130amr1025814a91.31.1738714260093; Tue, 04 Feb 2025 16:11:00 -0800 (PST) Date: Wed, 5 Feb 2025 00:10:52 +0000 In-Reply-To: <20250205001052.2590140-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250205001052.2590140-1-skhawaja@google.com> X-Mailer: git-send-email 2.48.1.362.g079036d154-goog Message-ID: <20250205001052.2590140-5-skhawaja@google.com> Subject: [PATCH net-next v3 4/4] selftests: Add napi threaded busy poll test in `busy_poller` From: Samiullah Khawaja To: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni , almasrymina@google.com Cc: netdev@vger.kernel.org, skhawaja@google.com X-Patchwork-Delegate: kuba@kernel.org Add testcase to run busy poll test with threaded napi busy poll enabled. Signed-off-by: Samiullah Khawaja --- tools/testing/selftests/net/busy_poll_test.sh | 25 ++++++++++++++++++- tools/testing/selftests/net/busy_poller.c | 14 ++++++++--- 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/net/busy_poll_test.sh b/tools/testing/selftests/net/busy_poll_test.sh index 7db292ec4884..aeca610dc989 100755 --- a/tools/testing/selftests/net/busy_poll_test.sh +++ b/tools/testing/selftests/net/busy_poll_test.sh @@ -27,6 +27,9 @@ NAPI_DEFER_HARD_IRQS=100 GRO_FLUSH_TIMEOUT=50000 SUSPEND_TIMEOUT=20000000 +# NAPI threaded busy poll config +NAPI_THREADED_POLL=2 + setup_ns() { set -e @@ -62,6 +65,9 @@ cleanup_ns() test_busypoll() { suspend_value=${1:-0} + napi_threaded_value=${2:-0} + prefer_busy_poll_value=${3:-$PREFER_BUSY_POLL} + tmp_file=$(mktemp) out_file=$(mktemp) @@ -73,10 +79,11 @@ test_busypoll() -b${SERVER_IP} \ -m${MAX_EVENTS} \ -u${BUSY_POLL_USECS} \ - -P${PREFER_BUSY_POLL} \ + -P${prefer_busy_poll_value} \ -g${BUSY_POLL_BUDGET} \ -i${NSIM_SV_IFIDX} \ -s${suspend_value} \ + -t${napi_threaded_value} \ -o${out_file}& wait_local_port_listen nssv ${SERVER_PORT} tcp @@ -109,6 +116,15 @@ test_busypoll_with_suspend() return $? } +test_busypoll_with_napi_threaded() +{ + # Only enable napi threaded poll. Set suspend timeout and prefer busy + # poll to 0. + test_busypoll 0 ${NAPI_THREADED_POLL} 0 + + return $? +} + ### ### Code start ### @@ -154,6 +170,13 @@ if [ $? -ne 0 ]; then exit 1 fi +test_busypoll_with_napi_threaded +if [ $? -ne 0 ]; then + echo "test_busypoll_with_napi_threaded failed" + cleanup_ns + exit 1 +fi + echo "$NSIM_SV_FD:$NSIM_SV_IFIDX" > $NSIM_DEV_SYS_UNLINK echo $NSIM_CL_ID > $NSIM_DEV_SYS_DEL diff --git a/tools/testing/selftests/net/busy_poller.c b/tools/testing/selftests/net/busy_poller.c index 04c7ff577bb8..f7407f09f635 100644 --- a/tools/testing/selftests/net/busy_poller.c +++ b/tools/testing/selftests/net/busy_poller.c @@ -65,15 +65,16 @@ static uint32_t cfg_busy_poll_usecs; static uint16_t cfg_busy_poll_budget; static uint8_t cfg_prefer_busy_poll; -/* IRQ params */ +/* NAPI params */ static uint32_t cfg_defer_hard_irqs; static uint64_t cfg_gro_flush_timeout; static uint64_t cfg_irq_suspend_timeout; +static enum netdev_napi_threaded cfg_napi_threaded_poll = NETDEV_NAPI_THREADED_DISABLE; static void usage(const char *filepath) { error(1, 0, - "Usage: %s -p -b -m -u -P -g -o -d -r -s -i", + "Usage: %s -p -b -m -u -P -g -o -d -r -s -t -i", filepath); } @@ -86,7 +87,7 @@ static void parse_opts(int argc, char **argv) if (argc <= 1) usage(argv[0]); - while ((c = getopt(argc, argv, "p:m:b:u:P:g:o:d:r:s:i:")) != -1) { + while ((c = getopt(argc, argv, "p:m:b:u:P:g:o:d:r:s:i:t:")) != -1) { /* most options take integer values, except o and b, so reduce * code duplication a bit for the common case by calling * strtoull here and leave bounds checking and casting per @@ -168,6 +169,12 @@ static void parse_opts(int argc, char **argv) cfg_ifindex = (int)tmp; break; + case 't': + if (tmp == ULLONG_MAX || tmp > 2) + error(1, ERANGE, "napi threaded poll value must be 0-2"); + + cfg_napi_threaded_poll = (enum netdev_napi_threaded)tmp; + break; } } @@ -246,6 +253,7 @@ static void setup_queue(void) cfg_gro_flush_timeout); netdev_napi_set_req_set_irq_suspend_timeout(set_req, cfg_irq_suspend_timeout); + netdev_napi_set_req_set_threaded(set_req, cfg_napi_threaded_poll); if (netdev_napi_set(ys, set_req)) error(1, 0, "can't set NAPI params: %s\n", yerr.msg);