From patchwork Thu Jan 2 19:12:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samiullah Khawaja X-Patchwork-Id: 13924892 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFD747E782 for ; Thu, 2 Jan 2025 19:12:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735845152; cv=none; b=LKogbF+yNeB67zUOEX+LkLkH7hFBOhyX042wS2MQk4wa7fHmbcKDqufGVv3gI0JYaUdb6KZWK1C6zhK2VtnSlN+2AxA1peEzmN2dWdA89FFzeFwBSmsG6tqoB9IhilWD00a+9vpkAIyMoAVjMk8/V/iAnBBbgjmXB5onm1+nt0E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735845152; c=relaxed/simple; bh=zo6/8YAQCyIQuXU4/lKWCrSVK2VkXmHdu/PtgZdpa6g=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ce0JnjfccAyen580t0BnLi7tNQKqhwWQ2jLryBw9X3hOQBZIqUw6wCBcxYKmE9BoYWLZmcwEh8nlttPo49nExIH9nomqZZXbFbX9Jq4zP/gKntahEaSrIb43A1abOH/wsIxaNP5Xs0U6NMMAHlLpJR5MjecSNqfxZC2FMby0VDk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=xTLe0AmZ; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="xTLe0AmZ" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2163a2a1ec2so271687175ad.1 for ; Thu, 02 Jan 2025 11:12:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735845150; x=1736449950; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WsLhQBPN9vpFHw6ruKqynGPJwCPsFOx/EOpt8eKOwxk=; b=xTLe0AmZxd3weyfw4Dm/e98tamYSZyv4UCL49HEGHeyLrIObWW4zPtIJWtxnIwDh8r Go3QcqYVG7JeFZT905v/Tsb+2+eMnNQ/gBpUmgTdcSS724dEjmoMMAsFCRq2OiHiI17V Clxe4SOdy+4MMw0j1KgZ/EBZog04MMLJQx5IZ3DSgX/qiZ5I1PnyoScXR4fe30WE6D1A agmSks/1s7olsyPbXdOSfN4PuUvrUOedi77iRjI/x7aAET9bOnaXvKMYvf8r+mHND+3H tka8DaFA1A1o+sfu4U7wRPs+CPTG8e61EYlhHkKo7Ml9+8DKpr/rxnb7o1thzmTbEOfv mmVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735845150; x=1736449950; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WsLhQBPN9vpFHw6ruKqynGPJwCPsFOx/EOpt8eKOwxk=; b=XfDqhvIbTs46gRSG4kv/zjPh5bXGGHziQEnBQ1LN2m6OBgpXMOeC31hhhPNeNp+jmO o/xZ+p3iDxsHw7a6DHWZUzJcBW2icsKPIXcWJeSWxj4GQ+cLk9MlM9PT9dALv8sNAhLc IBDnpKWF2dWwRtNLs7ydIvsCtpPQ42BXbYqUTcSaVd9nCqS/Pzt2P9OOdZdIvVa1ET9i bzFcmNjwzeFqREDXKt8adpLk0u7YYWe+xQSG5PRqWHNHlQQ+I4x8+PJUWZXwXFfOWPLb OERGweeV0XKGMxpxY1SoACCVPUH09376ajhxmI8gGf0Th7hHqlbPj0zHJYvWMrRP0SpE lUYA== X-Gm-Message-State: AOJu0Yw1Ecy7GiBOsb0WFkGcmU3O/a8FJAozXlndbV1Uw+MUMLZXptWK zaqDHNt6mtIXIzRTDfaphTH/gPWZbBi2T5RlSlj2Dtb9ziwK35Puy1yLOsxrOO2dHvpoahgw3/Q sY6B709AP9w== X-Google-Smtp-Source: AGHT+IEs7b2LezGkPyqG6g4+y0GSDtp1RTM7wZ3zQ+o0LDc/6w1z0smL+0/aaGywFM1ThLnqMHa9NXmTYNAEow== X-Received: from pfbmc2.prod.google.com ([2002:a05:6a00:7682:b0:72b:ccb:c99b]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7289:b0:1e0:d4f4:5b39 with SMTP id adf61e73a8af0-1e5e04a0ee8mr77764260637.24.1735845150209; Thu, 02 Jan 2025 11:12:30 -0800 (PST) Date: Thu, 2 Jan 2025 19:12:25 +0000 In-Reply-To: <20250102191227.2084046-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250102191227.2084046-1-skhawaja@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250102191227.2084046-2-skhawaja@google.com> Subject: [PATCH net-next 1/3] Add support to set napi threaded for individual napi From: Samiullah Khawaja To: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni Cc: netdev@vger.kernel.org, skhawaja@google.com X-Patchwork-Delegate: kuba@kernel.org A net device has a threaded sysctl that can be used to enable threaded napi polling on all of the NAPI contexts under that device. Allow enabling threaded napi polling at individual napi level using netlink. Add a new netlink operation `napi-set-threaded` that takes napi `id` and `threaded` attributes. This will enable the threaded polling on napi context. Tested using following command in qemu/virtio-net: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --do napi-set-threaded --json '{"id": 513, "threaded": 1}' Signed-off-by: Samiullah Khawaja Reviewed-by: Willem de Bruijn --- Documentation/netlink/specs/netdev.yaml | 19 +++++++++++++ include/linux/netdevice.h | 9 ++++++ include/uapi/linux/netdev.h | 2 ++ net/core/dev.c | 26 +++++++++++++++++ net/core/netdev-genl-gen.c | 13 +++++++++ net/core/netdev-genl-gen.h | 2 ++ net/core/netdev-genl.c | 37 +++++++++++++++++++++++++ tools/include/uapi/linux/netdev.h | 2 ++ 8 files changed, 110 insertions(+) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index cbb544bd6c84..aac343af7246 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -268,6 +268,14 @@ attribute-sets: doc: The timeout, in nanoseconds, of how long to suspend irq processing, if event polling finds events type: uint + - + name: threaded + doc: Whether the napi is configured to operate in threaded polling + mode. If this is set to `1` then the NAPI context operates + in threaded polling mode. + type: u32 + checks: + max: 1 - name: queue attributes: @@ -659,6 +667,7 @@ operations: - defer-hard-irqs - gro-flush-timeout - irq-suspend-timeout + - threaded dump: request: attributes: @@ -711,6 +720,16 @@ operations: - defer-hard-irqs - gro-flush-timeout - irq-suspend-timeout + - + name: napi-set-threaded + doc: Set threaded napi mode on this napi. + attribute-set: napi + flags: [ admin-perm ] + do: + request: + attributes: + - id + - threaded kernel-family: headers: [ "linux/list.h"] diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 2593019ad5b1..8f531d528869 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -570,6 +570,15 @@ static inline bool napi_complete(struct napi_struct *n) int dev_set_threaded(struct net_device *dev, bool threaded); +/* + * napi_set_threaded - set napi threaded state + * @napi: NAPI context + * @threaded: whether this napi does threaded polling + * + * Return 0 on success and negative errno on failure. + */ +int napi_set_threaded(struct napi_struct *napi, bool threaded); + /** * napi_disable - prevent NAPI from scheduling * @n: NAPI context diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index e4be227d3ad6..cefbb8f39ae7 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -125,6 +125,7 @@ enum { NETDEV_A_NAPI_DEFER_HARD_IRQS, NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT, NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT, + NETDEV_A_NAPI_THREADED, __NETDEV_A_NAPI_MAX, NETDEV_A_NAPI_MAX = (__NETDEV_A_NAPI_MAX - 1) @@ -203,6 +204,7 @@ enum { NETDEV_CMD_QSTATS_GET, NETDEV_CMD_BIND_RX, NETDEV_CMD_NAPI_SET, + NETDEV_CMD_NAPI_SET_THREADED, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) diff --git a/net/core/dev.c b/net/core/dev.c index c7f3dea3e0eb..3c95994323ea 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6628,6 +6628,27 @@ static void init_gro_hash(struct napi_struct *napi) napi->gro_bitmask = 0; } +int napi_set_threaded(struct napi_struct *napi, bool threaded) +{ + if (napi->dev->threaded) + return -EINVAL; + + if (threaded) { + if (!napi->thread) { + int err = napi_kthread_create(napi); + + if (err) + return err; + } + } + + /* Make sure kthread is created before THREADED bit is set. */ + smp_mb__before_atomic(); + assign_bit(NAPI_STATE_THREADED, &napi->state, threaded); + + return 0; +} + int dev_set_threaded(struct net_device *dev, bool threaded) { struct napi_struct *napi; @@ -6637,6 +6658,11 @@ int dev_set_threaded(struct net_device *dev, bool threaded) return 0; if (threaded) { + /* Check if threaded is set at napi level already */ + list_for_each_entry(napi, &dev->napi_list, dev_list) + if (test_bit(NAPI_STATE_THREADED, &napi->state)) + return -EINVAL; + list_for_each_entry(napi, &dev->napi_list, dev_list) { if (!napi->thread) { err = napi_kthread_create(napi); diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index a89cbd8d87c3..93dc74dad6de 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -99,6 +99,12 @@ static const struct nla_policy netdev_napi_set_nl_policy[NETDEV_A_NAPI_IRQ_SUSPE [NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT] = { .type = NLA_UINT, }, }; +/* NETDEV_CMD_NAPI_SET_THREADED - do */ +static const struct nla_policy netdev_napi_set_threaded_nl_policy[NETDEV_A_NAPI_THREADED + 1] = { + [NETDEV_A_NAPI_ID] = { .type = NLA_U32, }, + [NETDEV_A_NAPI_THREADED] = NLA_POLICY_MAX(NLA_U32, 1), +}; + /* Ops table for netdev */ static const struct genl_split_ops netdev_nl_ops[] = { { @@ -190,6 +196,13 @@ static const struct genl_split_ops netdev_nl_ops[] = { .maxattr = NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, + { + .cmd = NETDEV_CMD_NAPI_SET_THREADED, + .doit = netdev_nl_napi_set_threaded_doit, + .policy = netdev_napi_set_threaded_nl_policy, + .maxattr = NETDEV_A_NAPI_THREADED, + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, + }, }; static const struct genl_multicast_group netdev_nl_mcgrps[] = { diff --git a/net/core/netdev-genl-gen.h b/net/core/netdev-genl-gen.h index e09dd7539ff2..00c229569b7a 100644 --- a/net/core/netdev-genl-gen.h +++ b/net/core/netdev-genl-gen.h @@ -34,6 +34,8 @@ int netdev_nl_qstats_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb); int netdev_nl_bind_rx_doit(struct sk_buff *skb, struct genl_info *info); int netdev_nl_napi_set_doit(struct sk_buff *skb, struct genl_info *info); +int netdev_nl_napi_set_threaded_doit(struct sk_buff *skb, + struct genl_info *info); enum { NETDEV_NLGRP_MGMT, diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 2d3ae0cd3ad2..ace22b24be7e 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -186,6 +186,9 @@ netdev_nl_napi_fill_one(struct sk_buff *rsp, struct napi_struct *napi, if (napi->irq >= 0 && nla_put_u32(rsp, NETDEV_A_NAPI_IRQ, napi->irq)) goto nla_put_failure; + if (nla_put_u32(rsp, NETDEV_A_NAPI_THREADED, !!napi->thread)) + goto nla_put_failure; + if (napi->thread) { pid = task_pid_nr(napi->thread); if (nla_put_u32(rsp, NETDEV_A_NAPI_PID, pid)) @@ -311,6 +314,40 @@ int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb) return err; } +int netdev_nl_napi_set_threaded_doit(struct sk_buff *skb, + struct genl_info *info) +{ + struct napi_struct *napi; + u32 napi_threaded; + u32 napi_id; + int err = 0; + + if (GENL_REQ_ATTR_CHECK(info, NETDEV_A_NAPI_ID) || + GENL_REQ_ATTR_CHECK(info, NETDEV_A_NAPI_THREADED)) + return -EINVAL; + + napi_id = nla_get_u32(info->attrs[NETDEV_A_NAPI_ID]); + napi_threaded = nla_get_u32(info->attrs[NETDEV_A_NAPI_THREADED]); + + rtnl_lock(); + + napi = napi_by_id(napi_id); + if (!napi) { + NL_SET_BAD_ATTR(info->extack, info->attrs[NETDEV_A_NAPI_ID]); + err = -ENOENT; + goto napi_set_threaded_failure; + } + + err = napi_set_threaded(napi, napi_threaded); + if (err) + NL_SET_ERR_MSG(info->extack, + "unable to set threaded state of napi"); + +napi_set_threaded_failure: + rtnl_unlock(); + return err; +} + static int netdev_nl_napi_set_config(struct napi_struct *napi, struct genl_info *info) { diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index e4be227d3ad6..cefbb8f39ae7 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -125,6 +125,7 @@ enum { NETDEV_A_NAPI_DEFER_HARD_IRQS, NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT, NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT, + NETDEV_A_NAPI_THREADED, __NETDEV_A_NAPI_MAX, NETDEV_A_NAPI_MAX = (__NETDEV_A_NAPI_MAX - 1) @@ -203,6 +204,7 @@ enum { NETDEV_CMD_QSTATS_GET, NETDEV_CMD_BIND_RX, NETDEV_CMD_NAPI_SET, + NETDEV_CMD_NAPI_SET_THREADED, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) From patchwork Thu Jan 2 19:12:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samiullah Khawaja X-Patchwork-Id: 13924893 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 26F1D1B3944 for ; Thu, 2 Jan 2025 19:12:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735845153; cv=none; b=UiKJ2WlAnQvGDwFp4xHONPKezpa9PjIa05us01PYblebWP/BFoOoRBe3ZRcC3XXQEGjnDHGIjVng9Q5p9lTVCxvdbcjR9ohbRDpaxhmrPUN2DbZP4/JLCi5zgNC2F/7kHl0oVqVnpKTkz62QNnQ7Z26dScmn2py/SzqLat/mRVc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735845153; c=relaxed/simple; bh=wSBhS+IuGPcIZlqRmv6UNiYB7pmIrosa8C99ItnNum0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=uQubNit54fDtoecwY8fr1vQJG6YaBOblRetH4JmQrThT4X0pLRl33Czi213cPOfKKcqTJYI8w2BBPRg/iKhvl0Skp6pQmIjrQTOyCw+PX27ZoKzanFOxv+nZjDeMlGisC7sUfkzjRlIbRcB5zuAXjKM65+UuVYypxJiFg0g3aSE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=waNSsQ6v; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="waNSsQ6v" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ee6dccd3c9so15709169a91.3 for ; Thu, 02 Jan 2025 11:12:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735845151; x=1736449951; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VbTMlPgITTF8zvKL2cw9nlTdPwqwwqp+bMy8kAMrA20=; b=waNSsQ6v42mfCoXvT2zL7YnfGEjxJ9bgATJJ2VGQdjD9nTzrTJmy7/TbPs5R1rB5qR 8WQmHyaWcb4+ktGJpAIUya5CXtOqqZWs9BZ38wkBF26IkLANhATg9pF91bXpmKYQNmJP cIatc0/CVpJU8v2DW501uuyQhaovcpdW6M7wmsDf2OMFWEt9zVuEFI/BzX3PxnAECx4z Iiowq1VTDM0j1+gSWi4+NCD79uzQvze3z2GoCKyxLxvZ3GpAF8iXRbVTAUY5KZpcgk6n k/HnTvvfiDH8q2ersuphQ+fCYiWjAzamj+Yj0olJXM/47p5GsMf+GhJieE40xM6T8/w8 UVUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735845151; x=1736449951; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VbTMlPgITTF8zvKL2cw9nlTdPwqwwqp+bMy8kAMrA20=; b=aeoPf8i73YgCCvrdW7eO2VH7lXbrBSJJBvg0KlFEsuW4M3FRXH6F4MdHkvRNKp/UfR XsfS7ib0ehjC2oQ/9D1Ue8/sDD9x6ZI0CJslGP54u9sNcszpkf+zsYCui6KY1KVoFL1C q3Q4pNEl8cWo1INyRqwnuaZ76ToPd8sHaJ0FXedWybG8RTcc3Cd1G8nOdHUVIt2PDDCM 5Iy4k+SlW9saqbIG/ueQNFBam7JRB9hrFdHhggVDNpmbcc2/r1MjfYYMlg4G1uP2gzGB HzIC6lqjnh+w16l17nR2BIzpzWQeN1x3Xz+yx66KFAjB2LBRNiOk087QBfE954lRHBqY vBVQ== X-Gm-Message-State: AOJu0Yy8P9xqHTSerRfvSReG/qxHxn5CSQMZfCSCLMEW3LIhGPkqgKCb lpC2aISLeOhKvFm5CjyYJOmpfE90JIGgePAhQOIuJ+glbteE5KV9O2hEsr4guuQefIONS1IgZfc 2UMHD8+GwBQ== X-Google-Smtp-Source: AGHT+IFnkgjHPP894grqasbdVPIqtI5Im6T2WMIjDJi3ZHbHk1hxJ3Erydw9LtAshOdIRMtx8xaDODCtpsz08Q== X-Received: from pjtu8.prod.google.com ([2002:a17:90a:c888:b0:2e0:9fee:4b86]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2547:b0:2ee:c6c8:d89f with SMTP id 98e67ed59e1d1-2f452e20d8fmr66453545a91.14.1735845151403; Thu, 02 Jan 2025 11:12:31 -0800 (PST) Date: Thu, 2 Jan 2025 19:12:26 +0000 In-Reply-To: <20250102191227.2084046-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250102191227.2084046-1-skhawaja@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250102191227.2084046-3-skhawaja@google.com> Subject: [PATCH net-next 2/3] net: Create separate gro_flush helper function From: Samiullah Khawaja To: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni Cc: netdev@vger.kernel.org, skhawaja@google.com X-Patchwork-Delegate: kuba@kernel.org Move multiple copies of same code snippet doing `gro_flush` and `gro_normal_list` into a separate helper function. Signed-off-by: Samiullah Khawaja Reviewed-by: Willem de Bruijn --- net/core/dev.c | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 3c95994323ea..762977a62da2 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6325,6 +6325,17 @@ static void skb_defer_free_flush(struct softnet_data *sd) } } +static void __napi_gro_flush_helper(struct napi_struct *napi) +{ + if (napi->gro_bitmask) { + /* flush too old packets + * If HZ < 1000, flush all packets. + */ + napi_gro_flush(napi, HZ >= 1000); + } + gro_normal_list(napi); +} + #if defined(CONFIG_NET_RX_BUSY_POLL) static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) @@ -6335,14 +6346,8 @@ static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) return; } - if (napi->gro_bitmask) { - /* flush too old packets - * If HZ < 1000, flush all packets. - */ - napi_gro_flush(napi, HZ >= 1000); - } + __napi_gro_flush_helper(napi); - gro_normal_list(napi); clear_bit(NAPI_STATE_SCHED, &napi->state); } @@ -6942,14 +6947,7 @@ static int __napi_poll(struct napi_struct *n, bool *repoll) return work; } - if (n->gro_bitmask) { - /* flush too old packets - * If HZ < 1000, flush all packets. - */ - napi_gro_flush(n, HZ >= 1000); - } - - gro_normal_list(n); + __napi_gro_flush_helper(n); /* Some drivers may have called napi_schedule * prior to exhausting their budget. From patchwork Thu Jan 2 19:12:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samiullah Khawaja X-Patchwork-Id: 13924894 X-Patchwork-Delegate: kuba@kernel.org Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97D731B4153 for ; Thu, 2 Jan 2025 19:12:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735845155; cv=none; b=Tghf/9MxrgUAXRwmIe+cI/hIKBArqFCjvvDzJeN5FRS/1r+VQzjbNubkA0SyCwTPbTwg3hpov19AsUDlMyKpLtpvngYUE82HtyeMfBR9lM02aGGzB0xRvwS9aGJoNLTF2fLuOSf1tpO67yTfVJ9cWn3FPYjTc/981ao45BAjCMA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735845155; c=relaxed/simple; bh=LW/zSQGtxoTxPEP6l3I9wwZh+Wt+abfeVXThB2HE0Vc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RkXEjqFk4vk/U1NJq8YQvyWeEG1AvIgbTpjVo7VyVriHyJotauS7YhZnooqG30nskyddGWcWVESb/DqrU9iDu2oTplvz3vWLWzypErKJjePoMI+md4gsdnenFPdXUZob8IQKHNeCXyzXPQrm7kv92IzcFzSdjZF52FRJiKaXd8w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=R2JQ08xx; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--skhawaja.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="R2JQ08xx" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ee86953aeaso15475981a91.2 for ; Thu, 02 Jan 2025 11:12:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735845153; x=1736449953; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MtjhhJiEXKyaw1nvpEbH/yDIveLeL8Bw5DjTTO0PZIM=; b=R2JQ08xxIipLdB/RiIKA9GRLp0TERPB6Ra/F/sqmpxLVyvjjRyJcd709msROo1UXJl jir/waWwZjL5ly+fgnwG0cM3vEIQnF8gfjMqbJEg72l4d+6JxAp70tLAZT1G/+wWuisl reMCkDo/PL8AlUYhPZUi/ddUgceHM+MBl3Xb+eCfv7zaxxMapojsgAWlpwxyPgU5PQMu n+0MmfGlMN5/3OHzaT8BkHZTjvmVH64eOL66cS2XbKbSB4MdqYWG599ukxDTnClELzRp Pzha+tjHTG4uc9Bu0kiir8Vd1yfZTgPVanI+vvWX3/xGOv6s44B60kDNWjcVAJr+Lvrr ZQ0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735845153; x=1736449953; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MtjhhJiEXKyaw1nvpEbH/yDIveLeL8Bw5DjTTO0PZIM=; b=cNR9ZwMoW5x/GAp0sS0xwz2/fyWi0vhWhAVoPWlK/6r+p6xsMLrVdDZdaGEgMxTS7J dqDaOQHrhFJS+nBBKN3E/SFUfldZLCfTl1KWUPCY18Dw+tlFw9dFm5WgB8JK7m9PAInB C+1lnTx5GUSwWi9sEmmodveOTGru3XRsX+zPNQSiIkkPgyEX+ycmFyWpx2olxNfSOyJN N21nK6Qu01X3/Bwrd5g5G0ZkOsxYtg58Ps/t88F8y9kpehRmPiX9pp2WdDwNbxs2uD0F NcXmQAFsvZu2u57e69K51OfkJmDftASbYlRoiQ9LO0EMLYf75gM0tyGKw2nFkFxiVVXd Y7Kw== X-Gm-Message-State: AOJu0YxxDEXXvpD7Vbhue6jNpSvXfR5skgZF3oVHHvw7EolSx+Z7c/JY IJb4nv2The8p0uOt+5qe+wDoyCVlrfq85ofsmLcZALV7aXXJ9N9M3OS62x7TtJdSyL/+xdPHNZb Y2Pb/jfz0ng== X-Google-Smtp-Source: AGHT+IE9ZEXOoD83btWayqgeUFbAx+MRYWf74UTJuX8o39Ppf5qdoMdf0v1TnPcxCqxwvIqr9qsRPTnQXg7sBg== X-Received: from pfbdw23.prod.google.com ([2002:a05:6a00:3697:b0:728:e3af:6ba5]) (user=skhawaja job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:1947:b0:725:456e:76e with SMTP id d2e1a72fcca58-72abdd8caf2mr67826361b3a.6.1735845152866; Thu, 02 Jan 2025 11:12:32 -0800 (PST) Date: Thu, 2 Jan 2025 19:12:27 +0000 In-Reply-To: <20250102191227.2084046-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250102191227.2084046-1-skhawaja@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250102191227.2084046-4-skhawaja@google.com> Subject: [PATCH net-next 3/3] Extend napi threaded polling to allow kthread based busy polling From: Samiullah Khawaja To: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni Cc: netdev@vger.kernel.org, skhawaja@google.com X-Patchwork-Delegate: kuba@kernel.org Add a new state to napi state enum: - STATE_THREADED_BUSY_POLL Threaded busy poll is enabled/running for this napi. Following changes are introduced in the napi scheduling and state logic: - When threaded busy poll is enabled through sysfs it also enables NAPI_STATE_THREADED so a kthread is created per napi. It also sets NAPI_STATE_THREADED_BUSY_POLL bit on each napi to indicate that we are supposed to busy poll for each napi. - When napi is scheduled with STATE_SCHED_THREADED and associated kthread is woken up, the kthread owns the context. If NAPI_STATE_THREADED_BUSY_POLL and NAPI_SCHED_THREADED both are set then it means that we can busy poll. - To keep busy polling and to avoid scheduling of the interrupts, the napi_complete_done returns false when both SCHED_THREADED and THREADED_BUSY_POLL flags are set. Also napi_complete_done returns early to avoid the STATE_SCHED_THREADED being unset. - If at any point STATE_THREADED_BUSY_POLL is unset, the napi_complete_done will run and unset the SCHED_THREADED bit also. This will make the associated kthread go to sleep as per existing logic. Signed-off-by: Samiullah Khawaja Reviewed-by: Willem de Bruijn --- Documentation/ABI/testing/sysfs-class-net | 3 +- Documentation/netlink/specs/netdev.yaml | 5 +- .../net/ethernet/atheros/atl1c/atl1c_main.c | 2 +- include/linux/netdevice.h | 24 +++++-- net/core/dev.c | 72 ++++++++++++++++--- net/core/net-sysfs.c | 2 +- net/core/netdev-genl-gen.c | 2 +- 7 files changed, 89 insertions(+), 21 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net index ebf21beba846..15d7d36a8294 100644 --- a/Documentation/ABI/testing/sysfs-class-net +++ b/Documentation/ABI/testing/sysfs-class-net @@ -343,7 +343,7 @@ Date: Jan 2021 KernelVersion: 5.12 Contact: netdev@vger.kernel.org Description: - Boolean value to control the threaded mode per device. User could + Integer value to control the threaded mode per device. User could set this value to enable/disable threaded mode for all napi belonging to this device, without the need to do device up/down. @@ -351,4 +351,5 @@ Description: == ================================== 0 threaded mode disabled for this dev 1 threaded mode enabled for this dev + 2 threaded mode enabled, and busy polling enabled. == ================================== diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index aac343af7246..9c905243a1cc 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -272,10 +272,11 @@ attribute-sets: name: threaded doc: Whether the napi is configured to operate in threaded polling mode. If this is set to `1` then the NAPI context operates - in threaded polling mode. + in threaded polling mode. If this is set to `2` then the NAPI + kthread also does busypolling. type: u32 checks: - max: 1 + max: 2 - name: queue attributes: diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c index c571614b1d50..a709cddcd292 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -2688,7 +2688,7 @@ static int atl1c_probe(struct pci_dev *pdev, const struct pci_device_id *ent) adapter->mii.mdio_write = atl1c_mdio_write; adapter->mii.phy_id_mask = 0x1f; adapter->mii.reg_num_mask = MDIO_CTRL_REG_MASK; - dev_set_threaded(netdev, true); + dev_set_threaded(netdev, DEV_NAPI_THREADED); for (i = 0; i < adapter->rx_queue_count; ++i) netif_napi_add(netdev, &adapter->rrd_ring[i].napi, atl1c_clean_rx); diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 8f531d528869..c384ffe0976e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -407,6 +407,8 @@ enum { NAPI_STATE_PREFER_BUSY_POLL, /* prefer busy-polling over softirq processing*/ NAPI_STATE_THREADED, /* The poll is performed inside its own thread*/ NAPI_STATE_SCHED_THREADED, /* Napi is currently scheduled in threaded mode */ + NAPI_STATE_THREADED_BUSY_POLL, /* The threaded napi poller will busy poll */ + NAPI_STATE_SCHED_THREADED_BUSY_POLL, /* The threaded napi poller is busy polling */ }; enum { @@ -420,8 +422,14 @@ enum { NAPIF_STATE_PREFER_BUSY_POLL = BIT(NAPI_STATE_PREFER_BUSY_POLL), NAPIF_STATE_THREADED = BIT(NAPI_STATE_THREADED), NAPIF_STATE_SCHED_THREADED = BIT(NAPI_STATE_SCHED_THREADED), + NAPIF_STATE_THREADED_BUSY_POLL = BIT(NAPI_STATE_THREADED_BUSY_POLL), + NAPIF_STATE_SCHED_THREADED_BUSY_POLL + = BIT(NAPI_STATE_SCHED_THREADED_BUSY_POLL), }; +#define NAPIF_STATE_THREADED_BUSY_POLL_MASK \ + (NAPIF_STATE_THREADED | NAPIF_STATE_THREADED_BUSY_POLL) + enum gro_result { GRO_MERGED, GRO_MERGED_FREE, @@ -568,16 +576,24 @@ static inline bool napi_complete(struct napi_struct *n) return napi_complete_done(n, 0); } -int dev_set_threaded(struct net_device *dev, bool threaded); +enum napi_threaded_state { + NAPI_THREADED_OFF = 0, + NAPI_THREADED = 1, + NAPI_THREADED_BUSY_POLL = 2, + NAPI_THREADED_MAX = NAPI_THREADED_BUSY_POLL, +}; + +int dev_set_threaded(struct net_device *dev, enum napi_threaded_state threaded); /* * napi_set_threaded - set napi threaded state * @napi: NAPI context - * @threaded: whether this napi does threaded polling + * @threaded: threading mode * * Return 0 on success and negative errno on failure. */ -int napi_set_threaded(struct napi_struct *napi, bool threaded); +int napi_set_threaded(struct napi_struct *napi, + enum napi_threaded_state threaded); /** * napi_disable - prevent NAPI from scheduling @@ -2406,7 +2422,7 @@ struct net_device { struct sfp_bus *sfp_bus; struct lock_class_key *qdisc_tx_busylock; bool proto_down; - bool threaded; + u8 threaded; /* priv_flags_slow, ungrouped to save space */ unsigned long see_all_hwtstamp_requests:1; diff --git a/net/core/dev.c b/net/core/dev.c index 762977a62da2..b6cd9474bdd3 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -78,6 +78,7 @@ #include #include #include +#include #include #include #include @@ -6231,7 +6232,8 @@ bool napi_complete_done(struct napi_struct *n, int work_done) * the guarantee we will be called later. */ if (unlikely(n->state & (NAPIF_STATE_NPSVC | - NAPIF_STATE_IN_BUSY_POLL))) + NAPIF_STATE_IN_BUSY_POLL | + NAPIF_STATE_SCHED_THREADED_BUSY_POLL))) return false; if (work_done) { @@ -6633,8 +6635,10 @@ static void init_gro_hash(struct napi_struct *napi) napi->gro_bitmask = 0; } -int napi_set_threaded(struct napi_struct *napi, bool threaded) +int napi_set_threaded(struct napi_struct *napi, + enum napi_threaded_state threaded) { + unsigned long val; if (napi->dev->threaded) return -EINVAL; @@ -6649,30 +6653,41 @@ int napi_set_threaded(struct napi_struct *napi, bool threaded) /* Make sure kthread is created before THREADED bit is set. */ smp_mb__before_atomic(); - assign_bit(NAPI_STATE_THREADED, &napi->state, threaded); + val = 0; + if (threaded == NAPI_THREADED_BUSY_POLL) + val |= NAPIF_STATE_THREADED_BUSY_POLL; + if (threaded) + val |= NAPIF_STATE_THREADED; + set_mask_bits(&napi->state, NAPIF_STATE_THREADED_BUSY_POLL_MASK, val); return 0; } -int dev_set_threaded(struct net_device *dev, bool threaded) +int dev_set_threaded(struct net_device *dev, enum napi_threaded_state threaded) { struct napi_struct *napi; + unsigned long val; int err = 0; if (dev->threaded == threaded) return 0; + val = 0; if (threaded) { /* Check if threaded is set at napi level already */ list_for_each_entry(napi, &dev->napi_list, dev_list) if (test_bit(NAPI_STATE_THREADED, &napi->state)) return -EINVAL; + val |= NAPIF_STATE_THREADED; + if (threaded == NAPI_THREADED_BUSY_POLL) + val |= NAPIF_STATE_THREADED_BUSY_POLL; + list_for_each_entry(napi, &dev->napi_list, dev_list) { if (!napi->thread) { err = napi_kthread_create(napi); if (err) { - threaded = false; + threaded = NAPI_THREADED_OFF; break; } } @@ -6691,9 +6706,13 @@ int dev_set_threaded(struct net_device *dev, bool threaded) * polled. In this case, the switch between threaded mode and * softirq mode will happen in the next round of napi_schedule(). * This should not cause hiccups/stalls to the live traffic. + * + * Switch to busy_poll threaded napi will occur after the threaded + * napi is scheduled. */ list_for_each_entry(napi, &dev->napi_list, dev_list) - assign_bit(NAPI_STATE_THREADED, &napi->state, threaded); + set_mask_bits(&napi->state, + NAPIF_STATE_THREADED_BUSY_POLL_MASK, val); return err; } @@ -7007,7 +7026,7 @@ static int napi_thread_wait(struct napi_struct *napi) return -1; } -static void napi_threaded_poll_loop(struct napi_struct *napi) +static void napi_threaded_poll_loop(struct napi_struct *napi, bool busy_poll) { struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; struct softnet_data *sd; @@ -7036,22 +7055,53 @@ static void napi_threaded_poll_loop(struct napi_struct *napi) } skb_defer_free_flush(sd); bpf_net_ctx_clear(bpf_net_ctx); + + /* Push the skbs up the stack if busy polling. */ + if (busy_poll) + __napi_gro_flush_helper(napi); local_bh_enable(); - if (!repoll) + /* If busy polling then do not break here because we need to + * call cond_resched and rcu_softirq_qs_periodic to prevent + * watchdog warnings. + */ + if (!repoll && !busy_poll) break; rcu_softirq_qs_periodic(last_qs); cond_resched(); + + if (!repoll) + break; } } static int napi_threaded_poll(void *data) { struct napi_struct *napi = data; + bool busy_poll_sched; + unsigned long val; + bool busy_poll; + + while (!napi_thread_wait(napi)) { + /* Once woken up, this means that we are scheduled as threaded + * napi and this thread owns the napi context, if busy poll + * state is set then we busy poll this napi. + */ + val = READ_ONCE(napi->state); + busy_poll = val & NAPIF_STATE_THREADED_BUSY_POLL; + busy_poll_sched = val & NAPIF_STATE_SCHED_THREADED_BUSY_POLL; + + /* Do not busy poll if napi is disabled. */ + if (unlikely(val & NAPIF_STATE_DISABLE)) + busy_poll = false; + + if (busy_poll != busy_poll_sched) + assign_bit(NAPI_STATE_SCHED_THREADED_BUSY_POLL, + &napi->state, busy_poll); - while (!napi_thread_wait(napi)) - napi_threaded_poll_loop(napi); + napi_threaded_poll_loop(napi, busy_poll); + } return 0; } @@ -12205,7 +12255,7 @@ static void run_backlog_napi(unsigned int cpu) { struct softnet_data *sd = per_cpu_ptr(&softnet_data, cpu); - napi_threaded_poll_loop(&sd->backlog); + napi_threaded_poll_loop(&sd->backlog, false); } static void backlog_napi_setup(unsigned int cpu) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 2d9afc6e2161..36d0a22e341c 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -626,7 +626,7 @@ static int modify_napi_threaded(struct net_device *dev, unsigned long val) if (list_empty(&dev->napi_list)) return -EOPNOTSUPP; - if (val != 0 && val != 1) + if (val > NAPI_THREADED_MAX) return -EOPNOTSUPP; ret = dev_set_threaded(dev, val); diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index 93dc74dad6de..4086d2577dcc 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -102,7 +102,7 @@ static const struct nla_policy netdev_napi_set_nl_policy[NETDEV_A_NAPI_IRQ_SUSPE /* NETDEV_CMD_NAPI_SET_THREADED - do */ static const struct nla_policy netdev_napi_set_threaded_nl_policy[NETDEV_A_NAPI_THREADED + 1] = { [NETDEV_A_NAPI_ID] = { .type = NLA_U32, }, - [NETDEV_A_NAPI_THREADED] = NLA_POLICY_MAX(NLA_U32, 1), + [NETDEV_A_NAPI_THREADED] = NLA_POLICY_MAX(NLA_U32, 2), }; /* Ops table for netdev */