From patchwork Wed Nov 22 03:44:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463908 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 308D88C17 for ; Wed, 22 Nov 2023 03:44:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hqpTUv7w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 22EEAC433C9; Wed, 22 Nov 2023 03:44:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624665; bh=Msjdx20gPtma4cyEI0vBuNnnLjWNIuuVxWWfRfEPMWA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hqpTUv7wDc6T+dLsIh4A/lAw75mfUHpqCKfMJ4yZdyurbU3oKkOTka//IKM9rnKaN uegfyOiFPEGpUkq3Rf7sh3G1A5PgKqAvtxALClj9g4EcaaKmPg9/bGvMYNfHMkVV63 uKRwLH4Y6uySmz9kTk0bXh7Ri5Jcl+fCViIm/lihHw8WJdfV90e8vt1BiD2L43iTHJ 7otzyTHyMLshF/tOCGer5I6qgBkzGr+fWmIEyd6oAOOm0695glF0PAh05PJJP9xs+u foB+BNf7eHv0QkBkcn2C+pCnirDpk0zgm7NHrpY660e116zZGUfZnLHHckFmRV5cmI B8LE4yHYVwP1g== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 01/13] net: page_pool: factor out uninit Date: Tue, 21 Nov 2023 19:44:08 -0800 Message-ID: <20231122034420.1158898-2-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org We'll soon (next change in the series) need a fuller unwind path in page_pool_create() so create the inverse of page_pool_init(). Reviewed-by: Ilias Apalodimas Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- net/core/page_pool.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index df2a06d7da52..2e4575477e71 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -238,6 +238,18 @@ static int page_pool_init(struct page_pool *pool, return 0; } +static void page_pool_uninit(struct page_pool *pool) +{ + ptr_ring_cleanup(&pool->ring, NULL); + + if (pool->p.flags & PP_FLAG_DMA_MAP) + put_device(pool->p.dev); + +#ifdef CONFIG_PAGE_POOL_STATS + free_percpu(pool->recycle_stats); +#endif +} + /** * page_pool_create() - create a page pool. * @params: parameters, see struct page_pool_params @@ -821,14 +833,7 @@ static void __page_pool_destroy(struct page_pool *pool) if (pool->disconnect) pool->disconnect(pool); - ptr_ring_cleanup(&pool->ring, NULL); - - if (pool->p.flags & PP_FLAG_DMA_MAP) - put_device(pool->p.dev); - -#ifdef CONFIG_PAGE_POOL_STATS - free_percpu(pool->recycle_stats); -#endif + page_pool_uninit(pool); kfree(pool); } From patchwork Wed Nov 22 03:44:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463909 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D77ABA25 for ; Wed, 22 Nov 2023 03:44:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dKYzV7va" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CAAE7C433D9; Wed, 22 Nov 2023 03:44:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624666; bh=apoPkkz9CksuGS/Hi/sF9csMsXG1HKaIeUQq1Qv6HpU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dKYzV7vauIIsuYwrX/91x2vShaXDb5N4s3f8jARumJNC6yywPWi3nnzZUKuyFxZbu kfnH0tptJlLYcawqDZ1+vfdCagme0eIMzYNrSzbiEtZF+ARzn4pdPOUlVp000yiUHH p/x15rFOZEDj9GlWGYAjIxWaMBDi8Oc0hRtdajTMiX6n4p/qhBYzpYIVuxDRwg87uS xWEHiMt+w3deE3v+fc+7yulMNmD/nWehHitIcY/CZwpXsyVwTuARsTBL1cwbc2Lgae F0Ky4Y/LowdDc051kaGeGNEm8lTluVbGSp1XjN6dbM3PDkdZjXSPur7E2eBHEJBI+6 zESzTawQSkeOQ== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 02/13] net: page_pool: id the page pools Date: Tue, 21 Nov 2023 19:44:09 -0800 Message-ID: <20231122034420.1158898-3-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org To give ourselves the flexibility of creating netlink commands and ability to refer to page pool instances in uAPIs create IDs for page pools. Reviewed-by: Ilias Apalodimas Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- include/net/page_pool/types.h | 4 ++++ net/core/Makefile | 2 +- net/core/page_pool.c | 21 +++++++++++++++----- net/core/page_pool_priv.h | 9 +++++++++ net/core/page_pool_user.c | 36 +++++++++++++++++++++++++++++++++++ 5 files changed, 66 insertions(+), 6 deletions(-) create mode 100644 net/core/page_pool_priv.h create mode 100644 net/core/page_pool_user.c diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index e1bb92c192de..c19f0df3bf0b 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -187,6 +187,10 @@ struct page_pool { /* Slow/Control-path information follows */ struct page_pool_params_slow slow; + /* User-facing fields, protected by page_pools_lock */ + struct { + u32 id; + } user; }; struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp); diff --git a/net/core/Makefile b/net/core/Makefile index 0cb734cbc24b..821aec06abf1 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -18,7 +18,7 @@ obj-y += dev.o dev_addr_lists.o dst.o netevent.o \ obj-$(CONFIG_NETDEV_ADDR_LIST_TEST) += dev_addr_lists_test.o obj-y += net-sysfs.o -obj-$(CONFIG_PAGE_POOL) += page_pool.o +obj-$(CONFIG_PAGE_POOL) += page_pool.o page_pool_user.o obj-$(CONFIG_PROC_FS) += net-procfs.o obj-$(CONFIG_NET_PKTGEN) += pktgen.o obj-$(CONFIG_NETPOLL) += netpoll.o diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 2e4575477e71..a8d96ea38d18 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -23,6 +23,8 @@ #include +#include "page_pool_priv.h" + #define DEFER_TIME (msecs_to_jiffies(1000)) #define DEFER_WARN_INTERVAL (60 * HZ) @@ -264,13 +266,21 @@ struct page_pool *page_pool_create(const struct page_pool_params *params) return ERR_PTR(-ENOMEM); err = page_pool_init(pool, params); - if (err < 0) { - pr_warn("%s() gave up with errno %d\n", __func__, err); - kfree(pool); - return ERR_PTR(err); - } + if (err < 0) + goto err_free; + + err = page_pool_list(pool); + if (err) + goto err_uninit; return pool; + +err_uninit: + page_pool_uninit(pool); +err_free: + pr_warn("%s() gave up with errno %d\n", __func__, err); + kfree(pool); + return ERR_PTR(err); } EXPORT_SYMBOL(page_pool_create); @@ -833,6 +843,7 @@ static void __page_pool_destroy(struct page_pool *pool) if (pool->disconnect) pool->disconnect(pool); + page_pool_unlist(pool); page_pool_uninit(pool); kfree(pool); } diff --git a/net/core/page_pool_priv.h b/net/core/page_pool_priv.h new file mode 100644 index 000000000000..c17ea092b4ab --- /dev/null +++ b/net/core/page_pool_priv.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __PAGE_POOL_PRIV_H +#define __PAGE_POOL_PRIV_H + +int page_pool_list(struct page_pool *pool); +void page_pool_unlist(struct page_pool *pool); + +#endif diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c new file mode 100644 index 000000000000..630d1eeecf2a --- /dev/null +++ b/net/core/page_pool_user.c @@ -0,0 +1,36 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +#include "page_pool_priv.h" + +static DEFINE_XARRAY_FLAGS(page_pools, XA_FLAGS_ALLOC1); +static DEFINE_MUTEX(page_pools_lock); + +int page_pool_list(struct page_pool *pool) +{ + static u32 id_alloc_next; + int err; + + mutex_lock(&page_pools_lock); + err = xa_alloc_cyclic(&page_pools, &pool->user.id, pool, xa_limit_32b, + &id_alloc_next, GFP_KERNEL); + if (err < 0) + goto err_unlock; + + mutex_unlock(&page_pools_lock); + return 0; + +err_unlock: + mutex_unlock(&page_pools_lock); + return err; +} + +void page_pool_unlist(struct page_pool *pool) +{ + mutex_lock(&page_pools_lock); + xa_erase(&page_pools, pool->user.id); + mutex_unlock(&page_pools_lock); +} From patchwork Wed Nov 22 03:44:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463910 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81308BE7F for ; Wed, 22 Nov 2023 03:44:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="iz4al5Pv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78F58C433C7; Wed, 22 Nov 2023 03:44:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624667; bh=jkO8mr1m8wY1Fb4kyUwHl6SuEy5TQiHQGnvyjKc88AA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iz4al5PvPysl6R8TFZhYjzDgd2VFpnT5VqIsUPqOashyj/T+h/bXHLTxEy66uYF0N +YNa1HbDutpsqceA7jCaW2knJehB7zSHfWJhcSKtQVfbHx4sn1ExxDqIpM5KvIl3FT 8gaUhkNG566ymAiJTZPXF0YbuzzjfwB+obD8K4pBTeO15CiQAl8hEBt/qgxroBOGBk 0a0gRqOfe097DNi3VYaXr9Xlm1bzYF0BirCcWwdR1tBjR/2aC7zc4P9UPAjhMhyy+0 4KnRP4Rfkcg+b9Cnq3w32/3kIuaBN0cOrFoG7OEl0v7vU45rUy2vgQ8uxv+MOzymBx PoU7pQp0oi5qQ== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 03/13] net: page_pool: record pools per netdev Date: Tue, 21 Nov 2023 19:44:10 -0800 Message-ID: <20231122034420.1158898-4-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Link the page pools with netdevs. This needs to be netns compatible so we have two options. Either we record the pools per netns and have to worry about moving them as the netdev gets moved. Or we record them directly on the netdev so they move with the netdev without any extra work. Implement the latter option. Since pools may outlast netdev we need a place to store orphans. In time honored tradition use loopback for this purpose. Reviewed-by: Mina Almasry Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- v1: fix race between page pool and netdev disappearing (Simon) --- include/linux/list.h | 20 ++++++++ include/linux/netdevice.h | 4 ++ include/linux/poison.h | 2 + include/net/page_pool/types.h | 4 ++ net/core/page_pool_user.c | 90 +++++++++++++++++++++++++++++++++++ 5 files changed, 120 insertions(+) diff --git a/include/linux/list.h b/include/linux/list.h index 1837caedf723..059aa1fff41e 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -1119,6 +1119,26 @@ static inline void hlist_move_list(struct hlist_head *old, old->first = NULL; } +/** + * hlist_splice_init() - move all entries from one list to another + * @from: hlist_head from which entries will be moved + * @last: last entry on the @from list + * @to: hlist_head to which entries will be moved + * + * @to can be empty, @from must contain at least @last. + */ +static inline void hlist_splice_init(struct hlist_head *from, + struct hlist_node *last, + struct hlist_head *to) +{ + if (to->first) + to->first->pprev = &last->next; + last->next = to->first; + to->first = from->first; + from->first->pprev = &to->first; + from->first = NULL; +} + #define hlist_entry(ptr, type, member) container_of(ptr,type,member) #define hlist_for_each(pos, head) \ diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 2d840d7056f2..d6554f308ff1 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2435,6 +2435,10 @@ struct net_device { #if IS_ENABLED(CONFIG_DPLL) struct dpll_pin *dpll_pin; #endif +#if IS_ENABLED(CONFIG_PAGE_POOL) + /** @page_pools: page pools created for this netdevice */ + struct hlist_head page_pools; +#endif }; #define to_net_dev(d) container_of(d, struct net_device, dev) diff --git a/include/linux/poison.h b/include/linux/poison.h index 851a855d3868..27a7dad17eef 100644 --- a/include/linux/poison.h +++ b/include/linux/poison.h @@ -83,6 +83,8 @@ /********** net/core/skbuff.c **********/ #define SKB_LIST_POISON_NEXT ((void *)(0x800 + POISON_POINTER_DELTA)) +/********** net/ **********/ +#define NET_PTR_POISON ((void *)(0x801 + POISON_POINTER_DELTA)) /********** kernel/bpf/ **********/ #define BPF_PTR_POISON ((void *)(0xeB9FUL + POISON_POINTER_DELTA)) diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index c19f0df3bf0b..b258a571201e 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -5,6 +5,7 @@ #include #include +#include #define PP_FLAG_DMA_MAP BIT(0) /* Should page_pool do the DMA * map/unmap @@ -48,6 +49,7 @@ struct pp_alloc_cache { * @pool_size: size of the ptr_ring * @nid: NUMA node id to allocate from pages from * @dev: device, for DMA pre-mapping purposes + * @netdev: netdev this pool will serve (leave as NULL if none or multiple) * @napi: NAPI which is the sole consumer of pages, otherwise NULL * @dma_dir: DMA mapping direction * @max_len: max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV @@ -66,6 +68,7 @@ struct page_pool_params { unsigned int offset; ); struct_group_tagged(page_pool_params_slow, slow, + struct net_device *netdev; /* private: used by test code only */ void (*init_callback)(struct page *page, void *arg); void *init_arg; @@ -189,6 +192,7 @@ struct page_pool { struct page_pool_params_slow slow; /* User-facing fields, protected by page_pools_lock */ struct { + struct hlist_node list; u32 id; } user; }; diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index 630d1eeecf2a..1591dbd66d51 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -1,14 +1,31 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include +#include #include #include "page_pool_priv.h" static DEFINE_XARRAY_FLAGS(page_pools, XA_FLAGS_ALLOC1); +/* Protects: page_pools, netdevice->page_pools, pool->slow.netdev, pool->user. + * Ordering: inside rtnl_lock + */ static DEFINE_MUTEX(page_pools_lock); +/* Page pools are only reachable from user space (via netlink) if they are + * linked to a netdev at creation time. Following page pool "visibility" + * states are possible: + * - normal + * - user.list: linked to real netdev, netdev: real netdev + * - orphaned - real netdev has disappeared + * - user.list: linked to lo, netdev: lo + * - invisible - either (a) created without netdev linking, (b) unlisted due + * to error, or (c) the entire namespace which owned this pool disappeared + * - user.list: unhashed, netdev: unknown + */ + int page_pool_list(struct page_pool *pool) { static u32 id_alloc_next; @@ -20,6 +37,10 @@ int page_pool_list(struct page_pool *pool) if (err < 0) goto err_unlock; + if (pool->slow.netdev) + hlist_add_head(&pool->user.list, + &pool->slow.netdev->page_pools); + mutex_unlock(&page_pools_lock); return 0; @@ -32,5 +53,74 @@ void page_pool_unlist(struct page_pool *pool) { mutex_lock(&page_pools_lock); xa_erase(&page_pools, pool->user.id); + hlist_del(&pool->user.list); mutex_unlock(&page_pools_lock); } + +static void page_pool_unreg_netdev_wipe(struct net_device *netdev) +{ + struct page_pool *pool; + struct hlist_node *n; + + mutex_lock(&page_pools_lock); + hlist_for_each_entry_safe(pool, n, &netdev->page_pools, user.list) { + hlist_del_init(&pool->user.list); + pool->slow.netdev = NET_PTR_POISON; + } + mutex_unlock(&page_pools_lock); +} + +static void page_pool_unreg_netdev(struct net_device *netdev) +{ + struct page_pool *pool, *last; + struct net_device *lo; + + lo = __dev_get_by_index(dev_net(netdev), 1); + if (!lo) { + netdev_err_once(netdev, + "can't get lo to store orphan page pools\n"); + page_pool_unreg_netdev_wipe(netdev); + return; + } + + mutex_lock(&page_pools_lock); + last = NULL; + hlist_for_each_entry(pool, &netdev->page_pools, user.list) { + pool->slow.netdev = lo; + last = pool; + } + if (last) + hlist_splice_init(&netdev->page_pools, &last->user.list, + &lo->page_pools); + mutex_unlock(&page_pools_lock); +} + +static int +page_pool_netdevice_event(struct notifier_block *nb, + unsigned long event, void *ptr) +{ + struct net_device *netdev = netdev_notifier_info_to_dev(ptr); + + if (event != NETDEV_UNREGISTER) + return NOTIFY_DONE; + + if (hlist_empty(&netdev->page_pools)) + return NOTIFY_OK; + + if (netdev->ifindex != LOOPBACK_IFINDEX) + page_pool_unreg_netdev(netdev); + else + page_pool_unreg_netdev_wipe(netdev); + return NOTIFY_OK; +} + +static struct notifier_block page_pool_netdevice_nb = { + .notifier_call = page_pool_netdevice_event, +}; + +static int __init page_pool_user_init(void) +{ + return register_netdevice_notifier(&page_pool_netdevice_nb); +} + +subsys_initcall(page_pool_user_init); From patchwork Wed Nov 22 03:44:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463911 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D689AC131 for ; Wed, 22 Nov 2023 03:44:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OQLBD9ep" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 29C2BC433C8; Wed, 22 Nov 2023 03:44:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624667; bh=pR08Q6FORwYayHs7uhiLfNHKIns5vmXKCyxM52jBe7Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OQLBD9epnNTlbAcuwySNihKASbRUEVUu8V+G43ZDsfDWsmi8TBfafeKTQT5sh0y7N KqufGpHbU1cZDEkNfVB79BBQX+qjR59kNzD4quNGEmwdgapa9SMknEI++TBWtIQ3Eg HF73KofW5pMFkePhLW/8zWCzUB9tQhKaa4wgwXVHbtySITq4glVkfqm0LbNXZAEGda u4q/tMi4UvEuof0z+QUsoqQFEEwzqvszqa4HhBNNPZUTBqEyb4hVmqN+IvrBBe+ntq 8gl15rG+e05W58qGQrufcmAdDO4n/bgVzrR+NuxKPGHKKBJcMWTQrNVYefwRtKbDr6 FCtCrO4lZf5BA== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 04/13] net: page_pool: stash the NAPI ID for easier access Date: Tue, 21 Nov 2023 19:44:11 -0800 Message-ID: <20231122034420.1158898-5-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org To avoid any issues with race conditions on accessing napi and having to think about the lifetime of NAPI objects in netlink GET - stash the napi_id to which page pool was linked at creation time. Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- include/net/page_pool/types.h | 1 + net/core/page_pool_user.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index b258a571201e..7e47d7bb2c1e 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -193,6 +193,7 @@ struct page_pool { /* User-facing fields, protected by page_pools_lock */ struct { struct hlist_node list; + u32 napi_id; u32 id; } user; }; diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index 1591dbd66d51..23074d4c75fc 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -37,9 +37,11 @@ int page_pool_list(struct page_pool *pool) if (err < 0) goto err_unlock; - if (pool->slow.netdev) + if (pool->slow.netdev) { hlist_add_head(&pool->user.list, &pool->slow.netdev->page_pools); + pool->user.napi_id = pool->p.napi ? pool->p.napi->napi_id : 0; + } mutex_unlock(&page_pools_lock); return 0; From patchwork Wed Nov 22 03:44:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463912 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9819C2CF for ; Wed, 22 Nov 2023 03:44:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="sh/kq3Wz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D4AFAC433D9; Wed, 22 Nov 2023 03:44:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624668; bh=1rmI8uIg1fczjGS3j0dpoVFXLV9JFBoCjHeaCq+vIpA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sh/kq3Wz13M8B3bsNvWUFJDw4sSA3R+GyW+UClkcQDaxNZIiC7szF7PdBfDGe1mpi 2VpEgWsWAn29ak1qjc6puzp9gEJI1prnEW7eJt+Lk1QRTZ/IcokrcHB0rGlfF/72fs TB9W0bYprbWkdoGBjidfWpOXgqPs53s1BHYoE/ScKKUtM7MIV2/U6S1JDVF11KGmOx KWXfHv5DSpekrh0Cmllp/XF7aqqy9uGjazuxHgYrWlpNsQWIr6Dv1Kdj6wrk7kvX+r axbk9o4wJg6uCxmMeMGMCzH3rRKtgZVDZkTI1n1bsOkJ1CvPDVPw7qAAHHrxjNN9Mn 9b1LC1TINTKDw== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 05/13] eth: link netdev to page_pools in drivers Date: Tue, 21 Nov 2023 19:44:12 -0800 Message-ID: <20231122034420.1158898-6-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Link page pool instances to netdev for the drivers which already link to NAPI. Unless the driver is doing something very weird per-NAPI should imply per-netdev. Add netsec as well, Ilias indicates that it fits the mold. Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- v3: fix netsec build v2: add netsec --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 + drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 + drivers/net/ethernet/microsoft/mana/mana_en.c | 1 + drivers/net/ethernet/socionext/netsec.c | 2 ++ 4 files changed, 5 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index d8cf7b30bd03..e35e7e02538c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3331,6 +3331,7 @@ static int bnxt_alloc_rx_page_pool(struct bnxt *bp, pp.pool_size += bp->rx_ring_size; pp.nid = dev_to_node(&bp->pdev->dev); pp.napi = &rxr->bnapi->napi; + pp.netdev = bp->dev; pp.dev = &bp->pdev->dev; pp.dma_dir = bp->rx_dir; pp.max_len = PAGE_SIZE; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 3aecdf099a2f..af5e0d3c8ee1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -902,6 +902,7 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, pp_params.nid = node; pp_params.dev = rq->pdev; pp_params.napi = rq->cq.napi; + pp_params.netdev = rq->netdev; pp_params.dma_dir = rq->buff.map_dir; pp_params.max_len = PAGE_SIZE; diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index fc3d2903a80f..d1adec01299c 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -2137,6 +2137,7 @@ static int mana_create_page_pool(struct mana_rxq *rxq, struct gdma_context *gc) pprm.pool_size = RX_BUFFERS_PER_QUEUE; pprm.nid = gc->numa_node; pprm.napi = &rxq->rx_cq.napi; + pprm.netdev = rxq->ndev; rxq->page_pool = page_pool_create(&pprm); diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index 0891e9e49ecb..5ab8b81b84e6 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -1302,6 +1302,8 @@ static int netsec_setup_rx_dring(struct netsec_priv *priv) .dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE, .offset = NETSEC_RXBUF_HEADROOM, .max_len = NETSEC_RX_BUF_SIZE, + .napi = &priv->napi, + .netdev = priv->ndev, }; int i, err; From patchwork Wed Nov 22 03:44:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463913 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3024DC8DB for ; Wed, 22 Nov 2023 03:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="c9I2K8x9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 84A57C433C7; Wed, 22 Nov 2023 03:44:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624669; bh=u85xHyLUUP2c2kSBZmRu+sfWWRTC5Fw14x3bAnG17oc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=c9I2K8x97/WC24AMTnBdd5QalRJnxz+08WZll2DPxg4+8bq0/oWlTYDTZTKcbsC76 oBodJWadeGAIGsZAD6AvY5PV3eAD80it1tCt7V/k1ymNXlEAKLXHC/c0plPQIZmFpS /kBBsduc2ffhJMV5WymGV1fn6+R99Jw/CUcTqd7D4IpdHqIPz8UCByy2FSuE66EhXZ GPtZn1BaxN+xucKGWP1y/LQetEBGs6+h6zIkCmX1054kJQiiWD+kwURNq0KRe1vvEr 2rF3GZferDVjd0naTsUJHbRdBKFOydG+sOZ9ILk43ma7ePBfwbgnn+Y0e+p3R5DeiF U4wLHpCyyVWhw== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 06/13] net: page_pool: add nlspec for basic access to page pools Date: Tue, 21 Nov 2023 19:44:13 -0800 Message-ID: <20231122034420.1158898-7-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Add a Netlink spec in YAML for getting very basic information about page pools. Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- Documentation/netlink/specs/netdev.yaml | 46 +++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 14511b13f305..84ca3c2ab872 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -86,6 +86,34 @@ name: netdev See Documentation/networking/xdp-rx-metadata.rst for more details. type: u64 enum: xdp-rx-metadata + - + name: page-pool + attributes: + - + name: id + doc: Unique ID of a Page Pool instance. + type: uint + checks: + min: 1 + max: u32-max + - + name: ifindex + doc: | + ifindex of the netdev to which the pool belongs. + May be reported as 0 if the page pool was allocated for a netdev + which got destroyed already (page pools may outlast their netdevs + because they wait for all memory to be returned). + type: u32 + checks: + min: 1 + max: s32-max + - + name: napi-id + doc: Id of NAPI using this Page Pool instance. + type: uint + checks: + min: 1 + max: u32-max operations: list: @@ -120,6 +148,24 @@ name: netdev doc: Notification about device configuration being changed. notify: dev-get mcgrp: mgmt + - + name: page-pool-get + doc: | + Get / dump information about Page Pools. + (Only Page Pools associated with a net_device can be listed.) + attribute-set: page-pool + do: + request: + attributes: + - id + reply: &pp-reply + attributes: + - id + - ifindex + - napi-id + dump: + reply: *pp-reply + config-cond: page-pool mcast-groups: list: From patchwork Wed Nov 22 03:44:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463914 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D880AEADF for ; Wed, 22 Nov 2023 03:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GC/tCDy7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 33FBEC433CB; Wed, 22 Nov 2023 03:44:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624669; bh=UfF6EmJAWSOVyktwMIWXy5NGrphE9uMhM9sHCKAUMNY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GC/tCDy7UZqtcoFybi+dMUxszyHpsO9iXIT8E6W9Z+87po3//rhTnu3OVo+bbavSD Z5F5RhIqqkeeg7BBoPMaB1x3g+lbWtOLzdQmfvODnZsbRN4FH191cPGiQS1GQ4fgNm pSFn2XivtybUpGVTa6LFBJMgz7QEWNK1gTe4Pu1laoVUllmPAu46XbH/yYb8SdLOPc K7wl3yB0/I5t9AT99PASgreM1tMi4y6v2H4oT6tqwAuLXczNyij5TZBd0WUpo2paBM kjf6uwOysvuBcVi91PU1xsaokrlT7HcQS58dqUDKw+BxTmTJkxiSjY5n1fvIDtGJhq pQqOOEEL0Ww2Q== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 07/13] net: page_pool: implement GET in the netlink API Date: Tue, 21 Nov 2023 19:44:14 -0800 Message-ID: <20231122034420.1158898-8-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Expose the very basic page pool information via netlink. Example using ynl-py for a system with 9 queues: $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-get [{'id': 19, 'ifindex': 2, 'napi-id': 147}, {'id': 18, 'ifindex': 2, 'napi-id': 146}, {'id': 17, 'ifindex': 2, 'napi-id': 145}, {'id': 16, 'ifindex': 2, 'napi-id': 144}, {'id': 15, 'ifindex': 2, 'napi-id': 143}, {'id': 14, 'ifindex': 2, 'napi-id': 142}, {'id': 13, 'ifindex': 2, 'napi-id': 141}, {'id': 12, 'ifindex': 2, 'napi-id': 140}, {'id': 11, 'ifindex': 2, 'napi-id': 139}, {'id': 10, 'ifindex': 2, 'napi-id': 138}] Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- include/uapi/linux/netdev.h | 10 +++ net/core/netdev-genl-gen.c | 27 ++++++++ net/core/netdev-genl-gen.h | 3 + net/core/page_pool_user.c | 127 ++++++++++++++++++++++++++++++++++++ 4 files changed, 167 insertions(+) diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 2943a151d4f1..176665bcf0da 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -64,11 +64,21 @@ enum { NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) }; +enum { + NETDEV_A_PAGE_POOL_ID = 1, + NETDEV_A_PAGE_POOL_IFINDEX, + NETDEV_A_PAGE_POOL_NAPI_ID, + + __NETDEV_A_PAGE_POOL_MAX, + NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1) +}; + enum { NETDEV_CMD_DEV_GET = 1, NETDEV_CMD_DEV_ADD_NTF, NETDEV_CMD_DEV_DEL_NTF, NETDEV_CMD_DEV_CHANGE_NTF, + NETDEV_CMD_PAGE_POOL_GET, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index ea9231378aa6..9d34b451cfa7 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -10,11 +10,24 @@ #include +/* Integer value ranges */ +static const struct netlink_range_validation netdev_a_page_pool_id_range = { + .min = 1, + .max = 4294967295, +}; + /* NETDEV_CMD_DEV_GET - do */ static const struct nla_policy netdev_dev_get_nl_policy[NETDEV_A_DEV_IFINDEX + 1] = { [NETDEV_A_DEV_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1), }; +/* NETDEV_CMD_PAGE_POOL_GET - do */ +#ifdef CONFIG_PAGE_POOL +static const struct nla_policy netdev_page_pool_get_nl_policy[NETDEV_A_PAGE_POOL_ID + 1] = { + [NETDEV_A_PAGE_POOL_ID] = NLA_POLICY_FULL_RANGE(NLA_UINT, &netdev_a_page_pool_id_range), +}; +#endif /* CONFIG_PAGE_POOL */ + /* Ops table for netdev */ static const struct genl_split_ops netdev_nl_ops[] = { { @@ -29,6 +42,20 @@ static const struct genl_split_ops netdev_nl_ops[] = { .dumpit = netdev_nl_dev_get_dumpit, .flags = GENL_CMD_CAP_DUMP, }, +#ifdef CONFIG_PAGE_POOL + { + .cmd = NETDEV_CMD_PAGE_POOL_GET, + .doit = netdev_nl_page_pool_get_doit, + .policy = netdev_page_pool_get_nl_policy, + .maxattr = NETDEV_A_PAGE_POOL_ID, + .flags = GENL_CMD_CAP_DO, + }, + { + .cmd = NETDEV_CMD_PAGE_POOL_GET, + .dumpit = netdev_nl_page_pool_get_dumpit, + .flags = GENL_CMD_CAP_DUMP, + }, +#endif /* CONFIG_PAGE_POOL */ }; static const struct genl_multicast_group netdev_nl_mcgrps[] = { diff --git a/net/core/netdev-genl-gen.h b/net/core/netdev-genl-gen.h index 7b370c073e7d..a011d12abff4 100644 --- a/net/core/netdev-genl-gen.h +++ b/net/core/netdev-genl-gen.h @@ -13,6 +13,9 @@ int netdev_nl_dev_get_doit(struct sk_buff *skb, struct genl_info *info); int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb); +int netdev_nl_page_pool_get_doit(struct sk_buff *skb, struct genl_info *info); +int netdev_nl_page_pool_get_dumpit(struct sk_buff *skb, + struct netlink_callback *cb); enum { NETDEV_NLGRP_MGMT, diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index 23074d4c75fc..24f4e54f6266 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -5,8 +5,10 @@ #include #include #include +#include #include "page_pool_priv.h" +#include "netdev-genl-gen.h" static DEFINE_XARRAY_FLAGS(page_pools, XA_FLAGS_ALLOC1); /* Protects: page_pools, netdevice->page_pools, pool->slow.netdev, pool->user. @@ -26,6 +28,131 @@ static DEFINE_MUTEX(page_pools_lock); * - user.list: unhashed, netdev: unknown */ +typedef int (*pp_nl_fill_cb)(struct sk_buff *rsp, const struct page_pool *pool, + const struct genl_info *info); + +static int +netdev_nl_page_pool_get_do(struct genl_info *info, u32 id, pp_nl_fill_cb fill) +{ + struct page_pool *pool; + struct sk_buff *rsp; + int err; + + mutex_lock(&page_pools_lock); + pool = xa_load(&page_pools, id); + if (!pool || hlist_unhashed(&pool->user.list) || + !net_eq(dev_net(pool->slow.netdev), genl_info_net(info))) { + err = -ENOENT; + goto err_unlock; + } + + rsp = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!rsp) { + err = -ENOMEM; + goto err_unlock; + } + + err = fill(rsp, pool, info); + if (err) + goto err_free_msg; + + mutex_unlock(&page_pools_lock); + + return genlmsg_reply(rsp, info); + +err_free_msg: + nlmsg_free(rsp); +err_unlock: + mutex_unlock(&page_pools_lock); + return err; +} + +struct page_pool_dump_cb { + unsigned long ifindex; + u32 pp_id; +}; + +static int +netdev_nl_page_pool_get_dump(struct sk_buff *skb, struct netlink_callback *cb, + pp_nl_fill_cb fill) +{ + struct page_pool_dump_cb *state = (void *)cb->ctx; + const struct genl_info *info = genl_info_dump(cb); + struct net *net = sock_net(skb->sk); + struct net_device *netdev; + struct page_pool *pool; + int err = 0; + + rtnl_lock(); + mutex_lock(&page_pools_lock); + for_each_netdev_dump(net, netdev, state->ifindex) { + hlist_for_each_entry(pool, &netdev->page_pools, user.list) { + if (state->pp_id && state->pp_id < pool->user.id) + continue; + + state->pp_id = pool->user.id; + err = fill(skb, pool, info); + if (err) + break; + } + + state->pp_id = 0; + } + mutex_unlock(&page_pools_lock); + rtnl_unlock(); + + if (skb->len && err == -EMSGSIZE) + return skb->len; + return err; +} + +static int +page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool, + const struct genl_info *info) +{ + void *hdr; + + hdr = genlmsg_iput(rsp, info); + if (!hdr) + return -EMSGSIZE; + + if (nla_put_uint(rsp, NETDEV_A_PAGE_POOL_ID, pool->user.id)) + goto err_cancel; + + if (pool->slow.netdev->ifindex != LOOPBACK_IFINDEX && + nla_put_u32(rsp, NETDEV_A_PAGE_POOL_IFINDEX, + pool->slow.netdev->ifindex)) + goto err_cancel; + if (pool->user.napi_id && + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_NAPI_ID, pool->user.napi_id)) + goto err_cancel; + + genlmsg_end(rsp, hdr); + + return 0; +err_cancel: + genlmsg_cancel(rsp, hdr); + return -EMSGSIZE; +} + +int netdev_nl_page_pool_get_doit(struct sk_buff *skb, struct genl_info *info) +{ + u32 id; + + if (GENL_REQ_ATTR_CHECK(info, NETDEV_A_PAGE_POOL_ID)) + return -EINVAL; + + id = nla_get_uint(info->attrs[NETDEV_A_PAGE_POOL_ID]); + + return netdev_nl_page_pool_get_do(info, id, page_pool_nl_fill); +} + +int netdev_nl_page_pool_get_dumpit(struct sk_buff *skb, + struct netlink_callback *cb) +{ + return netdev_nl_page_pool_get_dump(skb, cb, page_pool_nl_fill); +} + int page_pool_list(struct page_pool *pool) { static u32 id_alloc_next; From patchwork Wed Nov 22 03:44:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463915 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D851F8BFE for ; Wed, 22 Nov 2023 03:44:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ppPK16+9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D894EC433C8; Wed, 22 Nov 2023 03:44:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624670; bh=7ZmaeQ1cF12GrDwqcSNdpNolUC2YY4x2c7OD6O31uSs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ppPK16+9F5k/8noSVYfqVFvGceRIGIov6jIxF9x7UlkRZ7CWgU1xx7mDaPjzpCamZ oba0gw/1KQBa9eYZQ7J6Jlu6cFuvo2O6ncAgpCw3oVkKtS5uQhDOpC0t/54Oy6csxo a0EHaylM3y6agg/bvpDZ56dQop3PZe9/dL8bON8vchHfgghE74Ao7XbMjwEsy0ngRm WUVg2pKWpCxgTYaMGRAlNQBiPvGv5ahxFKl76dfOO1RxJ5/88GWUdavM9UkkJ4m7iQ 98Lw4dspNtRBEW0BxkTnuSFrSsu5XEoAdbwa50jPUyDJ4fXYa+QrvKve5WuGP/0Q8Y PGXoiA4n7SLEw== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 08/13] net: page_pool: add netlink notifications for state changes Date: Tue, 21 Nov 2023 19:44:15 -0800 Message-ID: <20231122034420.1158898-9-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Generate netlink notifications about page pool state changes. Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- Documentation/netlink/specs/netdev.yaml | 20 ++++++++++++++ include/uapi/linux/netdev.h | 4 +++ net/core/netdev-genl-gen.c | 1 + net/core/netdev-genl-gen.h | 1 + net/core/page_pool_user.c | 36 +++++++++++++++++++++++++ 5 files changed, 62 insertions(+) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 84ca3c2ab872..82fbe81f7a49 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -166,8 +166,28 @@ name: netdev dump: reply: *pp-reply config-cond: page-pool + - + name: page-pool-add-ntf + doc: Notification about page pool appearing. + notify: page-pool-get + mcgrp: page-pool + config-cond: page-pool + - + name: page-pool-del-ntf + doc: Notification about page pool disappearing. + notify: page-pool-get + mcgrp: page-pool + config-cond: page-pool + - + name: page-pool-change-ntf + doc: Notification about page pool configuration being changed. + notify: page-pool-get + mcgrp: page-pool + config-cond: page-pool mcast-groups: list: - name: mgmt + - + name: page-pool diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 176665bcf0da..beb158872226 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -79,11 +79,15 @@ enum { NETDEV_CMD_DEV_DEL_NTF, NETDEV_CMD_DEV_CHANGE_NTF, NETDEV_CMD_PAGE_POOL_GET, + NETDEV_CMD_PAGE_POOL_ADD_NTF, + NETDEV_CMD_PAGE_POOL_DEL_NTF, + NETDEV_CMD_PAGE_POOL_CHANGE_NTF, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) }; #define NETDEV_MCGRP_MGMT "mgmt" +#define NETDEV_MCGRP_PAGE_POOL "page-pool" #endif /* _UAPI_LINUX_NETDEV_H */ diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index 9d34b451cfa7..278b43f5ed50 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -60,6 +60,7 @@ static const struct genl_split_ops netdev_nl_ops[] = { static const struct genl_multicast_group netdev_nl_mcgrps[] = { [NETDEV_NLGRP_MGMT] = { "mgmt", }, + [NETDEV_NLGRP_PAGE_POOL] = { "page-pool", }, }; struct genl_family netdev_nl_family __ro_after_init = { diff --git a/net/core/netdev-genl-gen.h b/net/core/netdev-genl-gen.h index a011d12abff4..738097847100 100644 --- a/net/core/netdev-genl-gen.h +++ b/net/core/netdev-genl-gen.h @@ -19,6 +19,7 @@ int netdev_nl_page_pool_get_dumpit(struct sk_buff *skb, enum { NETDEV_NLGRP_MGMT, + NETDEV_NLGRP_PAGE_POOL, }; extern struct genl_family netdev_nl_family; diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index 24f4e54f6266..35c56fb41c46 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -135,6 +135,37 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool, return -EMSGSIZE; } +static void netdev_nl_page_pool_event(const struct page_pool *pool, u32 cmd) +{ + struct genl_info info; + struct sk_buff *ntf; + struct net *net; + + lockdep_assert_held(&page_pools_lock); + + /* 'invisible' page pools don't matter */ + if (hlist_unhashed(&pool->user.list)) + return; + net = dev_net(pool->slow.netdev); + + if (!genl_has_listeners(&netdev_nl_family, net, NETDEV_NLGRP_PAGE_POOL)) + return; + + genl_info_init_ntf(&info, &netdev_nl_family, cmd); + + ntf = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!ntf) + return; + + if (page_pool_nl_fill(ntf, pool, &info)) { + nlmsg_free(ntf); + return; + } + + genlmsg_multicast_netns(&netdev_nl_family, net, ntf, + 0, NETDEV_NLGRP_PAGE_POOL, GFP_KERNEL); +} + int netdev_nl_page_pool_get_doit(struct sk_buff *skb, struct genl_info *info) { u32 id; @@ -168,6 +199,8 @@ int page_pool_list(struct page_pool *pool) hlist_add_head(&pool->user.list, &pool->slow.netdev->page_pools); pool->user.napi_id = pool->p.napi ? pool->p.napi->napi_id : 0; + + netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_ADD_NTF); } mutex_unlock(&page_pools_lock); @@ -181,6 +214,7 @@ int page_pool_list(struct page_pool *pool) void page_pool_unlist(struct page_pool *pool) { mutex_lock(&page_pools_lock); + netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_DEL_NTF); xa_erase(&page_pools, pool->user.id); hlist_del(&pool->user.list); mutex_unlock(&page_pools_lock); @@ -216,6 +250,8 @@ static void page_pool_unreg_netdev(struct net_device *netdev) last = NULL; hlist_for_each_entry(pool, &netdev->page_pools, user.list) { pool->slow.netdev = lo; + netdev_nl_page_pool_event(pool, + NETDEV_CMD_PAGE_POOL_CHANGE_NTF); last = pool; } if (last) From patchwork Wed Nov 22 03:44:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463916 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B6B611C9E for ; Wed, 22 Nov 2023 03:44:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cfEWrOVj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A7D8C433CD; Wed, 22 Nov 2023 03:44:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624671; bh=YThdnmZoPNuHhVZ+EdbJkvWc6JlaPBcyHLTu/aZYFWw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cfEWrOVjsnX9uWzrs000/DkGqv+7xQ6vnMOR7ogNZ6AlzBKnbF6Z/BXe0olpjvUht Z3aXiFPyrmprCUMfi0qz1wq7h8vVphusLwNZr8YwuVxake1azBu8lK6gr5SS3Bcbuu VLolH9oQ/xrRJngiHGdVn4ny4TlYeAqDPshMkrBeAGsvTztPHVkt6ekqussMBMbu1X 7Fef6DmMCrtRhmq4GqnR9FUECPGOkV5oUoE6hdWdxH1Y8deRz1zFHBKVBoGiSeCL35 PsOVs2FL2l0XWgBNGCChMAQ8IL01+CD9oiXg6FHeR1Gy8hvuMiIFVsNGnwtW/RU1A9 Tj/dTHi/ZvHCg== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 09/13] net: page_pool: report amount of memory held by page pools Date: Tue, 21 Nov 2023 19:44:16 -0800 Message-ID: <20231122034420.1158898-10-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Advanced deployments need the ability to check memory use of various system components. It makes it possible to make informed decisions about memory allocation and to find regressions and leaks. Report memory use of page pools. Report both number of references and bytes held. Signed-off-by: Jakub Kicinski Acked-by: Jesper Dangaard Brouer --- Documentation/netlink/specs/netdev.yaml | 13 +++++++++++++ include/uapi/linux/netdev.h | 2 ++ net/core/page_pool.c | 13 +++++++++---- net/core/page_pool_priv.h | 2 ++ net/core/page_pool_user.c | 8 ++++++++ 5 files changed, 34 insertions(+), 4 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 82fbe81f7a49..85209e19dca9 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -114,6 +114,17 @@ name: netdev checks: min: 1 max: u32-max + - + name: inflight + type: uint + doc: | + Number of outstanding references to this page pool (allocated + but yet to be freed pages). + - + name: inflight-mem + type: uint + doc: | + Amount of memory held by inflight pages. operations: list: @@ -163,6 +174,8 @@ name: netdev - id - ifindex - napi-id + - inflight + - inflight-mem dump: reply: *pp-reply config-cond: page-pool diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index beb158872226..26ae5bdd3187 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -68,6 +68,8 @@ enum { NETDEV_A_PAGE_POOL_ID = 1, NETDEV_A_PAGE_POOL_IFINDEX, NETDEV_A_PAGE_POOL_NAPI_ID, + NETDEV_A_PAGE_POOL_INFLIGHT, + NETDEV_A_PAGE_POOL_INFLIGHT_MEM, __NETDEV_A_PAGE_POOL_MAX, NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index a8d96ea38d18..566390759294 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -529,7 +529,7 @@ EXPORT_SYMBOL(page_pool_alloc_pages); */ #define _distance(a, b) (s32)((a) - (b)) -static s32 page_pool_inflight(struct page_pool *pool) +s32 page_pool_inflight(const struct page_pool *pool, bool strict) { u32 release_cnt = atomic_read(&pool->pages_state_release_cnt); u32 hold_cnt = READ_ONCE(pool->pages_state_hold_cnt); @@ -537,8 +537,13 @@ static s32 page_pool_inflight(struct page_pool *pool) inflight = _distance(hold_cnt, release_cnt); - trace_page_pool_release(pool, inflight, hold_cnt, release_cnt); - WARN(inflight < 0, "Negative(%d) inflight packet-pages", inflight); + if (strict) { + trace_page_pool_release(pool, inflight, hold_cnt, release_cnt); + WARN(inflight < 0, "Negative(%d) inflight packet-pages", + inflight); + } else { + inflight = max(0, inflight); + } return inflight; } @@ -881,7 +886,7 @@ static int page_pool_release(struct page_pool *pool) int inflight; page_pool_scrub(pool); - inflight = page_pool_inflight(pool); + inflight = page_pool_inflight(pool, true); if (!inflight) __page_pool_destroy(pool); diff --git a/net/core/page_pool_priv.h b/net/core/page_pool_priv.h index c17ea092b4ab..72fb21ea1ddc 100644 --- a/net/core/page_pool_priv.h +++ b/net/core/page_pool_priv.h @@ -3,6 +3,8 @@ #ifndef __PAGE_POOL_PRIV_H #define __PAGE_POOL_PRIV_H +s32 page_pool_inflight(const struct page_pool *pool, bool strict); + int page_pool_list(struct page_pool *pool); void page_pool_unlist(struct page_pool *pool); diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index 35c56fb41c46..d889b347f8f4 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -110,6 +110,7 @@ static int page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool, const struct genl_info *info) { + size_t inflight, refsz; void *hdr; hdr = genlmsg_iput(rsp, info); @@ -127,6 +128,13 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool, nla_put_uint(rsp, NETDEV_A_PAGE_POOL_NAPI_ID, pool->user.napi_id)) goto err_cancel; + inflight = page_pool_inflight(pool, false); + refsz = PAGE_SIZE << pool->p.order; + if (nla_put_uint(rsp, NETDEV_A_PAGE_POOL_INFLIGHT, inflight) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_INFLIGHT_MEM, + inflight * refsz)) + goto err_cancel; + genlmsg_end(rsp, hdr); return 0; From patchwork Wed Nov 22 03:44:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463917 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D528DC2C1 for ; Wed, 22 Nov 2023 03:44:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="B78yVTrz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 35836C433CC; Wed, 22 Nov 2023 03:44:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624671; bh=a1Rbvmm6RHsQBnDqqBVFiZ182uT+NxWVLFRwjgIxjoE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=B78yVTrzZ+O+zu5dueCrFG0rpsZSuK/Jv5+I47hETkY5V4cYKcpRR03SN7y/fQxFU uC2d8elxYsKB6CGw4ullOVSAZJazlh89IB9OYCSk6jxg7mE67Z83P2OYP09pbyPWyt puhTCWSUDraSPt1m12FSmm83QtaTqVExKoocn+d7jW6HCDbGmTrmsKK0BxFyolYnE6 q++sbC+Qkk9/S3f7dQCUAIHt9mKL6T5WqPvwEL3UKKst6EjAOV2S8fsCcISBl3GlIw Zh4+RjHVRjVkjTUH2PAtc6vR9CXIw3tmMdDDyx8sy6sQFwNqDG8Vvg/wYvnFrlLYEm f77NsuVfPZjAw== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 10/13] net: page_pool: report when page pool was destroyed Date: Tue, 21 Nov 2023 19:44:17 -0800 Message-ID: <20231122034420.1158898-11-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Report when page pool was destroyed. Together with the inflight / memory use reporting this can serve as a replacement for the warning about leaked page pools we currently print to dmesg. Example output for a fake leaked page pool using some hacks in netdevsim (one "live" pool, and one "leaked" on the same dev): $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-get [{'id': 2, 'ifindex': 3}, {'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}] Tested-by: Dragos Tatulea Signed-off-by: Jakub Kicinski Acked-by: Jesper Dangaard Brouer Reviewed-by: Eric Dumazet --- Documentation/netlink/specs/netdev.yaml | 13 +++++++++++++ include/net/page_pool/types.h | 1 + include/uapi/linux/netdev.h | 1 + net/core/page_pool.c | 1 + net/core/page_pool_priv.h | 1 + net/core/page_pool_user.c | 12 ++++++++++++ 6 files changed, 29 insertions(+) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 85209e19dca9..695e0e4e0d8b 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -125,6 +125,18 @@ name: netdev type: uint doc: | Amount of memory held by inflight pages. + - + name: detach-time + type: uint + doc: | + Seconds in CLOCK_BOOTTIME of when Page Pool was detached by + the driver. Once detached Page Pool can no longer be used to + allocate memory. + Page Pools wait for all the memory allocated from them to be freed + before truly disappearing. "Detached" Page Pools cannot be + "re-attached", they are just waiting to disappear. + Attribute is absent if Page Pool has not been detached, and + can still be used to allocate new memory. operations: list: @@ -176,6 +188,7 @@ name: netdev - napi-id - inflight - inflight-mem + - detach-time dump: reply: *pp-reply config-cond: page-pool diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index 7e47d7bb2c1e..ac286ea8ce2d 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -193,6 +193,7 @@ struct page_pool { /* User-facing fields, protected by page_pools_lock */ struct { struct hlist_node list; + u64 detach_time; u32 napi_id; u32 id; } user; diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 26ae5bdd3187..756410274120 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -70,6 +70,7 @@ enum { NETDEV_A_PAGE_POOL_NAPI_ID, NETDEV_A_PAGE_POOL_INFLIGHT, NETDEV_A_PAGE_POOL_INFLIGHT_MEM, + NETDEV_A_PAGE_POOL_DETACH_TIME, __NETDEV_A_PAGE_POOL_MAX, NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 566390759294..a821fb5fe054 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -953,6 +953,7 @@ void page_pool_destroy(struct page_pool *pool) if (!page_pool_release(pool)) return; + page_pool_detached(pool); pool->defer_start = jiffies; pool->defer_warn = jiffies + DEFER_WARN_INTERVAL; diff --git a/net/core/page_pool_priv.h b/net/core/page_pool_priv.h index 72fb21ea1ddc..90665d40f1eb 100644 --- a/net/core/page_pool_priv.h +++ b/net/core/page_pool_priv.h @@ -6,6 +6,7 @@ s32 page_pool_inflight(const struct page_pool *pool, bool strict); int page_pool_list(struct page_pool *pool); +void page_pool_detached(struct page_pool *pool); void page_pool_unlist(struct page_pool *pool); #endif diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index d889b347f8f4..f28ad2179f53 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -134,6 +134,10 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool, nla_put_uint(rsp, NETDEV_A_PAGE_POOL_INFLIGHT_MEM, inflight * refsz)) goto err_cancel; + if (pool->user.detach_time && + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_DETACH_TIME, + pool->user.detach_time)) + goto err_cancel; genlmsg_end(rsp, hdr); @@ -219,6 +223,14 @@ int page_pool_list(struct page_pool *pool) return err; } +void page_pool_detached(struct page_pool *pool) +{ + mutex_lock(&page_pools_lock); + pool->user.detach_time = ktime_get_boottime_seconds(); + netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_CHANGE_NTF); + mutex_unlock(&page_pools_lock); +} + void page_pool_unlist(struct page_pool *pool) { mutex_lock(&page_pools_lock); From patchwork Wed Nov 22 03:44:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463918 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C71B13FEB for ; Wed, 22 Nov 2023 03:44:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GTv3wbp3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D56FAC433C8; Wed, 22 Nov 2023 03:44:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624672; bh=j0tHz5j/3LLxmlkDIKIKaQpvdrf3KmNX/c0DP99d7N4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GTv3wbp3OqfMXMcAhsAWtRdVy50qWm5HBcevknpQWH+/bQx4q3ZU2dhIQ5q111AvD LkSBQFfsb//ZHU3xAfHoX6pFWyUZCNoT30jbdXA5aS48ew2PbUl5DtvlWCHpTRObsV +fcleQsMgIXkQSKLo6PVOInlBtNpIbY6N9jHCUY3oo5DeYbiem0F08/v6IUcpwDZLL pffMaq5qrjGLh4ZpYXw/sPcGaR5DEKiY8pGr4MeHY3TCGWzav4nMYsXx7JxAJjsLY1 0WR3PsPBqOfAdMqZuWcXkOgg51HK/x02rqqzZi+VQmyF6lVlgEuCIsBXlUQwC8xZmg 62/82o8KpWxWw== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 11/13] net: page_pool: expose page pool stats via netlink Date: Tue, 21 Nov 2023 19:44:18 -0800 Message-ID: <20231122034420.1158898-12-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Dump the stats into netlink. More clever approaches like dumping the stats per-CPU for each CPU individually to see where the packets get consumed can be implemented in the future. A trimmed example from a real (but recently booted system): $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-stats-get [{'info': {'id': 19, 'ifindex': 2}, 'alloc-empty': 48, 'alloc-fast': 3024, 'alloc-refill': 0, 'alloc-slow': 48, 'alloc-slow-high-order': 0, 'alloc-waive': 0, 'recycle-cache-full': 0, 'recycle-cached': 0, 'recycle-released-refcnt': 0, 'recycle-ring': 0, 'recycle-ring-full': 0}, {'info': {'id': 18, 'ifindex': 2}, 'alloc-empty': 66, 'alloc-fast': 11811, 'alloc-refill': 35, 'alloc-slow': 66, 'alloc-slow-high-order': 0, 'alloc-waive': 0, 'recycle-cache-full': 1145, 'recycle-cached': 6541, 'recycle-released-refcnt': 0, 'recycle-ring': 1275, 'recycle-ring-full': 0}, {'info': {'id': 17, 'ifindex': 2}, 'alloc-empty': 73, 'alloc-fast': 62099, 'alloc-refill': 413, ... Signed-off-by: Jakub Kicinski Acked-by: Jesper Dangaard Brouer --- Documentation/netlink/specs/netdev.yaml | 78 ++++++++++++++++++ Documentation/networking/page_pool.rst | 10 ++- include/net/page_pool/helpers.h | 8 +- include/uapi/linux/netdev.h | 19 +++++ net/core/netdev-genl-gen.c | 32 ++++++++ net/core/netdev-genl-gen.h | 7 ++ net/core/page_pool.c | 2 +- net/core/page_pool_user.c | 103 ++++++++++++++++++++++++ 8 files changed, 250 insertions(+), 9 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 695e0e4e0d8b..77d991738a17 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -137,6 +137,59 @@ name: netdev "re-attached", they are just waiting to disappear. Attribute is absent if Page Pool has not been detached, and can still be used to allocate new memory. + - + name: page-pool-info + subset-of: page-pool + attributes: + - + name: id + - + name: ifindex + - + name: page-pool-stats + doc: | + Page pool statistics, see docs for struct page_pool_stats + for information about individual statistics. + attributes: + - + name: info + doc: Page pool identifying information. + type: nest + nested-attributes: page-pool-info + - + name: alloc-fast + type: uint + value: 8 # reserve some attr ids in case we need more metadata later + - + name: alloc-slow + type: uint + - + name: alloc-slow-high-order + type: uint + - + name: alloc-empty + type: uint + - + name: alloc-refill + type: uint + - + name: alloc-waive + type: uint + - + name: recycle-cached + type: uint + - + name: recycle-cache-full + type: uint + - + name: recycle-ring + type: uint + - + name: recycle-ring-full + type: uint + - + name: recycle-released-refcnt + type: uint operations: list: @@ -210,6 +263,31 @@ name: netdev notify: page-pool-get mcgrp: page-pool config-cond: page-pool + - + name: page-pool-stats-get + doc: Get page pool statistics. + attribute-set: page-pool-stats + do: + request: + attributes: + - info + reply: &pp-stats-reply + attributes: + - info + - alloc-fast + - alloc-slow + - alloc-slow-high-order + - alloc-empty + - alloc-refill + - alloc-waive + - recycle-cached + - recycle-cache-full + - recycle-ring + - recycle-ring-full + - recycle-released-refcnt + dump: + reply: *pp-stats-reply + config-cond: page-pool-stats mcast-groups: list: diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst index 60993cb56b32..9d958128a57c 100644 --- a/Documentation/networking/page_pool.rst +++ b/Documentation/networking/page_pool.rst @@ -41,6 +41,11 @@ Architecture overview | Fast cache | | ptr-ring cache | +-----------------+ +------------------+ +Monitoring +========== +Information about page pools on the system can be accessed via the netdev +genetlink family (see Documentation/netlink/specs/netdev.yaml). + API interface ============= The number of pools created **must** match the number of hardware queues @@ -107,8 +112,9 @@ page_pool_get_stats() and structures described below are available. It takes a pointer to a ``struct page_pool`` and a pointer to a struct page_pool_stats allocated by the caller. -The API will fill in the provided struct page_pool_stats with -statistics about the page_pool. +Older drivers expose page pool statistics via ethtool or debugfs. +The same statistics are accessible via the netlink netdev family +in a driver-independent fashion. .. kernel-doc:: include/net/page_pool/types.h :identifiers: struct page_pool_recycle_stats diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h index 4ebd544ae977..7dc65774cde5 100644 --- a/include/net/page_pool/helpers.h +++ b/include/net/page_pool/helpers.h @@ -55,16 +55,12 @@ #include #ifdef CONFIG_PAGE_POOL_STATS +/* Deprecated driver-facing API, use netlink instead */ int page_pool_ethtool_stats_get_count(void); u8 *page_pool_ethtool_stats_get_strings(u8 *data); u64 *page_pool_ethtool_stats_get(u64 *data, void *stats); -/* - * Drivers that wish to harvest page pool stats and report them to users - * (perhaps via ethtool, debugfs, or another mechanism) can allocate a - * struct page_pool_stats call page_pool_get_stats to get stats for the specified pool. - */ -bool page_pool_get_stats(struct page_pool *pool, +bool page_pool_get_stats(const struct page_pool *pool, struct page_pool_stats *stats); #else static inline int page_pool_ethtool_stats_get_count(void) diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 756410274120..2b37233e00c0 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -76,6 +76,24 @@ enum { NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1) }; +enum { + NETDEV_A_PAGE_POOL_STATS_INFO = 1, + NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST = 8, + NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW, + NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW_HIGH_ORDER, + NETDEV_A_PAGE_POOL_STATS_ALLOC_EMPTY, + NETDEV_A_PAGE_POOL_STATS_ALLOC_REFILL, + NETDEV_A_PAGE_POOL_STATS_ALLOC_WAIVE, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHED, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHE_FULL, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING_FULL, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_RELEASED_REFCNT, + + __NETDEV_A_PAGE_POOL_STATS_MAX, + NETDEV_A_PAGE_POOL_STATS_MAX = (__NETDEV_A_PAGE_POOL_STATS_MAX - 1) +}; + enum { NETDEV_CMD_DEV_GET = 1, NETDEV_CMD_DEV_ADD_NTF, @@ -85,6 +103,7 @@ enum { NETDEV_CMD_PAGE_POOL_ADD_NTF, NETDEV_CMD_PAGE_POOL_DEL_NTF, NETDEV_CMD_PAGE_POOL_CHANGE_NTF, + NETDEV_CMD_PAGE_POOL_STATS_GET, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index 278b43f5ed50..b8af054b9fe5 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -16,6 +16,17 @@ static const struct netlink_range_validation netdev_a_page_pool_id_range = { .max = 4294967295, }; +static const struct netlink_range_validation netdev_a_page_pool_ifindex_range = { + .min = 1, + .max = 2147483647, +}; + +/* Common nested types */ +const struct nla_policy netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFINDEX + 1] = { + [NETDEV_A_PAGE_POOL_ID] = NLA_POLICY_FULL_RANGE(NLA_UINT, &netdev_a_page_pool_id_range), + [NETDEV_A_PAGE_POOL_IFINDEX] = NLA_POLICY_FULL_RANGE(NLA_U32, &netdev_a_page_pool_ifindex_range), +}; + /* NETDEV_CMD_DEV_GET - do */ static const struct nla_policy netdev_dev_get_nl_policy[NETDEV_A_DEV_IFINDEX + 1] = { [NETDEV_A_DEV_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1), @@ -28,6 +39,13 @@ static const struct nla_policy netdev_page_pool_get_nl_policy[NETDEV_A_PAGE_POOL }; #endif /* CONFIG_PAGE_POOL */ +/* NETDEV_CMD_PAGE_POOL_STATS_GET - do */ +#ifdef CONFIG_PAGE_POOL_STATS +static const struct nla_policy netdev_page_pool_stats_get_nl_policy[NETDEV_A_PAGE_POOL_STATS_INFO + 1] = { + [NETDEV_A_PAGE_POOL_STATS_INFO] = NLA_POLICY_NESTED(netdev_page_pool_info_nl_policy), +}; +#endif /* CONFIG_PAGE_POOL_STATS */ + /* Ops table for netdev */ static const struct genl_split_ops netdev_nl_ops[] = { { @@ -56,6 +74,20 @@ static const struct genl_split_ops netdev_nl_ops[] = { .flags = GENL_CMD_CAP_DUMP, }, #endif /* CONFIG_PAGE_POOL */ +#ifdef CONFIG_PAGE_POOL_STATS + { + .cmd = NETDEV_CMD_PAGE_POOL_STATS_GET, + .doit = netdev_nl_page_pool_stats_get_doit, + .policy = netdev_page_pool_stats_get_nl_policy, + .maxattr = NETDEV_A_PAGE_POOL_STATS_INFO, + .flags = GENL_CMD_CAP_DO, + }, + { + .cmd = NETDEV_CMD_PAGE_POOL_STATS_GET, + .dumpit = netdev_nl_page_pool_stats_get_dumpit, + .flags = GENL_CMD_CAP_DUMP, + }, +#endif /* CONFIG_PAGE_POOL_STATS */ }; static const struct genl_multicast_group netdev_nl_mcgrps[] = { diff --git a/net/core/netdev-genl-gen.h b/net/core/netdev-genl-gen.h index 738097847100..649e4b46eccf 100644 --- a/net/core/netdev-genl-gen.h +++ b/net/core/netdev-genl-gen.h @@ -11,11 +11,18 @@ #include +/* Common nested types */ +extern const struct nla_policy netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFINDEX + 1]; + int netdev_nl_dev_get_doit(struct sk_buff *skb, struct genl_info *info); int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb); int netdev_nl_page_pool_get_doit(struct sk_buff *skb, struct genl_info *info); int netdev_nl_page_pool_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb); +int netdev_nl_page_pool_stats_get_doit(struct sk_buff *skb, + struct genl_info *info); +int netdev_nl_page_pool_stats_get_dumpit(struct sk_buff *skb, + struct netlink_callback *cb); enum { NETDEV_NLGRP_MGMT, diff --git a/net/core/page_pool.c b/net/core/page_pool.c index a821fb5fe054..3d0938a60646 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -71,7 +71,7 @@ static const char pp_stats[][ETH_GSTRING_LEN] = { * is passed to this API which is filled in. The caller can then report * those stats to the user (perhaps via ethtool, debugfs, etc.). */ -bool page_pool_get_stats(struct page_pool *pool, +bool page_pool_get_stats(const struct page_pool *pool, struct page_pool_stats *stats) { int cpu = 0; diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index f28ad2179f53..ce56bd3aae2c 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include "page_pool_priv.h" @@ -106,6 +107,108 @@ netdev_nl_page_pool_get_dump(struct sk_buff *skb, struct netlink_callback *cb, return err; } +static int +page_pool_nl_stats_fill(struct sk_buff *rsp, const struct page_pool *pool, + const struct genl_info *info) +{ +#ifdef CONFIG_PAGE_POOL_STATS + struct page_pool_stats stats = {}; + struct nlattr *nest; + void *hdr; + + if (!page_pool_get_stats(pool, &stats)) + return 0; + + hdr = genlmsg_iput(rsp, info); + if (!hdr) + return -EMSGSIZE; + + nest = nla_nest_start(rsp, NETDEV_A_PAGE_POOL_STATS_INFO); + + if (nla_put_uint(rsp, NETDEV_A_PAGE_POOL_ID, pool->user.id) || + (pool->slow.netdev->ifindex != LOOPBACK_IFINDEX && + nla_put_u32(rsp, NETDEV_A_PAGE_POOL_IFINDEX, + pool->slow.netdev->ifindex))) + goto err_cancel_nest; + + nla_nest_end(rsp, nest); + + if (nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST, + stats.alloc_stats.fast) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW, + stats.alloc_stats.slow) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW_HIGH_ORDER, + stats.alloc_stats.slow_high_order) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_ALLOC_EMPTY, + stats.alloc_stats.empty) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_ALLOC_REFILL, + stats.alloc_stats.refill) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_ALLOC_WAIVE, + stats.alloc_stats.waive) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHED, + stats.recycle_stats.cached) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHE_FULL, + stats.recycle_stats.cache_full) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING, + stats.recycle_stats.ring) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING_FULL, + stats.recycle_stats.ring_full) || + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_STATS_RECYCLE_RELEASED_REFCNT, + stats.recycle_stats.released_refcnt)) + goto err_cancel_msg; + + genlmsg_end(rsp, hdr); + + return 0; +err_cancel_nest: + nla_nest_cancel(rsp, nest); +err_cancel_msg: + genlmsg_cancel(rsp, hdr); + return -EMSGSIZE; +#else + GENL_SET_ERR_MSG(info, "kernel built without CONFIG_PAGE_POOL_STATS"); + return -EOPNOTSUPP; +#endif +} + +int netdev_nl_page_pool_stats_get_doit(struct sk_buff *skb, + struct genl_info *info) +{ + struct nlattr *tb[ARRAY_SIZE(netdev_page_pool_info_nl_policy)]; + struct nlattr *nest; + int err; + u32 id; + + if (GENL_REQ_ATTR_CHECK(info, NETDEV_A_PAGE_POOL_STATS_INFO)) + return -EINVAL; + + nest = info->attrs[NETDEV_A_PAGE_POOL_STATS_INFO]; + err = nla_parse_nested(tb, ARRAY_SIZE(tb) - 1, nest, + netdev_page_pool_info_nl_policy, + info->extack); + if (err) + return err; + + if (NL_REQ_ATTR_CHECK(info->extack, nest, tb, NETDEV_A_PAGE_POOL_ID)) + return -EINVAL; + if (tb[NETDEV_A_PAGE_POOL_IFINDEX]) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[NETDEV_A_PAGE_POOL_IFINDEX], + "selecting by ifindex not supported"); + return -EINVAL; + } + + id = nla_get_uint(tb[NETDEV_A_PAGE_POOL_ID]); + + return netdev_nl_page_pool_get_do(info, id, page_pool_nl_stats_fill); +} + +int netdev_nl_page_pool_stats_get_dumpit(struct sk_buff *skb, + struct netlink_callback *cb) +{ + return netdev_nl_page_pool_get_dump(skb, cb, page_pool_nl_stats_fill); +} + static int page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool, const struct genl_info *info) From patchwork Wed Nov 22 03:44:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463919 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3BC07154AD for ; Wed, 22 Nov 2023 03:44:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Rw0GS3IG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8E95BC433C9; Wed, 22 Nov 2023 03:44:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624673; bh=k6BCwT3mMlwBdbDzwICVicjvlT2vUYZ5On66Y2xJ0vw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Rw0GS3IG2X1yQ5yeEO3GngxzTB2AifJ4gkFLv9Tso9lu836nmGx1kb07xYWT3a/j6 8IFOd4nxG4k3YabNJRN7tuY/STgIiDaI6cYx3Qql87QMcj12fY/1cr9Xh+MPep989H c6AorMCztSaRGTLPSDlRfcFtV6IXI+ePyw8L3ocO3xodmmz6/SsTXKAU01InS+1Fxf ghtKmv/BzjLgAOBugNXidK5P9pOt4IpewL6y3YX/PIRtkbspooHSJ/loy4HBzPuD0y lfIRVGUiumOcTVfHVgzFsI5O1lv4kPGtZEFf70FgbMlYYoKNMCqHdVRXo08FKq5P/b yNfx75gRw3UwQ== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 12/13] net: page_pool: mute the periodic warning for visible page pools Date: Tue, 21 Nov 2023 19:44:19 -0800 Message-ID: <20231122034420.1158898-13-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Mute the periodic "stalled pool shutdown" warning if the page pool is visible to user space. Rolling out a driver using page pools to just a few hundred hosts at Meta surfaces applications which fail to reap their broken sockets. Obviously it's best if the applications are fixed, but we don't generally print warnings for application resource leaks. Admins can now depend on the netlink interface for getting page pool info to detect buggy apps. While at it throw in the ID of the pool into the message, in rare cases (pools from destroyed netns) this will make finding the pool with a debugger easier. Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Acked-by: Jesper Dangaard Brouer --- net/core/page_pool.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 3d0938a60646..c2e7c9a6efbe 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -897,18 +897,21 @@ static void page_pool_release_retry(struct work_struct *wq) { struct delayed_work *dwq = to_delayed_work(wq); struct page_pool *pool = container_of(dwq, typeof(*pool), release_dw); + void *netdev; int inflight; inflight = page_pool_release(pool); if (!inflight) return; - /* Periodic warning */ - if (time_after_eq(jiffies, pool->defer_warn)) { + /* Periodic warning for page pools the user can't see */ + netdev = READ_ONCE(pool->slow.netdev); + if (time_after_eq(jiffies, pool->defer_warn) && + (!netdev || netdev == NET_PTR_POISON)) { int sec = (s32)((u32)jiffies - (u32)pool->defer_start) / HZ; - pr_warn("%s() stalled pool shutdown %d inflight %d sec\n", - __func__, inflight, sec); + pr_warn("%s() stalled pool shutdown: id %u, %d inflight %d sec\n", + __func__, pool->user.id, inflight, sec); pool->defer_warn = jiffies + DEFER_WARN_INTERVAL; } From patchwork Wed Nov 22 03:44:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 13463920 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E46F9156FA for ; Wed, 22 Nov 2023 03:44:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mQyNqU/U" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3D4A4C433C7; Wed, 22 Nov 2023 03:44:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700624673; bh=Ss4mGXWk5JCzPCZ7jnZ0waqr0mlfBJiNSYi+zYjMV/A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mQyNqU/UPXDYUpGnauMjAHwJnCmPj1t+nfaS+GwiC82Y0D9Urb8c/IixzkxwPpgVS f5bG3k+eJvPJE5ZfMVutwYmMsaQbL+IdyKAp/t1wmpKZzmVKVG/tMS7NeSYQnzxvyo ghv9xaR31PxbjSr4JfMahePQED9/+7coS6sCkwP66TH5VPpDqUxng08d4rBZHeZXST dpmWXO24DXA5pWJILyUfH+0tETYzz28KErxMCDvC+C1oZOaD8SdsdJ8b7NBr83nMCw PWHc9wDAVMVcZOe47f5oRCEjPCC5EKvzNLIlVe/+hK7nurake+j4XAdJaY59mPaWrU airlEMjdICLhQ== From: Jakub Kicinski To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com, dtatulea@nvidia.com, willemb@google.com, Jakub Kicinski Subject: [PATCH net-next v3 13/13] tools: ynl: add sample for getting page-pool information Date: Tue, 21 Nov 2023 19:44:20 -0800 Message-ID: <20231122034420.1158898-14-kuba@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122034420.1158898-1-kuba@kernel.org> References: <20231122034420.1158898-1-kuba@kernel.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Regenerate the tools/ code after netdev spec changes. Add sample to query page-pool info in a concise fashion: $ ./page-pool eth0[2] page pools: 10 (zombies: 0) refs: 41984 bytes: 171966464 (refs: 0 bytes: 0) recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201) Signed-off-by: Jakub Kicinski Acked-by: Jesper Dangaard Brouer --- tools/include/uapi/linux/netdev.h | 36 +++ tools/net/ynl/generated/netdev-user.c | 419 ++++++++++++++++++++++++++ tools/net/ynl/generated/netdev-user.h | 171 +++++++++++ tools/net/ynl/lib/ynl.h | 2 +- tools/net/ynl/samples/.gitignore | 1 + tools/net/ynl/samples/Makefile | 2 +- tools/net/ynl/samples/page-pool.c | 147 +++++++++ 7 files changed, 776 insertions(+), 2 deletions(-) create mode 100644 tools/net/ynl/samples/page-pool.c diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 2943a151d4f1..9d0a48717d1f 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -64,16 +64,52 @@ enum { NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) }; +enum { + NETDEV_A_PAGE_POOL_ID = 1, + NETDEV_A_PAGE_POOL_IFINDEX, + NETDEV_A_PAGE_POOL_NAPI_ID, + NETDEV_A_PAGE_POOL_INFLIGHT, + NETDEV_A_PAGE_POOL_INFLIGHT_MEM, + NETDEV_A_PAGE_POOL_DESTROYED, + + __NETDEV_A_PAGE_POOL_MAX, + NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1) +}; + +enum { + NETDEV_A_PAGE_POOL_STATS_INFO = 1, + NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST = 8, + NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW, + NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW_HIGH_ORDER, + NETDEV_A_PAGE_POOL_STATS_ALLOC_EMPTY, + NETDEV_A_PAGE_POOL_STATS_ALLOC_REFILL, + NETDEV_A_PAGE_POOL_STATS_ALLOC_WAIVE, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHED, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHE_FULL, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING_FULL, + NETDEV_A_PAGE_POOL_STATS_RECYCLE_RELEASED_REFCNT, + + __NETDEV_A_PAGE_POOL_STATS_MAX, + NETDEV_A_PAGE_POOL_STATS_MAX = (__NETDEV_A_PAGE_POOL_STATS_MAX - 1) +}; + enum { NETDEV_CMD_DEV_GET = 1, NETDEV_CMD_DEV_ADD_NTF, NETDEV_CMD_DEV_DEL_NTF, NETDEV_CMD_DEV_CHANGE_NTF, + NETDEV_CMD_PAGE_POOL_GET, + NETDEV_CMD_PAGE_POOL_ADD_NTF, + NETDEV_CMD_PAGE_POOL_DEL_NTF, + NETDEV_CMD_PAGE_POOL_CHANGE_NTF, + NETDEV_CMD_PAGE_POOL_STATS_GET, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) }; #define NETDEV_MCGRP_MGMT "mgmt" +#define NETDEV_MCGRP_PAGE_POOL "page-pool" #endif /* _UAPI_LINUX_NETDEV_H */ diff --git a/tools/net/ynl/generated/netdev-user.c b/tools/net/ynl/generated/netdev-user.c index b5ffe8cd1144..f9a000584493 100644 --- a/tools/net/ynl/generated/netdev-user.c +++ b/tools/net/ynl/generated/netdev-user.c @@ -18,6 +18,11 @@ static const char * const netdev_op_strmap[] = { [NETDEV_CMD_DEV_ADD_NTF] = "dev-add-ntf", [NETDEV_CMD_DEV_DEL_NTF] = "dev-del-ntf", [NETDEV_CMD_DEV_CHANGE_NTF] = "dev-change-ntf", + [NETDEV_CMD_PAGE_POOL_GET] = "page-pool-get", + [NETDEV_CMD_PAGE_POOL_ADD_NTF] = "page-pool-add-ntf", + [NETDEV_CMD_PAGE_POOL_DEL_NTF] = "page-pool-del-ntf", + [NETDEV_CMD_PAGE_POOL_CHANGE_NTF] = "page-pool-change-ntf", + [NETDEV_CMD_PAGE_POOL_STATS_GET] = "page-pool-stats-get", }; const char *netdev_op_str(int op) @@ -59,6 +64,16 @@ const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value) } /* Policies */ +struct ynl_policy_attr netdev_page_pool_info_policy[NETDEV_A_PAGE_POOL_MAX + 1] = { + [NETDEV_A_PAGE_POOL_ID] = { .name = "id", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_IFINDEX] = { .name = "ifindex", .type = YNL_PT_U32, }, +}; + +struct ynl_policy_nest netdev_page_pool_info_nest = { + .max_attr = NETDEV_A_PAGE_POOL_MAX, + .table = netdev_page_pool_info_policy, +}; + struct ynl_policy_attr netdev_dev_policy[NETDEV_A_DEV_MAX + 1] = { [NETDEV_A_DEV_IFINDEX] = { .name = "ifindex", .type = YNL_PT_U32, }, [NETDEV_A_DEV_PAD] = { .name = "pad", .type = YNL_PT_IGNORE, }, @@ -72,7 +87,85 @@ struct ynl_policy_nest netdev_dev_nest = { .table = netdev_dev_policy, }; +struct ynl_policy_attr netdev_page_pool_policy[NETDEV_A_PAGE_POOL_MAX + 1] = { + [NETDEV_A_PAGE_POOL_ID] = { .name = "id", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_IFINDEX] = { .name = "ifindex", .type = YNL_PT_U32, }, + [NETDEV_A_PAGE_POOL_NAPI_ID] = { .name = "napi-id", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_INFLIGHT] = { .name = "inflight", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_INFLIGHT_MEM] = { .name = "inflight-mem", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_DESTROYED] = { .name = "destroyed", .type = YNL_PT_UINT, }, +}; + +struct ynl_policy_nest netdev_page_pool_nest = { + .max_attr = NETDEV_A_PAGE_POOL_MAX, + .table = netdev_page_pool_policy, +}; + +struct ynl_policy_attr netdev_page_pool_stats_policy[NETDEV_A_PAGE_POOL_STATS_MAX + 1] = { + [NETDEV_A_PAGE_POOL_STATS_INFO] = { .name = "info", .type = YNL_PT_NEST, .nest = &netdev_page_pool_info_nest, }, + [NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST] = { .name = "alloc-fast", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW] = { .name = "alloc-slow", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW_HIGH_ORDER] = { .name = "alloc-slow-high-order", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_ALLOC_EMPTY] = { .name = "alloc-empty", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_ALLOC_REFILL] = { .name = "alloc-refill", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_ALLOC_WAIVE] = { .name = "alloc-waive", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHED] = { .name = "recycle-cached", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHE_FULL] = { .name = "recycle-cache-full", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING] = { .name = "recycle-ring", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING_FULL] = { .name = "recycle-ring-full", .type = YNL_PT_UINT, }, + [NETDEV_A_PAGE_POOL_STATS_RECYCLE_RELEASED_REFCNT] = { .name = "recycle-released-refcnt", .type = YNL_PT_UINT, }, +}; + +struct ynl_policy_nest netdev_page_pool_stats_nest = { + .max_attr = NETDEV_A_PAGE_POOL_STATS_MAX, + .table = netdev_page_pool_stats_policy, +}; + /* Common nested types */ +void netdev_page_pool_info_free(struct netdev_page_pool_info *obj) +{ +} + +int netdev_page_pool_info_put(struct nlmsghdr *nlh, unsigned int attr_type, + struct netdev_page_pool_info *obj) +{ + struct nlattr *nest; + + nest = mnl_attr_nest_start(nlh, attr_type); + if (obj->_present.id) + mnl_attr_put_uint(nlh, NETDEV_A_PAGE_POOL_ID, obj->id); + if (obj->_present.ifindex) + mnl_attr_put_u32(nlh, NETDEV_A_PAGE_POOL_IFINDEX, obj->ifindex); + mnl_attr_nest_end(nlh, nest); + + return 0; +} + +int netdev_page_pool_info_parse(struct ynl_parse_arg *yarg, + const struct nlattr *nested) +{ + struct netdev_page_pool_info *dst = yarg->data; + const struct nlattr *attr; + + mnl_attr_for_each_nested(attr, nested) { + unsigned int type = mnl_attr_get_type(attr); + + if (type == NETDEV_A_PAGE_POOL_ID) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.id = 1; + dst->id = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_IFINDEX) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.ifindex = 1; + dst->ifindex = mnl_attr_get_u32(attr); + } + } + + return 0; +} + /* ============== NETDEV_CMD_DEV_GET ============== */ /* NETDEV_CMD_DEV_GET - do */ void netdev_dev_get_req_free(struct netdev_dev_get_req *req) @@ -197,6 +290,314 @@ void netdev_dev_get_ntf_free(struct netdev_dev_get_ntf *rsp) free(rsp); } +/* ============== NETDEV_CMD_PAGE_POOL_GET ============== */ +/* NETDEV_CMD_PAGE_POOL_GET - do */ +void netdev_page_pool_get_req_free(struct netdev_page_pool_get_req *req) +{ + free(req); +} + +void netdev_page_pool_get_rsp_free(struct netdev_page_pool_get_rsp *rsp) +{ + free(rsp); +} + +int netdev_page_pool_get_rsp_parse(const struct nlmsghdr *nlh, void *data) +{ + struct netdev_page_pool_get_rsp *dst; + struct ynl_parse_arg *yarg = data; + const struct nlattr *attr; + + dst = yarg->data; + + mnl_attr_for_each(attr, nlh, sizeof(struct genlmsghdr)) { + unsigned int type = mnl_attr_get_type(attr); + + if (type == NETDEV_A_PAGE_POOL_ID) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.id = 1; + dst->id = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_IFINDEX) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.ifindex = 1; + dst->ifindex = mnl_attr_get_u32(attr); + } else if (type == NETDEV_A_PAGE_POOL_NAPI_ID) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.napi_id = 1; + dst->napi_id = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_INFLIGHT) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.inflight = 1; + dst->inflight = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_INFLIGHT_MEM) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.inflight_mem = 1; + dst->inflight_mem = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_DESTROYED) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.destroyed = 1; + dst->destroyed = mnl_attr_get_uint(attr); + } + } + + return MNL_CB_OK; +} + +struct netdev_page_pool_get_rsp * +netdev_page_pool_get(struct ynl_sock *ys, struct netdev_page_pool_get_req *req) +{ + struct ynl_req_state yrs = { .yarg = { .ys = ys, }, }; + struct netdev_page_pool_get_rsp *rsp; + struct nlmsghdr *nlh; + int err; + + nlh = ynl_gemsg_start_req(ys, ys->family_id, NETDEV_CMD_PAGE_POOL_GET, 1); + ys->req_policy = &netdev_page_pool_nest; + yrs.yarg.rsp_policy = &netdev_page_pool_nest; + + if (req->_present.id) + mnl_attr_put_uint(nlh, NETDEV_A_PAGE_POOL_ID, req->id); + + rsp = calloc(1, sizeof(*rsp)); + yrs.yarg.data = rsp; + yrs.cb = netdev_page_pool_get_rsp_parse; + yrs.rsp_cmd = NETDEV_CMD_PAGE_POOL_GET; + + err = ynl_exec(ys, nlh, &yrs); + if (err < 0) + goto err_free; + + return rsp; + +err_free: + netdev_page_pool_get_rsp_free(rsp); + return NULL; +} + +/* NETDEV_CMD_PAGE_POOL_GET - dump */ +void netdev_page_pool_get_list_free(struct netdev_page_pool_get_list *rsp) +{ + struct netdev_page_pool_get_list *next = rsp; + + while ((void *)next != YNL_LIST_END) { + rsp = next; + next = rsp->next; + + free(rsp); + } +} + +struct netdev_page_pool_get_list * +netdev_page_pool_get_dump(struct ynl_sock *ys) +{ + struct ynl_dump_state yds = {}; + struct nlmsghdr *nlh; + int err; + + yds.ys = ys; + yds.alloc_sz = sizeof(struct netdev_page_pool_get_list); + yds.cb = netdev_page_pool_get_rsp_parse; + yds.rsp_cmd = NETDEV_CMD_PAGE_POOL_GET; + yds.rsp_policy = &netdev_page_pool_nest; + + nlh = ynl_gemsg_start_dump(ys, ys->family_id, NETDEV_CMD_PAGE_POOL_GET, 1); + + err = ynl_exec_dump(ys, nlh, &yds); + if (err < 0) + goto free_list; + + return yds.first; + +free_list: + netdev_page_pool_get_list_free(yds.first); + return NULL; +} + +/* NETDEV_CMD_PAGE_POOL_GET - notify */ +void netdev_page_pool_get_ntf_free(struct netdev_page_pool_get_ntf *rsp) +{ + free(rsp); +} + +/* ============== NETDEV_CMD_PAGE_POOL_STATS_GET ============== */ +/* NETDEV_CMD_PAGE_POOL_STATS_GET - do */ +void +netdev_page_pool_stats_get_req_free(struct netdev_page_pool_stats_get_req *req) +{ + netdev_page_pool_info_free(&req->info); + free(req); +} + +void +netdev_page_pool_stats_get_rsp_free(struct netdev_page_pool_stats_get_rsp *rsp) +{ + netdev_page_pool_info_free(&rsp->info); + free(rsp); +} + +int netdev_page_pool_stats_get_rsp_parse(const struct nlmsghdr *nlh, + void *data) +{ + struct netdev_page_pool_stats_get_rsp *dst; + struct ynl_parse_arg *yarg = data; + const struct nlattr *attr; + struct ynl_parse_arg parg; + + dst = yarg->data; + parg.ys = yarg->ys; + + mnl_attr_for_each(attr, nlh, sizeof(struct genlmsghdr)) { + unsigned int type = mnl_attr_get_type(attr); + + if (type == NETDEV_A_PAGE_POOL_STATS_INFO) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.info = 1; + + parg.rsp_policy = &netdev_page_pool_info_nest; + parg.data = &dst->info; + if (netdev_page_pool_info_parse(&parg, attr)) + return MNL_CB_ERROR; + } else if (type == NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.alloc_fast = 1; + dst->alloc_fast = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.alloc_slow = 1; + dst->alloc_slow = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW_HIGH_ORDER) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.alloc_slow_high_order = 1; + dst->alloc_slow_high_order = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_ALLOC_EMPTY) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.alloc_empty = 1; + dst->alloc_empty = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_ALLOC_REFILL) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.alloc_refill = 1; + dst->alloc_refill = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_ALLOC_WAIVE) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.alloc_waive = 1; + dst->alloc_waive = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHED) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.recycle_cached = 1; + dst->recycle_cached = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHE_FULL) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.recycle_cache_full = 1; + dst->recycle_cache_full = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.recycle_ring = 1; + dst->recycle_ring = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING_FULL) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.recycle_ring_full = 1; + dst->recycle_ring_full = mnl_attr_get_uint(attr); + } else if (type == NETDEV_A_PAGE_POOL_STATS_RECYCLE_RELEASED_REFCNT) { + if (ynl_attr_validate(yarg, attr)) + return MNL_CB_ERROR; + dst->_present.recycle_released_refcnt = 1; + dst->recycle_released_refcnt = mnl_attr_get_uint(attr); + } + } + + return MNL_CB_OK; +} + +struct netdev_page_pool_stats_get_rsp * +netdev_page_pool_stats_get(struct ynl_sock *ys, + struct netdev_page_pool_stats_get_req *req) +{ + struct ynl_req_state yrs = { .yarg = { .ys = ys, }, }; + struct netdev_page_pool_stats_get_rsp *rsp; + struct nlmsghdr *nlh; + int err; + + nlh = ynl_gemsg_start_req(ys, ys->family_id, NETDEV_CMD_PAGE_POOL_STATS_GET, 1); + ys->req_policy = &netdev_page_pool_stats_nest; + yrs.yarg.rsp_policy = &netdev_page_pool_stats_nest; + + if (req->_present.info) + netdev_page_pool_info_put(nlh, NETDEV_A_PAGE_POOL_STATS_INFO, &req->info); + + rsp = calloc(1, sizeof(*rsp)); + yrs.yarg.data = rsp; + yrs.cb = netdev_page_pool_stats_get_rsp_parse; + yrs.rsp_cmd = NETDEV_CMD_PAGE_POOL_STATS_GET; + + err = ynl_exec(ys, nlh, &yrs); + if (err < 0) + goto err_free; + + return rsp; + +err_free: + netdev_page_pool_stats_get_rsp_free(rsp); + return NULL; +} + +/* NETDEV_CMD_PAGE_POOL_STATS_GET - dump */ +void +netdev_page_pool_stats_get_list_free(struct netdev_page_pool_stats_get_list *rsp) +{ + struct netdev_page_pool_stats_get_list *next = rsp; + + while ((void *)next != YNL_LIST_END) { + rsp = next; + next = rsp->next; + + netdev_page_pool_info_free(&rsp->obj.info); + free(rsp); + } +} + +struct netdev_page_pool_stats_get_list * +netdev_page_pool_stats_get_dump(struct ynl_sock *ys) +{ + struct ynl_dump_state yds = {}; + struct nlmsghdr *nlh; + int err; + + yds.ys = ys; + yds.alloc_sz = sizeof(struct netdev_page_pool_stats_get_list); + yds.cb = netdev_page_pool_stats_get_rsp_parse; + yds.rsp_cmd = NETDEV_CMD_PAGE_POOL_STATS_GET; + yds.rsp_policy = &netdev_page_pool_stats_nest; + + nlh = ynl_gemsg_start_dump(ys, ys->family_id, NETDEV_CMD_PAGE_POOL_STATS_GET, 1); + + err = ynl_exec_dump(ys, nlh, &yds); + if (err < 0) + goto free_list; + + return yds.first; + +free_list: + netdev_page_pool_stats_get_list_free(yds.first); + return NULL; +} + static const struct ynl_ntf_info netdev_ntf_info[] = { [NETDEV_CMD_DEV_ADD_NTF] = { .alloc_sz = sizeof(struct netdev_dev_get_ntf), @@ -216,6 +617,24 @@ static const struct ynl_ntf_info netdev_ntf_info[] = { .policy = &netdev_dev_nest, .free = (void *)netdev_dev_get_ntf_free, }, + [NETDEV_CMD_PAGE_POOL_ADD_NTF] = { + .alloc_sz = sizeof(struct netdev_page_pool_get_ntf), + .cb = netdev_page_pool_get_rsp_parse, + .policy = &netdev_page_pool_nest, + .free = (void *)netdev_page_pool_get_ntf_free, + }, + [NETDEV_CMD_PAGE_POOL_DEL_NTF] = { + .alloc_sz = sizeof(struct netdev_page_pool_get_ntf), + .cb = netdev_page_pool_get_rsp_parse, + .policy = &netdev_page_pool_nest, + .free = (void *)netdev_page_pool_get_ntf_free, + }, + [NETDEV_CMD_PAGE_POOL_CHANGE_NTF] = { + .alloc_sz = sizeof(struct netdev_page_pool_get_ntf), + .cb = netdev_page_pool_get_rsp_parse, + .policy = &netdev_page_pool_nest, + .free = (void *)netdev_page_pool_get_ntf_free, + }, }; const struct ynl_family ynl_netdev_family = { diff --git a/tools/net/ynl/generated/netdev-user.h b/tools/net/ynl/generated/netdev-user.h index 4fafac879df3..f27ad84fe020 100644 --- a/tools/net/ynl/generated/netdev-user.h +++ b/tools/net/ynl/generated/netdev-user.h @@ -21,6 +21,16 @@ const char *netdev_xdp_act_str(enum netdev_xdp_act value); const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value); /* Common nested types */ +struct netdev_page_pool_info { + struct { + __u32 id:1; + __u32 ifindex:1; + } _present; + + __u64 id; + __u32 ifindex; +}; + /* ============== NETDEV_CMD_DEV_GET ============== */ /* NETDEV_CMD_DEV_GET - do */ struct netdev_dev_get_req { @@ -87,4 +97,165 @@ struct netdev_dev_get_ntf { void netdev_dev_get_ntf_free(struct netdev_dev_get_ntf *rsp); +/* ============== NETDEV_CMD_PAGE_POOL_GET ============== */ +/* NETDEV_CMD_PAGE_POOL_GET - do */ +struct netdev_page_pool_get_req { + struct { + __u32 id:1; + } _present; + + __u64 id; +}; + +static inline struct netdev_page_pool_get_req * +netdev_page_pool_get_req_alloc(void) +{ + return calloc(1, sizeof(struct netdev_page_pool_get_req)); +} +void netdev_page_pool_get_req_free(struct netdev_page_pool_get_req *req); + +static inline void +netdev_page_pool_get_req_set_id(struct netdev_page_pool_get_req *req, __u64 id) +{ + req->_present.id = 1; + req->id = id; +} + +struct netdev_page_pool_get_rsp { + struct { + __u32 id:1; + __u32 ifindex:1; + __u32 napi_id:1; + __u32 inflight:1; + __u32 inflight_mem:1; + __u32 destroyed:1; + } _present; + + __u64 id; + __u32 ifindex; + __u64 napi_id; + __u64 inflight; + __u64 inflight_mem; + __u64 destroyed; +}; + +void netdev_page_pool_get_rsp_free(struct netdev_page_pool_get_rsp *rsp); + +/* + * Get / dump information about Page Pools. +(Only Page Pools associated with a net_device can be listed.) + + */ +struct netdev_page_pool_get_rsp * +netdev_page_pool_get(struct ynl_sock *ys, struct netdev_page_pool_get_req *req); + +/* NETDEV_CMD_PAGE_POOL_GET - dump */ +struct netdev_page_pool_get_list { + struct netdev_page_pool_get_list *next; + struct netdev_page_pool_get_rsp obj __attribute__((aligned(8))); +}; + +void netdev_page_pool_get_list_free(struct netdev_page_pool_get_list *rsp); + +struct netdev_page_pool_get_list * +netdev_page_pool_get_dump(struct ynl_sock *ys); + +/* NETDEV_CMD_PAGE_POOL_GET - notify */ +struct netdev_page_pool_get_ntf { + __u16 family; + __u8 cmd; + struct ynl_ntf_base_type *next; + void (*free)(struct netdev_page_pool_get_ntf *ntf); + struct netdev_page_pool_get_rsp obj __attribute__((aligned(8))); +}; + +void netdev_page_pool_get_ntf_free(struct netdev_page_pool_get_ntf *rsp); + +/* ============== NETDEV_CMD_PAGE_POOL_STATS_GET ============== */ +/* NETDEV_CMD_PAGE_POOL_STATS_GET - do */ +struct netdev_page_pool_stats_get_req { + struct { + __u32 info:1; + } _present; + + struct netdev_page_pool_info info; +}; + +static inline struct netdev_page_pool_stats_get_req * +netdev_page_pool_stats_get_req_alloc(void) +{ + return calloc(1, sizeof(struct netdev_page_pool_stats_get_req)); +} +void +netdev_page_pool_stats_get_req_free(struct netdev_page_pool_stats_get_req *req); + +static inline void +netdev_page_pool_stats_get_req_set_info_id(struct netdev_page_pool_stats_get_req *req, + __u64 id) +{ + req->_present.info = 1; + req->info._present.id = 1; + req->info.id = id; +} +static inline void +netdev_page_pool_stats_get_req_set_info_ifindex(struct netdev_page_pool_stats_get_req *req, + __u32 ifindex) +{ + req->_present.info = 1; + req->info._present.ifindex = 1; + req->info.ifindex = ifindex; +} + +struct netdev_page_pool_stats_get_rsp { + struct { + __u32 info:1; + __u32 alloc_fast:1; + __u32 alloc_slow:1; + __u32 alloc_slow_high_order:1; + __u32 alloc_empty:1; + __u32 alloc_refill:1; + __u32 alloc_waive:1; + __u32 recycle_cached:1; + __u32 recycle_cache_full:1; + __u32 recycle_ring:1; + __u32 recycle_ring_full:1; + __u32 recycle_released_refcnt:1; + } _present; + + struct netdev_page_pool_info info; + __u64 alloc_fast; + __u64 alloc_slow; + __u64 alloc_slow_high_order; + __u64 alloc_empty; + __u64 alloc_refill; + __u64 alloc_waive; + __u64 recycle_cached; + __u64 recycle_cache_full; + __u64 recycle_ring; + __u64 recycle_ring_full; + __u64 recycle_released_refcnt; +}; + +void +netdev_page_pool_stats_get_rsp_free(struct netdev_page_pool_stats_get_rsp *rsp); + +/* + * Get page pool statistics. + */ +struct netdev_page_pool_stats_get_rsp * +netdev_page_pool_stats_get(struct ynl_sock *ys, + struct netdev_page_pool_stats_get_req *req); + +/* NETDEV_CMD_PAGE_POOL_STATS_GET - dump */ +struct netdev_page_pool_stats_get_list { + struct netdev_page_pool_stats_get_list *next; + struct netdev_page_pool_stats_get_rsp obj __attribute__((aligned(8))); +}; + +void +netdev_page_pool_stats_get_list_free(struct netdev_page_pool_stats_get_list *rsp); + +struct netdev_page_pool_stats_get_list * +netdev_page_pool_stats_get_dump(struct ynl_sock *ys); + #endif /* _LINUX_NETDEV_GEN_H */ diff --git a/tools/net/ynl/lib/ynl.h b/tools/net/ynl/lib/ynl.h index e974378e3b8c..075d868f3b57 100644 --- a/tools/net/ynl/lib/ynl.h +++ b/tools/net/ynl/lib/ynl.h @@ -239,7 +239,7 @@ int ynl_error_parse(struct ynl_parse_arg *yarg, const char *msg); #ifndef MNL_HAS_AUTO_SCALARS static inline uint64_t mnl_attr_get_uint(const struct nlattr *attr) { - if (mnl_attr_get_len(attr) == 4) + if (mnl_attr_get_payload_len(attr) == 4) return mnl_attr_get_u32(attr); return mnl_attr_get_u64(attr); } diff --git a/tools/net/ynl/samples/.gitignore b/tools/net/ynl/samples/.gitignore index 2aae60c4829f..49637b26c482 100644 --- a/tools/net/ynl/samples/.gitignore +++ b/tools/net/ynl/samples/.gitignore @@ -1,3 +1,4 @@ ethtool devlink netdev +page-pool \ No newline at end of file diff --git a/tools/net/ynl/samples/Makefile b/tools/net/ynl/samples/Makefile index 3dbb106e87d9..1afefc266b7a 100644 --- a/tools/net/ynl/samples/Makefile +++ b/tools/net/ynl/samples/Makefile @@ -18,7 +18,7 @@ include $(wildcard *.d) all: $(BINS) -$(BINS): ../lib/ynl.a ../generated/protos.a +$(BINS): ../lib/ynl.a ../generated/protos.a $(SRCS) @echo -e '\tCC sample $@' @$(COMPILE.c) $(CFLAGS_$@) $@.c -o $@.o @$(LINK.c) $@.o -o $@ $(LDLIBS) diff --git a/tools/net/ynl/samples/page-pool.c b/tools/net/ynl/samples/page-pool.c new file mode 100644 index 000000000000..18d359713469 --- /dev/null +++ b/tools/net/ynl/samples/page-pool.c @@ -0,0 +1,147 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE + +#include +#include + +#include + +#include + +#include "netdev-user.h" + +struct stat { + unsigned int ifc; + + struct { + unsigned int cnt; + size_t refs, bytes; + } live[2]; + + size_t alloc_slow, alloc_fast, recycle_ring, recycle_cache; +}; + +struct stats_array { + unsigned int i, max; + struct stat *s; +}; + +static struct stat *find_ifc(struct stats_array *a, unsigned int ifindex) +{ + unsigned int i; + + for (i = 0; i < a->i; i++) { + if (a->s[i].ifc == ifindex) + return &a->s[i]; + } + + a->i++; + if (a->i == a->max) { + a->max *= 2; + a->s = reallocarray(a->s, a->max, sizeof(*a->s)); + } + a->s[i].ifc = ifindex; + return &a->s[i]; +} + +static void count(struct stat *s, unsigned int l, + struct netdev_page_pool_get_rsp *pp) +{ + s->live[l].cnt++; + if (pp->_present.inflight) + s->live[l].refs += pp->inflight; + if (pp->_present.inflight_mem) + s->live[l].bytes += pp->inflight_mem; +} + +int main(int argc, char **argv) +{ + struct netdev_page_pool_stats_get_list *pp_stats; + struct netdev_page_pool_get_list *pools; + struct stats_array a = {}; + struct ynl_error yerr; + struct ynl_sock *ys; + + ys = ynl_sock_create(&ynl_netdev_family, &yerr); + if (!ys) { + fprintf(stderr, "YNL: %s\n", yerr.msg); + return 1; + } + + a.max = 128; + a.s = calloc(a.max, sizeof(*a.s)); + if (!a.s) + goto err_close; + + pools = netdev_page_pool_get_dump(ys); + if (!pools) + goto err_free; + + ynl_dump_foreach(pools, pp) { + struct stat *s = find_ifc(&a, pp->ifindex); + + count(s, 1, pp); + if (pp->_present.destroyed) + count(s, 0, pp); + } + netdev_page_pool_get_list_free(pools); + + pp_stats = netdev_page_pool_stats_get_dump(ys); + if (!pp_stats) + goto err_free; + + ynl_dump_foreach(pp_stats, pp) { + struct stat *s = find_ifc(&a, pp->info.ifindex); + + if (pp->_present.alloc_fast) + s->alloc_fast += pp->alloc_fast; + if (pp->_present.alloc_slow) + s->alloc_slow += pp->alloc_slow; + if (pp->_present.recycle_ring) + s->recycle_ring += pp->recycle_ring; + if (pp->_present.recycle_cached) + s->recycle_cache += pp->recycle_cached; + } + netdev_page_pool_stats_get_list_free(pp_stats); + + for (unsigned int i = 0; i < a.i; i++) { + char ifname[IF_NAMESIZE]; + struct stat *s = &a.s[i]; + const char *name; + double recycle; + + if (!s->ifc) { + name = "\t"; + } else { + name = if_indextoname(s->ifc, ifname); + if (name) + printf("%8s", name); + printf("[%d]\t", s->ifc); + } + + printf("page pools: %u (zombies: %u)\n", + s->live[1].cnt, s->live[0].cnt); + printf("\t\trefs: %zu bytes: %zu (refs: %zu bytes: %zu)\n", + s->live[1].refs, s->live[1].bytes, + s->live[0].refs, s->live[0].bytes); + + /* We don't know how many pages are sitting in cache and ring + * so we will under-count the recycling rate a bit. + */ + recycle = (double)(s->recycle_ring + s->recycle_cache) / + (s->alloc_fast + s->alloc_slow) * 100; + printf("\t\trecycling: %.1lf%% (alloc: %zu:%zu recycle: %zu:%zu)\n", + recycle, s->alloc_slow, s->alloc_fast, + s->recycle_ring, s->recycle_cache); + } + + ynl_sock_destroy(ys); + return 0; + +err_free: + free(a.s); +err_close: + fprintf(stderr, "YNL: %s\n", ys->err.msg); + ynl_sock_destroy(ys); + return 2; +}