From patchwork Tue Feb 25 17:17:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990346 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E1CA158520; Tue, 25 Feb 2025 17:24:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504279; cv=none; b=IH25TdNiKscG4w+THxGRqxc80NrOrUiXopmMpALTiUCnDUfWEOC1PMOLRhYfmOr8j6Yv2yi8X45RlOENDnDTyIAW29Ag0cTv24zUKMlJuVDL2+UT2iuCqupkYdUnF/7DPSeJ+8H49QBoeodfAVWssOY0z1YUTvgitjkNso4yB68= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504279; c=relaxed/simple; bh=ES2Ne8kzzvV8nTSmMaBEFhPnsihQ21QDwYPJkOhltvQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=lZug1EmftFnL220XmoXeo3gUTFGZniy0SWbQpGJAqikIcf6nqUFElA1YnGjFCUNTS3C+ol5KCoViPyqRmLP8v1mKX8p5EMRpqfX75IQynKnFD5eTPinEy37LApXDuBKTzk5OexVrXX0jG6Q67asU1d/GvDpqSmVAnKEdNJj+VsA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nJjb7lgl; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nJjb7lgl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504278; x=1772040278; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ES2Ne8kzzvV8nTSmMaBEFhPnsihQ21QDwYPJkOhltvQ=; b=nJjb7lgleHFrBC2UMTDbdT0IiPCrFubEjjLf5B4GZUNBgYS6sUNrLiL9 3FihIfnw+xgkgnjfsKk46gdXC3pDVRESinJuFf4cAEs5eHlnUYOSA6xPh UsbByU+n5l1N+F3HaWNPUMrl0EepCXuRvSJqX73lq6LYrBv50laa3+IF0 1odhKWZv8GwznJof8fTA3VjHVe/8ziQ8Uoa87BX4qdeZ7JctUUSveCAL3 sXmVzLBJ+RaQJBn7n1PKy78gQp1BvOwwNx52ZhWtUFOwPuSFcTgpgs6zY KB9gQnlscCwSL4Jx+Ef1sHVJLPXJ/4Nlx1bmBVrwu1CZGoMGk+zFUGiWP w==; X-CSE-ConnectionGUID: w0c8AepwQ1qadf/si1b8hQ== X-CSE-MsgGUID: NP/Rtp5HRC6Ahgkn9mVA6Q== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974087" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974087" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:24:37 -0800 X-CSE-ConnectionGUID: lq6ZYj4dSa69CazjYyQ7KQ== X-CSE-MsgGUID: uyaSZb31RC2k2w8nyfOYAA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256866" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:24:33 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 1/8] net: gro: decouple GRO from the NAPI layer Date: Tue, 25 Feb 2025 18:17:43 +0100 Message-ID: <20250225171751.2268401-2-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org In fact, these two are not tied closely to each other. The only requirements to GRO are to use it in the BH context and have some sane limits on the packet batches, e.g. NAPI has a limit of its budget (64/8/etc.). Move purely GRO fields into a new structure, &gro_node. Embed it into &napi_struct and adjust all the references. gro_node::cached_napi_id is effectively the same as napi_struct::napi_id, but to be used on GRO hotpath to mark skbs. napi_struct::napi_id is now a fully control path field. Three Ethernet drivers use napi_gro_flush() not really meant to be exported, so move it to and add that include there. napi_gro_receive() is used in more than 100 drivers, keep it in . This does not make GRO ready to use outside of the NAPI context yet. Tested-by: Daniel Xu Acked-by: Jakub Kicinski Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- include/linux/netdevice.h | 37 +++++++++--- include/net/busy_poll.h | 12 +++- include/net/gro.h | 35 +++++++---- drivers/net/ethernet/brocade/bna/bnad.c | 1 + drivers/net/ethernet/cortina/gemini.c | 1 + drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c | 1 + net/core/dev.c | 66 ++++++++++----------- net/core/gro.c | 69 +++++++++++----------- 8 files changed, 130 insertions(+), 92 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 9a387d456592..fe0d889960bb 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -340,11 +340,27 @@ struct gro_list { }; /* - * size of gro hash buckets, must less than bit number of - * napi_struct::gro_bitmask + * size of gro hash buckets, must be <= the number of bits in + * gro_node::bitmask */ #define GRO_HASH_BUCKETS 8 +/** + * struct gro_node - structure to support Generic Receive Offload + * @bitmask: bitmask to indicate used buckets in @hash + * @hash: hashtable of pending aggregated skbs, separated by flows + * @rx_list: list of pending ``GRO_NORMAL`` skbs + * @rx_count: cached current length of @rx_list + * @cached_napi_id: napi_struct::napi_id cached for hotpath, 0 for standalone + */ +struct gro_node { + unsigned long bitmask; + struct gro_list hash[GRO_HASH_BUCKETS]; + struct list_head rx_list; + u32 rx_count; + u32 cached_napi_id; +}; + /* * Structure for per-NAPI config */ @@ -370,7 +386,6 @@ struct napi_struct { unsigned long state; int weight; u32 defer_hard_irqs_count; - unsigned long gro_bitmask; int (*poll)(struct napi_struct *, int); #ifdef CONFIG_NETPOLL /* CPU actively polling if netpoll is configured */ @@ -379,11 +394,8 @@ struct napi_struct { /* CPU on which NAPI has been scheduled for processing */ int list_owner; struct net_device *dev; - struct gro_list gro_hash[GRO_HASH_BUCKETS]; struct sk_buff *skb; - struct list_head rx_list; /* Pending GRO_NORMAL skbs */ - int rx_count; /* length of rx_list */ - unsigned int napi_id; /* protected by netdev_lock */ + struct gro_node gro; struct hrtimer timer; /* all fields past this point are write-protected by netdev_lock */ struct task_struct *thread; @@ -391,6 +403,7 @@ struct napi_struct { unsigned long irq_suspend_timeout; u32 defer_hard_irqs; /* control-path-only fields follow */ + u32 napi_id; struct list_head dev_list; struct hlist_node napi_hash_node; int irq; @@ -4115,8 +4128,14 @@ int netif_receive_skb(struct sk_buff *skb); int netif_receive_skb_core(struct sk_buff *skb); void netif_receive_skb_list_internal(struct list_head *head); void netif_receive_skb_list(struct list_head *head); -gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb); -void napi_gro_flush(struct napi_struct *napi, bool flush_old); +gro_result_t gro_receive_skb(struct gro_node *gro, struct sk_buff *skb); + +static inline gro_result_t napi_gro_receive(struct napi_struct *napi, + struct sk_buff *skb) +{ + return gro_receive_skb(&napi->gro, skb); +} + struct sk_buff *napi_get_frags(struct napi_struct *napi); gro_result_t napi_gro_frags(struct napi_struct *napi); diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index cab6146a510a..6e172d0f6ef5 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -127,18 +127,24 @@ static inline void sk_busy_loop(struct sock *sk, int nonblock) } /* used in the NIC receive handler to mark the skb */ -static inline void skb_mark_napi_id(struct sk_buff *skb, - struct napi_struct *napi) +static inline void __skb_mark_napi_id(struct sk_buff *skb, + const struct gro_node *gro) { #ifdef CONFIG_NET_RX_BUSY_POLL /* If the skb was already marked with a valid NAPI ID, avoid overwriting * it. */ if (!napi_id_valid(skb->napi_id)) - skb->napi_id = napi->napi_id; + skb->napi_id = gro->cached_napi_id; #endif } +static inline void skb_mark_napi_id(struct sk_buff *skb, + const struct napi_struct *napi) +{ + __skb_mark_napi_id(skb, &napi->gro); +} + /* used in the protocol handler to propagate the napi_id to the socket */ static inline void sk_mark_napi_id(struct sock *sk, const struct sk_buff *skb) { diff --git a/include/net/gro.h b/include/net/gro.h index 7b548f91754b..38d70c69ff80 100644 --- a/include/net/gro.h +++ b/include/net/gro.h @@ -509,26 +509,41 @@ static inline int gro_receive_network_flush(const void *th, const void *th2, int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb); int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb); +void __gro_flush(struct gro_node *gro, bool flush_old); + +static inline void gro_flush(struct gro_node *gro, bool flush_old) +{ + if (!gro->bitmask) + return; + + __gro_flush(gro, flush_old); +} + +static inline void napi_gro_flush(struct napi_struct *napi, bool flush_old) +{ + gro_flush(&napi->gro, flush_old); +} /* Pass the currently batched GRO_NORMAL SKBs up to the stack. */ -static inline void gro_normal_list(struct napi_struct *napi) +static inline void gro_normal_list(struct gro_node *gro) { - if (!napi->rx_count) + if (!gro->rx_count) return; - netif_receive_skb_list_internal(&napi->rx_list); - INIT_LIST_HEAD(&napi->rx_list); - napi->rx_count = 0; + netif_receive_skb_list_internal(&gro->rx_list); + INIT_LIST_HEAD(&gro->rx_list); + gro->rx_count = 0; } /* Queue one GRO_NORMAL SKB up for list processing. If batch size exceeded, * pass the whole batch up to the stack. */ -static inline void gro_normal_one(struct napi_struct *napi, struct sk_buff *skb, int segs) +static inline void gro_normal_one(struct gro_node *gro, struct sk_buff *skb, + int segs) { - list_add_tail(&skb->list, &napi->rx_list); - napi->rx_count += segs; - if (napi->rx_count >= READ_ONCE(net_hotdata.gro_normal_batch)) - gro_normal_list(napi); + list_add_tail(&skb->list, &gro->rx_list); + gro->rx_count += segs; + if (gro->rx_count >= READ_ONCE(net_hotdata.gro_normal_batch)) + gro_normal_list(gro); } /* This function is the alternative of 'inet_iif' and 'inet_sdif' diff --git a/drivers/net/ethernet/brocade/bna/bnad.c b/drivers/net/ethernet/brocade/bna/bnad.c index ece6f3b48327..3b9107003b00 100644 --- a/drivers/net/ethernet/brocade/bna/bnad.c +++ b/drivers/net/ethernet/brocade/bna/bnad.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "bnad.h" #include "bna.h" diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index 991e3839858b..1f8067bdd61a 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -40,6 +40,7 @@ #include #include #include +#include #include "gemini.h" diff --git a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c index 7a9c09cd4fdc..6a7a26085fc7 100644 --- a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c +++ b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_rx.c @@ -41,6 +41,7 @@ #include #include #include +#include #include "t7xx_dpmaif.h" #include "t7xx_hif_dpmaif.h" diff --git a/net/core/dev.c b/net/core/dev.c index 3f525278a871..ef66e8aaca19 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6484,7 +6484,7 @@ bool napi_complete_done(struct napi_struct *n, int work_done) return false; if (work_done) { - if (n->gro_bitmask) + if (n->gro.bitmask) timeout = napi_get_gro_flush_timeout(n); n->defer_hard_irqs_count = napi_get_defer_hard_irqs(n); } @@ -6494,15 +6494,14 @@ bool napi_complete_done(struct napi_struct *n, int work_done) if (timeout) ret = false; } - if (n->gro_bitmask) { - /* When the NAPI instance uses a timeout and keeps postponing - * it, we need to bound somehow the time packets are kept in - * the GRO layer - */ - napi_gro_flush(n, !!timeout); - } - gro_normal_list(n); + /* + * When the NAPI instance uses a timeout and keeps postponing + * it, we need to bound somehow the time packets are kept in + * the GRO layer. + */ + gro_flush(&n->gro, !!timeout); + gro_normal_list(&n->gro); if (unlikely(!list_empty(&n->poll_list))) { /* If n->poll_list is not empty, we need to mask irqs */ @@ -6566,19 +6565,15 @@ static void skb_defer_free_flush(struct softnet_data *sd) static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) { if (!skip_schedule) { - gro_normal_list(napi); + gro_normal_list(&napi->gro); __napi_schedule(napi); return; } - if (napi->gro_bitmask) { - /* flush too old packets - * If HZ < 1000, flush all packets. - */ - napi_gro_flush(napi, HZ >= 1000); - } + /* Flush too old packets. If HZ < 1000, flush all packets */ + gro_flush(&napi->gro, HZ >= 1000); + gro_normal_list(&napi->gro); - gro_normal_list(napi); clear_bit(NAPI_STATE_SCHED, &napi->state); } @@ -6685,7 +6680,7 @@ static void __napi_busy_loop(unsigned int napi_id, } work = napi_poll(napi, budget); trace_napi_poll(napi, work, budget); - gro_normal_list(napi); + gro_normal_list(&napi->gro); count: if (work > 0) __NET_ADD_STATS(dev_net(napi->dev), @@ -6785,6 +6780,8 @@ void napi_resume_irqs(unsigned int napi_id) static void __napi_hash_add_with_id(struct napi_struct *napi, unsigned int napi_id) { + napi->gro.cached_napi_id = napi_id; + WRITE_ONCE(napi->napi_id, napi_id); hlist_add_head_rcu(&napi->napi_hash_node, &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]); @@ -6858,10 +6855,12 @@ static void init_gro_hash(struct napi_struct *napi) int i; for (i = 0; i < GRO_HASH_BUCKETS; i++) { - INIT_LIST_HEAD(&napi->gro_hash[i].list); - napi->gro_hash[i].count = 0; + INIT_LIST_HEAD(&napi->gro.hash[i].list); + napi->gro.hash[i].count = 0; } - napi->gro_bitmask = 0; + + napi->gro.bitmask = 0; + napi->gro.cached_napi_id = 0; } int dev_set_threaded(struct net_device *dev, bool threaded) @@ -7029,8 +7028,8 @@ void netif_napi_add_weight_locked(struct net_device *dev, napi->timer.function = napi_watchdog; init_gro_hash(napi); napi->skb = NULL; - INIT_LIST_HEAD(&napi->rx_list); - napi->rx_count = 0; + INIT_LIST_HEAD(&napi->gro.rx_list); + napi->gro.rx_count = 0; napi->poll = poll; if (weight > NAPI_POLL_WEIGHT) netdev_err_once(dev, "%s() called with weight %d\n", __func__, @@ -7151,10 +7150,13 @@ static void flush_gro_hash(struct napi_struct *napi) for (i = 0; i < GRO_HASH_BUCKETS; i++) { struct sk_buff *skb, *n; - list_for_each_entry_safe(skb, n, &napi->gro_hash[i].list, list) + list_for_each_entry_safe(skb, n, &napi->gro.hash[i].list, list) kfree_skb(skb); - napi->gro_hash[i].count = 0; + napi->gro.hash[i].count = 0; } + + napi->gro.bitmask = 0; + napi->gro.cached_napi_id = 0; } /* Must be called in process context */ @@ -7177,7 +7179,6 @@ void __netif_napi_del_locked(struct napi_struct *napi) napi_free_frags(napi); flush_gro_hash(napi); - napi->gro_bitmask = 0; if (napi->thread) { kthread_stop(napi->thread); @@ -7236,14 +7237,9 @@ static int __napi_poll(struct napi_struct *n, bool *repoll) return work; } - if (n->gro_bitmask) { - /* flush too old packets - * If HZ < 1000, flush all packets. - */ - napi_gro_flush(n, HZ >= 1000); - } - - gro_normal_list(n); + /* Flush too old packets. If HZ < 1000, flush all packets */ + gro_flush(&n->gro, HZ >= 1000); + gro_normal_list(&n->gro); /* Some drivers may have called napi_schedule * prior to exhausting their budget. @@ -12270,7 +12266,7 @@ static struct hlist_head * __net_init netdev_create_hash(void) static int __net_init netdev_init(struct net *net) { BUILD_BUG_ON(GRO_HASH_BUCKETS > - 8 * sizeof_field(struct napi_struct, gro_bitmask)); + BITS_PER_BYTE * sizeof_field(struct gro_node, bitmask)); INIT_LIST_HEAD(&net->dev_base_head); diff --git a/net/core/gro.c b/net/core/gro.c index 78b320b63174..9e1803fdf249 100644 --- a/net/core/gro.c +++ b/net/core/gro.c @@ -250,8 +250,7 @@ int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb) return 0; } - -static void napi_gro_complete(struct napi_struct *napi, struct sk_buff *skb) +static void gro_complete(struct gro_node *gro, struct sk_buff *skb) { struct list_head *head = &net_hotdata.offload_base; struct packet_offload *ptype; @@ -284,43 +283,43 @@ static void napi_gro_complete(struct napi_struct *napi, struct sk_buff *skb) } out: - gro_normal_one(napi, skb, NAPI_GRO_CB(skb)->count); + gro_normal_one(gro, skb, NAPI_GRO_CB(skb)->count); } -static void __napi_gro_flush_chain(struct napi_struct *napi, u32 index, - bool flush_old) +static void __gro_flush_chain(struct gro_node *gro, u32 index, bool flush_old) { - struct list_head *head = &napi->gro_hash[index].list; + struct list_head *head = &gro->hash[index].list; struct sk_buff *skb, *p; list_for_each_entry_safe_reverse(skb, p, head, list) { if (flush_old && NAPI_GRO_CB(skb)->age == jiffies) return; skb_list_del_init(skb); - napi_gro_complete(napi, skb); - napi->gro_hash[index].count--; + gro_complete(gro, skb); + gro->hash[index].count--; } - if (!napi->gro_hash[index].count) - __clear_bit(index, &napi->gro_bitmask); + if (!gro->hash[index].count) + __clear_bit(index, &gro->bitmask); } -/* napi->gro_hash[].list contains packets ordered by age. +/* + * gro->hash[].list contains packets ordered by age. * youngest packets at the head of it. * Complete skbs in reverse order to reduce latencies. */ -void napi_gro_flush(struct napi_struct *napi, bool flush_old) +void __gro_flush(struct gro_node *gro, bool flush_old) { - unsigned long bitmask = napi->gro_bitmask; + unsigned long bitmask = gro->bitmask; unsigned int i, base = ~0U; while ((i = ffs(bitmask)) != 0) { bitmask >>= i; base += i; - __napi_gro_flush_chain(napi, base, flush_old); + __gro_flush_chain(gro, base, flush_old); } } -EXPORT_SYMBOL(napi_gro_flush); +EXPORT_SYMBOL(__gro_flush); static unsigned long gro_list_prepare_tc_ext(const struct sk_buff *skb, const struct sk_buff *p, @@ -439,7 +438,7 @@ static void gro_try_pull_from_frag0(struct sk_buff *skb) gro_pull_from_frag0(skb, grow); } -static void gro_flush_oldest(struct napi_struct *napi, struct list_head *head) +static void gro_flush_oldest(struct gro_node *gro, struct list_head *head) { struct sk_buff *oldest; @@ -455,14 +454,15 @@ static void gro_flush_oldest(struct napi_struct *napi, struct list_head *head) * SKB to the chain. */ skb_list_del_init(oldest); - napi_gro_complete(napi, oldest); + gro_complete(gro, oldest); } -static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff *skb) +static enum gro_result dev_gro_receive(struct gro_node *gro, + struct sk_buff *skb) { u32 bucket = skb_get_hash_raw(skb) & (GRO_HASH_BUCKETS - 1); - struct gro_list *gro_list = &napi->gro_hash[bucket]; struct list_head *head = &net_hotdata.offload_base; + struct gro_list *gro_list = &gro->hash[bucket]; struct packet_offload *ptype; __be16 type = skb->protocol; struct sk_buff *pp = NULL; @@ -526,7 +526,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff if (pp) { skb_list_del_init(pp); - napi_gro_complete(napi, pp); + gro_complete(gro, pp); gro_list->count--; } @@ -537,7 +537,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff goto normal; if (unlikely(gro_list->count >= MAX_GRO_SKBS)) - gro_flush_oldest(napi, &gro_list->list); + gro_flush_oldest(gro, &gro_list->list); else gro_list->count++; @@ -551,10 +551,10 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff ret = GRO_HELD; ok: if (gro_list->count) { - if (!test_bit(bucket, &napi->gro_bitmask)) - __set_bit(bucket, &napi->gro_bitmask); - } else if (test_bit(bucket, &napi->gro_bitmask)) { - __clear_bit(bucket, &napi->gro_bitmask); + if (!test_bit(bucket, &gro->bitmask)) + __set_bit(bucket, &gro->bitmask); + } else if (test_bit(bucket, &gro->bitmask)) { + __clear_bit(bucket, &gro->bitmask); } return ret; @@ -593,13 +593,12 @@ struct packet_offload *gro_find_complete_by_type(__be16 type) } EXPORT_SYMBOL(gro_find_complete_by_type); -static gro_result_t napi_skb_finish(struct napi_struct *napi, - struct sk_buff *skb, - gro_result_t ret) +static gro_result_t gro_skb_finish(struct gro_node *gro, struct sk_buff *skb, + gro_result_t ret) { switch (ret) { case GRO_NORMAL: - gro_normal_one(napi, skb, 1); + gro_normal_one(gro, skb, 1); break; case GRO_MERGED_FREE: @@ -620,21 +619,21 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, return ret; } -gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb) +gro_result_t gro_receive_skb(struct gro_node *gro, struct sk_buff *skb) { gro_result_t ret; - skb_mark_napi_id(skb, napi); + __skb_mark_napi_id(skb, gro); trace_napi_gro_receive_entry(skb); skb_gro_reset_offset(skb, 0); - ret = napi_skb_finish(napi, skb, dev_gro_receive(napi, skb)); + ret = gro_skb_finish(gro, skb, dev_gro_receive(gro, skb)); trace_napi_gro_receive_exit(ret); return ret; } -EXPORT_SYMBOL(napi_gro_receive); +EXPORT_SYMBOL(gro_receive_skb); static void napi_reuse_skb(struct napi_struct *napi, struct sk_buff *skb) { @@ -690,7 +689,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, __skb_push(skb, ETH_HLEN); skb->protocol = eth_type_trans(skb, skb->dev); if (ret == GRO_NORMAL) - gro_normal_one(napi, skb, 1); + gro_normal_one(&napi->gro, skb, 1); break; case GRO_MERGED_FREE: @@ -759,7 +758,7 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) trace_napi_gro_frags_entry(skb); - ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb)); + ret = napi_frags_finish(napi, skb, dev_gro_receive(&napi->gro, skb)); trace_napi_gro_frags_exit(ret); return ret; From patchwork Tue Feb 25 17:17:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990347 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B39B119992D; Tue, 25 Feb 2025 17:24:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504283; cv=none; b=HWv/0malNdEPKnT2wtOWfZsFTHbmXsG9VPBO8a9k5dTYwa5kqs4TxrHdcDV0iCgzQuOBn67u+/t1D96jepNRRYWH2V5NCwb9LPsvEICnORqmTDR3rxA9I1Nch+0WjXLzW7ghJiXmHEPB352damp5a8Af5UdpLmrivRrvn/QdJCY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504283; c=relaxed/simple; bh=pQJdWRNBZwHK6OqaFnjZoFz/5IDQS+SCAFtxnsNtLuY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IGaQl07d3k1lsazVxkbOgQ2z4tONkP8MxX0t3YiVASMIA2nGPO99sfVu8TV0Pt5C+WRXg4+G8sh0t1GLg56djE39d6RtZ42r6xc49EtAsTQz9xPQhZW1peMscLGVH8vmRYhHYGC66SuASnoa9eJFiakNYtP+QmaeJOt5VdWIBw0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fH8XuLiL; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fH8XuLiL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504282; x=1772040282; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pQJdWRNBZwHK6OqaFnjZoFz/5IDQS+SCAFtxnsNtLuY=; b=fH8XuLiLhBb6rWFb6H9fBE5nhfgL91/hQO//qzsvxMmxrKgbmaJw2pS/ Paim4RE2Lu4+Tii3FxGpH8QGIvQkPw+kDK4jgnrMthw1r0XhyY9UTtOPi B+a5tmEHvSPk4izJ7lJQICs3HBRVatHdqtIzQmFvG3zsQQ2npdazip3k9 ANGAreh9baZT9xxz06ElnCSKV5Aew+a+UwZMIfZ3aOwRS7iHSFMNI9pc+ sJ4juNJdm4p7axqO77cTRJu9s4Ca5wJYYDrbBgxPQJNSHlCN60RIqWCU5 EQuBVMSxPG4FKYwjgSq5rwoKpaWtBIgUiHumVOaRmr3dDyQORDq1E2Rtu A==; X-CSE-ConnectionGUID: LF8trDYSQt+g1VI5AXrEOw== X-CSE-MsgGUID: Kyvy2yXGTXCIf2GSlgkmqA== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974099" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974099" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:24:41 -0800 X-CSE-ConnectionGUID: /wG95mAmS9KYyDtnylqnqw== X-CSE-MsgGUID: 9vEUlFs+Si6QW6/yOjc0Ww== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256871" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:24:37 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 2/8] net: gro: expose GRO init/cleanup to use outside of NAPI Date: Tue, 25 Feb 2025 18:17:44 +0100 Message-ID: <20250225171751.2268401-3-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Make GRO init and cleanup functions global to be able to use GRO without a NAPI instance. Taking into account already global gro_flush(), it's now fully usable standalone. New functions are not exported, since they're not supposed to be used outside of the kernel core code. Tested-by: Daniel Xu Reviewed-by: Jakub Kicinski Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- include/net/gro.h | 3 +++ net/core/dev.c | 37 +++---------------------------------- net/core/gro.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 40 insertions(+), 34 deletions(-) diff --git a/include/net/gro.h b/include/net/gro.h index 38d70c69ff80..22d3a69e4404 100644 --- a/include/net/gro.h +++ b/include/net/gro.h @@ -546,6 +546,9 @@ static inline void gro_normal_one(struct gro_node *gro, struct sk_buff *skb, gro_normal_list(gro); } +void gro_init(struct gro_node *gro); +void gro_cleanup(struct gro_node *gro); + /* This function is the alternative of 'inet_iif' and 'inet_sdif' * functions in case we can not rely on fields of IPCB. * diff --git a/net/core/dev.c b/net/core/dev.c index ef66e8aaca19..5ea1400066cc 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6850,19 +6850,6 @@ static enum hrtimer_restart napi_watchdog(struct hrtimer *timer) return HRTIMER_NORESTART; } -static void init_gro_hash(struct napi_struct *napi) -{ - int i; - - for (i = 0; i < GRO_HASH_BUCKETS; i++) { - INIT_LIST_HEAD(&napi->gro.hash[i].list); - napi->gro.hash[i].count = 0; - } - - napi->gro.bitmask = 0; - napi->gro.cached_napi_id = 0; -} - int dev_set_threaded(struct net_device *dev, bool threaded) { struct napi_struct *napi; @@ -7026,10 +7013,8 @@ void netif_napi_add_weight_locked(struct net_device *dev, INIT_HLIST_NODE(&napi->napi_hash_node); hrtimer_init(&napi->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); napi->timer.function = napi_watchdog; - init_gro_hash(napi); + gro_init(&napi->gro); napi->skb = NULL; - INIT_LIST_HEAD(&napi->gro.rx_list); - napi->gro.rx_count = 0; napi->poll = poll; if (weight > NAPI_POLL_WEIGHT) netdev_err_once(dev, "%s() called with weight %d\n", __func__, @@ -7143,22 +7128,6 @@ void napi_enable(struct napi_struct *n) } EXPORT_SYMBOL(napi_enable); -static void flush_gro_hash(struct napi_struct *napi) -{ - int i; - - for (i = 0; i < GRO_HASH_BUCKETS; i++) { - struct sk_buff *skb, *n; - - list_for_each_entry_safe(skb, n, &napi->gro.hash[i].list, list) - kfree_skb(skb); - napi->gro.hash[i].count = 0; - } - - napi->gro.bitmask = 0; - napi->gro.cached_napi_id = 0; -} - /* Must be called in process context */ void __netif_napi_del_locked(struct napi_struct *napi) { @@ -7178,7 +7147,7 @@ void __netif_napi_del_locked(struct napi_struct *napi) list_del_rcu(&napi->dev_list); napi_free_frags(napi); - flush_gro_hash(napi); + gro_cleanup(&napi->gro); if (napi->thread) { kthread_stop(napi->thread); @@ -12631,7 +12600,7 @@ static int __init net_dev_init(void) INIT_CSD(&sd->defer_csd, trigger_rx_softirq, sd); spin_lock_init(&sd->defer_lock); - init_gro_hash(&sd->backlog); + gro_init(&sd->backlog.gro); sd->backlog.poll = process_backlog; sd->backlog.weight = weight_p; INIT_LIST_HEAD(&sd->backlog.poll_list); diff --git a/net/core/gro.c b/net/core/gro.c index 9e1803fdf249..19bd4cdaee3a 100644 --- a/net/core/gro.c +++ b/net/core/gro.c @@ -790,3 +790,37 @@ __sum16 __skb_gro_checksum_complete(struct sk_buff *skb) return sum; } EXPORT_SYMBOL(__skb_gro_checksum_complete); + +void gro_init(struct gro_node *gro) +{ + for (u32 i = 0; i < GRO_HASH_BUCKETS; i++) { + INIT_LIST_HEAD(&gro->hash[i].list); + gro->hash[i].count = 0; + } + + gro->bitmask = 0; + gro->cached_napi_id = 0; + + INIT_LIST_HEAD(&gro->rx_list); + gro->rx_count = 0; +} + +void gro_cleanup(struct gro_node *gro) +{ + struct sk_buff *skb, *n; + + for (u32 i = 0; i < GRO_HASH_BUCKETS; i++) { + list_for_each_entry_safe(skb, n, &gro->hash[i].list, list) + kfree_skb(skb); + + gro->hash[i].count = 0; + } + + gro->bitmask = 0; + gro->cached_napi_id = 0; + + list_for_each_entry_safe(skb, n, &gro->rx_list, list) + kfree_skb(skb); + + gro->rx_count = 0; +} From patchwork Tue Feb 25 17:17:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990348 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 40D5C19922F; Tue, 25 Feb 2025 17:24:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504287; cv=none; b=MHSA3UWHm7Bf1d5VGtOVkSbLGgz9b641Ni7KT0lkR8zUT+jRxCOnnHnupA5kwremhwaTZQH87/tu2NO9tMSPMLVZNxZLPe1jPq+nM+sCIk63rGXH0WYFjn2GhQdmSV9eGNBBXOBEOF5XPs1u1cjjns1yrFxLCrf8qDNxmGWVtSo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504287; c=relaxed/simple; bh=NHJVJjgmFUzTIgQmgstoUzfkHlS6ur0aEm7F2vGSbR8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=t4VLNFD3x8LNvseoAsT60zuv8hWDvuKrSBqim8cacb3FRg/WMdsUt8xFgoEYW81gAhMDbnp7yVsbEIeUT+ts8HW0oHQlYOuH3irjTVTu9Ztpy7JFVXFXnNhL13c74FfjcZxpvEbWJzpujH0EDOLcRO82L6b4zNiWCMBSCE2iSaQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PT5IcNR/; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PT5IcNR/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504286; x=1772040286; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NHJVJjgmFUzTIgQmgstoUzfkHlS6ur0aEm7F2vGSbR8=; b=PT5IcNR/vb1K7W1jpCA6HME4+VySUUxe0HPCSdBThZ8fYya9MZrY8IQu PQS4YqfMUa41MkNeq+BPV0jSukVQWa520ROrSYx9o4tmI3hrCh9peI3E7 y1LrBBRHfhaD+PZIGpQeStGa5uuXnCFbzH7xUsh4Ad73xddYuvLdh6VEt KZp/ljFWdDG0hjK4E2zU1k2vxwLAmNCFTZq3vhuu+ictA73FMX0DVg1wK A78i9xgGqo8WKRTrmOr/aKwqPQ0Pg/aZAAVzTKYZhlfTWq2zJvorwCIcP Ty8xpS7RLEsHqFUcdjb3lrlH3PP4+SNETAvFD/0jf6w6fJXqWGZ4D7TAD Q==; X-CSE-ConnectionGUID: lNaTjVxMR1Sf3i3+MnAEZQ== X-CSE-MsgGUID: vU5nJAzeSMuASo5PXmcWHQ== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974121" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974121" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:24:46 -0800 X-CSE-ConnectionGUID: Ttd7JzE/SQCXsjAKzuCDeQ== X-CSE-MsgGUID: UemGeVwTQBiI/+PMp0/8Pg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256887" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:24:41 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 3/8] bpf: cpumap: switch to GRO from netif_receive_skb_list() Date: Tue, 25 Feb 2025 18:17:45 +0100 Message-ID: <20250225171751.2268401-4-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org cpumap has its own BH context based on kthread. It has a sane batch size of 8 frames per one cycle. GRO can be used here on its own. Adjust cpumap calls to the upper stack to use GRO API instead of netif_receive_skb_list() which processes skbs by batches, but doesn't involve GRO layer at all. In plenty of tests, GRO performs better than listed receiving even given that it has to calculate full frame checksums on the CPU. As GRO passes the skbs to the upper stack in the batches of @gro_normal_batch, i.e. 8 by default, and skb->dev points to the device where the frame comes from, it is enough to disable GRO netdev feature on it to completely restore the original behaviour: untouched frames will be being bulked and passed to the upper stack by 8, as it was with netif_receive_skb_list(). Tested-by: Daniel Xu Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- kernel/bpf/cpumap.c | 46 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 774accbd4a22..f0909736eaa5 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -33,8 +33,8 @@ #include #include -#include /* netif_receive_skb_list */ -#include /* eth_type_trans */ +#include +#include /* General idea: XDP packets getting XDP redirected to another CPU, * will maximum be stored/queued for one driver ->poll() call. It is @@ -68,6 +68,7 @@ struct bpf_cpu_map_entry { struct bpf_cpumap_val value; struct bpf_prog *prog; + struct gro_node gro; struct completion kthread_running; struct rcu_work free_work; @@ -261,10 +262,36 @@ static int cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames, return nframes; } +static void cpu_map_gro_receive(struct bpf_cpu_map_entry *rcpu, + struct list_head *list) +{ + struct sk_buff *skb, *tmp; + + list_for_each_entry_safe(skb, tmp, list, list) { + skb_list_del_init(skb); + gro_receive_skb(&rcpu->gro, skb); + } +} + +static void cpu_map_gro_flush(struct bpf_cpu_map_entry *rcpu, bool empty) +{ + /* + * If the ring is not empty, there'll be a new iteration soon, and we + * only need to do a full flush if a tick is long (> 1 ms). + * If the ring is empty, to not hold GRO packets in the stack for too + * long, do a full flush. + * This is equivalent to how NAPI decides whether to perform a full + * flush. + */ + gro_flush(&rcpu->gro, !empty && HZ >= 1000); + gro_normal_list(&rcpu->gro); +} + static int cpu_map_kthread_run(void *data) { struct bpf_cpu_map_entry *rcpu = data; unsigned long last_qs = jiffies; + u32 packets = 0; complete(&rcpu->kthread_running); set_current_state(TASK_INTERRUPTIBLE); @@ -282,6 +309,7 @@ static int cpu_map_kthread_run(void *data) void *frames[CPUMAP_BATCH]; void *skbs[CPUMAP_BATCH]; LIST_HEAD(list); + bool empty; /* Release CPU reschedule checks */ if (__ptr_ring_empty(rcpu->queue)) { @@ -361,7 +389,16 @@ static int cpu_map_kthread_run(void *data) trace_xdp_cpumap_kthread(rcpu->map_id, n, kmem_alloc_drops, sched, &stats); - netif_receive_skb_list(&list); + cpu_map_gro_receive(rcpu, &list); + + /* Flush either every 64 packets or in case of empty ring */ + packets += n; + empty = __ptr_ring_empty(rcpu->queue); + if (packets >= NAPI_POLL_WEIGHT || empty) { + cpu_map_gro_flush(rcpu, empty); + packets = 0; + } + local_bh_enable(); /* resched point, may call do_softirq() */ } __set_current_state(TASK_RUNNING); @@ -430,6 +467,7 @@ __cpu_map_entry_alloc(struct bpf_map *map, struct bpf_cpumap_val *value, rcpu->cpu = cpu; rcpu->map_id = map->id; rcpu->value.qsize = value->qsize; + gro_init(&rcpu->gro); if (fd > 0 && __cpu_map_load_bpf_program(rcpu, map, fd)) goto free_ptr_ring; @@ -458,6 +496,7 @@ __cpu_map_entry_alloc(struct bpf_map *map, struct bpf_cpumap_val *value, if (rcpu->prog) bpf_prog_put(rcpu->prog); free_ptr_ring: + gro_cleanup(&rcpu->gro); ptr_ring_cleanup(rcpu->queue, NULL); free_queue: kfree(rcpu->queue); @@ -487,6 +526,7 @@ static void __cpu_map_entry_free(struct work_struct *work) if (rcpu->prog) bpf_prog_put(rcpu->prog); + gro_cleanup(&rcpu->gro); /* The queue should be empty at this point */ __cpu_map_ring_cleanup(rcpu->queue); ptr_ring_cleanup(rcpu->queue, NULL); From patchwork Tue Feb 25 17:17:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990349 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA9E619992E; Tue, 25 Feb 2025 17:24:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504292; cv=none; b=cLdmqflXuXyCOBtFZ+4U7sckxcYpl4J/06hhYB9xLdlmZWSRufGS96skyKKZvKS42mPGWptTY2pAp/hSHpmdXoq84K991BKYxGgneWTPMk0C4W9g0Ky/ZV5wW+20wpBI2AS4K1RGgjiPZKxtZGMWVDhdxoCLIMffcefu2HfnRKk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504292; c=relaxed/simple; bh=IvnjDQh7UZAMI+TirlbcH5qxYIQfFE7GLFGMTzd8fM8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nguGlsc0v7uXHkJ6tw2NAQFwxNiBNMV3M3V5wBZBGonE5XCZrpbChCM1UDJAsvo7zrAr+/Td5zvJnHgW+0R9t7acDfcaFfKsZconPGmNFx5owtl41Av1pupr3po47pUlR5JThvTvqhaFLM8KIucgR/cyOA1NOGJi5NdhnI0Skvo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ednl5erx; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ednl5erx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504291; x=1772040291; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IvnjDQh7UZAMI+TirlbcH5qxYIQfFE7GLFGMTzd8fM8=; b=Ednl5erxR5eZYHaDT9J1heapEWbkqYdtm0snZkis7mXH2iPeFxVWg1uJ vLX4M/q0fCSpLOA0qnrCDEq38pTHG8dbDqBx9Ruy0XUk1QGRTCR7mX5YF dOxUdclOoTdgGac4Madv4p0Uz8WyATHfG+rCmrrtQNoqFH2alzQRHR9iv ri2pJWMI0H6FOedhuDP3f1fMp0gKa6xC0xXHj5/49pCzt3pN4Ly5Wzye5 klgliZhdfnq69ZdklkP1rXE62Goyx3SGI0euaTxmdqV10Q/k6aZ7T9CLd otRL1o3miPy41YpunQvHqLgp5o9VIF4Ccef+Wu9PbpX9I4edEUa3B5lk4 g==; X-CSE-ConnectionGUID: 7GuMniXqQBOGagoiGwa23Q== X-CSE-MsgGUID: 0edAOWACS+OJtK3QSZVvng== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974135" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974135" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:24:51 -0800 X-CSE-ConnectionGUID: dof8zh5JScavLF97eYbYPA== X-CSE-MsgGUID: ZFaEO5UvStSH4HoZOgiNnw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256905" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:24:46 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 4/8] bpf: cpumap: reuse skb array instead of a linked list to chain skbs Date: Tue, 25 Feb 2025 18:17:46 +0100 Message-ID: <20250225171751.2268401-5-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org cpumap still uses linked lists to store a list of skbs to pass to the stack. Now that we don't use listified Rx in favor of napi_gro_receive(), linked list is now an unneeded overhead. Inside the polling loop, we already have an array of skbs. Let's reuse it for skbs passed to cpumap (generic XDP) and keep there in case of XDP_PASS when a program is installed to the map itself. Don't list regular xdp_frames after converting them to skbs as well; store them in the mentioned array (but *before* generic skbs as the latters have lower priority) and call gro_receive_skb() for each array element after they're done. Tested-by: Daniel Xu Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- kernel/bpf/cpumap.c | 119 +++++++++++++++++++++++--------------------- 1 file changed, 61 insertions(+), 58 deletions(-) diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index f0909736eaa5..85936f09d8d7 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -134,22 +134,23 @@ static void __cpu_map_ring_cleanup(struct ptr_ring *ring) } } -static void cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu, - struct list_head *listp, - struct xdp_cpumap_stats *stats) +static u32 cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu, + void **skbs, u32 skb_n, + struct xdp_cpumap_stats *stats) { - struct sk_buff *skb, *tmp; struct xdp_buff xdp; - u32 act; + u32 act, pass = 0; int err; - list_for_each_entry_safe(skb, tmp, listp, list) { + for (u32 i = 0; i < skb_n; i++) { + struct sk_buff *skb = skbs[i]; + act = bpf_prog_run_generic_xdp(skb, &xdp, rcpu->prog); switch (act) { case XDP_PASS: + skbs[pass++] = skb; break; case XDP_REDIRECT: - skb_list_del_init(skb); err = xdp_do_generic_redirect(skb->dev, skb, &xdp, rcpu->prog); if (unlikely(err)) { @@ -158,7 +159,7 @@ static void cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu, } else { stats->redirect++; } - return; + break; default: bpf_warn_invalid_xdp_action(NULL, rcpu->prog, act); fallthrough; @@ -166,12 +167,15 @@ static void cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu, trace_xdp_exception(skb->dev, rcpu->prog, act); fallthrough; case XDP_DROP: - skb_list_del_init(skb); - kfree_skb(skb); + napi_consume_skb(skb, true); stats->drop++; - return; + break; } } + + stats->pass += pass; + + return pass; } static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu, @@ -205,7 +209,6 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu, stats->drop++; } else { frames[nframes++] = xdpf; - stats->pass++; } break; case XDP_REDIRECT: @@ -229,48 +232,44 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu, } xdp_clear_return_frame_no_direct(); + stats->pass += nframes; return nframes; } #define CPUMAP_BATCH 8 -static int cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames, - int xdp_n, struct xdp_cpumap_stats *stats, - struct list_head *list) +struct cpu_map_ret { + u32 xdp_n; + u32 skb_n; +}; + +static void cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames, + void **skbs, struct cpu_map_ret *ret, + struct xdp_cpumap_stats *stats) { struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; - int nframes; if (!rcpu->prog) - return xdp_n; + goto out; rcu_read_lock_bh(); bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); - nframes = cpu_map_bpf_prog_run_xdp(rcpu, frames, xdp_n, stats); + ret->xdp_n = cpu_map_bpf_prog_run_xdp(rcpu, frames, ret->xdp_n, stats); + if (unlikely(ret->skb_n)) + ret->skb_n = cpu_map_bpf_prog_run_skb(rcpu, skbs, ret->skb_n, + stats); if (stats->redirect) xdp_do_flush(); - if (unlikely(!list_empty(list))) - cpu_map_bpf_prog_run_skb(rcpu, list, stats); - bpf_net_ctx_clear(bpf_net_ctx); rcu_read_unlock_bh(); /* resched point, may call do_softirq() */ - return nframes; -} - -static void cpu_map_gro_receive(struct bpf_cpu_map_entry *rcpu, - struct list_head *list) -{ - struct sk_buff *skb, *tmp; - - list_for_each_entry_safe(skb, tmp, list, list) { - skb_list_del_init(skb); - gro_receive_skb(&rcpu->gro, skb); - } +out: + if (unlikely(ret->skb_n) && ret->xdp_n) + memmove(&skbs[ret->xdp_n], skbs, ret->skb_n * sizeof(*skbs)); } static void cpu_map_gro_flush(struct bpf_cpu_map_entry *rcpu, bool empty) @@ -305,10 +304,10 @@ static int cpu_map_kthread_run(void *data) struct xdp_cpumap_stats stats = {}; /* zero stats */ unsigned int kmem_alloc_drops = 0, sched = 0; gfp_t gfp = __GFP_ZERO | GFP_ATOMIC; - int i, n, m, nframes, xdp_n; + struct cpu_map_ret ret = { }; void *frames[CPUMAP_BATCH]; void *skbs[CPUMAP_BATCH]; - LIST_HEAD(list); + u32 i, n, m; bool empty; /* Release CPU reschedule checks */ @@ -334,7 +333,7 @@ static int cpu_map_kthread_run(void *data) */ n = __ptr_ring_consume_batched(rcpu->queue, frames, CPUMAP_BATCH); - for (i = 0, xdp_n = 0; i < n; i++) { + for (i = 0; i < n; i++) { void *f = frames[i]; struct page *page; @@ -342,11 +341,11 @@ static int cpu_map_kthread_run(void *data) struct sk_buff *skb = f; __ptr_clear_bit(0, &skb); - list_add_tail(&skb->list, &list); + skbs[ret.skb_n++] = skb; continue; } - frames[xdp_n++] = f; + frames[ret.xdp_n++] = f; page = virt_to_page(f); /* Bring struct page memory area to curr CPU. Read by @@ -357,39 +356,43 @@ static int cpu_map_kthread_run(void *data) } /* Support running another XDP prog on this CPU */ - nframes = cpu_map_bpf_prog_run(rcpu, frames, xdp_n, &stats, &list); - if (nframes) { - m = kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, - gfp, nframes, skbs); - if (unlikely(m == 0)) { - for (i = 0; i < nframes; i++) - skbs[i] = NULL; /* effect: xdp_return_frame */ - kmem_alloc_drops += nframes; - } + cpu_map_bpf_prog_run(rcpu, frames, skbs, &ret, &stats); + if (!ret.xdp_n) { + local_bh_disable(); + goto stats; + } + + m = kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, gfp, + ret.xdp_n, skbs); + if (unlikely(m < ret.xdp_n)) { + for (i = m; i < ret.xdp_n; i++) + xdp_return_frame(frames[i]); + + if (ret.skb_n) + memmove(&skbs[m], &skbs[ret.xdp_n], + ret.skb_n * sizeof(*skbs)); + + kmem_alloc_drops += ret.xdp_n - m; + ret.xdp_n = m; } local_bh_disable(); - for (i = 0; i < nframes; i++) { + for (i = 0; i < ret.xdp_n; i++) { struct xdp_frame *xdpf = frames[i]; - struct sk_buff *skb = skbs[i]; - - skb = __xdp_build_skb_from_frame(xdpf, skb, - xdpf->dev_rx); - if (!skb) { - xdp_return_frame(xdpf); - continue; - } - list_add_tail(&skb->list, &list); + /* Can fail only when !skb -- already handled above */ + __xdp_build_skb_from_frame(xdpf, skbs[i], xdpf->dev_rx); } +stats: /* Feedback loop via tracepoint. * NB: keep before recv to allow measuring enqueue/dequeue latency. */ trace_xdp_cpumap_kthread(rcpu->map_id, n, kmem_alloc_drops, sched, &stats); - cpu_map_gro_receive(rcpu, &list); + for (i = 0; i < ret.xdp_n + ret.skb_n; i++) + gro_receive_skb(&rcpu->gro, skbs[i]); /* Flush either every 64 packets or in case of empty ring */ packets += n; From patchwork Tue Feb 25 17:17:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990350 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 914BA1A4E98; Tue, 25 Feb 2025 17:24:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504296; cv=none; b=fUPoZuTIszrS6vwwBwVO8XXc4cjBevWdPCdjmW6mUsFx+480mVKYwWRz80mtp6JrILmDTeWxZmyxLE2WhL+mBPsZ2fCoVka7Wj+OzoWUFYJ4fRnNWxj2fpRAaz0zZoMQ5OHrdLU3rUFPMKOWvc8V2IdIiw4/yInjONZIjyBXo5g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504296; c=relaxed/simple; bh=zzkUs89S3RVdWMvT270E4MEpXAYIkkV8Rgdhw4YfLUI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=BeMEsREmOMVOivKgy75AZpMQAFzKuWCtWcXhZy9pCpfHNes+3rmledg2D7RUoEeT7ynXs570ExD3H0J1K6mnqVanWTHqcgYwmGmIvJ6bua7RkrltX8tLXPR+eJksxIG4hhVvU6lw2UpWuvUg7DvpOTbiOMazZmOLfvUr1NKx+GQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Sg6NdyG7; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Sg6NdyG7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504295; x=1772040295; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zzkUs89S3RVdWMvT270E4MEpXAYIkkV8Rgdhw4YfLUI=; b=Sg6NdyG7M4rus6bHgu2215Nmt+HrHIhpQwGEtaz3niEo1LBF+sxqVFsg JQY0PnJx4M0tvQbYN3uM4tEiiVbPX3/try156giYDZ3qoONEJQLcg+Y1a hzpGOaPuHgxelhd6jU/gYXt7THmJTayZm9oltoaNYPY7jHBLbbRrTZDZP iHPme2zKIjroemjaRR3A9EJwv5bUuv86WTXCGrSvJJpACCRX/yvZU9cZJ G/yT78kbOGqf1LMBLzN8q0Pidu23SVHNcM7NjWkb1tNqQwBjjNWTYYFhz 7ckFAL4LHHjf3hLNBW7EdUKge25uXP+KMZvDtIzRgI3SpkLQhAOfy1fRd Q==; X-CSE-ConnectionGUID: h+4F9YesSxS7sQQRw8cfKQ== X-CSE-MsgGUID: GlivEZUSQnO/w6T1+/wVMg== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974146" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974146" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:24:54 -0800 X-CSE-ConnectionGUID: RtflafJeQrSYS8OMG2TLuw== X-CSE-MsgGUID: kLJwXuF9QXmE2cwXrMfpuA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256926" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:24:50 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 5/8] net: skbuff: introduce napi_skb_cache_get_bulk() Date: Tue, 25 Feb 2025 18:17:47 +0100 Message-ID: <20250225171751.2268401-6-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Add a function to get an array of skbs from the NAPI percpu cache. It's supposed to be a drop-in replacement for kmem_cache_alloc_bulk(skbuff_head_cache, GFP_ATOMIC) and xdp_alloc_skb_bulk(GFP_ATOMIC). The difference (apart from the requirement to call it only from the BH) is that it tries to use as many NAPI cache entries for skbs as possible, and allocate new ones only if needed. The logic is as follows: * there is enough skbs in the cache: decache them and return to the caller; * not enough: try refilling the cache first. If there is now enough skbs, return; * still not enough: try allocating skbs directly to the output array with %GFP_ZERO, maybe we'll be able to get some. If there's now enough, return; * still not enough: return as many as we were able to obtain. Most of times, if called from the NAPI polling loop, the first one will be true, sometimes (rarely) the second one. The third and the fourth -- only under heavy memory pressure. It can save significant amounts of CPU cycles if there are GRO cycles and/or Tx completion cycles (anything that descends to napi_skb_cache_put()) happening on this CPU. Tested-by: Daniel Xu Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- include/linux/skbuff.h | 1 + net/core/skbuff.c | 62 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index f2bb8473d99a..e8e190ad2b16 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1321,6 +1321,7 @@ struct sk_buff *build_skb_around(struct sk_buff *skb, void *data, unsigned int frag_size); void skb_attempt_defer_free(struct sk_buff *skb); +u32 napi_skb_cache_get_bulk(void **skbs, u32 n); struct sk_buff *napi_build_skb(void *data, unsigned int frag_size); struct sk_buff *slab_build_skb(void *data); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 5b241c9e6f38..f12815f9c83d 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -295,6 +295,68 @@ static struct sk_buff *napi_skb_cache_get(void) return skb; } +/** + * napi_skb_cache_get_bulk - obtain a number of zeroed skb heads from the cache + * @skbs: pointer to an at least @n-sized array to fill with skb pointers + * @n: number of entries to provide + * + * Tries to obtain @n &sk_buff entries from the NAPI percpu cache and writes + * the pointers into the provided array @skbs. If there are less entries + * available, tries to replenish the cache and bulk-allocates the diff from + * the MM layer if needed. + * The heads are being zeroed with either memset() or %__GFP_ZERO, so they are + * ready for {,__}build_skb_around() and don't have any data buffers attached. + * Must be called *only* from the BH context. + * + * Return: number of successfully allocated skbs (@n if no actual allocation + * needed or kmem_cache_alloc_bulk() didn't fail). + */ +u32 napi_skb_cache_get_bulk(void **skbs, u32 n) +{ + struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + u32 bulk, total = n; + + local_lock_nested_bh(&napi_alloc_cache.bh_lock); + + if (nc->skb_count >= n) + goto get; + + /* No enough cached skbs. Try refilling the cache first */ + bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count, NAPI_SKB_CACHE_BULK); + nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, + GFP_ATOMIC | __GFP_NOWARN, bulk, + &nc->skb_cache[nc->skb_count]); + if (likely(nc->skb_count >= n)) + goto get; + + /* Still not enough. Bulk-allocate the missing part directly, zeroed */ + n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, + GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN, + n - nc->skb_count, &skbs[nc->skb_count]); + if (likely(nc->skb_count >= n)) + goto get; + + /* kmem_cache didn't allocate the number we need, limit the output */ + total -= n - nc->skb_count; + n = nc->skb_count; + +get: + for (u32 base = nc->skb_count - n, i = 0; i < n; i++) { + u32 cache_size = kmem_cache_size(net_hotdata.skbuff_cache); + + skbs[i] = nc->skb_cache[base + i]; + + kasan_mempool_unpoison_object(skbs[i], cache_size); + memset(skbs[i], 0, offsetof(struct sk_buff, tail)); + } + + nc->skb_count -= n; + local_unlock_nested_bh(&napi_alloc_cache.bh_lock); + + return total; +} +EXPORT_SYMBOL_GPL(napi_skb_cache_get_bulk); + static inline void __finalize_skb_around(struct sk_buff *skb, void *data, unsigned int size) { From patchwork Tue Feb 25 17:17:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990351 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3E1A1EA7DF; Tue, 25 Feb 2025 17:24:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504300; cv=none; b=j59lRMCYV8y79oXcWMoJlHYbU3MmYac4Ez3kyNk/nmOjqnCawXzNmsUAJ6M1rXiCAYg3hFiyzRZiko+A4uvxxeW/DTGV0TO137b3TbInneGcPATBbcERHG5P2IdyPFpPNTmWqsq5axc59+iyrh+rnR6QVRIDmJ2uRT0cbgrmpLU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504300; c=relaxed/simple; bh=WsQDKQL1yVZGxz9j/MJ297q+BCLvJbuCmPc5AY/pbu4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=pXgfbX4LsP0qrVPTsRQz6mflva31wAwNj7RecmmSp2E7sSWDKBTWUO/R9kM1D4PBGsjyB66ZDYiaWSTm0vvQbonViGHkgT8xTi+AT2e1GUGOYeOgfjawG97l6kr27iJDV27OD2gYSMgTOxn40oHvX9s21JKvbw6ZYYURbaaocsE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KNcOXi6d; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KNcOXi6d" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504299; x=1772040299; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WsQDKQL1yVZGxz9j/MJ297q+BCLvJbuCmPc5AY/pbu4=; b=KNcOXi6dgw5VVDcfCji9IZ3vzRLocsSBHmLR3lnZCUR3ezWTN8VO2pSC nWiXjbpFexjCamOPKwbTKqh2Ro6x0gQ6IobttMp2niXPx8sIQwniL0ezV AaGM6lqJo1a6OB2Kn5O4ZlDeI3eGu83+vpFbDF/Pg/l8DQS54ErEnGmnY +5dsO0sHcLaCIpLCWqmPUk26ijSzJquVvVnx6koOcvkN4JV+J2QlKCrFS a+w8DXxZk9IGpubBNyhqMTV70CKSnoC6IJvB2pG6isRjOZyVIOdwYtKjm 89ay+0LKP4J68QS/DdhtvGplda5Ci1WaVERxVD/jrxhnsZLownU4YmaE/ A==; X-CSE-ConnectionGUID: waLvHvdrS5mtirFt05svNw== X-CSE-MsgGUID: tZoMKS5qS26Km7G8mZkpWA== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974158" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974158" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:24:58 -0800 X-CSE-ConnectionGUID: uDrcqFtFR12cJZxm1JFS3g== X-CSE-MsgGUID: 37Y8eIMtQCWwMQG0rZgrHw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256938" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:24:54 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 6/8] bpf: cpumap: switch to napi_skb_cache_get_bulk() Date: Tue, 25 Feb 2025 18:17:48 +0100 Message-ID: <20250225171751.2268401-7-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Now that cpumap uses GRO, which drops unused skb heads to the NAPI cache, use napi_skb_cache_get_bulk() to try to reuse cached entries and lower MM layer pressure. Always disable the BH before checking and running the cpumap-pinned XDP prog and don't re-enable it in between that and allocating an skb bulk, as we can access the NAPI caches only from the BH context. The better GRO aggregates packets, the less new skbs will be allocated. If an aggregated skb contains 16 frags, this means 15 skbs were returned to the cache, so next 15 skbs will be built without allocating anything. The same trafficgen UDP GRO test now shows: GRO off GRO on threaded GRO 2.3 4 Mpps thr bulk GRO 2.4 4.7 Mpps diff +4 +17 % Comparing to the baseline cpumap: baseline 2.7 N/A Mpps thr bulk GRO 2.4 4.7 Mpps diff -11 +74 % Tested-by: Daniel Xu Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- kernel/bpf/cpumap.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 85936f09d8d7..67e8a2fc1a99 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -253,7 +253,7 @@ static void cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames, if (!rcpu->prog) goto out; - rcu_read_lock_bh(); + rcu_read_lock(); bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); ret->xdp_n = cpu_map_bpf_prog_run_xdp(rcpu, frames, ret->xdp_n, stats); @@ -265,7 +265,7 @@ static void cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames, xdp_do_flush(); bpf_net_ctx_clear(bpf_net_ctx); - rcu_read_unlock_bh(); /* resched point, may call do_softirq() */ + rcu_read_unlock(); out: if (unlikely(ret->skb_n) && ret->xdp_n) @@ -303,7 +303,6 @@ static int cpu_map_kthread_run(void *data) while (!kthread_should_stop() || !__ptr_ring_empty(rcpu->queue)) { struct xdp_cpumap_stats stats = {}; /* zero stats */ unsigned int kmem_alloc_drops = 0, sched = 0; - gfp_t gfp = __GFP_ZERO | GFP_ATOMIC; struct cpu_map_ret ret = { }; void *frames[CPUMAP_BATCH]; void *skbs[CPUMAP_BATCH]; @@ -355,15 +354,14 @@ static int cpu_map_kthread_run(void *data) prefetchw(page); } + local_bh_disable(); + /* Support running another XDP prog on this CPU */ cpu_map_bpf_prog_run(rcpu, frames, skbs, &ret, &stats); - if (!ret.xdp_n) { - local_bh_disable(); + if (!ret.xdp_n) goto stats; - } - m = kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, gfp, - ret.xdp_n, skbs); + m = napi_skb_cache_get_bulk(skbs, ret.xdp_n); if (unlikely(m < ret.xdp_n)) { for (i = m; i < ret.xdp_n; i++) xdp_return_frame(frames[i]); @@ -376,7 +374,6 @@ static int cpu_map_kthread_run(void *data) ret.xdp_n = m; } - local_bh_disable(); for (i = 0; i < ret.xdp_n; i++) { struct xdp_frame *xdpf = frames[i]; From patchwork Tue Feb 25 17:17:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990352 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5977B20B7EB; Tue, 25 Feb 2025 17:25:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504305; cv=none; b=tP1jDjokO48IErXtrk0q3bRCk1chTDvJLMPa1KITDQardabx6G4GSHHaHD2fZMX+nFoxUp1d4Uo+KB2NeLOVM+tttoTA1XfSjNzZaj9jFLpSr3ZHlltwcDsquhhqaMDz9y3xX3+pJCoV0ok1dOavUxV9WfE6fb13sRni3e6FH3A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504305; c=relaxed/simple; bh=at5laq0S9A36ZUKWkzLRYRU1WS5EvH7bROwshhyGo7I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VWwqVbnjYhkzmIZ7O9BqThfRXSVQOVZ3MD/vjdka8RUtfa+BazD+Ff+HbKK9DaR/uYcfHU54TousfxhakQAsei1FIebLODegPkiZZMUPyn/ktAC1w+dEqd6CSGRjjBTpOeHBiu1wJUQvACwLc0PvNjkQsVZc+ncmiZ7FaNulizs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Y/SPPYSa; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Y/SPPYSa" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504304; x=1772040304; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=at5laq0S9A36ZUKWkzLRYRU1WS5EvH7bROwshhyGo7I=; b=Y/SPPYSa5u5SlKi2YFlPg6TnlipCSrjgPqXEm8QpMz1eGzpiUr9pQzEO q3nCWO+d6W1qDLUH0lxVJED+B41LWAak7Yic974vgsguEpnhk8oiu8U8u odprvuba4mjNVZ8Y4m2lNp/qkrYHw3RljLHcIecofqA9TJaTZVMfQSJ+M M8750+L9pr6DMlqAXM31tMtB5MJ7JZLzNhsBfPk7atJaJBy6QzKsVDArq tWP2Kn7AkLJ/RMG8YJ3fjOoRsCm6C88dxSU2gAWrzUxlztBE0UJ0BsIIU XhiujVteM1/UhnJWaQK08UgB7nrw8RKRwTx2AKrve6XUvWSGQUxJypVMl A==; X-CSE-ConnectionGUID: IfiCmn0xTXW8M5qouJnWbw== X-CSE-MsgGUID: vuCeZMGnQWibt4ZOWBk/eQ== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974176" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974176" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:25:03 -0800 X-CSE-ConnectionGUID: Eg4S//TrToKvxow8ua6KIg== X-CSE-MsgGUID: yHyj/RFTRcSyC4hdVNNdug== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256968" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:24:58 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 7/8] veth: use napi_skb_cache_get_bulk() instead of xdp_alloc_skb_bulk() Date: Tue, 25 Feb 2025 18:17:49 +0100 Message-ID: <20250225171751.2268401-8-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Now that we can bulk-allocate skbs from the NAPI cache, use that function to do that in veth as well instead of direct allocation from the kmem caches. veth uses NAPI and GRO, so this is both context-safe and beneficial. Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- drivers/net/veth.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index ba3ae2d8092f..05f5eeef539f 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -684,8 +684,7 @@ static void veth_xdp_rcv_bulk_skb(struct veth_rq *rq, void **frames, void *skbs[VETH_XDP_BATCH]; int i; - if (xdp_alloc_skb_bulk(skbs, n_xdpf, - GFP_ATOMIC | __GFP_ZERO) < 0) { + if (unlikely(!napi_skb_cache_get_bulk(skbs, n_xdpf))) { for (i = 0; i < n_xdpf; i++) xdp_return_frame(frames[i]); stats->rx_drops += n_xdpf; From patchwork Tue Feb 25 17:17:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13990353 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F29C720CCF1; Tue, 25 Feb 2025 17:25:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504308; cv=none; b=rR86n/BLaqhLTjFz05G0SBma0w8TmHRxYFdT0pA3auHouVgl3jEdjoiekEVmI7mSLQSggRPHelMAYg47YqDi0tB7dK1qNVl/tzs9rMXa6XAuawYSLJoNeKdBpakllgIzlQuz2Dekpdl1W94auRw4w/8RreKorE3VpMi7RuUpQv4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740504308; c=relaxed/simple; bh=EDUd8AUDXZD+TVEFowQGG4u/w1Xjx6TUQuNUxP09fkg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=O7Ofiv53JEJREBPireXmwhi/jErxwx1fZECS+cSM68F0HamBz6E5YzNM46X3wkvW0yzzBfw93VU2AxmHl7bPvuZaa7Le6nX1KM/z48eKKW652yyYrx7wA6bjG0yWIhAJjHVMwXe9M1V5pBtx87vXZ8DJXBuStsunVCQVroCgguM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VrKq7hEh; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VrKq7hEh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740504307; x=1772040307; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EDUd8AUDXZD+TVEFowQGG4u/w1Xjx6TUQuNUxP09fkg=; b=VrKq7hEhBFK8v1ODLilgV7m7T3TOVryGF7g/kLnEExtR6GnFA4x5tEPv FpEvO8ano6catWI+NQxykcPyytvQrk9tNhZ8+q5DOelOHaIvba4eF8ur2 b6NdKpy0mwI0Q6fDUVKPMqLvjUJjmXXFVlSmCztcbPSNVfj48T1bATfxw 7kF8i5Xt10vJxuUUqYLBztoRL2Y4rP7D19fT7sVYkUxT10PlnMNKTpjTU SbAbOONWJRXzi59hRSR2or7CHXfN+dFKSe5srYOUZANufIP608vh2kp2x CKP/O/MeCSjdididspfKsNlaZe/GTa3jbUGeyhg31rJcGZOpBC8Ln70jA w==; X-CSE-ConnectionGUID: +AElUEHhTFKODZkofeJOLw== X-CSE-MsgGUID: OiUdA/cMSP+7r2E3/6o4oQ== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="44974197" X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="44974197" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 09:25:07 -0800 X-CSE-ConnectionGUID: ZeQFXE/4Qo2Z918geIXl2w== X-CSE-MsgGUID: 1rbA4SIRRZqqmontmfT0dw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,314,1732608000"; d="scan'208";a="116256991" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmviesa006.fm.intel.com with ESMTP; 25 Feb 2025 09:25:03 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8il?= =?utf-8?q?and-J=C3=B8rgensen?= Subject: [PATCH net-next v5 8/8] xdp: remove xdp_alloc_skb_bulk() Date: Tue, 25 Feb 2025 18:17:50 +0100 Message-ID: <20250225171751.2268401-9-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250225171751.2268401-1-aleksander.lobakin@intel.com> References: <20250225171751.2268401-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org The only user was veth, which now uses napi_skb_cache_get_bulk(). It's now preferred over a direct allocation and is exported as well, so remove this one. Reviewed-by: Toke Høiland-Jørgensen Signed-off-by: Alexander Lobakin --- include/net/xdp.h | 1 - net/core/xdp.c | 10 ---------- 2 files changed, 11 deletions(-) diff --git a/include/net/xdp.h b/include/net/xdp.h index 4dafc5e021f1..48efacbaa35d 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -343,7 +343,6 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf, struct net_device *dev); struct sk_buff *xdp_build_skb_from_frame(struct xdp_frame *xdpf, struct net_device *dev); -int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp); struct xdp_frame *xdpf_clone(struct xdp_frame *xdpf); static inline diff --git a/net/core/xdp.c b/net/core/xdp.c index 2c6ab6fb452f..f86eedad586a 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -618,16 +618,6 @@ void xdp_warn(const char *msg, const char *func, const int line) }; EXPORT_SYMBOL_GPL(xdp_warn); -int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp) -{ - n_skb = kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, gfp, n_skb, skbs); - if (unlikely(!n_skb)) - return -ENOMEM; - - return 0; -} -EXPORT_SYMBOL_GPL(xdp_alloc_skb_bulk); - /** * xdp_build_skb_from_buff - create an skb from &xdp_buff * @xdp: &xdp_buff to convert to an skb