[net-next,v6,08/12] libie: add Rx buffer management (via Page Pool)

Message ID	20231207172010.1441468-9-aleksander.lobakin@intel.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="P2Mydmz6" From: Alexander Lobakin <aleksander.lobakin@intel.com> To: "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com>, Maciej Fijalkowski <maciej.fijalkowski@intel.com>, Michal Kubiak <michal.kubiak@intel.com>, Larysa Zaremba <larysa.zaremba@intel.com>, Alexander Duyck <alexanderduyck@fb.com>, Yunsheng Lin <linyunsheng@huawei.com>, David Christensen <drc@linux.vnet.ibm.com>, Jesper Dangaard Brouer <hawk@kernel.org>, Ilias Apalodimas <ilias.apalodimas@linaro.org>, Paul Menzel <pmenzel@molgen.mpg.de>, netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next v6 08/12] libie: add Rx buffer management (via Page Pool) Date: Thu, 7 Dec 2023 18:20:06 +0100 Message-ID: <20231207172010.1441468-9-aleksander.lobakin@intel.com> In-Reply-To: <20231207172010.1441468-1-aleksander.lobakin@intel.com> References: <20231207172010.1441468-1-aleksander.lobakin@intel.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	net: intel: start The Great Code Dedup + Page Pool for iavf \| expand [net-next,v6,00/12] net: intel: start The Great Code Dedup + Page Pool for iavf [net-next,v6,01/12] page_pool: make sure frag API fields don't span between cachelines [net-next,v6,02/12] page_pool: don't use driver-set flags field directly [net-next,v6,03/12] net: intel: introduce Intel Ethernet common library [net-next,v6,04/12] iavf: kill "legacy-rx" for good [net-next,v6,05/12] iavf: drop page splitting and recycling [net-next,v6,06/12] page_pool: constify some read-only function arguments [net-next,v6,07/12] page_pool: add DMA-sync-for-CPU inline helper [net-next,v6,08/12] libie: add Rx buffer management (via Page Pool) [net-next,v6,09/12] iavf: pack iavf_ring more efficiently [net-next,v6,10/12] iavf: switch to Page Pool [net-next,v6,11/12] libie: add common queue stats [net-next,v6,12/12] iavf: switch queue stats to libie

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net-next, async
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 1115 this patch: 1115
netdev/cc_maintainers	warning	2 maintainers not CCed: anthony.l.nguyen@intel.com jesse.brandeburg@intel.com
netdev/build_clang	success	Errors and warnings before: 1142 this patch: 1142
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 1142 this patch: 1142
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 221 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

diff --git a/drivers/net/ethernet/intel/libie/Kconfig b/drivers/net/ethernet/intel/libie/Kconfig index 1eda4a5faa5a..6e0162fb94d2 100644 --- a/drivers/net/ethernet/intel/libie/Kconfig +++ b/drivers/net/ethernet/intel/libie/Kconfig @@ -3,6 +3,7 @@ config LIBIE tristate + select PAGE_POOL help libie (Intel Ethernet library) is a common library containing routines shared between several Intel Ethernet drivers. diff --git a/drivers/net/ethernet/intel/libie/rx.c b/drivers/net/ethernet/intel/libie/rx.c index f503476d8eef..359867714a1b 100644 --- a/drivers/net/ethernet/intel/libie/rx.c +++ b/drivers/net/ethernet/intel/libie/rx.c @@ -3,6 +3,75 @@ #include <linux/net/intel/libie/rx.h> +/* Rx buffer management */ + +/** + * libie_rx_hw_len - get the actual buffer size to be passed to HW + * @pp: &page_pool_params of the netdev to calculate the size for + * + * Return: HW-writeable length per one buffer to pass it to the HW accounting: + * MTU the @dev has, HW required alignment, minimum and maximum allowed values, + * and system's page size. + */ +static u32 libie_rx_hw_len(const struct page_pool_params *pp) +{ + u32 len; + + len = READ_ONCE(pp->netdev->mtu) + LIBIE_RX_LL_LEN; + len = ALIGN(len, LIBIE_RX_BUF_LEN_ALIGN); + len = clamp(len, LIBIE_MIN_RX_BUF_LEN, pp->max_len); + + return len; +} + +/** + * libie_rx_page_pool_create - create a PP with the default libie settings + * @bq: buffer queue struct to fill + * @napi: &napi_struct covering this PP (no usage outside its poll loops) + * + * Return: 0 on success, -errno on failure. + */ +int libie_rx_page_pool_create(struct libie_buf_queue *bq, + struct napi_struct *napi) +{ + struct page_pool_params pp = { + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, + .order = LIBIE_RX_PAGE_ORDER, + .pool_size = bq->count, + .nid = NUMA_NO_NODE, + .dev = napi->dev->dev.parent, + .netdev = napi->dev, + .napi = napi, + .dma_dir = DMA_FROM_DEVICE, + .offset = LIBIE_SKB_HEADROOM, + }; + + /* HW-writeable / syncable length per one page */ + pp.max_len = LIBIE_RX_BUF_LEN(pp.offset); + + /* HW-writeable length per buffer */ + bq->rx_buf_len = libie_rx_hw_len(&pp); + /* Buffer size to allocate */ + bq->truesize = roundup_pow_of_two(SKB_HEAD_ALIGN(pp.offset + + bq->rx_buf_len)); + + bq->pp = page_pool_create(&pp); + + return PTR_ERR_OR_ZERO(bq->pp); +} +EXPORT_SYMBOL_NS_GPL(libie_rx_page_pool_create, LIBIE); + +/** + * libie_rx_page_pool_destroy - destroy a &page_pool created by libie + * @bq: buffer queue to process + */ +void libie_rx_page_pool_destroy(struct libie_buf_queue *bq) +{ + page_pool_destroy(bq->pp); + bq->pp = NULL; +} +EXPORT_SYMBOL_NS_GPL(libie_rx_page_pool_destroy, LIBIE); + /* O(1) converting i40e/ice/iavf's 8/10-bit hardware packet type to a parsed * bitfield struct. */ diff --git a/include/linux/net/intel/libie/rx.h b/include/linux/net/intel/libie/rx.h index 55263930aa99..71bc9a1a9856 100644 --- a/include/linux/net/intel/libie/rx.h +++ b/include/linux/net/intel/libie/rx.h @@ -4,7 +4,138 @@ #ifndef __LIBIE_RX_H #define __LIBIE_RX_H -#include <linux/netdevice.h> +#include <linux/if_vlan.h> +#include <net/page_pool/helpers.h> + +/* Rx MTU/buffer/truesize helpers. Mostly pure software-side; HW-defined values + * are valid for all Intel HW. + */ + +/* Space reserved in front of each frame */ +#define LIBIE_SKB_HEADROOM (NET_SKB_PAD + NET_IP_ALIGN) +/* Maximum headroom to calculate max MTU below */ +#define LIBIE_MAX_HEADROOM LIBIE_SKB_HEADROOM +/* Link layer / L2 overhead: Ethernet, 2 VLAN tags (C + S), FCS */ +#define LIBIE_RX_LL_LEN (ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN) + +/* Always use order-0 pages */ +#define LIBIE_RX_PAGE_ORDER 0 +/* Rx buffer size config is a multiple of 128 */ +#define LIBIE_RX_BUF_LEN_ALIGN 128 +/* HW-writeable space in one buffer: truesize - headroom/tailroom, + * HW-aligned + */ +#define __LIBIE_RX_BUF_LEN(hr) \ + ALIGN_DOWN(SKB_MAX_ORDER(hr, LIBIE_RX_PAGE_ORDER), \ + LIBIE_RX_BUF_LEN_ALIGN) +/* The smallest and largest size for a single descriptor as per HW */ +#define LIBIE_MIN_RX_BUF_LEN 1024U +#define LIBIE_MAX_RX_BUF_LEN 9728U +/* "True" HW-writeable space: minimum from SW and HW values */ +#define LIBIE_RX_BUF_LEN(hr) min_t(u32, __LIBIE_RX_BUF_LEN(hr), \ + LIBIE_MAX_RX_BUF_LEN) + +/* The maximum frame size as per HW (S/G) */ +#define __LIBIE_MAX_RX_FRM_LEN 16382U +/* ATST, HW can chain up to 5 Rx descriptors */ +#define LIBIE_MAX_RX_FRM_LEN(hr) \ + min_t(u32, __LIBIE_MAX_RX_FRM_LEN, LIBIE_RX_BUF_LEN(hr) * 5) +/* Maximum frame size minus LL overhead */ +#define LIBIE_MAX_MTU \ + (LIBIE_MAX_RX_FRM_LEN(LIBIE_MAX_HEADROOM) - LIBIE_RX_LL_LEN) + +/* Rx buffer management */ + +/** + * struct libie_rx_buffer - structure representing an Rx buffer + * @page: page holding the buffer + * @offset: offset from the page start (to the headroom) + * @truesize: total space occupied by the buffer (w/ headroom and tailroom) + * + * Depending on the MTU, API switches between one-page-per-frame and shared + * page model (to conserve memory on bigger-page platforms). In case of the + * former, @offset is always 0 and @truesize is always ```PAGE_SIZE```. + */ +struct libie_rx_buffer { + struct page *page; + u32 offset; + u32 truesize; +}; + +/** + * struct libie_buf_queue - structure representing a buffer queue + * @pp: &page_pool for buffer management + * @rx_bi: array of Rx buffers + * @truesize: size to allocate per buffer, w/overhead + * @count: number of descriptors/buffers the queue has + * @rx_buf_len: HW-writeable length per each buffer + */ +struct libie_buf_queue { + struct page_pool *pp; + struct libie_rx_buffer *rx_bi; + + u32 truesize; + u32 count; + + /* Cold fields */ + u32 rx_buf_len; +}; + +int libie_rx_page_pool_create(struct libie_buf_queue *bq, + struct napi_struct *napi); +void libie_rx_page_pool_destroy(struct libie_buf_queue *bq); + +/** + * libie_rx_alloc - allocate a new Rx buffer + * @bq: buffer queue to allocate for + * @i: index of the buffer within the queue + * + * Return: DMA address to be passed to HW for Rx on successful allocation, + * ```DMA_MAPPING_ERROR``` otherwise. + */ +static inline dma_addr_t libie_rx_alloc(const struct libie_buf_queue *bq, + u32 i) +{ + struct libie_rx_buffer *buf = &bq->rx_bi[i]; + + buf->truesize = bq->truesize; + buf->page = page_pool_dev_alloc(bq->pp, &buf->offset, &buf->truesize); + if (unlikely(!buf->page)) + return DMA_MAPPING_ERROR; + + return page_pool_get_dma_addr(buf->page) + buf->offset + + bq->pp->p.offset; +} + +/** + * libie_rx_sync_for_cpu - synchronize or recycle buffer post DMA + * @buf: buffer to process + * @len: frame length from the descriptor + * + * Process the buffer after it's written by HW. The regular path is to + * synchronize DMA for CPU, but in case of no data it will be immediately + * recycled back to its PP. + * + * Return: true when there's data to process, false otherwise. + */ +static inline bool libie_rx_sync_for_cpu(const struct libie_rx_buffer *buf, + u32 len) +{ + struct page *page = buf->page; + + /* Very rare, but possible case. The most common reason: + * the last fragment contained FCS only, which was then + * stripped by the HW. + */ + if (unlikely(!len)) { + page_pool_recycle_direct(page->pp, page); + return false; + } + + page_pool_dma_sync_for_cpu(page->pp, page, buf->offset, len); + + return true; +} /* O(1) converting i40e/ice/iavf's 8/10-bit hardware packet type to a parsed * bitfield struct.

[net-next,v6,08/12] libie: add Rx buffer management (via Page Pool)

Checks

Commit Message

Comments

Patch