[net-next,03/11] iavf: optimize Rx buffer allocation a bunch

Message ID	20230516161841.37138-4-aleksander.lobakin@intel.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 771A934CF9 for <netdev@vger.kernel.org>; Tue, 16 May 2023 16:20:24 +0000 (UTC) From: Alexander Lobakin <aleksander.lobakin@intel.com> To: "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com>, Maciej Fijalkowski <maciej.fijalkowski@intel.com>, Magnus Karlsson <magnus.karlsson@intel.com>, Michal Kubiak <michal.kubiak@intel.com>, Larysa Zaremba <larysa.zaremba@intel.com>, Jesper Dangaard Brouer <hawk@kernel.org>, Ilias Apalodimas <ilias.apalodimas@linaro.org>, Christoph Hellwig <hch@lst.de>, netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 03/11] iavf: optimize Rx buffer allocation a bunch Date: Tue, 16 May 2023 18:18:33 +0200 Message-Id: <20230516161841.37138-4-aleksander.lobakin@intel.com> In-Reply-To: <20230516161841.37138-1-aleksander.lobakin@intel.com> References: <20230516161841.37138-1-aleksander.lobakin@intel.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	net: intel: start The Great Code Dedup + Page Pool for iavf \| expand [net-next,00/11] net: intel: start The Great Code Dedup + Page Pool for iavf [net-next,01/11] net: intel: introduce Intel Ethernet common library [net-next,02/11] iavf: kill "legacy-rx" for good [net-next,03/11] iavf: optimize Rx buffer allocation a bunch [net-next,04/11] iavf: remove page splitting/recycling [net-next,05/11] iavf: always use a full order-0 page [net-next,06/11] net: page_pool: avoid calling no-op externals when possible [net-next,07/11] net: page_pool: add DMA-sync-for-CPU inline helpers [net-next,08/11] iavf: switch to Page Pool [net-next,09/11] libie: add common queue stats [net-next,10/11] libie: add per-queue Page Pool stats [net-next,11/11] iavf: switch queue stats to libie

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net-next, async
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers	warning	2 maintainers not CCed: jesse.brandeburg@intel.com anthony.l.nguyen@intel.com
netdev/build_clang	success	Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 8 this patch: 8
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 487 lines checked
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index a5a6c9861a93..ade32aa1ed78 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1236,7 +1236,7 @@ static void iavf_configure(struct iavf_adapter *adapter) for (i = 0; i < adapter->num_active_queues; i++) { struct iavf_ring *ring = &adapter->rx_rings[i]; - iavf_alloc_rx_buffers(ring, IAVF_DESC_UNUSED(ring)); + iavf_alloc_rx_buffers(ring); } } diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c index a7121dc5c32b..fd08ce67380e 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c @@ -736,7 +736,6 @@ void iavf_clean_rx_ring(struct iavf_ring *rx_ring) /* Zero out the descriptor ring */ memset(rx_ring->desc, 0, rx_ring->size); - rx_ring->next_to_alloc = 0; rx_ring->next_to_clean = 0; rx_ring->next_to_use = 0; } @@ -792,7 +791,6 @@ int iavf_setup_rx_descriptors(struct iavf_ring *rx_ring) goto err; } - rx_ring->next_to_alloc = 0; rx_ring->next_to_clean = 0; rx_ring->next_to_use = 0; @@ -812,9 +810,6 @@ static inline void iavf_release_rx_desc(struct iavf_ring *rx_ring, u32 val) { rx_ring->next_to_use = val; - /* update next to alloc since we have filled the ring */ - rx_ring->next_to_alloc = val; - /* Force memory writes to complete before letting h/w * know there are new descriptors to fetch. (Only * applicable for weak-ordered memory model archs, @@ -828,12 +823,17 @@ static inline void iavf_release_rx_desc(struct iavf_ring *rx_ring, u32 val) * iavf_alloc_mapped_page - recycle or make a new page * @rx_ring: ring to use * @bi: rx_buffer struct to modify + * @dev: device used for DMA mapping + * @order: page order to allocate + * @gfp: GFP mask to allocate page * * Returns true if the page was successfully allocated or * reused. **/ static bool iavf_alloc_mapped_page(struct iavf_ring *rx_ring, - struct iavf_rx_buffer *bi) + struct iavf_rx_buffer *bi, + struct device *dev, u32 order, + gfp_t gfp) { struct page *page = bi->page; dma_addr_t dma; @@ -845,23 +845,21 @@ static bool iavf_alloc_mapped_page(struct iavf_ring *rx_ring, } /* alloc new page for storage */ - page = dev_alloc_pages(iavf_rx_pg_order(rx_ring)); + page = __dev_alloc_pages(gfp, order); if (unlikely(!page)) { rx_ring->rx_stats.alloc_page_failed++; return false; } /* map page for use */ - dma = dma_map_page_attrs(rx_ring->dev, page, 0, - iavf_rx_pg_size(rx_ring), - DMA_FROM_DEVICE, - IAVF_RX_DMA_ATTR); + dma = dma_map_page_attrs(dev, page, 0, PAGE_SIZE << order, + DMA_FROM_DEVICE, IAVF_RX_DMA_ATTR); /* if mapping failed free memory back to system since * there isn't much point in holding memory we can't use */ - if (dma_mapping_error(rx_ring->dev, dma)) { - __free_pages(page, iavf_rx_pg_order(rx_ring)); + if (dma_mapping_error(dev, dma)) { + __free_pages(page, order); rx_ring->rx_stats.alloc_page_failed++; return false; } @@ -898,32 +896,36 @@ static void iavf_receive_skb(struct iavf_ring *rx_ring, } /** - * iavf_alloc_rx_buffers - Replace used receive buffers + * __iavf_alloc_rx_buffers - Replace used receive buffers * @rx_ring: ring to place buffers on - * @cleaned_count: number of buffers to replace + * @to_refill: number of buffers to replace + * @gfp: GFP mask to allocate pages * - * Returns false if all allocations were successful, true if any fail + * Returns 0 if all allocations were successful or the number of buffers left + * to refill in case of an allocation failure. **/ -bool iavf_alloc_rx_buffers(struct iavf_ring *rx_ring, u16 cleaned_count) +static u32 __iavf_alloc_rx_buffers(struct iavf_ring *rx_ring, u32 to_refill, + gfp_t gfp) { - u16 ntu = rx_ring->next_to_use; + u32 order = iavf_rx_pg_order(rx_ring); + struct device *dev = rx_ring->dev; + u32 ntu = rx_ring->next_to_use; union iavf_rx_desc *rx_desc; struct iavf_rx_buffer *bi; /* do nothing if no valid netdev defined */ - if (!rx_ring->netdev || !cleaned_count) - return false; + if (unlikely(!rx_ring->netdev || !to_refill)) + return 0; rx_desc = IAVF_RX_DESC(rx_ring, ntu); bi = &rx_ring->rx_bi[ntu]; do { - if (!iavf_alloc_mapped_page(rx_ring, bi)) - goto no_buffers; + if (!iavf_alloc_mapped_page(rx_ring, bi, dev, order, gfp)) + break; /* sync the buffer for use by the device */ - dma_sync_single_range_for_device(rx_ring->dev, bi->dma, - bi->page_offset, + dma_sync_single_range_for_device(dev, bi->dma, bi->page_offset, rx_ring->rx_buf_len, DMA_FROM_DEVICE); @@ -943,23 +945,17 @@ bool iavf_alloc_rx_buffers(struct iavf_ring *rx_ring, u16 cleaned_count) /* clear the status bits for the next_to_use descriptor */ rx_desc->wb.qword1.status_error_len = 0; - - cleaned_count--; - } while (cleaned_count); + } while (--to_refill); if (rx_ring->next_to_use != ntu) iavf_release_rx_desc(rx_ring, ntu); - return false; - -no_buffers: - if (rx_ring->next_to_use != ntu) - iavf_release_rx_desc(rx_ring, ntu); + return to_refill; +} - /* make sure to come back via polling to try again after - * allocation failure - */ - return true; +void iavf_alloc_rx_buffers(struct iavf_ring *rxr) +{ + __iavf_alloc_rx_buffers(rxr, IAVF_DESC_UNUSED(rxr), GFP_KERNEL); } /** @@ -1104,32 +1100,6 @@ static bool iavf_cleanup_headers(struct iavf_ring *rx_ring, struct sk_buff *skb) return false; } -/** - * iavf_reuse_rx_page - page flip buffer and store it back on the ring - * @rx_ring: rx descriptor ring to store buffers on - * @old_buff: donor buffer to have page reused - * - * Synchronizes page for reuse by the adapter - **/ -static void iavf_reuse_rx_page(struct iavf_ring *rx_ring, - struct iavf_rx_buffer *old_buff) -{ - struct iavf_rx_buffer *new_buff; - u16 nta = rx_ring->next_to_alloc; - - new_buff = &rx_ring->rx_bi[nta]; - - /* update, and store next to alloc */ - nta++; - rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; - - /* transfer page from old buffer to new buffer */ - new_buff->dma = old_buff->dma; - new_buff->page = old_buff->page; - new_buff->page_offset = old_buff->page_offset; - new_buff->pagecnt_bias = old_buff->pagecnt_bias; -} - /** * iavf_can_reuse_rx_page - Determine if this page can be reused by * the adapter for another receive @@ -1191,30 +1161,26 @@ static bool iavf_can_reuse_rx_page(struct iavf_rx_buffer *rx_buffer) /** * iavf_add_rx_frag - Add contents of Rx buffer to sk_buff - * @rx_ring: rx descriptor ring to transact packets on - * @rx_buffer: buffer containing page to add * @skb: sk_buff to place the data into + * @rx_buffer: buffer containing page to add * @size: packet length from rx_desc + * @pg_size: Rx buffer page size * * This function will add the data contained in rx_buffer->page to the skb. * It will just attach the page as a frag to the skb. * * The function will then update the page offset. **/ -static void iavf_add_rx_frag(struct iavf_ring *rx_ring, +static void iavf_add_rx_frag(struct sk_buff *skb, struct iavf_rx_buffer *rx_buffer, - struct sk_buff *skb, - unsigned int size) + u32 size, u32 pg_size) { #if (PAGE_SIZE < 8192) - unsigned int truesize = iavf_rx_pg_size(rx_ring) / 2; + unsigned int truesize = pg_size / 2; #else unsigned int truesize = SKB_DATA_ALIGN(size + IAVF_SKB_PAD); #endif - if (!size) - return; - skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buffer->page, rx_buffer->page_offset, size, truesize); @@ -1224,63 +1190,47 @@ static void iavf_add_rx_frag(struct iavf_ring *rx_ring, #else rx_buffer->page_offset += truesize; #endif + + /* We have pulled a buffer for use, so decrement pagecnt_bias */ + rx_buffer->pagecnt_bias--; } /** - * iavf_get_rx_buffer - Fetch Rx buffer and synchronize data for use - * @rx_ring: rx descriptor ring to transact packets on - * @size: size of buffer to add to skb + * iavf_sync_rx_buffer - Synchronize received data for use + * @dev: device used for DMA mapping + * @buf: Rx buffer containing the data + * @size: size of the received data * - * This function will pull an Rx buffer from the ring and synchronize it - * for use by the CPU. + * This function will synchronize the Rx buffer for use by the CPU. */ -static struct iavf_rx_buffer *iavf_get_rx_buffer(struct iavf_ring *rx_ring, - const unsigned int size) +static void iavf_sync_rx_buffer(struct device *dev, struct iavf_rx_buffer *buf, + u32 size) { - struct iavf_rx_buffer *rx_buffer; - - rx_buffer = &rx_ring->rx_bi[rx_ring->next_to_clean]; - prefetchw(rx_buffer->page); - if (!size) - return rx_buffer; - - /* we are reusing so sync this buffer for CPU use */ - dma_sync_single_range_for_cpu(rx_ring->dev, - rx_buffer->dma, - rx_buffer->page_offset, - size, + dma_sync_single_range_for_cpu(dev, buf->dma, buf->page_offset, size, DMA_FROM_DEVICE); - - /* We have pulled a buffer for use, so decrement pagecnt_bias */ - rx_buffer->pagecnt_bias--; - - return rx_buffer; } /** * iavf_build_skb - Build skb around an existing buffer - * @rx_ring: Rx descriptor ring to transact packets on - * @rx_buffer: Rx buffer to pull data from - * @size: size of buffer to add to skb + * @rx_buffer: Rx buffer with the data + * @size: size of the data + * @pg_size: size of the Rx page * * This function builds an skb around an existing Rx buffer, taking care * to set up the skb correctly and avoid any memcpy overhead. */ -static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring, - struct iavf_rx_buffer *rx_buffer, - unsigned int size) +static struct sk_buff *iavf_build_skb(struct iavf_rx_buffer *rx_buffer, + u32 size, u32 pg_size) { void *va; #if (PAGE_SIZE < 8192) - unsigned int truesize = iavf_rx_pg_size(rx_ring) / 2; + unsigned int truesize = pg_size / 2; #else unsigned int truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + SKB_DATA_ALIGN(IAVF_SKB_PAD + size); #endif struct sk_buff *skb; - if (!rx_buffer || !size) - return NULL; /* prefetch first cache line of first page */ va = page_address(rx_buffer->page) + rx_buffer->page_offset; net_prefetch(va); @@ -1301,36 +1251,33 @@ static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring, rx_buffer->page_offset += truesize; #endif + rx_buffer->pagecnt_bias--; + return skb; } /** - * iavf_put_rx_buffer - Clean up used buffer and either recycle or free + * iavf_put_rx_buffer - Recycle or free used buffer * @rx_ring: rx descriptor ring to transact packets on - * @rx_buffer: rx buffer to pull data from + * @dev: device used for DMA mapping + * @rx_buffer: Rx buffer to handle + * @pg_size: Rx page size * - * This function will clean up the contents of the rx_buffer. It will - * either recycle the buffer or unmap it and free the associated resources. + * Either recycle the buffer if possible or unmap and free the page. */ -static void iavf_put_rx_buffer(struct iavf_ring *rx_ring, - struct iavf_rx_buffer *rx_buffer) +static void iavf_put_rx_buffer(struct iavf_ring *rx_ring, struct device *dev, + struct iavf_rx_buffer *rx_buffer, u32 pg_size) { - if (!rx_buffer) - return; - if (iavf_can_reuse_rx_page(rx_buffer)) { - /* hand second half of page back to the ring */ - iavf_reuse_rx_page(rx_ring, rx_buffer); rx_ring->rx_stats.page_reuse_count++; - } else { - /* we are not reusing the buffer so unmap it */ - dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma, - iavf_rx_pg_size(rx_ring), - DMA_FROM_DEVICE, IAVF_RX_DMA_ATTR); - __page_frag_cache_drain(rx_buffer->page, - rx_buffer->pagecnt_bias); + return; } + /* we are not reusing the buffer so unmap it */ + dma_unmap_page_attrs(dev, rx_buffer->dma, pg_size, + DMA_FROM_DEVICE, IAVF_RX_DMA_ATTR); + __page_frag_cache_drain(rx_buffer->page, rx_buffer->pagecnt_bias); + /* clear contents of buffer_info */ rx_buffer->page = NULL; } @@ -1350,14 +1297,6 @@ static bool iavf_is_non_eop(struct iavf_ring *rx_ring, union iavf_rx_desc *rx_desc, struct sk_buff *skb) { - u32 ntc = rx_ring->next_to_clean + 1; - - /* fetch, update, and store next to clean */ - ntc = (ntc < rx_ring->count) ? ntc : 0; - rx_ring->next_to_clean = ntc; - - prefetch(IAVF_RX_DESC(rx_ring, ntc)); - /* if we are the last buffer then there is nothing else to do */ #define IAVF_RXD_EOF BIT(IAVF_RX_DESC_STATUS_EOF_SHIFT) if (likely(iavf_test_staterr(rx_desc, IAVF_RXD_EOF))) @@ -1383,11 +1322,16 @@ static bool iavf_is_non_eop(struct iavf_ring *rx_ring, static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget) { unsigned int total_rx_bytes = 0, total_rx_packets = 0; + const gfp_t gfp = GFP_ATOMIC | __GFP_NOWARN; + u32 to_refill = IAVF_DESC_UNUSED(rx_ring); + u32 pg_size = iavf_rx_pg_size(rx_ring); struct sk_buff *skb = rx_ring->skb; - u16 cleaned_count = IAVF_DESC_UNUSED(rx_ring); - bool failure = false; + struct device *dev = rx_ring->dev; + u32 ntc = rx_ring->next_to_clean; + u32 ring_size = rx_ring->count; + u32 cleaned_count = 0; - while (likely(total_rx_packets < (unsigned int)budget)) { + while (likely(cleaned_count < budget)) { struct iavf_rx_buffer *rx_buffer; union iavf_rx_desc *rx_desc; unsigned int size; @@ -1396,13 +1340,11 @@ static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget) u64 qword; /* return some buffers to hardware, one at a time is too slow */ - if (cleaned_count >= IAVF_RX_BUFFER_WRITE) { - failure = failure || - iavf_alloc_rx_buffers(rx_ring, cleaned_count); - cleaned_count = 0; - } + if (to_refill >= IAVF_RX_BUFFER_WRITE) + to_refill = __iavf_alloc_rx_buffers(rx_ring, to_refill, + gfp); - rx_desc = IAVF_RX_DESC(rx_ring, rx_ring->next_to_clean); + rx_desc = IAVF_RX_DESC(rx_ring, ntc); /* status_error_len will always be zero for unused descriptors * because it's cleared in cleanup, and overlaps with hdr_addr @@ -1424,24 +1366,38 @@ static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget) IAVF_RXD_QW1_LENGTH_PBUF_SHIFT; iavf_trace(clean_rx_irq, rx_ring, rx_desc, skb); - rx_buffer = iavf_get_rx_buffer(rx_ring, size); + rx_buffer = &rx_ring->rx_bi[ntc]; + + /* Very rare, but possible case. The most common reason: + * the last fragment contained FCS only, which was then + * stripped by the HW. + */ + if (unlikely(!size)) + goto skip_data; + + iavf_sync_rx_buffer(dev, rx_buffer, size); /* retrieve a buffer from the ring */ if (skb) - iavf_add_rx_frag(rx_ring, rx_buffer, skb, size); + iavf_add_rx_frag(skb, rx_buffer, size, pg_size); else - skb = iavf_build_skb(rx_ring, rx_buffer, size); + skb = iavf_build_skb(rx_buffer, size, pg_size); /* exit if we failed to retrieve a buffer */ if (!skb) { rx_ring->rx_stats.alloc_buff_failed++; - if (rx_buffer && size) - rx_buffer->pagecnt_bias++; break; } - iavf_put_rx_buffer(rx_ring, rx_buffer); +skip_data: + iavf_put_rx_buffer(rx_ring, dev, rx_buffer, pg_size); + cleaned_count++; + to_refill++; + if (unlikely(++ntc == ring_size)) + ntc = 0; + + prefetch(IAVF_RX_DESC(rx_ring, ntc)); if (iavf_is_non_eop(rx_ring, rx_desc, skb)) continue; @@ -1488,8 +1444,18 @@ static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget) total_rx_packets++; } + rx_ring->next_to_clean = ntc; rx_ring->skb = skb; + if (to_refill >= IAVF_RX_BUFFER_WRITE) { + to_refill = __iavf_alloc_rx_buffers(rx_ring, to_refill, gfp); + /* guarantee a trip back through this routine if there was + * a failure + */ + if (unlikely(to_refill)) + cleaned_count = budget; + } + u64_stats_update_begin(&rx_ring->syncp); rx_ring->stats.packets += total_rx_packets; rx_ring->stats.bytes += total_rx_bytes; @@ -1497,8 +1463,7 @@ static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget) rx_ring->q_vector->rx.total_packets += total_rx_packets; rx_ring->q_vector->rx.total_bytes += total_rx_bytes; - /* guarantee a trip back through this routine if there was a failure */ - return failure ? budget : (int)total_rx_packets; + return cleaned_count; } static inline u32 iavf_buildreg_itr(const int type, u16 itr) diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.h b/drivers/net/ethernet/intel/iavf/iavf_txrx.h index 234e189c1987..9c6661a6edf2 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.h +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.h @@ -383,7 +383,6 @@ struct iavf_ring { struct iavf_q_vector *q_vector; /* Backreference to associated vector */ struct rcu_head rcu; /* to avoid race on free */ - u16 next_to_alloc; struct sk_buff *skb; /* When iavf_clean_rx_ring_irq() must * return before it sees the EOP for * the current packet, we save that skb @@ -426,7 +425,7 @@ static inline unsigned int iavf_rx_pg_order(struct iavf_ring *ring) #define iavf_rx_pg_size(_ring) (PAGE_SIZE << iavf_rx_pg_order(_ring)) -bool iavf_alloc_rx_buffers(struct iavf_ring *rxr, u16 cleaned_count); +void iavf_alloc_rx_buffers(struct iavf_ring *rxr); netdev_tx_t iavf_xmit_frame(struct sk_buff *skb, struct net_device *netdev); void iavf_clean_tx_ring(struct iavf_ring *tx_ring); void iavf_clean_rx_ring(struct iavf_ring *rx_ring);

[net-next,03/11] iavf: optimize Rx buffer allocation a bunch

Checks

Commit Message

Patch