Message ID | 20250122151046.574061-3-maciej.fijalkowski@intel.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ice: fix Rx data path for heavy 9k MTU traffic | expand |
On Wed, Jan 22, 2025 at 04:10:45PM +0100, Maciej Fijalkowski wrote: > If we store the pgcnt on few fragments while being in the middle of > gathering the whole frame and we stumbled upon DD bit not being set, we > terminate the NAPI Rx processing loop and come back later on. Then on > next NAPI execution we work on previously stored pgcnt. > > Imagine that second half of page was used actively by networking stack > and by the time we came back, stack is not busy with this page anymore > and decremented the refcnt. The page reuse algorithm in this case should > be good to reuse the page but given the old refcnt it will not do so and > attempt to release the page via page_frag_cache_drain() with > pagecnt_bias used as an arg. This in turn will result in negative refcnt > on struct page, which was initially observed by Xu Du. > > Therefore, move the page count storage from ice_get_rx_buf() to a place > where we are sure that whole frame has been collected, but before > calling XDP program as it internally can also change the page count of > fragments belonging to xdp_buff. > > Fixes: ac0753391195 ("ice: Store page count inside ice_rx_buf") > Reported-and-tested-by: Xu Du <xudu@redhat.com> > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> > Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org>
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index e173d9c98988..cf46bcf143b4 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -924,7 +924,6 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, struct ice_rx_buf *rx_buf; rx_buf = &rx_ring->rx_buf[ntc]; - rx_buf->pgcnt = page_count(rx_buf->page); prefetchw(rx_buf->page); if (!size) @@ -940,6 +939,31 @@ ice_get_rx_buf(struct ice_rx_ring *rx_ring, const unsigned int size, return rx_buf; } +/** + * ice_get_pgcnts - grab page_count() for gathered fragments + * @rx_ring: Rx descriptor ring to store the page counts on + * + * This function is intended to be called right before running XDP + * program so that the page recycling mechanism will be able to take + * a correct decision regarding underlying pages; this is done in such + * way as XDP program can change the refcount of page + */ +static void ice_get_pgcnts(struct ice_rx_ring *rx_ring) +{ + u32 nr_frags = rx_ring->nr_frags + 1; + u32 idx = rx_ring->first_desc; + struct ice_rx_buf *rx_buf; + u32 cnt = rx_ring->count; + + for (int i = 0; i < nr_frags; i++) { + rx_buf = &rx_ring->rx_buf[idx]; + rx_buf->pgcnt = page_count(rx_buf->page); + + if (++idx == cnt) + idx = 0; + } +} + /** * ice_build_skb - Build skb around an existing buffer * @rx_ring: Rx descriptor ring to transact packets on @@ -1241,6 +1265,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; + ice_get_pgcnts(rx_ring); ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb;