mbox series

[RFC,net-next,v2,0/7] net: page_pool: a couple of assorted optimizations

Message ID 20230714170853.866018-1-aleksander.lobakin@intel.com (mailing list archive)
Headers show
Series net: page_pool: a couple of assorted optimizations | expand

Message

Alexander Lobakin July 14, 2023, 5:08 p.m. UTC
Here's spin-off of the IAVF PP series[0], with 1 compile-time and several
runtime (hotpath) optimizations. They're based and tested on top of the
hybrid PP allocation series[1], but don't require it to work and are
in general independent of it and each other.

Per-patch breakdown:
 #1:   Already was on the lists, but this time it's done the other way,
       the one that Alex Duyck proposed during the review of the previous
       series. Slightly reduce the amount of C preprocessing by stopping
       including <net/page_pool.h> to <linux/skbuff.h> (which is included
       in half of the kernel sources). Especially useful with the
       abovementioned series applied, as it makes page_pool.h heavier;
 #2:   New. Group frag_* fields of &page_pool together to reduce cache
       misses;
 #3-4: New, prereqs to #5. Free 4 bytes in &page_pool_params and combine it
       with the already existing hole to get a free slot in the same CL
       where the params are inside &page_pool. Use it to store the internal
       PP flags in opposite to the driver-set ones;
 #5:   Don't call to DMA sync externals when they won't do anything anyway
       by doing some heuristics a bit earlier (when allocating a new page).
       Also was on the lists;
 #6-7: New. In addition to recycling skb PP pages directly when @napi_safe
       is set, check for the context we're in and always try to recycle
       directly when in softirq (on the same CPU where the consumer runs).
       This allows us to use direct recycling anytime we're inside a NAPI
       polling loop or GRO stuff going right after it, covering way more
       cases than it does right now.

(complete tree with [1] + this + [0] is available here: [2])

[0] https://lore.kernel.org/netdev/20230530150035.1943669-1-aleksander.lobakin@intel.com
[1] https://lore.kernel.org/netdev/20230629120226.14854-1-linyunsheng@huawei.com
[2] https://github.com/alobakin/linux/commits/iavf-pp-frag

Alexander Lobakin (7):
  net: skbuff: don't include <net/page_pool.h> to <linux/skbuff.h>
  net: page_pool: place frag_* fields in one cacheline
  net: page_pool: shrink &page_pool_params a tiny bit
  net: page_pool: don't use driver-set flags field directly
  net: page_pool: avoid calling no-op externals when possible
  net: skbuff: avoid accessing page_pool if !napi_safe when returning
    page
  net: skbuff: always try to recycle PP pages directly when in softirq

 drivers/net/ethernet/engleder/tsnep_main.c    |  1 +
 drivers/net/ethernet/freescale/fec_main.c     |  1 +
 .../marvell/octeontx2/nic/otx2_common.c       |  1 +
 .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |  1 +
 .../ethernet/mellanox/mlx5/core/en/params.c   |  1 +
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  |  1 +
 drivers/net/wireless/mediatek/mt76/mt76.h     |  1 +
 include/linux/skbuff.h                        |  3 +-
 include/net/page_pool.h                       | 23 +++---
 net/core/page_pool.c                          | 70 +++++--------------
 net/core/skbuff.c                             | 41 +++++++++++
 11 files changed, 83 insertions(+), 61 deletions(-)

---
From RFC v1[3]:
* #1: move the entire function to skbuff.c, don't try to split it (Alex);
* #2-4: new;
* #5: use internal flags field added in #4 and don't modify driver-defined
  structure (Alex, Jakub);
* #6: new;
* drop "add new NAPI state" as a redundant complication;
* #7: replace the check for the new NAPI state to just in_softirq(), should
  be fine (Jakub).

[3] https://lore.kernel.org/netdev/20230629152305.905962-1-aleksander.lobakin@intel.com