mbox series

[intel-next,v5,0/8] i40e: support XDP multi-buffer

Message ID 20230216140043.109345-1-tirthendu.sarkar@intel.com (mailing list archive)
Headers show
Series i40e: support XDP multi-buffer | expand

Message

Tirthendu Sarkar Feb. 16, 2023, 2 p.m. UTC
This patchset adds multi-buffer support for XDP. Tx side already has
support for multi-buffer. This patchset focuses on Rx side. The last
patch contains actual multi-buffer changes while the previous ones are
preparatory patches.

On receiving the first buffer of a packet, xdp_buff is built and its
subsequent buffers are added to it as frags. While 'next_to_clean' keeps
pointing to the first descriptor, the newly introduced 'next_to_process'
keeps track of every descriptor for the packet. 

On receiving EOP buffer the XDP program is called and appropriate action
is taken (building skb for XDP_PASS, reusing page for XDP_DROP, adjusting
page offsets for XDP_{REDIRECT,TX}).

The patchset also streamlines page offset adjustments for buffer reuse
to make it easier to post process the rx_buffers after running XDP prog.

With this patchset there does not seem to be any performance degradation
for XDP_PASS and some improvement (~1% for XDP_TX, ~5% for XDP_DROP) when
measured using xdp_rxq_info program from samples/bpf/ for 64B packets.

Changelog:
    v4 -> v5:
    - Change s/size/truesize [Tony]
    - Rebased on top of commit 9dd6e53ef63d ("i40e: check vsi type before
      setting xdp_features flag") [Lorenzo]
    - Changed size of on stack variable to u32 from u16.

    v3 -> v4:
    - Added non-linear XDP buffer support to xdp_features. [Maciej]
    - Removed double space. [Maciej]

    v2 -> v3:
    - Fixed buffer cleanup for single buffer packets on skb alloc
      failure.
    - Better naming of cleanup function.
    - Stop incrementing nr_frags for overflowing packets.
 
    v1 -> v2:
    - Instead of building xdp_buff on eop now it is built incrementally.
    - xdp_buff is now added to i40e_ring struct for preserving across
      napi calls. [Alexander Duyck]
    - Post XDP program rx_buffer processing has been simplified.
    - Rx buffer allocation pull out is reverted to avoid performance 
      issues for smaller ring sizes and now done when at least half of
      the ring has been cleaned. With v1 there was ~75% drop for
      XDP_PASS with the smallest ring size of 64 which is mitigated by
      v2 [Alexander Duyck]
    - Instead of retrying skb allocation on previous failure now the
      packet is dropped. [Maciej]
    - Simplified page offset adjustments by using xdp->frame_sz instead
      of recalculating truesize. [Maciej]
    - Change i40e_trace() to use xdp instead of skb [Maciej]
    - Reserve tailroom for legacy-rx [Maciej]
    - Centralize max frame size calculation

Tirthendu Sarkar (8):
  i40e: consolidate maximum frame size calculation for vsi
  i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer
  i40e: add pre-xdp page_count in rx_buffer
  i40e: Change size to truesize when using i40e_rx_buffer_flip()
  i40e: use frame_sz instead of recalculating truesize for building skb
  i40e: introduce next_to_process to i40e_ring
  i40e: add xdp_buff to i40e_ring struct
  i40e: add support for XDP multi-buffer Rx

 drivers/net/ethernet/intel/i40e/i40e_main.c  |  78 ++--
 drivers/net/ethernet/intel/i40e/i40e_trace.h |  20 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c  | 420 +++++++++++--------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h  |  21 +-
 4 files changed, 307 insertions(+), 232 deletions(-)

Comments

Fijalkowski, Maciej Feb. 16, 2023, 2:43 p.m. UTC | #1
On Thu, Feb 16, 2023 at 07:30:35PM +0530, Tirthendu Sarkar wrote:
> This patchset adds multi-buffer support for XDP. Tx side already has
> support for multi-buffer. This patchset focuses on Rx side. The last
> patch contains actual multi-buffer changes while the previous ones are
> preparatory patches.
> 
> On receiving the first buffer of a packet, xdp_buff is built and its
> subsequent buffers are added to it as frags. While 'next_to_clean' keeps
> pointing to the first descriptor, the newly introduced 'next_to_process'
> keeps track of every descriptor for the packet. 
> 
> On receiving EOP buffer the XDP program is called and appropriate action
> is taken (building skb for XDP_PASS, reusing page for XDP_DROP, adjusting
> page offsets for XDP_{REDIRECT,TX}).
> 
> The patchset also streamlines page offset adjustments for buffer reuse
> to make it easier to post process the rx_buffers after running XDP prog.
> 
> With this patchset there does not seem to be any performance degradation
> for XDP_PASS and some improvement (~1% for XDP_TX, ~5% for XDP_DROP) when
> measured using xdp_rxq_info program from samples/bpf/ for 64B packets.

For series:
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> 
> Changelog:
>     v4 -> v5:
>     - Change s/size/truesize [Tony]
>     - Rebased on top of commit 9dd6e53ef63d ("i40e: check vsi type before
>       setting xdp_features flag") [Lorenzo]
>     - Changed size of on stack variable to u32 from u16.
> 
>     v3 -> v4:
>     - Added non-linear XDP buffer support to xdp_features. [Maciej]
>     - Removed double space. [Maciej]
> 
>     v2 -> v3:
>     - Fixed buffer cleanup for single buffer packets on skb alloc
>       failure.
>     - Better naming of cleanup function.
>     - Stop incrementing nr_frags for overflowing packets.
>  
>     v1 -> v2:
>     - Instead of building xdp_buff on eop now it is built incrementally.
>     - xdp_buff is now added to i40e_ring struct for preserving across
>       napi calls. [Alexander Duyck]
>     - Post XDP program rx_buffer processing has been simplified.
>     - Rx buffer allocation pull out is reverted to avoid performance 
>       issues for smaller ring sizes and now done when at least half of
>       the ring has been cleaned. With v1 there was ~75% drop for
>       XDP_PASS with the smallest ring size of 64 which is mitigated by
>       v2 [Alexander Duyck]
>     - Instead of retrying skb allocation on previous failure now the
>       packet is dropped. [Maciej]
>     - Simplified page offset adjustments by using xdp->frame_sz instead
>       of recalculating truesize. [Maciej]
>     - Change i40e_trace() to use xdp instead of skb [Maciej]
>     - Reserve tailroom for legacy-rx [Maciej]
>     - Centralize max frame size calculation
> 
> Tirthendu Sarkar (8):
>   i40e: consolidate maximum frame size calculation for vsi
>   i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer
>   i40e: add pre-xdp page_count in rx_buffer
>   i40e: Change size to truesize when using i40e_rx_buffer_flip()
>   i40e: use frame_sz instead of recalculating truesize for building skb
>   i40e: introduce next_to_process to i40e_ring
>   i40e: add xdp_buff to i40e_ring struct
>   i40e: add support for XDP multi-buffer Rx
> 
>  drivers/net/ethernet/intel/i40e/i40e_main.c  |  78 ++--
>  drivers/net/ethernet/intel/i40e/i40e_trace.h |  20 +-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c  | 420 +++++++++++--------
>  drivers/net/ethernet/intel/i40e/i40e_txrx.h  |  21 +-
>  4 files changed, 307 insertions(+), 232 deletions(-)
> 
> -- 
> 2.34.1
>
Tony Nguyen Feb. 17, 2023, 5:02 p.m. UTC | #2
On 2/16/2023 6:00 AM, Tirthendu Sarkar wrote:
> This patchset adds multi-buffer support for XDP. Tx side already has
> support for multi-buffer. This patchset focuses on Rx side. The last
> patch contains actual multi-buffer changes while the previous ones are
> preparatory patches.
> 
> On receiving the first buffer of a packet, xdp_buff is built and its
> subsequent buffers are added to it as frags. While 'next_to_clean' keeps
> pointing to the first descriptor, the newly introduced 'next_to_process'
> keeps track of every descriptor for the packet.
> 
> On receiving EOP buffer the XDP program is called and appropriate action
> is taken (building skb for XDP_PASS, reusing page for XDP_DROP, adjusting
> page offsets for XDP_{REDIRECT,TX}).
> 
> The patchset also streamlines page offset adjustments for buffer reuse
> to make it easier to post process the rx_buffers after running XDP prog.
> 
> With this patchset there does not seem to be any performance degradation
> for XDP_PASS and some improvement (~1% for XDP_TX, ~5% for XDP_DROP) when
> measured using xdp_rxq_info program from samples/bpf/ for 64B packets.
> 
> Changelog:
>      v4 -> v5:
>      - Change s/size/truesize [Tony]
>      - Rebased on top of commit 9dd6e53ef63d ("i40e: check vsi type before
>        setting xdp_features flag") [Lorenzo]
>      - Changed size of on stack variable to u32 from u16.

Hi Tirthendu,

Did you move over to next-queue/dev-queue because this series still 
isn't applying.

Also, I'm not seeing the truesize change on patch 4 as there are still 
issues being reported on it.

Thanks,
Tony

>      v3 -> v4:
>      - Added non-linear XDP buffer support to xdp_features. [Maciej]
>      - Removed double space. [Maciej]
> 
>      v2 -> v3:
>      - Fixed buffer cleanup for single buffer packets on skb alloc
>        failure.
>      - Better naming of cleanup function.
>      - Stop incrementing nr_frags for overflowing packets.
>   
>      v1 -> v2:
>      - Instead of building xdp_buff on eop now it is built incrementally.
>      - xdp_buff is now added to i40e_ring struct for preserving across
>        napi calls. [Alexander Duyck]
>      - Post XDP program rx_buffer processing has been simplified.
>      - Rx buffer allocation pull out is reverted to avoid performance
>        issues for smaller ring sizes and now done when at least half of
>        the ring has been cleaned. With v1 there was ~75% drop for
>        XDP_PASS with the smallest ring size of 64 which is mitigated by
>        v2 [Alexander Duyck]
>      - Instead of retrying skb allocation on previous failure now the
>        packet is dropped. [Maciej]
>      - Simplified page offset adjustments by using xdp->frame_sz instead
>        of recalculating truesize. [Maciej]
>      - Change i40e_trace() to use xdp instead of skb [Maciej]
>      - Reserve tailroom for legacy-rx [Maciej]
>      - Centralize max frame size calculation
> 
> Tirthendu Sarkar (8):
>    i40e: consolidate maximum frame size calculation for vsi
>    i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer
>    i40e: add pre-xdp page_count in rx_buffer
>    i40e: Change size to truesize when using i40e_rx_buffer_flip()
>    i40e: use frame_sz instead of recalculating truesize for building skb
>    i40e: introduce next_to_process to i40e_ring
>    i40e: add xdp_buff to i40e_ring struct
>    i40e: add support for XDP multi-buffer Rx
> 
>   drivers/net/ethernet/intel/i40e/i40e_main.c  |  78 ++--
>   drivers/net/ethernet/intel/i40e/i40e_trace.h |  20 +-
>   drivers/net/ethernet/intel/i40e/i40e_txrx.c  | 420 +++++++++++--------
>   drivers/net/ethernet/intel/i40e/i40e_txrx.h  |  21 +-
>   4 files changed, 307 insertions(+), 232 deletions(-)
>