mbox series

[bpf-next,0/6] ice: post-mbuf fixes

Message ID 20230210170618.1973430-1-alexandr.lobakin@intel.com (mailing list archive)
Headers show
Series ice: post-mbuf fixes | expand

Message

Alexander Lobakin Feb. 10, 2023, 5:06 p.m. UTC
The set grew from the poor performance of %BPF_F_TEST_XDP_LIVE_FRAMES
when the ice-backed device is a sender. Initially there were around
3.3 Mpps / thread, while I have 5.5 on skb-based pktgen...

After fixing 0005 (0004 is a prereq for it) first (strange thing nobody
noticed that earlier), I started catching random OOMs. This is how 0002
(and partially 0001) appeared.
0003 is a suggestion from Maciej to not waste time on refactoring dead
lines. 0006 is a "cherry on top" to get away with the final 6.7 Mpps.
4.5 of 6 are fixes, but only the first three are tagged, since it then
starts being tricky. I may backport them manually later on.

TL;DR for the series is that shortcuts are good, but only as long as
they don't make the driver miss important things. %XDP_TX is purely
driver-local, however .ndo_xdp_xmit() is not, and sometimes assumptions
can be unsafe there.

With that series and also one core code patch[0], "live frames" and
xdp-trafficgen are now safe'n'fast on ice (probably more to come).

[0] https://lore.kernel.org/all/20230209172827.874728-1-alexandr.lobakin@intel.com
---
Goes to directly to bpf-next as touches the recently added/changed code.

Alexander Lobakin (6):
  ice: fix ice_tx_ring::xdp_tx_active underflow
  ice: fix XDP Tx ring overrun
  ice: remove two impossible branches on XDP Tx cleaning
  ice: robustify cleaning/completing XDP Tx buffers
  ice: fix freeing XDP frames backed by Page Pool
  ice: micro-optimize .ndo_xdp_xmit() path

 drivers/net/ethernet/intel/ice/ice_txrx.c     | 67 +++++++++-----
 drivers/net/ethernet/intel/ice/ice_txrx.h     | 37 ++++++--
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 88 ++++++++++++-------
 drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  4 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c      | 12 +--
 5 files changed, 136 insertions(+), 72 deletions(-)

Comments

Toke Høiland-Jørgensen Feb. 10, 2023, 6:09 p.m. UTC | #1
Alexander Lobakin <alexandr.lobakin@intel.com> writes:

> The set grew from the poor performance of %BPF_F_TEST_XDP_LIVE_FRAMES
> when the ice-backed device is a sender. Initially there were around
> 3.3 Mpps / thread, while I have 5.5 on skb-based pktgen...
>
> After fixing 0005 (0004 is a prereq for it) first (strange thing nobody
> noticed that earlier), I started catching random OOMs. This is how 0002
> (and partially 0001) appeared.
> 0003 is a suggestion from Maciej to not waste time on refactoring dead
> lines. 0006 is a "cherry on top" to get away with the final 6.7 Mpps.
> 4.5 of 6 are fixes, but only the first three are tagged, since it then
> starts being tricky. I may backport them manually later on.
>
> TL;DR for the series is that shortcuts are good, but only as long as
> they don't make the driver miss important things. %XDP_TX is purely
> driver-local, however .ndo_xdp_xmit() is not, and sometimes assumptions
> can be unsafe there.
>
> With that series and also one core code patch[0], "live frames" and
> xdp-trafficgen are now safe'n'fast on ice (probably more to come).

Nice speedup! And cool to see that you're playing around with
xdp-trafficgen :)

-Toke
Alexander Lobakin Feb. 13, 2023, 2:53 p.m. UTC | #2
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Fri, 10 Feb 2023 19:09:12 +0100

> Alexander Lobakin <alexandr.lobakin@intel.com> writes:
> 
>> The set grew from the poor performance of %BPF_F_TEST_XDP_LIVE_FRAMES
>> when the ice-backed device is a sender. Initially there were around
>> 3.3 Mpps / thread, while I have 5.5 on skb-based pktgen...
>>
>> After fixing 0005 (0004 is a prereq for it) first (strange thing nobody
>> noticed that earlier), I started catching random OOMs. This is how 0002
>> (and partially 0001) appeared.
>> 0003 is a suggestion from Maciej to not waste time on refactoring dead
>> lines. 0006 is a "cherry on top" to get away with the final 6.7 Mpps.
>> 4.5 of 6 are fixes, but only the first three are tagged, since it then
>> starts being tricky. I may backport them manually later on.
>>
>> TL;DR for the series is that shortcuts are good, but only as long as
>> they don't make the driver miss important things. %XDP_TX is purely
>> driver-local, however .ndo_xdp_xmit() is not, and sometimes assumptions
>> can be unsafe there.
>>
>> With that series and also one core code patch[0], "live frames" and
>> xdp-trafficgen are now safe'n'fast on ice (probably more to come).
> 
> Nice speedup! And cool to see that you're playing around with
> xdp-trafficgen :)

It's not only good for bombing receivers without any special HW, but
also for uncovering problems with XDP in drivers and/or kernel core,
as I can see :D

> 
> -Toke
>

Thanks,
Olek
Maciej Fijalkowski Feb. 13, 2023, 5:57 p.m. UTC | #3
On Fri, Feb 10, 2023 at 06:06:12PM +0100, Alexander Lobakin wrote:
> The set grew from the poor performance of %BPF_F_TEST_XDP_LIVE_FRAMES
> when the ice-backed device is a sender. Initially there were around
> 3.3 Mpps / thread, while I have 5.5 on skb-based pktgen...
> 
> After fixing 0005 (0004 is a prereq for it) first (strange thing nobody
> noticed that earlier), I started catching random OOMs. This is how 0002
> (and partially 0001) appeared.
> 0003 is a suggestion from Maciej to not waste time on refactoring dead
> lines. 0006 is a "cherry on top" to get away with the final 6.7 Mpps.
> 4.5 of 6 are fixes, but only the first three are tagged, since it then
> starts being tricky. I may backport them manually later on.
> 
> TL;DR for the series is that shortcuts are good, but only as long as
> they don't make the driver miss important things. %XDP_TX is purely
> driver-local, however .ndo_xdp_xmit() is not, and sometimes assumptions
> can be unsafe there.
> 
> With that series and also one core code patch[0], "live frames" and
> xdp-trafficgen are now safe'n'fast on ice (probably more to come).
> 
> [0] https://lore.kernel.org/all/20230209172827.874728-1-alexandr.lobakin@intel.com
> ---
> Goes to directly to bpf-next as touches the recently added/changed code.

For the series:
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> 
> Alexander Lobakin (6):
>   ice: fix ice_tx_ring::xdp_tx_active underflow
>   ice: fix XDP Tx ring overrun
>   ice: remove two impossible branches on XDP Tx cleaning
>   ice: robustify cleaning/completing XDP Tx buffers
>   ice: fix freeing XDP frames backed by Page Pool
>   ice: micro-optimize .ndo_xdp_xmit() path
> 
>  drivers/net/ethernet/intel/ice/ice_txrx.c     | 67 +++++++++-----
>  drivers/net/ethernet/intel/ice/ice_txrx.h     | 37 ++++++--
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 88 ++++++++++++-------
>  drivers/net/ethernet/intel/ice/ice_txrx_lib.h |  4 +-
>  drivers/net/ethernet/intel/ice/ice_xsk.c      | 12 +--
>  5 files changed, 136 insertions(+), 72 deletions(-)
> 
> -- 
> 2.39.1
>
patchwork-bot+netdevbpf@kernel.org Feb. 13, 2023, 6:21 p.m. UTC | #4
Hello:

This series was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Fri, 10 Feb 2023 18:06:12 +0100 you wrote:
> The set grew from the poor performance of %BPF_F_TEST_XDP_LIVE_FRAMES
> when the ice-backed device is a sender. Initially there were around
> 3.3 Mpps / thread, while I have 5.5 on skb-based pktgen...
> 
> After fixing 0005 (0004 is a prereq for it) first (strange thing nobody
> noticed that earlier), I started catching random OOMs. This is how 0002
> (and partially 0001) appeared.
> 0003 is a suggestion from Maciej to not waste time on refactoring dead
> lines. 0006 is a "cherry on top" to get away with the final 6.7 Mpps.
> 4.5 of 6 are fixes, but only the first three are tagged, since it then
> starts being tricky. I may backport them manually later on.
> 
> [...]

Here is the summary with links:
  - [bpf-next,1/6] ice: fix ice_tx_ring::xdp_tx_active underflow
    https://git.kernel.org/bpf/bpf-next/c/bc4db8347003
  - [bpf-next,2/6] ice: fix XDP Tx ring overrun
    https://git.kernel.org/bpf/bpf-next/c/0bd939b60cea
  - [bpf-next,3/6] ice: remove two impossible branches on XDP Tx cleaning
    https://git.kernel.org/bpf/bpf-next/c/923096b5cec3
  - [bpf-next,4/6] ice: robustify cleaning/completing XDP Tx buffers
    https://git.kernel.org/bpf/bpf-next/c/aa1d3faf71a6
  - [bpf-next,5/6] ice: fix freeing XDP frames backed by Page Pool
    https://git.kernel.org/bpf/bpf-next/c/055d0920685e
  - [bpf-next,6/6] ice: micro-optimize .ndo_xdp_xmit() path
    https://git.kernel.org/bpf/bpf-next/c/ad07f29b9c9a

You are awesome, thank you!