mbox series

[net,v2,0/8,pull,request] ice: fix AF_XDP ZC timeout and concurrency issues

Message ID 20240729200716.681496-1-anthony.l.nguyen@intel.com (mailing list archive)
Headers show
Series ice: fix AF_XDP ZC timeout and concurrency issues | expand

Message

Tony Nguyen July 29, 2024, 8:07 p.m. UTC
Maciej Fijalkowski says:

Changes included in this patchset address an issue that customer has
been facing when AF_XDP ZC Tx sockets were used in combination with flow
control and regular Tx traffic.

After executing:
ethtool --set-priv-flags $dev link-down-on-close on
ethtool -A $dev rx on tx on

launching multiple ZC Tx sockets on $dev + pinging remote interface (so
that regular Tx traffic is present) and then going through down/up of
$dev, Tx timeout occurred and then most of the time ice driver was unable
to recover from that state.

These patches combined together solve the described above issue on
customer side. Main focus here is to forbid producing Tx descriptors when
either carrier is not yet initialized or process of bringing interface
down has already started.

v2:
* in patch 6, use a single READ_ONCE against xsk_pool within napi [Jakub]

v1: https://lore.kernel.org/netdev/20240708221416.625850-1-anthony.l.nguyen@intel.com/
---
Olek,
we decided not to check IFF_UP as you initially suggested. Reason is
that when link goes down netif_running() has broader scope than IFF_UP
being set as the former (the __LINK_STATE_START bit) is cleared earlier
in the core.

The following are changes since commit 039564d2fd37b122ec0d268e2ee6334e7169e225:
  Merge branch 'mptcp-endpoint-readd-fixes' into main
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 100GbE

Maciej Fijalkowski (7):
  ice: don't busy wait for Rx queue disable in ice_qp_dis()
  ice: replace synchronize_rcu with synchronize_net
  ice: modify error handling when setting XSK pool in ndo_bpf
  ice: toggle netif_carrier when setting up XSK pool
  ice: improve updating ice_{t,r}x_ring::xsk_pool
  ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
  ice: xsk: fix txq interrupt mapping

Michal Kubiak (1):
  ice: respect netif readiness in AF_XDP ZC related ndo's

 drivers/net/ethernet/intel/ice/ice.h      |  11 +-
 drivers/net/ethernet/intel/ice/ice_base.c |   4 +-
 drivers/net/ethernet/intel/ice/ice_main.c |   2 +-
 drivers/net/ethernet/intel/ice/ice_txrx.c |  10 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c  | 184 +++++++++++++---------
 drivers/net/ethernet/intel/ice/ice_xsk.h  |  14 +-
 6 files changed, 135 insertions(+), 90 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org July 31, 2024, 2 a.m. UTC | #1
Hello:

This series was applied to netdev/net.git (main)
by Tony Nguyen <anthony.l.nguyen@intel.com>:

On Mon, 29 Jul 2024 13:07:06 -0700 you wrote:
> Maciej Fijalkowski says:
> 
> Changes included in this patchset address an issue that customer has
> been facing when AF_XDP ZC Tx sockets were used in combination with flow
> control and regular Tx traffic.
> 
> After executing:
> ethtool --set-priv-flags $dev link-down-on-close on
> ethtool -A $dev rx on tx on
> 
> [...]

Here is the summary with links:
  - [net,v2,1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
    https://git.kernel.org/netdev/net/c/ec145a18687f
  - [net,v2,2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis()
    https://git.kernel.org/netdev/net/c/1ff72a2f6779
  - [net,v2,3/8] ice: replace synchronize_rcu with synchronize_net
    https://git.kernel.org/netdev/net/c/405d9999aa0b
  - [net,v2,4/8] ice: modify error handling when setting XSK pool in ndo_bpf
    https://git.kernel.org/netdev/net/c/d59227179949
  - [net,v2,5/8] ice: toggle netif_carrier when setting up XSK pool
    https://git.kernel.org/netdev/net/c/9da75a511c55
  - [net,v2,6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool
    https://git.kernel.org/netdev/net/c/ebc33a3f8d0a
  - [net,v2,7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
    https://git.kernel.org/netdev/net/c/6044ca26210b
  - [net,v2,8/8] ice: xsk: fix txq interrupt mapping
    https://git.kernel.org/netdev/net/c/963fb4612295

You are awesome, thank you!