Message ID | 20250411074958.2858496-1-cratiu@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | xfrm & bonding: Correct use of xso.real_dev | expand |
On 4/11/25 10:49, Cosmin Ratiu wrote: > This patch series was motivated by fixing a few bugs in the bonding > driver related to xfrm state migration on device failover. > > struct xfrm_dev_offload has two net_device pointers: dev and real_dev. > The first one is the device the xfrm_state is offloaded on and the > second one is used by the bonding driver to manage the underlying device > xfrm_states are actually offloaded on. When bonding isn't used, the two > pointers are the same. > > This causes confusion in drivers: Which device pointer should they use? > If they want to support bonding, they need to only use real_dev and > never look at dev. > > Furthermore, real_dev is used without proper locking from multiple code > paths and changing it is dangerous. See commit [1] for example. > > This patch series clears things out by removing all uses of real_dev > from outside the bonding driver. > Then, the bonding driver is refactored to fix a couple of long standing > races and the original bug which motivated this patch series. > > [1] commit f8cde9805981 ("bonding: fix xfrm real_dev null pointer > dereference") > > v2 -> v3: > Added a comment with locking expectations for real_dev. > Removed unnecessary bond variable from bond_ipsec_del_sa(). > v1 -> v2: > Added missing kdoc for various functions. > Made bond_ipsec_del_sa() use xso.real_dev instead of curr_active_slave. > > Cosmin Ratiu (6): > Cleaning up unnecessary uses of xso.real_dev: > net/mlx5: Avoid using xso.real_dev unnecessarily > xfrm: Use xdo.dev instead of xdo.real_dev > xfrm: Remove unneeded device check from validate_xmit_xfrm > Refactoring device operations to get an explicit device pointer: > xfrm: Add explicit dev to .xdo_dev_state_{add,delete,free} > Fixing a bonding xfrm state migration bug: > bonding: Mark active offloaded xfrm_states > Fixing long standing races in bonding: > bonding: Fix multiple long standing offload races > > Documentation/networking/xfrm_device.rst | 10 +- > drivers/net/bonding/bond_main.c | 119 +++++++++--------- > .../net/ethernet/chelsio/cxgb4/cxgb4_main.c | 20 +-- > .../inline_crypto/ch_ipsec/chcr_ipsec.c | 18 ++- > .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 41 +++--- > drivers/net/ethernet/intel/ixgbevf/ipsec.c | 21 ++-- > .../marvell/octeontx2/nic/cn10k_ipsec.c | 18 +-- > .../mellanox/mlx5/core/en_accel/ipsec.c | 28 ++--- > .../mellanox/mlx5/core/en_accel/ipsec.h | 1 + > .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 +- > drivers/net/netdevsim/ipsec.c | 15 ++- > include/linux/netdevice.h | 10 +- > include/net/xfrm.h | 11 ++ > net/xfrm/xfrm_device.c | 13 +- > net/xfrm/xfrm_state.c | 16 +-- > 15 files changed, 185 insertions(+), 167 deletions(-) > For the set: Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
On Fri, Apr 11, 2025 at 10:49:52AM +0300, Cosmin Ratiu wrote: > This patch series was motivated by fixing a few bugs in the bonding > driver related to xfrm state migration on device failover. > > struct xfrm_dev_offload has two net_device pointers: dev and real_dev. > The first one is the device the xfrm_state is offloaded on and the > second one is used by the bonding driver to manage the underlying device > xfrm_states are actually offloaded on. When bonding isn't used, the two > pointers are the same. > > This causes confusion in drivers: Which device pointer should they use? > If they want to support bonding, they need to only use real_dev and > never look at dev. > > Furthermore, real_dev is used without proper locking from multiple code > paths and changing it is dangerous. See commit [1] for example. > > This patch series clears things out by removing all uses of real_dev > from outside the bonding driver. > Then, the bonding driver is refactored to fix a couple of long standing > races and the original bug which motivated this patch series. I'm still a bit skeptical about the bonding offloads itself as mentioned here: https://lore.kernel.org/all/ZsbkdzvjVf3GiYHa@gauss3.secunet.de/ but I'm OK with this particular pachset. How should we merge this patchset? It touches several subsystems, including xfrm. I'm fine merging it through the ipsec-next tree, but would be also ok if it goes though the net-next tree if that's easier.
On Mon, 14 Apr 2025 12:11:36 +0200 Steffen Klassert wrote: > I'm still a bit skeptical about the bonding offloads itself as > mentioned here: > > https://lore.kernel.org/all/ZsbkdzvjVf3GiYHa@gauss3.secunet.de/ So am I, FWIW. > but I'm OK with this particular pachset. > > How should we merge this patchset? It touches several subsystems, > including xfrm. I'm fine merging it through the ipsec-next tree, > but would be also ok if it goes though the net-next tree if > that's easier. No strong preference, but I think xfrm tree makes most sense. It touches a few other directories but all code here is xfrm related.
On Mon, Apr 14, 2025 at 09:51:47AM -0700, Jakub Kicinski wrote: > On Mon, 14 Apr 2025 12:11:36 +0200 Steffen Klassert wrote: > > I'm still a bit skeptical about the bonding offloads itself as > > mentioned here: > > > > https://lore.kernel.org/all/ZsbkdzvjVf3GiYHa@gauss3.secunet.de/ > > So am I, FWIW. > > > but I'm OK with this particular pachset. > > > > How should we merge this patchset? It touches several subsystems, > > including xfrm. I'm fine merging it through the ipsec-next tree, > > but would be also ok if it goes though the net-next tree if > > that's easier. > > No strong preference, but I think xfrm tree makes most sense. > It touches a few other directories but all code here is xfrm > related. Ok, I'll take them into ipsec-next.
On Fri, Apr 11, 2025 at 10:49:52AM +0300, Cosmin Ratiu wrote: > This patch series was motivated by fixing a few bugs in the bonding > driver related to xfrm state migration on device failover. > > struct xfrm_dev_offload has two net_device pointers: dev and real_dev. > The first one is the device the xfrm_state is offloaded on and the > second one is used by the bonding driver to manage the underlying device > xfrm_states are actually offloaded on. When bonding isn't used, the two > pointers are the same. > > This causes confusion in drivers: Which device pointer should they use? > If they want to support bonding, they need to only use real_dev and > never look at dev. > > Furthermore, real_dev is used without proper locking from multiple code > paths and changing it is dangerous. See commit [1] for example. > > This patch series clears things out by removing all uses of real_dev > from outside the bonding driver. > Then, the bonding driver is refactored to fix a couple of long standing > races and the original bug which motivated this patch series. > > [1] commit f8cde9805981 ("bonding: fix xfrm real_dev null pointer > dereference") > > v2 -> v3: > Added a comment with locking expectations for real_dev. > Removed unnecessary bond variable from bond_ipsec_del_sa(). > v1 -> v2: > Added missing kdoc for various functions. > Made bond_ipsec_del_sa() use xso.real_dev instead of curr_active_slave. > > Cosmin Ratiu (6): > Cleaning up unnecessary uses of xso.real_dev: > net/mlx5: Avoid using xso.real_dev unnecessarily > xfrm: Use xdo.dev instead of xdo.real_dev > xfrm: Remove unneeded device check from validate_xmit_xfrm > Refactoring device operations to get an explicit device pointer: > xfrm: Add explicit dev to .xdo_dev_state_{add,delete,free} > Fixing a bonding xfrm state migration bug: > bonding: Mark active offloaded xfrm_states > Fixing long standing races in bonding: > bonding: Fix multiple long standing offload races > > Documentation/networking/xfrm_device.rst | 10 +- > drivers/net/bonding/bond_main.c | 119 +++++++++--------- > .../net/ethernet/chelsio/cxgb4/cxgb4_main.c | 20 +-- > .../inline_crypto/ch_ipsec/chcr_ipsec.c | 18 ++- > .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 41 +++--- > drivers/net/ethernet/intel/ixgbevf/ipsec.c | 21 ++-- > .../marvell/octeontx2/nic/cn10k_ipsec.c | 18 +-- > .../mellanox/mlx5/core/en_accel/ipsec.c | 28 ++--- > .../mellanox/mlx5/core/en_accel/ipsec.h | 1 + > .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 +- > drivers/net/netdevsim/ipsec.c | 15 ++- > include/linux/netdevice.h | 10 +- > include/net/xfrm.h | 11 ++ > net/xfrm/xfrm_device.c | 13 +- > net/xfrm/xfrm_state.c | 16 +-- > 15 files changed, 185 insertions(+), 167 deletions(-) Series applied to ipsec-next, thanks a lot Cosmin!