Message ID | 20210820115746.3701811-1-vladimir.oltean@nxp.com (mailing list archive) |
---|---|
Headers | show |
Series | Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking | expand |
On Fri 20 Aug 2021 at 14:57, Vladimir Oltean <vladimir.oltean@nxp.com> wrote: > Problem statement: > > Any time a driver needs to create a private association between a bridge > upper interface and use that association within its > SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB > entries deleted by the bridge when the port leaves. The issue is that > all switchdev drivers schedule a work item to have sleepable context, > and that work item can be actually scheduled after the port has left the > bridge, which means the association might have already been broken by > the time the scheduled FDB work item attempts to use it. > > The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER > mechanism to make the FDB notifiers emitted from the fastpath be > scheduled in sleepable context. All drivers are converted to handle > SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block > handler (or register a blocking switchdev notifier handler if they > didn't have one). This solves the aforementioned problem because the > bridge waits for the switchdev deferred work items to finish before a > port leaves (del_nbp calls switchdev_deferred_process), whereas a work > item privately scheduled by the driver will obviously not be waited upon > by the bridge, leading to the possibility of having the race. > > This is a dependency for the "DSA FDB isolation" posted here. It was > split out of that series hence the numbering starts directly at v2. > > https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/ > > Changes in v3: > - make "addr" part of switchdev_fdb_notifier_info to avoid dangling > pointers not watched by RCU > - mlx5 correction > - build fixes in the S/390 qeth driver > > Vladimir Oltean (7): > net: bridge: move br_fdb_replay inside br_switchdev.c > net: switchdev: keep the MAC address by value in struct > switchdev_notifier_fdb_info > net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking > notifier chain > net: bridge: switchdev: make br_fdb_replay offer sleepable context to > consumers > net: switchdev: drop the atomic notifier block from > switchdev_bridge_port_{,un}offload > net: switchdev: don't assume RCU context in > switchdev_handle_fdb_{add,del}_to_device > net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously > > .../ethernet/freescale/dpaa2/dpaa2-switch.c | 75 ++++------ > .../marvell/prestera/prestera_switchdev.c | 104 ++++++------- > .../mellanox/mlx5/core/en/rep/bridge.c | 65 +++++++-- > .../ethernet/mellanox/mlx5/core/esw/bridge.c | 2 +- > .../ethernet/mellanox/mlxsw/spectrum_router.c | 4 +- > .../mellanox/mlxsw/spectrum_switchdev.c | 62 ++++++-- > .../microchip/sparx5/sparx5_mactable.c | 2 +- > .../microchip/sparx5/sparx5_switchdev.c | 72 ++++----- > drivers/net/ethernet/mscc/ocelot_net.c | 3 - > drivers/net/ethernet/rocker/rocker_main.c | 67 ++++----- > drivers/net/ethernet/rocker/rocker_ofdpa.c | 6 +- > drivers/net/ethernet/ti/am65-cpsw-nuss.c | 4 +- > drivers/net/ethernet/ti/am65-cpsw-switchdev.c | 54 +++---- > drivers/net/ethernet/ti/cpsw_new.c | 4 +- > drivers/net/ethernet/ti/cpsw_switchdev.c | 57 ++++---- > drivers/s390/net/qeth_l2_main.c | 26 ++-- > include/net/switchdev.h | 33 ++++- > net/bridge/br.c | 5 +- > net/bridge/br_fdb.c | 54 ------- > net/bridge/br_private.h | 6 - > net/bridge/br_switchdev.c | 128 +++++++++++++--- > net/dsa/dsa.c | 15 -- > net/dsa/dsa_priv.h | 15 -- > net/dsa/port.c | 3 - > net/dsa/slave.c | 138 ++++++------------ > net/switchdev/switchdev.c | 61 +++++++- > 26 files changed, 550 insertions(+), 515 deletions(-) For mlx5 parts: Reviewed-and-tested-by: Vlad Buslov <vladbu@nvidia.com>
On 20.08.21 13:57, Vladimir Oltean wrote: > Problem statement: > > Any time a driver needs to create a private association between a bridge > upper interface and use that association within its > SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB > entries deleted by the bridge when the port leaves. The issue is that > all switchdev drivers schedule a work item to have sleepable context, > and that work item can be actually scheduled after the port has left the > bridge, which means the association might have already been broken by > the time the scheduled FDB work item attempts to use it. > > The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER > mechanism to make the FDB notifiers emitted from the fastpath be > scheduled in sleepable context. All drivers are converted to handle > SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block > handler (or register a blocking switchdev notifier handler if they > didn't have one). This solves the aforementioned problem because the > bridge waits for the switchdev deferred work items to finish before a > port leaves (del_nbp calls switchdev_deferred_process), whereas a work > item privately scheduled by the driver will obviously not be waited upon > by the bridge, leading to the possibility of having the race. > > This is a dependency for the "DSA FDB isolation" posted here. It was > split out of that series hence the numbering starts directly at v2. > > https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/ > > Changes in v3: > - make "addr" part of switchdev_fdb_notifier_info to avoid dangling > pointers not watched by RCU > - mlx5 correction > - build fixes in the S/390 qeth driver > > Vladimir Oltean (7): > net: bridge: move br_fdb_replay inside br_switchdev.c > net: switchdev: keep the MAC address by value in struct > switchdev_notifier_fdb_info > net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking > notifier chain > net: bridge: switchdev: make br_fdb_replay offer sleepable context to > consumers > net: switchdev: drop the atomic notifier block from > switchdev_bridge_port_{,un}offload > net: switchdev: don't assume RCU context in > switchdev_handle_fdb_{add,del}_to_device > net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously > > .../ethernet/freescale/dpaa2/dpaa2-switch.c | 75 ++++------ > .../marvell/prestera/prestera_switchdev.c | 104 ++++++------- > .../mellanox/mlx5/core/en/rep/bridge.c | 65 +++++++-- > .../ethernet/mellanox/mlx5/core/esw/bridge.c | 2 +- > .../ethernet/mellanox/mlxsw/spectrum_router.c | 4 +- > .../mellanox/mlxsw/spectrum_switchdev.c | 62 ++++++-- > .../microchip/sparx5/sparx5_mactable.c | 2 +- > .../microchip/sparx5/sparx5_switchdev.c | 72 ++++----- > drivers/net/ethernet/mscc/ocelot_net.c | 3 - > drivers/net/ethernet/rocker/rocker_main.c | 67 ++++----- > drivers/net/ethernet/rocker/rocker_ofdpa.c | 6 +- > drivers/net/ethernet/ti/am65-cpsw-nuss.c | 4 +- > drivers/net/ethernet/ti/am65-cpsw-switchdev.c | 54 +++---- > drivers/net/ethernet/ti/cpsw_new.c | 4 +- > drivers/net/ethernet/ti/cpsw_switchdev.c | 57 ++++---- > drivers/s390/net/qeth_l2_main.c | 26 ++-- > include/net/switchdev.h | 33 ++++- > net/bridge/br.c | 5 +- > net/bridge/br_fdb.c | 54 ------- > net/bridge/br_private.h | 6 - > net/bridge/br_switchdev.c | 128 +++++++++++++--- > net/dsa/dsa.c | 15 -- > net/dsa/dsa_priv.h | 15 -- > net/dsa/port.c | 3 - > net/dsa/slave.c | 138 ++++++------------ > net/switchdev/switchdev.c | 61 +++++++- > 26 files changed, 550 insertions(+), 515 deletions(-) > For drivers/s390/net/qeth_l2_main.c : Reviewed-and-tested-by: Alexandra Winter <wintera@linux.ibm.com> Thank you for this proposal, it makes qeth switchdev code more robust, cleaner and gives the opportunity for future enhancements, like you proposed.
Hi Alexandra, On Thu, Aug 26, 2021 at 04:35:58PM +0200, Alexandra Winter wrote: > For drivers/s390/net/qeth_l2_main.c : > > Reviewed-and-tested-by: Alexandra Winter <wintera@linux.ibm.com> > > Thank you for this proposal, it makes qeth switchdev code more robust, cleaner and gives the > opportunity for future enhancements, like you proposed. Thanks for reviewing and testing, and sorry that I did not copy you on v2 too, only on v3. The discussion on v2 had not yet completed when I posted the v3, and unless anybody has any better idea, a v4 is not going to take place: https://patchwork.kernel.org/project/netdevbpf/cover/20210819160723.2186424-1-vladimir.oltean@nxp.com/