mbox series

[v3,net-next,0/7] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking

Message ID 20210820115746.3701811-1-vladimir.oltean@nxp.com (mailing list archive)
Headers show
Series Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking | expand

Message

Vladimir Oltean Aug. 20, 2021, 11:57 a.m. UTC
Problem statement:

Any time a driver needs to create a private association between a bridge
upper interface and use that association within its
SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
entries deleted by the bridge when the port leaves. The issue is that
all switchdev drivers schedule a work item to have sleepable context,
and that work item can be actually scheduled after the port has left the
bridge, which means the association might have already been broken by
the time the scheduled FDB work item attempts to use it.

The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER
mechanism to make the FDB notifiers emitted from the fastpath be
scheduled in sleepable context. All drivers are converted to handle
SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block
handler (or register a blocking switchdev notifier handler if they
didn't have one). This solves the aforementioned problem because the
bridge waits for the switchdev deferred work items to finish before a
port leaves (del_nbp calls switchdev_deferred_process), whereas a work
item privately scheduled by the driver will obviously not be waited upon
by the bridge, leading to the possibility of having the race.

This is a dependency for the "DSA FDB isolation" posted here. It was
split out of that series hence the numbering starts directly at v2.

https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/

Changes in v3:
- make "addr" part of switchdev_fdb_notifier_info to avoid dangling
  pointers not watched by RCU
- mlx5 correction
- build fixes in the S/390 qeth driver

Vladimir Oltean (7):
  net: bridge: move br_fdb_replay inside br_switchdev.c
  net: switchdev: keep the MAC address by value in struct
    switchdev_notifier_fdb_info
  net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking
    notifier chain
  net: bridge: switchdev: make br_fdb_replay offer sleepable context to
    consumers
  net: switchdev: drop the atomic notifier block from
    switchdev_bridge_port_{,un}offload
  net: switchdev: don't assume RCU context in
    switchdev_handle_fdb_{add,del}_to_device
  net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously

 .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  75 ++++------
 .../marvell/prestera/prestera_switchdev.c     | 104 ++++++-------
 .../mellanox/mlx5/core/en/rep/bridge.c        |  65 +++++++--
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
 .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
 .../mellanox/mlxsw/spectrum_switchdev.c       |  62 ++++++--
 .../microchip/sparx5/sparx5_mactable.c        |   2 +-
 .../microchip/sparx5/sparx5_switchdev.c       |  72 ++++-----
 drivers/net/ethernet/mscc/ocelot_net.c        |   3 -
 drivers/net/ethernet/rocker/rocker_main.c     |  67 ++++-----
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |   6 +-
 drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   4 +-
 drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  54 +++----
 drivers/net/ethernet/ti/cpsw_new.c            |   4 +-
 drivers/net/ethernet/ti/cpsw_switchdev.c      |  57 ++++----
 drivers/s390/net/qeth_l2_main.c               |  26 ++--
 include/net/switchdev.h                       |  33 ++++-
 net/bridge/br.c                               |   5 +-
 net/bridge/br_fdb.c                           |  54 -------
 net/bridge/br_private.h                       |   6 -
 net/bridge/br_switchdev.c                     | 128 +++++++++++++---
 net/dsa/dsa.c                                 |  15 --
 net/dsa/dsa_priv.h                            |  15 --
 net/dsa/port.c                                |   3 -
 net/dsa/slave.c                               | 138 ++++++------------
 net/switchdev/switchdev.c                     |  61 +++++++-
 26 files changed, 550 insertions(+), 515 deletions(-)

Comments

Vlad Buslov Aug. 20, 2021, 3:46 p.m. UTC | #1
On Fri 20 Aug 2021 at 14:57, Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> Problem statement:
>
> Any time a driver needs to create a private association between a bridge
> upper interface and use that association within its
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
> entries deleted by the bridge when the port leaves. The issue is that
> all switchdev drivers schedule a work item to have sleepable context,
> and that work item can be actually scheduled after the port has left the
> bridge, which means the association might have already been broken by
> the time the scheduled FDB work item attempts to use it.
>
> The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER
> mechanism to make the FDB notifiers emitted from the fastpath be
> scheduled in sleepable context. All drivers are converted to handle
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block
> handler (or register a blocking switchdev notifier handler if they
> didn't have one). This solves the aforementioned problem because the
> bridge waits for the switchdev deferred work items to finish before a
> port leaves (del_nbp calls switchdev_deferred_process), whereas a work
> item privately scheduled by the driver will obviously not be waited upon
> by the bridge, leading to the possibility of having the race.
>
> This is a dependency for the "DSA FDB isolation" posted here. It was
> split out of that series hence the numbering starts directly at v2.
>
> https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/
>
> Changes in v3:
> - make "addr" part of switchdev_fdb_notifier_info to avoid dangling
>   pointers not watched by RCU
> - mlx5 correction
> - build fixes in the S/390 qeth driver
>
> Vladimir Oltean (7):
>   net: bridge: move br_fdb_replay inside br_switchdev.c
>   net: switchdev: keep the MAC address by value in struct
>     switchdev_notifier_fdb_info
>   net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking
>     notifier chain
>   net: bridge: switchdev: make br_fdb_replay offer sleepable context to
>     consumers
>   net: switchdev: drop the atomic notifier block from
>     switchdev_bridge_port_{,un}offload
>   net: switchdev: don't assume RCU context in
>     switchdev_handle_fdb_{add,del}_to_device
>   net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously
>
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  75 ++++------
>  .../marvell/prestera/prestera_switchdev.c     | 104 ++++++-------
>  .../mellanox/mlx5/core/en/rep/bridge.c        |  65 +++++++--
>  .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
>  .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
>  .../mellanox/mlxsw/spectrum_switchdev.c       |  62 ++++++--
>  .../microchip/sparx5/sparx5_mactable.c        |   2 +-
>  .../microchip/sparx5/sparx5_switchdev.c       |  72 ++++-----
>  drivers/net/ethernet/mscc/ocelot_net.c        |   3 -
>  drivers/net/ethernet/rocker/rocker_main.c     |  67 ++++-----
>  drivers/net/ethernet/rocker/rocker_ofdpa.c    |   6 +-
>  drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   4 +-
>  drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  54 +++----
>  drivers/net/ethernet/ti/cpsw_new.c            |   4 +-
>  drivers/net/ethernet/ti/cpsw_switchdev.c      |  57 ++++----
>  drivers/s390/net/qeth_l2_main.c               |  26 ++--
>  include/net/switchdev.h                       |  33 ++++-
>  net/bridge/br.c                               |   5 +-
>  net/bridge/br_fdb.c                           |  54 -------
>  net/bridge/br_private.h                       |   6 -
>  net/bridge/br_switchdev.c                     | 128 +++++++++++++---
>  net/dsa/dsa.c                                 |  15 --
>  net/dsa/dsa_priv.h                            |  15 --
>  net/dsa/port.c                                |   3 -
>  net/dsa/slave.c                               | 138 ++++++------------
>  net/switchdev/switchdev.c                     |  61 +++++++-
>  26 files changed, 550 insertions(+), 515 deletions(-)

For mlx5 parts:

Reviewed-and-tested-by: Vlad Buslov <vladbu@nvidia.com>
Alexandra Winter Aug. 26, 2021, 2:35 p.m. UTC | #2
On 20.08.21 13:57, Vladimir Oltean wrote:
> Problem statement:
> 
> Any time a driver needs to create a private association between a bridge
> upper interface and use that association within its
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
> entries deleted by the bridge when the port leaves. The issue is that
> all switchdev drivers schedule a work item to have sleepable context,
> and that work item can be actually scheduled after the port has left the
> bridge, which means the association might have already been broken by
> the time the scheduled FDB work item attempts to use it.
> 
> The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER
> mechanism to make the FDB notifiers emitted from the fastpath be
> scheduled in sleepable context. All drivers are converted to handle
> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block
> handler (or register a blocking switchdev notifier handler if they
> didn't have one). This solves the aforementioned problem because the
> bridge waits for the switchdev deferred work items to finish before a
> port leaves (del_nbp calls switchdev_deferred_process), whereas a work
> item privately scheduled by the driver will obviously not be waited upon
> by the bridge, leading to the possibility of having the race.
> 
> This is a dependency for the "DSA FDB isolation" posted here. It was
> split out of that series hence the numbering starts directly at v2.
> 
> https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/
> 
> Changes in v3:
> - make "addr" part of switchdev_fdb_notifier_info to avoid dangling
>   pointers not watched by RCU
> - mlx5 correction
> - build fixes in the S/390 qeth driver
> 
> Vladimir Oltean (7):
>   net: bridge: move br_fdb_replay inside br_switchdev.c
>   net: switchdev: keep the MAC address by value in struct
>     switchdev_notifier_fdb_info
>   net: switchdev: move SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE to the blocking
>     notifier chain
>   net: bridge: switchdev: make br_fdb_replay offer sleepable context to
>     consumers
>   net: switchdev: drop the atomic notifier block from
>     switchdev_bridge_port_{,un}offload
>   net: switchdev: don't assume RCU context in
>     switchdev_handle_fdb_{add,del}_to_device
>   net: dsa: handle SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE synchronously
> 
>  .../ethernet/freescale/dpaa2/dpaa2-switch.c   |  75 ++++------
>  .../marvell/prestera/prestera_switchdev.c     | 104 ++++++-------
>  .../mellanox/mlx5/core/en/rep/bridge.c        |  65 +++++++--
>  .../ethernet/mellanox/mlx5/core/esw/bridge.c  |   2 +-
>  .../ethernet/mellanox/mlxsw/spectrum_router.c |   4 +-
>  .../mellanox/mlxsw/spectrum_switchdev.c       |  62 ++++++--
>  .../microchip/sparx5/sparx5_mactable.c        |   2 +-
>  .../microchip/sparx5/sparx5_switchdev.c       |  72 ++++-----
>  drivers/net/ethernet/mscc/ocelot_net.c        |   3 -
>  drivers/net/ethernet/rocker/rocker_main.c     |  67 ++++-----
>  drivers/net/ethernet/rocker/rocker_ofdpa.c    |   6 +-
>  drivers/net/ethernet/ti/am65-cpsw-nuss.c      |   4 +-
>  drivers/net/ethernet/ti/am65-cpsw-switchdev.c |  54 +++----
>  drivers/net/ethernet/ti/cpsw_new.c            |   4 +-
>  drivers/net/ethernet/ti/cpsw_switchdev.c      |  57 ++++----
>  drivers/s390/net/qeth_l2_main.c               |  26 ++--
>  include/net/switchdev.h                       |  33 ++++-
>  net/bridge/br.c                               |   5 +-
>  net/bridge/br_fdb.c                           |  54 -------
>  net/bridge/br_private.h                       |   6 -
>  net/bridge/br_switchdev.c                     | 128 +++++++++++++---
>  net/dsa/dsa.c                                 |  15 --
>  net/dsa/dsa_priv.h                            |  15 --
>  net/dsa/port.c                                |   3 -
>  net/dsa/slave.c                               | 138 ++++++------------
>  net/switchdev/switchdev.c                     |  61 +++++++-
>  26 files changed, 550 insertions(+), 515 deletions(-)
> 

For drivers/s390/net/qeth_l2_main.c :

Reviewed-and-tested-by: Alexandra Winter <wintera@linux.ibm.com>

Thank you for this proposal, it makes qeth switchdev code more robust, cleaner and gives the
opportunity for future enhancements, like you proposed.
Vladimir Oltean Aug. 26, 2021, 2:41 p.m. UTC | #3
Hi Alexandra,

On Thu, Aug 26, 2021 at 04:35:58PM +0200, Alexandra Winter wrote:
> For drivers/s390/net/qeth_l2_main.c :
> 
> Reviewed-and-tested-by: Alexandra Winter <wintera@linux.ibm.com>
> 
> Thank you for this proposal, it makes qeth switchdev code more robust, cleaner and gives the
> opportunity for future enhancements, like you proposed.

Thanks for reviewing and testing, and sorry that I did not copy you on
v2 too, only on v3. The discussion on v2 had not yet completed when I
posted the v3, and unless anybody has any better idea, a v4 is not going
to take place:

https://patchwork.kernel.org/project/netdevbpf/cover/20210819160723.2186424-1-vladimir.oltean@nxp.com/