mbox series

[v5,net-next,00/10] Drop rtnl_lock from DSA .port_fdb_{add,del}

Message ID 20211024171757.3753288-1-vladimir.oltean@nxp.com (mailing list archive)
Headers show
Series Drop rtnl_lock from DSA .port_fdb_{add,del} | expand

Message

Vladimir Oltean Oct. 24, 2021, 5:17 p.m. UTC
As mentioned in the RFC posted 2 months ago:
https://patchwork.kernel.org/project/netdevbpf/cover/20210824114049.3814660-1-vladimir.oltean@nxp.com/

DSA is transitioning to a driver API where the rtnl_lock is not held
when calling ds->ops->port_fdb_add() and ds->ops->port_fdb_del().
Drivers cannot take that lock privately from those callbacks either.

This change is required so that DSA can wait for switchdev FDB work
items to finish before leaving the bridge. That change will be made in a
future patch series.

A small selftest is provided with the patch set in the hope that
concurrency issues uncovered by this series, but not spotted by me by
code inspection, will be caught.

A status of the existing drivers:

- mv88e6xxx_port_fdb_add() and mv88e6xxx_port_fdb_del() take
  mv88e6xxx_reg_lock() so they should be safe.

- qca8k_fdb_add() and qca8k_fdb_del() take mutex_lock(&priv->reg_mutex)
  so they should be safe.

- hellcreek_fdb_add() and hellcreek_fdb_add() take mutex_lock(&hellcreek->reg_lock)
  so they should be safe.

- ksz9477_port_fdb_add() and ksz9477_port_fdb_del() take mutex_lock(&dev->alu_mutex)
  so they should be safe.

- b53_fdb_add() and b53_fdb_del() did not have locking, so I've added a
  scheme based on my own judgement there (not tested).

- felix_fdb_add() and felix_fdb_del() did not have locking, I've added
  and tested a locking scheme there.

- mt7530_port_fdb_add() and mt7530_port_fdb_del() take
  mutex_lock(&priv->reg_mutex), so they should be safe.

- gswip_port_fdb() did not have locking, so I've added a non-expert
  locking scheme based on my own judgement (not tested).

- lan9303_alr_add_port() and lan9303_alr_del_port() take
  mutex_lock(&chip->alr_mutex) so they should be safe.

- sja1105_fdb_add() and sja1105_fdb_del() did not have locking, I've
  added and tested a locking scheme.

Changes in v3:
Unlock arl_mutex only once in b53_fdb_dump().

Changes in v4:
- Use __must_hold in ocelot and b53
- Add missing mutex_init in lantiq_gswip
- Clean up the selftest a bit.

Changes in v5:
- Replace __must_hold with a comment.
- Add a new patch (01/10).

Vladimir Oltean (10):
  net: dsa: avoid refcount warnings when ->port_{fdb,mdb}_del returns
    error
  net: dsa: sja1105: wait for dynamic config command completion on
    writes too
  net: dsa: sja1105: serialize access to the dynamic config interface
  net: mscc: ocelot: serialize access to the MAC table
  net: dsa: b53: serialize access to the ARL table
  net: dsa: lantiq_gswip: serialize access to the PCE registers
  net: dsa: introduce locking for the address lists on CPU and DSA ports
  net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work
  selftests: lib: forwarding: allow tests to not require mz and jq
  selftests: net: dsa: add a stress test for unlocked FDB operations

 MAINTAINERS                                   |  1 +
 drivers/net/dsa/b53/b53_common.c              | 36 ++++++--
 drivers/net/dsa/b53/b53_priv.h                |  1 +
 drivers/net/dsa/lantiq_gswip.c                | 28 +++++-
 drivers/net/dsa/sja1105/sja1105.h             |  2 +
 .../net/dsa/sja1105/sja1105_dynamic_config.c  | 91 ++++++++++++++-----
 drivers/net/dsa/sja1105/sja1105_main.c        |  1 +
 drivers/net/ethernet/mscc/ocelot.c            | 53 ++++++++---
 include/net/dsa.h                             |  1 +
 include/soc/mscc/ocelot.h                     |  3 +
 net/dsa/dsa2.c                                |  1 +
 net/dsa/slave.c                               |  2 -
 net/dsa/switch.c                              | 80 ++++++++++------
 .../drivers/net/dsa/test_bridge_fdb_stress.sh | 47 ++++++++++
 tools/testing/selftests/net/forwarding/lib.sh | 10 +-
 15 files changed, 280 insertions(+), 77 deletions(-)
 create mode 100755 tools/testing/selftests/drivers/net/dsa/test_bridge_fdb_stress.sh

Comments

patchwork-bot+netdevbpf@kernel.org Oct. 25, 2021, 12:10 p.m. UTC | #1
Hello:

This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Sun, 24 Oct 2021 20:17:47 +0300 you wrote:
> As mentioned in the RFC posted 2 months ago:
> https://patchwork.kernel.org/project/netdevbpf/cover/20210824114049.3814660-1-vladimir.oltean@nxp.com/
> 
> DSA is transitioning to a driver API where the rtnl_lock is not held
> when calling ds->ops->port_fdb_add() and ds->ops->port_fdb_del().
> Drivers cannot take that lock privately from those callbacks either.
> 
> [...]

Here is the summary with links:
  - [v5,net-next,01/10] net: dsa: avoid refcount warnings when ->port_{fdb,mdb}_del returns error
    https://git.kernel.org/netdev/net-next/c/232deb3f9567
  - [v5,net-next,02/10] net: dsa: sja1105: wait for dynamic config command completion on writes too
    https://git.kernel.org/netdev/net-next/c/df405910ab9f
  - [v5,net-next,03/10] net: dsa: sja1105: serialize access to the dynamic config interface
    https://git.kernel.org/netdev/net-next/c/eb016afd83a9
  - [v5,net-next,04/10] net: mscc: ocelot: serialize access to the MAC table
    https://git.kernel.org/netdev/net-next/c/2468346c5677
  - [v5,net-next,05/10] net: dsa: b53: serialize access to the ARL table
    https://git.kernel.org/netdev/net-next/c/f7eb4a1c0864
  - [v5,net-next,06/10] net: dsa: lantiq_gswip: serialize access to the PCE registers
    https://git.kernel.org/netdev/net-next/c/cf231b436f7c
  - [v5,net-next,07/10] net: dsa: introduce locking for the address lists on CPU and DSA ports
    https://git.kernel.org/netdev/net-next/c/338a3a4745aa
  - [v5,net-next,08/10] net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work
    https://git.kernel.org/netdev/net-next/c/5cdfde49a07f
  - [v5,net-next,09/10] selftests: lib: forwarding: allow tests to not require mz and jq
    https://git.kernel.org/netdev/net-next/c/d70b51f2845d
  - [v5,net-next,10/10] selftests: net: dsa: add a stress test for unlocked FDB operations
    https://git.kernel.org/netdev/net-next/c/edc90d15850c

You are awesome, thank you!