mbox series

[RFC,net-next,0/7] net: dsa: Sync local bridge FDB addresses to hardware

Message ID 20210116012515.3152-1-tobias@waldekranz.com (mailing list archive)
Headers show
Series net: dsa: Sync local bridge FDB addresses to hardware | expand

Message

Tobias Waldekranz Jan. 16, 2021, 1:25 a.m. UTC
This is an extension of previous work done by Vladimir Oltean:
https://lore.kernel.org/netdev/20210106095136.224739-1-olteanv@gmail.com/

With this series, local addresses belonging to bridge ports or to the
bridge itself are also synced down to the hardware FDB. As a result
the hardware can avoid flooding return traffic to the CPU which is not
only inefficient but also very confusing to users:

https://lore.kernel.org/netdev/CAGwvh_MAQWuKuhu5VuYjibmyN-FRxCXXhrQBRm34GShZPSN6Aw@mail.gmail.com/
https://lore.kernel.org/netdev/6106e3d5-31fc-388e-d4ac-c84ac0746a72@prevas.dk/

Patch 1 through 3 extends the switchdev fdb notifications to include
the local flag, and to handle the case when an entry is added to the
bridge itself.

Patches 4 through 6 enables DSA to make use of those extensions.

Finally, enable assisted learning on the CPU port for mv88e6xxx.

Tobias Waldekranz (7):
  net: bridge: switchdev: Refactor br_switchdev_fdb_notify
  net: bridge: switchdev: Include local flag in FDB notifications
  net: bridge: switchdev: Send FDB notifications for host addresses
  net: dsa: Include local addresses in assisted CPU port learning
  net: dsa: Include bridge addresses in assisted CPU port learning
  net: dsa: Sync static FDB entries on foreign interfaces to hardware
  net: dsa: mv88e6xxx: Request assisted learning on CPU port

 drivers/net/dsa/mv88e6xxx/chip.c |  1 +
 include/net/switchdev.h          |  1 +
 net/bridge/br_fdb.c              |  4 +--
 net/bridge/br_private.h          |  7 +++--
 net/bridge/br_switchdev.c        | 47 ++++++++++----------------------
 net/dsa/slave.c                  | 26 ++++++++++++------
 6 files changed, 40 insertions(+), 46 deletions(-)

Comments

Qingfang Deng Feb. 1, 2021, 6:24 a.m. UTC | #1
Hi Tobias,

I've tested your patch series on kernel 5.4 and found that it only works
when VLAN filtering is enabled.
After some debugging, I noticed DSA will add static entries to ATU 0 if
VLAN filtering is disabled, regardless of default_pvid of the bridge,
which is also the ATU# used by the bridge.

Currently I use the hack below to rewrite ATU# to 1, but it obviously
does not solve the root cause.

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index b99f27b8c084..9c897c03896f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2106,6 +2106,7 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	vid = vid ? : 1;
 	mv88e6xxx_reg_lock(chip);
 	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
 					   MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC);
@@ -2120,6 +2121,7 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
 
+	vid = vid ? : 1;
 	mv88e6xxx_reg_lock(chip);
 	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
 	mv88e6xxx_reg_unlock(chip);
Tobias Waldekranz Feb. 3, 2021, 9:27 a.m. UTC | #2
On Mon, Feb 01, 2021 at 14:24, DENG Qingfang <dqfext@gmail.com> wrote:
> Hi Tobias,
>
> I've tested your patch series on kernel 5.4 and found that it only works
> when VLAN filtering is enabled.
> After some debugging, I noticed DSA will add static entries to ATU 0 if
> VLAN filtering is disabled, regardless of default_pvid of the bridge,
> which is also the ATU# used by the bridge.

Good catch, thanks! This seems to stem from a different policy regarding
VLAN assignment in the bridge vs. how a Marvell switch works.

By default, a bridge will use a default PVID of 1, even when VLAN
filtering is disabled (nbp_vlan_init). Yet it will assign all packets to
VLAN 0 on ingress (br_handle_frame_finish->br_allowed_ingress).

The switch OTOH, will use the PVID of the port for all packets when
802.1Q is disabled, thus assigning all packets to VLAN 1 when VLAN
filtering is disabled.

Andrew, Vladimir: Should mv88e6xxx always set the PVID to 0 when VLAN
filtering is disabled?

> Currently I use the hack below to rewrite ATU# to 1, but it obviously
> does not solve the root cause.

Alternatively, as a userspace workaround, you can change the default
PVID to 0:

ip link set dev $BR type bridge vlan_default_pvid 0
Vladimir Oltean Feb. 3, 2021, 10:14 a.m. UTC | #3
On Wed, Feb 03, 2021 at 10:27:02AM +0100, Tobias Waldekranz wrote:
> On Mon, Feb 01, 2021 at 14:24, DENG Qingfang <dqfext@gmail.com> wrote:
> > I've tested your patch series on kernel 5.4 and found that it only works
> > when VLAN filtering is enabled.
> > After some debugging, I noticed DSA will add static entries to ATU 0 if
> > VLAN filtering is disabled, regardless of default_pvid of the bridge,
> > which is also the ATU# used by the bridge.
> By default, a bridge will use a default PVID of 1, even when VLAN
> filtering is disabled (nbp_vlan_init). Yet it will assign all packets to
> VLAN 0 on ingress (br_handle_frame_finish->br_allowed_ingress).
>
> The switch OTOH, will use the PVID of the port for all packets when
> 802.1Q is disabled, thus assigning all packets to VLAN 1 when VLAN
> filtering is disabled.
>
> Andrew, Vladimir: Should mv88e6xxx always set the PVID to 0 when VLAN
> filtering is disabled?

For Ocelot/Felix, after trying to fight with some other fallout caused
by a mismatch between hardware pvid and bridge pvid when vlan_filtering=0:
https://patchwork.ozlabs.org/project/netdev/patch/20201015173355.564934-1-vladimir.oltean@nxp.com/
I did this and lived happily ever after:

  commit 75e5a554c87fd48f67d1674cfd34e47e3b454fb3
  Author: Vladimir Oltean <vladimir.oltean@nxp.com>
  Date:   Sat Oct 31 12:29:10 2020 +0200

      net: mscc: ocelot: use the pvid of zero when bridged with vlan_filtering=0

      Currently, mscc_ocelot ports configure pvid=0 in standalone mode, and
      inherit the pvid from the bridge when one is present.

      When the bridge has vlan_filtering=0, the software semantics are that
      packets should be received regardless of whether there's a pvid
      configured on the ingress port or not. However, ocelot does not observe
      those semantics today.

      Moreover, changing the PVID is also a problem with vlan_filtering=0.
      We are privately remapping the VID of FDB, MDB entries to the port's
      PVID when those are VLAN-unaware (i.e. when the VID of these entries
      comes to us as 0). But we have no logic of adjusting that remapping when
      the user changes the pvid and vlan_filtering is 0. So stale entries
      would be left behind, and untagged traffic will stop matching on them.

      And even if we were to solve that, there's an even bigger problem. If
      swp0 has pvid 1, and swp1 has pvid 2, and both are under a vlan_filtering=0
      bridge, they should be able to forward traffic between one another.
      However, with ocelot they wouldn't do that.

      The simplest way of fixing this is to never configure the pvid based on
      what the bridge is asking for, when vlan_filtering is 0. Only if there
      was a VLAN that the bridge couldn't mangle, that we could use as pvid....
      So, turns out, there's 0 just for that. And for a reason: IEEE
      802.1Q-2018, page 247, Table 9-2-Reserved VID values says:

              The null VID. Indicates that the tag header contains only
              priority information; no VID is present in the frame.
              This VID value shall not be configured as a PVID or a member
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
              of a VID Set, or configured in any FDB entry, or used in any
              Management operation.

      So, aren't we doing exactly what 802.1Q says not to? Well, in a way, but
      what we're doing here is just driver-level bookkeeping, all for the
      better. The fact that we're using a pvid of 0 is not observable behavior
      from the outside world: the network stack does not see the classified
      VLAN that the switch uses, in vlan_filtering=0 mode. And we're also more
      consistent with the standalone mode now.

      And now that we use the pvid of 0 in this mode, there's another advantage:
      we don't need to perform any VID remapping for FDB and MDB entries either,
      we can just use the VID of 0 that the bridge is passing to us.

      The only gotcha is that every time we change the vlan_filtering setting,
      we need to reapply the pvid (either to 0, or to the value from the bridge).
      A small side-effect visible in the patch is that ocelot_port_set_pvid
      needs to be moved above ocelot_port_vlan_filtering, so that it can be
      called from there without forward-declarations.

      Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tobias Waldekranz Feb. 3, 2021, 10:42 a.m. UTC | #4
On Wed, Feb 03, 2021 at 12:14, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Wed, Feb 03, 2021 at 10:27:02AM +0100, Tobias Waldekranz wrote:
>> On Mon, Feb 01, 2021 at 14:24, DENG Qingfang <dqfext@gmail.com> wrote:
>> > I've tested your patch series on kernel 5.4 and found that it only works
>> > when VLAN filtering is enabled.
>> > After some debugging, I noticed DSA will add static entries to ATU 0 if
>> > VLAN filtering is disabled, regardless of default_pvid of the bridge,
>> > which is also the ATU# used by the bridge.
>> By default, a bridge will use a default PVID of 1, even when VLAN
>> filtering is disabled (nbp_vlan_init). Yet it will assign all packets to
>> VLAN 0 on ingress (br_handle_frame_finish->br_allowed_ingress).
>>
>> The switch OTOH, will use the PVID of the port for all packets when
>> 802.1Q is disabled, thus assigning all packets to VLAN 1 when VLAN
>> filtering is disabled.
>>
>> Andrew, Vladimir: Should mv88e6xxx always set the PVID to 0 when VLAN
>> filtering is disabled?
>
> For Ocelot/Felix, after trying to fight with some other fallout caused
> by a mismatch between hardware pvid and bridge pvid when vlan_filtering=0:
> https://patchwork.ozlabs.org/project/netdev/patch/20201015173355.564934-1-vladimir.oltean@nxp.com/
> I did this and lived happily ever after:

OK great, so there is precedent. I will add it to my TODO.

Thank you!