diff mbox series

[RFC,net-next,1/3] selftests: forwarding: add a vlan_deletion test to bridge_vlan_unaware

Message ID 20220705173114.2004386-2-vladimir.oltean@nxp.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series Delete ds->configure_vlan_while_not_filtering | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 12 maintainers not CCed: daniel@iogearbox.net songliubraving@fb.com ast@kernel.org bpf@vger.kernel.org linux-kselftest@vger.kernel.org yhs@fb.com tobias@waldekranz.com john.fastabend@gmail.com kafai@fb.com shuah@kernel.org andrii@kernel.org kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success net selftest script(s) already in Makefile
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 47 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vladimir Oltean July 5, 2022, 5:31 p.m. UTC
Historically, DSA drivers have seen problems with the model in which
bridge VLANs work, particularly with them being offloaded to switchdev
asynchronously relative to when they become active (vlan_filtering=1).

This switchdev API peculiarity was papered over by commit 2ea7a679ca2a
("net: dsa: Don't add vlans when vlan filtering is disabled"), which
introduced other problems, fixed by commit 54a0ed0df496 ("net: dsa:
provide an option for drivers to always receive bridge VLANs") through
an opt-in ds->configure_vlan_while_not_filtering bool (which later
became an opt-out).

The point is that some DSA drivers still skip VLAN configuration while
VLAN-unaware, and there is a desire to get rid of that behavior.

It's hard to deduce from the wording "at least one corner case" what
Andrew saw, but my best guess is that there is a discrepancy of meaning
between bridge pvid and hardware port pvid which caused breakage.

On one side, the Linux bridge with vlan_filtering=0 is completely
VLAN-unaware, and will accept and process a packet the same way
irrespective of the VLAN groups on the ports or the bridge itself
(there may not even be a pvid, and this makes no difference).

On the other hand, DSA switches still do VLAN processing internally,
even with vlan_filtering disabled, but they are expected to classify all
packets to the port pvid. That pvid shouldn't be confused with the
bridge pvid, and there lies the problem.

When a switch port is under a VLAN-unaware bridge, the hardware pvid
must be explicitly managed by the driver to classify all received
packets to it, regardless of bridge VLAN groups. When under a VLAN-aware
bridge, the hardware pvid must be synchronized to the bridge port pvid.
To do this correctly, the pattern is unfortunately a bit complicated,
and involves hooking the pvid change logic into quite a few places
(the ones that change the input variables which determine the value to
use as hardware pvid for a port). See mv88e6xxx_port_commit_pvid(),
sja1105_commit_pvid(), ocelot_port_set_pvid() etc.

The point is that not all drivers used to do that, especially in older
kernels. If a driver is to blindly program a bridge pvid VLAN received
from switchdev while it's VLAN-unaware, this might in turn change the
hardware pvid used by a VLAN-unaware bridge port, which might result in
packet loss depending which other ports have that pvid too (in that same
note, it might also go unnoticed).

To capture that condition, it is sufficient to take a VLAN-unaware
bridge and change the [VLAN-aware] bridge pvid on a single port, to a
VID that isn't present on any other port. This shouldn't have absolutely
any effect on packet classification or forwarding. However, broken
drivers will take the bait, and change their PVID to 3, causing packet
loss.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
In case you see some apparently unrelated failures reported by
bridge_vlan_unaware.sh, it's good to be aware of some fixes sent earlier
this week to "net", which are hence absent from net-next currently:
https://patchwork.kernel.org/project/netdevbpf/cover/20220703073626.937785-1-vladimir.oltean@nxp.com/

 .../net/forwarding/bridge_vlan_unaware.sh     | 25 ++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

Ido Schimmel July 7, 2022, 12:13 p.m. UTC | #1
On Tue, Jul 05, 2022 at 08:31:12PM +0300, Vladimir Oltean wrote:
> Historically, DSA drivers have seen problems with the model in which
> bridge VLANs work, particularly with them being offloaded to switchdev
> asynchronously relative to when they become active (vlan_filtering=1).
> 
> This switchdev API peculiarity was papered over by commit 2ea7a679ca2a
> ("net: dsa: Don't add vlans when vlan filtering is disabled"), which
> introduced other problems, fixed by commit 54a0ed0df496 ("net: dsa:
> provide an option for drivers to always receive bridge VLANs") through
> an opt-in ds->configure_vlan_while_not_filtering bool (which later
> became an opt-out).
> 
> The point is that some DSA drivers still skip VLAN configuration while
> VLAN-unaware, and there is a desire to get rid of that behavior.
> 
> It's hard to deduce from the wording "at least one corner case" what
> Andrew saw, but my best guess is that there is a discrepancy of meaning
> between bridge pvid and hardware port pvid which caused breakage.
> 
> On one side, the Linux bridge with vlan_filtering=0 is completely
> VLAN-unaware, and will accept and process a packet the same way
> irrespective of the VLAN groups on the ports or the bridge itself
> (there may not even be a pvid, and this makes no difference).
> 
> On the other hand, DSA switches still do VLAN processing internally,
> even with vlan_filtering disabled, but they are expected to classify all
> packets to the port pvid. That pvid shouldn't be confused with the
> bridge pvid, and there lies the problem.
> 
> When a switch port is under a VLAN-unaware bridge, the hardware pvid
> must be explicitly managed by the driver to classify all received
> packets to it, regardless of bridge VLAN groups. When under a VLAN-aware
> bridge, the hardware pvid must be synchronized to the bridge port pvid.
> To do this correctly, the pattern is unfortunately a bit complicated,
> and involves hooking the pvid change logic into quite a few places
> (the ones that change the input variables which determine the value to
> use as hardware pvid for a port). See mv88e6xxx_port_commit_pvid(),
> sja1105_commit_pvid(), ocelot_port_set_pvid() etc.
> 
> The point is that not all drivers used to do that, especially in older
> kernels. If a driver is to blindly program a bridge pvid VLAN received
> from switchdev while it's VLAN-unaware, this might in turn change the
> hardware pvid used by a VLAN-unaware bridge port, which might result in
> packet loss depending which other ports have that pvid too (in that same
> note, it might also go unnoticed).
> 
> To capture that condition, it is sufficient to take a VLAN-unaware
> bridge and change the [VLAN-aware] bridge pvid on a single port, to a
> VID that isn't present on any other port. This shouldn't have absolutely
> any effect on packet classification or forwarding. However, broken
> drivers will take the bait, and change their PVID to 3, causing packet
> loss.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Tested-by: Ido Schimmel <idosch@nvidia.com>
Martin Blumenstingl July 7, 2022, 1:34 p.m. UTC | #2
Hi Vladimir,

On Tue, Jul 5, 2022 at 7:32 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
[...]
>  .../net/forwarding/bridge_vlan_unaware.sh     | 25 ++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
While working on an OpenWrt package for these selftests I found that
this should be added to the Makefile (in the same directory as the
test).
That way it can be installed by a distribution using:
  make -C tools/testing/selftests/net/forwarding/ \
      INSTALL_PATH="some/install/path" \
      install

If you agree then I can also send patches for adding no_forwarding.sh
and local_termination.sh to that Makefile (which are the only two
files which are missing currently).


Best regards,
Martin
Vladimir Oltean July 7, 2022, 1:45 p.m. UTC | #3
On Thu, Jul 07, 2022 at 03:34:07PM +0200, Martin Blumenstingl wrote:
> Hi Vladimir,
> 
> On Tue, Jul 5, 2022 at 7:32 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> [...]
> >  .../net/forwarding/bridge_vlan_unaware.sh     | 25 ++++++++++++++++---
> >  1 file changed, 22 insertions(+), 3 deletions(-)
> While working on an OpenWrt package for these selftests I found that
> this should be added to the Makefile (in the same directory as the
> test).
> That way it can be installed by a distribution using:
>   make -C tools/testing/selftests/net/forwarding/ \
>       INSTALL_PATH="some/install/path" \
>       install
> 
> If you agree then I can also send patches for adding no_forwarding.sh
> and local_termination.sh to that Makefile (which are the only two
> files which are missing currently).

Yes, please do that. I noticed that was the correct way of doing things,
but rsync was enough for my caveman testing so far...
Thanks again for doing this, I haven't forgotten about your other email
and will respond to it later today.
diff mbox series

Patch

diff --git a/tools/testing/selftests/net/forwarding/bridge_vlan_unaware.sh b/tools/testing/selftests/net/forwarding/bridge_vlan_unaware.sh
index 1c8a26046589..2b5700b61ffa 100755
--- a/tools/testing/selftests/net/forwarding/bridge_vlan_unaware.sh
+++ b/tools/testing/selftests/net/forwarding/bridge_vlan_unaware.sh
@@ -1,7 +1,7 @@ 
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
-ALL_TESTS="ping_ipv4 ping_ipv6 learning flooding"
+ALL_TESTS="ping_ipv4 ping_ipv6 learning flooding pvid_change"
 NUM_NETIFS=4
 source lib.sh
 
@@ -77,12 +77,16 @@  cleanup()
 
 ping_ipv4()
 {
-	ping_test $h1 192.0.2.2
+	local msg=$1
+
+	ping_test $h1 192.0.2.2 "$msg"
 }
 
 ping_ipv6()
 {
-	ping6_test $h1 2001:db8:1::2
+	local msg=$1
+
+	ping6_test $h1 2001:db8:1::2 "$msg"
 }
 
 learning()
@@ -95,6 +99,21 @@  flooding()
 	flood_test $swp2 $h1 $h2
 }
 
+pvid_change()
+{
+	# Test that the changing of the VLAN-aware PVID does not affect
+	# VLAN-unaware forwarding
+	bridge vlan add vid 3 dev $swp1 pvid untagged
+
+	ping_ipv4 " with bridge port $swp1 PVID changed"
+	ping_ipv6 " with bridge port $swp1 PVID changed"
+
+	bridge vlan del vid 3 dev $swp1
+
+	ping_ipv4 " with bridge port $swp1 PVID deleted"
+	ping_ipv6 " with bridge port $swp1 PVID deleted"
+}
+
 trap cleanup EXIT
 
 setup_prepare