[RFC,net-next,5/9] net: dsa: Track port PVIDs

Message ID	20210426170411.1789186-6-tobias@waldekranz.com (mailing list archive)
State	RFC
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> From: Tobias Waldekranz <tobias@waldekranz.com> To: davem@davemloft.net, kuba@kernel.org Cc: andrew@lunn.ch, vivien.didelot@gmail.com, f.fainelli@gmail.com, olteanv@gmail.com, roopa@nvidia.com, nikolay@nvidia.com, jiri@resnulli.us, idosch@idosch.org, stephen@networkplumber.org, netdev@vger.kernel.org, bridge@lists.linux-foundation.org Subject: [RFC net-next 5/9] net: dsa: Track port PVIDs Date: Mon, 26 Apr 2021 19:04:07 +0200 Message-Id: <20210426170411.1789186-6-tobias@waldekranz.com> In-Reply-To: <20210426170411.1789186-1-tobias@waldekranz.com> References: <20210426170411.1789186-1-tobias@waldekranz.com> MIME-Version: 1.0 Organization: Westermo Content-Transfer-Encoding: 8bit Precedence: bulk
Series	net: bridge: Forward offloading \| expand [RFC,net-next,0/9] net: bridge: Forward offloading [RFC,net-next,1/9] net: dfwd: Constrain existing users to macvlan subordinates [RFC,net-next,2/9] net: bridge: Disambiguate offload_fwd_mark [RFC,net-next,3/9] net: bridge: switchdev: Recycle unused hwdoms [RFC,net-next,4/9] net: bridge: switchdev: Forward offloading [RFC,net-next,5/9] net: dsa: Track port PVIDs [RFC,net-next,6/9] net: dsa: Forward offloading [RFC,net-next,7/9] net: dsa: mv88e6xxx: Allocate a virtual DSA port for each bridge [RFC,net-next,8/9] net: dsa: mv88e6xxx: Map virtual bridge port in PVT [RFC,net-next,9/9] net: dsa: mv88e6xxx: Forward offloading

Context	Check	Description
netdev/cover_letter	success	Link
netdev/fixes_present	success	Link
netdev/patch_count	success	Link
netdev/tree_selection	success	Clearly marked for net-next
netdev/subject_prefix	success	Link
netdev/cc_maintainers	success	CCed 7 of 7 maintainers
netdev/source_inline	success	Was 0 now: 0
netdev/verify_signedoff	success	Link
netdev/module_param	success	Was 0 now: 0
netdev/build_32bit	success	Errors and warnings before: 7 this patch: 7
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/verify_fixes	success	Link
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 37 lines checked
netdev/build_allmodconfig_warn	success	Errors and warnings before: 7 this patch: 7
netdev/header_inline	success	Link

Tobias Waldekranz April 26, 2021, 5:04 p.m. UTC

In some scenarios a tagger must know which VLAN to assign to a packet,
even if the packet is set to egress untagged. Since the VLAN
information in the skb will be removed by the bridge in this case,
track each port's PVID such that the VID of an outgoing frame can
always be determined.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
---
 include/net/dsa.h |  1 +
 net/dsa/port.c    | 16 ++++++++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

Vladimir Oltean April 26, 2021, 7:40 p.m. UTC | #1

Hi Tobias,

On Mon, Apr 26, 2021 at 07:04:07PM +0200, Tobias Waldekranz wrote:
> In some scenarios a tagger must know which VLAN to assign to a packet,
> even if the packet is set to egress untagged. Since the VLAN
> information in the skb will be removed by the bridge in this case,
> track each port's PVID such that the VID of an outgoing frame can
> always be determined.
> 
> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> ---

Let me give you this real-life example:

#!/bin/bash

ip link add br0 type bridge vlan_filtering 1
for eth in eth0 eth1 swp2 swp3 swp4 swp5; do
	ip link set $eth up
	ip link set $eth master br0
done
ip link set br0 up

bridge vlan add dev eth0 vid 100 pvid untagged
bridge vlan del dev swp2 vid 1
bridge vlan del dev swp3 vid 1
bridge vlan add dev swp2 vid 100
bridge vlan add dev swp3 vid 100 untagged

reproducible on the NXP LS1021A-TSN board.
The bridge receives an untagged packet on eth0 and floods it.
It should reach swp2 and swp3, and be tagged on swp2, and untagged on
swp3 respectively.

With your idea of sending untagged frames towards the port's pvid,
wouldn't we be leaking this packet to VLAN 1, therefore towards ports
swp4 and swp5, and the real destination ports would not get this packet?

Tobias Waldekranz April 26, 2021, 8:05 p.m. UTC | #2

On Mon, Apr 26, 2021 at 22:40, Vladimir Oltean <olteanv@gmail.com> wrote:
> Hi Tobias,
>
> On Mon, Apr 26, 2021 at 07:04:07PM +0200, Tobias Waldekranz wrote:
>> In some scenarios a tagger must know which VLAN to assign to a packet,
>> even if the packet is set to egress untagged. Since the VLAN
>> information in the skb will be removed by the bridge in this case,
>> track each port's PVID such that the VID of an outgoing frame can
>> always be determined.
>> 
>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> ---
>
> Let me give you this real-life example:
>
> #!/bin/bash
>
> ip link add br0 type bridge vlan_filtering 1
> for eth in eth0 eth1 swp2 swp3 swp4 swp5; do
> 	ip link set $eth up
> 	ip link set $eth master br0
> done
> ip link set br0 up
>
> bridge vlan add dev eth0 vid 100 pvid untagged
> bridge vlan del dev swp2 vid 1
> bridge vlan del dev swp3 vid 1
> bridge vlan add dev swp2 vid 100
> bridge vlan add dev swp3 vid 100 untagged
>
> reproducible on the NXP LS1021A-TSN board.
> The bridge receives an untagged packet on eth0 and floods it.
> It should reach swp2 and swp3, and be tagged on swp2, and untagged on
> swp3 respectively.
>
> With your idea of sending untagged frames towards the port's pvid,
> wouldn't we be leaking this packet to VLAN 1, therefore towards ports
> swp4 and swp5, and the real destination ports would not get this packet?

I am not sure I follow. The bridge would never send the packet to
swp{4,5} because should_deliver() rejects them (as usual). So it could
only be sent either to swp2 or swp3. In the case that swp3 is first in
the bridge's port list, it would be sent untagged, but the PVID would be
100 and the flooding would thus be limited to swp{2,3}.

You did make me realize that there is a fatal flaw in the current design
though: Using this approach, it is not possible to have multiple VLANs
configured to egress untagged out of one port. Rare, but allowed.

So the VLAN information will have to remain in the skb somehow. My
initial plan was actually to always send offloaded skbs tagged. I went
this route because I thought we already had all the information we
needed in the driver. It seems reasonable that skb->vlan_tci could
always be set for offloaded frames from a filtering bridge, no?

Vladimir Oltean April 26, 2021, 8:28 p.m. UTC | #3

On Mon, Apr 26, 2021 at 10:05:52PM +0200, Tobias Waldekranz wrote:
> On Mon, Apr 26, 2021 at 22:40, Vladimir Oltean <olteanv@gmail.com> wrote:
> > Hi Tobias,
> >
> > On Mon, Apr 26, 2021 at 07:04:07PM +0200, Tobias Waldekranz wrote:
> >> In some scenarios a tagger must know which VLAN to assign to a packet,
> >> even if the packet is set to egress untagged. Since the VLAN
> >> information in the skb will be removed by the bridge in this case,
> >> track each port's PVID such that the VID of an outgoing frame can
> >> always be determined.
> >> 
> >> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> >> ---
> >
> > Let me give you this real-life example:
> >
> > #!/bin/bash
> >
> > ip link add br0 type bridge vlan_filtering 1
> > for eth in eth0 eth1 swp2 swp3 swp4 swp5; do
> > 	ip link set $eth up
> > 	ip link set $eth master br0
> > done
> > ip link set br0 up
> >
> > bridge vlan add dev eth0 vid 100 pvid untagged
> > bridge vlan del dev swp2 vid 1
> > bridge vlan del dev swp3 vid 1
> > bridge vlan add dev swp2 vid 100
> > bridge vlan add dev swp3 vid 100 untagged
> >
> > reproducible on the NXP LS1021A-TSN board.
> > The bridge receives an untagged packet on eth0 and floods it.
> > It should reach swp2 and swp3, and be tagged on swp2, and untagged on
> > swp3 respectively.
> >
> > With your idea of sending untagged frames towards the port's pvid,
> > wouldn't we be leaking this packet to VLAN 1, therefore towards ports
> > swp4 and swp5, and the real destination ports would not get this packet?
> 
> I am not sure I follow. The bridge would never send the packet to
> swp{4,5} because should_deliver() rejects them (as usual). So it could
> only be sent either to swp2 or swp3. In the case that swp3 is first in
> the bridge's port list, it would be sent untagged, but the PVID would be
> 100 and the flooding would thus be limited to swp{2,3}.

Sorry, _I_ don't understand.

When you say that the PVID is 100, whose PVID is it, exactly? Is it the
pvid of the source port (aka eth0 in this example)? That's not what I
see, I see the pvid of the egress port (the Marvell device)...

So to reiterate: when you transmit a packet towards your hardware switch
which has br0 inside the sb_dev, how does the switch know in which VLAN
to forward that packet? As far as I am aware, when the bridge had
received the packet as untagged on eth0, it did not insert VLAN 100 into
the skb itself, so the bridge VLAN information is lost when delivering
the frame to the egress net device. Am I wrong?

Tobias Waldekranz April 27, 2021, 9:12 a.m. UTC | #4

On Mon, Apr 26, 2021 at 23:28, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Mon, Apr 26, 2021 at 10:05:52PM +0200, Tobias Waldekranz wrote:
>> On Mon, Apr 26, 2021 at 22:40, Vladimir Oltean <olteanv@gmail.com> wrote:
>> > Hi Tobias,
>> >
>> > On Mon, Apr 26, 2021 at 07:04:07PM +0200, Tobias Waldekranz wrote:
>> >> In some scenarios a tagger must know which VLAN to assign to a packet,
>> >> even if the packet is set to egress untagged. Since the VLAN
>> >> information in the skb will be removed by the bridge in this case,
>> >> track each port's PVID such that the VID of an outgoing frame can
>> >> always be determined.
>> >> 
>> >> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> >> ---
>> >
>> > Let me give you this real-life example:
>> >
>> > #!/bin/bash
>> >
>> > ip link add br0 type bridge vlan_filtering 1
>> > for eth in eth0 eth1 swp2 swp3 swp4 swp5; do
>> > 	ip link set $eth up
>> > 	ip link set $eth master br0
>> > done
>> > ip link set br0 up
>> >
>> > bridge vlan add dev eth0 vid 100 pvid untagged
>> > bridge vlan del dev swp2 vid 1
>> > bridge vlan del dev swp3 vid 1
>> > bridge vlan add dev swp2 vid 100
>> > bridge vlan add dev swp3 vid 100 untagged
>> >
>> > reproducible on the NXP LS1021A-TSN board.
>> > The bridge receives an untagged packet on eth0 and floods it.
>> > It should reach swp2 and swp3, and be tagged on swp2, and untagged on
>> > swp3 respectively.
>> >
>> > With your idea of sending untagged frames towards the port's pvid,
>> > wouldn't we be leaking this packet to VLAN 1, therefore towards ports
>> > swp4 and swp5, and the real destination ports would not get this packet?
>> 
>> I am not sure I follow. The bridge would never send the packet to
>> swp{4,5} because should_deliver() rejects them (as usual). So it could
>> only be sent either to swp2 or swp3. In the case that swp3 is first in
>> the bridge's port list, it would be sent untagged, but the PVID would be
>> 100 and the flooding would thus be limited to swp{2,3}.
>
> Sorry, _I_ don't understand.
>
> When you say that the PVID is 100, whose PVID is it, exactly? Is it the
> pvid of the source port (aka eth0 in this example)? That's not what I
> see, I see the pvid of the egress port (the Marvell device)...

I meant the PVID of swp3.

In summary: This series incorrectly assumes that a port's PVID always
corresponds to the VID that should be assigned to untagged packets on
egress. This is wrong because PVID only specifies which VID to assign
packets to on ingress, it says nothing about policy on egress. Multiple
VIDs can also be configured to egress untagged on a given port. The VID
must thus be sent along with each packet in order for the driver to be
able to assign it to the correct VID.

> So to reiterate: when you transmit a packet towards your hardware switch
> which has br0 inside the sb_dev, how does the switch know in which VLAN
> to forward that packet? As far as I am aware, when the bridge had
> received the packet as untagged on eth0, it did not insert VLAN 100 into
> the skb itself, so the bridge VLAN information is lost when delivering
> the frame to the egress net device. Am I wrong?

VID 100 is inserted into skb->vlan_tci on ingress from eth0, in
br_vlan.c/__allowed_ingress. It is then cleared again in
br_vlan.c/br_handle_vlan if the egress port (swp3 in our example) is set
to egress the VID untagged.

The last step only clears skb->vlan_present though, the actual VID
information still resides in skb->vlan_tci. I tried just removing 5/9
from this series, and then sourced the VID from skb->vlan_tci for
untagged packets. It works like a charm! I think this is the way
forward.

The question is if we need another bit of information to signal that
skb->vlan_tci contains valid information, but the packet should still be
considered untagged? This could also be used on Rx to carry priority
(PCP) information to the bridge.

Vladimir Oltean April 27, 2021, 9:27 a.m. UTC | #5

On Tue, Apr 27, 2021 at 11:12:56AM +0200, Tobias Waldekranz wrote:
> On Mon, Apr 26, 2021 at 23:28, Vladimir Oltean <olteanv@gmail.com> wrote:
> > On Mon, Apr 26, 2021 at 10:05:52PM +0200, Tobias Waldekranz wrote:
> >> On Mon, Apr 26, 2021 at 22:40, Vladimir Oltean <olteanv@gmail.com> wrote:
> >> > Hi Tobias,
> >> >
> >> > On Mon, Apr 26, 2021 at 07:04:07PM +0200, Tobias Waldekranz wrote:
> >> >> In some scenarios a tagger must know which VLAN to assign to a packet,
> >> >> even if the packet is set to egress untagged. Since the VLAN
> >> >> information in the skb will be removed by the bridge in this case,
> >> >> track each port's PVID such that the VID of an outgoing frame can
> >> >> always be determined.
> >> >> 
> >> >> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> >> >> ---
> >> >
> >> > Let me give you this real-life example:
> >> >
> >> > #!/bin/bash
> >> >
> >> > ip link add br0 type bridge vlan_filtering 1
> >> > for eth in eth0 eth1 swp2 swp3 swp4 swp5; do
> >> > 	ip link set $eth up
> >> > 	ip link set $eth master br0
> >> > done
> >> > ip link set br0 up
> >> >
> >> > bridge vlan add dev eth0 vid 100 pvid untagged
> >> > bridge vlan del dev swp2 vid 1
> >> > bridge vlan del dev swp3 vid 1
> >> > bridge vlan add dev swp2 vid 100
> >> > bridge vlan add dev swp3 vid 100 untagged
> >> >
> >> > reproducible on the NXP LS1021A-TSN board.
> >> > The bridge receives an untagged packet on eth0 and floods it.
> >> > It should reach swp2 and swp3, and be tagged on swp2, and untagged on
> >> > swp3 respectively.
> >> >
> >> > With your idea of sending untagged frames towards the port's pvid,
> >> > wouldn't we be leaking this packet to VLAN 1, therefore towards ports
> >> > swp4 and swp5, and the real destination ports would not get this packet?
> >> 
> >> I am not sure I follow. The bridge would never send the packet to
> >> swp{4,5} because should_deliver() rejects them (as usual). So it could
> >> only be sent either to swp2 or swp3. In the case that swp3 is first in
> >> the bridge's port list, it would be sent untagged, but the PVID would be
> >> 100 and the flooding would thus be limited to swp{2,3}.
> >
> > Sorry, _I_ don't understand.
> >
> > When you say that the PVID is 100, whose PVID is it, exactly? Is it the
> > pvid of the source port (aka eth0 in this example)? That's not what I
> > see, I see the pvid of the egress port (the Marvell device)...
> 
> I meant the PVID of swp3.
> 
> In summary: This series incorrectly assumes that a port's PVID always
> corresponds to the VID that should be assigned to untagged packets on
> egress. This is wrong because PVID only specifies which VID to assign
> packets to on ingress, it says nothing about policy on egress. Multiple
> VIDs can also be configured to egress untagged on a given port. The VID
> must thus be sent along with each packet in order for the driver to be
> able to assign it to the correct VID.
> 
> > So to reiterate: when you transmit a packet towards your hardware switch
> > which has br0 inside the sb_dev, how does the switch know in which VLAN
> > to forward that packet? As far as I am aware, when the bridge had
> > received the packet as untagged on eth0, it did not insert VLAN 100 into
> > the skb itself, so the bridge VLAN information is lost when delivering
> > the frame to the egress net device. Am I wrong?
> 
> VID 100 is inserted into skb->vlan_tci on ingress from eth0, in
> br_vlan.c/__allowed_ingress. It is then cleared again in
> br_vlan.c/br_handle_vlan if the egress port (swp3 in our example) is set
> to egress the VID untagged.
> 
> The last step only clears skb->vlan_present though, the actual VID
> information still resides in skb->vlan_tci. I tried just removing 5/9
> from this series, and then sourced the VID from skb->vlan_tci for
> untagged packets. It works like a charm! I think this is the way
> forward.
> 
> The question is if we need another bit of information to signal that
> skb->vlan_tci contains valid information, but the packet should still be
> considered untagged? This could also be used on Rx to carry priority
> (PCP) information to the bridge.

My expectation is that when you do this forwarding offload thing, the
bridge passes the classified VLAN down to the port driver, encoded
inside the accel_priv alongside the sb_dev somehow.

Vladimir Oltean April 27, 2021, 10:07 a.m. UTC | #6

On Tue, Apr 27, 2021 at 11:12:56AM +0200, Tobias Waldekranz wrote:
> On Mon, Apr 26, 2021 at 23:28, Vladimir Oltean <olteanv@gmail.com> wrote:
> > On Mon, Apr 26, 2021 at 10:05:52PM +0200, Tobias Waldekranz wrote:
> >> On Mon, Apr 26, 2021 at 22:40, Vladimir Oltean <olteanv@gmail.com> wrote:
> >> > Hi Tobias,
> >> >
> >> > On Mon, Apr 26, 2021 at 07:04:07PM +0200, Tobias Waldekranz wrote:
> >> >> In some scenarios a tagger must know which VLAN to assign to a packet,
> >> >> even if the packet is set to egress untagged. Since the VLAN
> >> >> information in the skb will be removed by the bridge in this case,
> >> >> track each port's PVID such that the VID of an outgoing frame can
> >> >> always be determined.
> >> >> 
> >> >> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> >> >> ---
> >> >
> >> > Let me give you this real-life example:
> >> >
> >> > #!/bin/bash
> >> >
> >> > ip link add br0 type bridge vlan_filtering 1
> >> > for eth in eth0 eth1 swp2 swp3 swp4 swp5; do
> >> > 	ip link set $eth up
> >> > 	ip link set $eth master br0
> >> > done
> >> > ip link set br0 up
> >> >
> >> > bridge vlan add dev eth0 vid 100 pvid untagged
> >> > bridge vlan del dev swp2 vid 1
> >> > bridge vlan del dev swp3 vid 1
> >> > bridge vlan add dev swp2 vid 100
> >> > bridge vlan add dev swp3 vid 100 untagged
> >> >
> >> > reproducible on the NXP LS1021A-TSN board.
> >> > The bridge receives an untagged packet on eth0 and floods it.
> >> > It should reach swp2 and swp3, and be tagged on swp2, and untagged on
> >> > swp3 respectively.
> >> >
> >> > With your idea of sending untagged frames towards the port's pvid,
> >> > wouldn't we be leaking this packet to VLAN 1, therefore towards ports
> >> > swp4 and swp5, and the real destination ports would not get this packet?
> >> 
> >> I am not sure I follow. The bridge would never send the packet to
> >> swp{4,5} because should_deliver() rejects them (as usual). So it could
> >> only be sent either to swp2 or swp3. In the case that swp3 is first in
> >> the bridge's port list, it would be sent untagged, but the PVID would be
> >> 100 and the flooding would thus be limited to swp{2,3}.
> >
> > Sorry, _I_ don't understand.
> >
> > When you say that the PVID is 100, whose PVID is it, exactly? Is it the
> > pvid of the source port (aka eth0 in this example)? That's not what I
> > see, I see the pvid of the egress port (the Marvell device)...
> 
> I meant the PVID of swp3.
> 
> In summary: This series incorrectly assumes that a port's PVID always
> corresponds to the VID that should be assigned to untagged packets on
> egress. This is wrong because PVID only specifies which VID to assign
> packets to on ingress, it says nothing about policy on egress. Multiple
> VIDs can also be configured to egress untagged on a given port. The VID
> must thus be sent along with each packet in order for the driver to be
> able to assign it to the correct VID.

So yes, I think you and I are on the same page now, in that the port
driver must not inject untagged packets into the port's PVID, since the
PVID is an ingress setting. Heck, the PVID might not even be installed
on the egress port, and that doesn't mean it shouldn't send untagged
packets, it only means it shouldn't receive them.

Just to be even more clear, this is what I think happens with your
change.

Untagged packets classified to VLAN 100 are reinterpreted by the port
driver as untagged, and sent to VLAN 1 (the PVID of the egress port).
What you said about should_deliver() doesn't matter/doesn't make sense,
because the offload forwarding domain contains all of swp2, swp3, swp4,
swp5. It is not per-VLAN. So the bridge has no idea that the port driver
will inject the packet with the wrong VLAN information. The packet
_will_ end up on the wrong ports, and it has hopped VLANs.

> > So to reiterate: when you transmit a packet towards your hardware switch
> > which has br0 inside the sb_dev, how does the switch know in which VLAN
> > to forward that packet? As far as I am aware, when the bridge had
> > received the packet as untagged on eth0, it did not insert VLAN 100 into
> > the skb itself, so the bridge VLAN information is lost when delivering
> > the frame to the egress net device. Am I wrong?
> 
> VID 100 is inserted into skb->vlan_tci on ingress from eth0, in
> br_vlan.c/__allowed_ingress. It is then cleared again in
> br_vlan.c/br_handle_vlan if the egress port (swp3 in our example) is set
> to egress the VID untagged.
> 
> The last step only clears skb->vlan_present though, the actual VID
> information still resides in skb->vlan_tci. I tried just removing 5/9
> from this series, and then sourced the VID from skb->vlan_tci for
> untagged packets. It works like a charm! I think this is the way
> forward.
> 
> The question is if we need another bit of information to signal that
> skb->vlan_tci contains valid information, but the packet should still be
> considered untagged? This could also be used on Rx to carry priority
> (PCP) information to the bridge.

Either we add another bit of information, or we don't clear the VLAN
in this bit of code, if the port supports fwd offload:

br_handle_vlan:

	if (v->flags & BRIDGE_VLAN_INFO_UNTAGGED)
		__vlan_hwaccel_clear_tag(skb);

The expectation that the hardware handles VLAN popping on the egress of
individual ports (as part of the replication procedure) should be valid,
I guess. In the case of DSA, all packets sent between the DSA master and
the CPU port using fwd offload should be VLAN-tagged.

Tobias Waldekranz April 28, 2021, 11:10 p.m. UTC | #7

On Tue, Apr 27, 2021 at 13:07, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Tue, Apr 27, 2021 at 11:12:56AM +0200, Tobias Waldekranz wrote:
>> On Mon, Apr 26, 2021 at 23:28, Vladimir Oltean <olteanv@gmail.com> wrote:
>> > On Mon, Apr 26, 2021 at 10:05:52PM +0200, Tobias Waldekranz wrote:
>> >> On Mon, Apr 26, 2021 at 22:40, Vladimir Oltean <olteanv@gmail.com> wrote:
>> >> > Hi Tobias,
>> >> >
>> >> > On Mon, Apr 26, 2021 at 07:04:07PM +0200, Tobias Waldekranz wrote:
>> >> >> In some scenarios a tagger must know which VLAN to assign to a packet,
>> >> >> even if the packet is set to egress untagged. Since the VLAN
>> >> >> information in the skb will be removed by the bridge in this case,
>> >> >> track each port's PVID such that the VID of an outgoing frame can
>> >> >> always be determined.
>> >> >> 
>> >> >> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> >> >> ---
>> >> >
>> >> > Let me give you this real-life example:
>> >> >
>> >> > #!/bin/bash
>> >> >
>> >> > ip link add br0 type bridge vlan_filtering 1
>> >> > for eth in eth0 eth1 swp2 swp3 swp4 swp5; do
>> >> > 	ip link set $eth up
>> >> > 	ip link set $eth master br0
>> >> > done
>> >> > ip link set br0 up
>> >> >
>> >> > bridge vlan add dev eth0 vid 100 pvid untagged
>> >> > bridge vlan del dev swp2 vid 1
>> >> > bridge vlan del dev swp3 vid 1
>> >> > bridge vlan add dev swp2 vid 100
>> >> > bridge vlan add dev swp3 vid 100 untagged
>> >> >
>> >> > reproducible on the NXP LS1021A-TSN board.
>> >> > The bridge receives an untagged packet on eth0 and floods it.
>> >> > It should reach swp2 and swp3, and be tagged on swp2, and untagged on
>> >> > swp3 respectively.
>> >> >
>> >> > With your idea of sending untagged frames towards the port's pvid,
>> >> > wouldn't we be leaking this packet to VLAN 1, therefore towards ports
>> >> > swp4 and swp5, and the real destination ports would not get this packet?
>> >> 
>> >> I am not sure I follow. The bridge would never send the packet to
>> >> swp{4,5} because should_deliver() rejects them (as usual). So it could
>> >> only be sent either to swp2 or swp3. In the case that swp3 is first in
>> >> the bridge's port list, it would be sent untagged, but the PVID would be
>> >> 100 and the flooding would thus be limited to swp{2,3}.
>> >
>> > Sorry, _I_ don't understand.
>> >
>> > When you say that the PVID is 100, whose PVID is it, exactly? Is it the
>> > pvid of the source port (aka eth0 in this example)? That's not what I
>> > see, I see the pvid of the egress port (the Marvell device)...
>> 
>> I meant the PVID of swp3.
>> 
>> In summary: This series incorrectly assumes that a port's PVID always
>> corresponds to the VID that should be assigned to untagged packets on
>> egress. This is wrong because PVID only specifies which VID to assign
>> packets to on ingress, it says nothing about policy on egress. Multiple
>> VIDs can also be configured to egress untagged on a given port. The VID
>> must thus be sent along with each packet in order for the driver to be
>> able to assign it to the correct VID.
>
> So yes, I think you and I are on the same page now, in that the port
> driver must not inject untagged packets into the port's PVID, since the
> PVID is an ingress setting. Heck, the PVID might not even be installed
> on the egress port, and that doesn't mean it shouldn't send untagged
> packets, it only means it shouldn't receive them.
>
> Just to be even more clear, this is what I think happens with your
> change.
>
> Untagged packets classified to VLAN 100 are reinterpreted by the port
> driver as untagged, and sent to VLAN 1 (the PVID of the egress port).
> What you said about should_deliver() doesn't matter/doesn't make sense,
> because the offload forwarding domain contains all of swp2, swp3, swp4,
> swp5. It is not per-VLAN. So the bridge has no idea that the port driver
> will inject the packet with the wrong VLAN information. The packet
> _will_ end up on the wrong ports, and it has hopped VLANs.

My brain's iproute2 simulator must have malfunctioned :) Anyway, we
agree that the current implementation only works for the common case
where there is a single untagged VID on a port that is also set as the
PVID.

>> > So to reiterate: when you transmit a packet towards your hardware switch
>> > which has br0 inside the sb_dev, how does the switch know in which VLAN
>> > to forward that packet? As far as I am aware, when the bridge had
>> > received the packet as untagged on eth0, it did not insert VLAN 100 into
>> > the skb itself, so the bridge VLAN information is lost when delivering
>> > the frame to the egress net device. Am I wrong?
>> 
>> VID 100 is inserted into skb->vlan_tci on ingress from eth0, in
>> br_vlan.c/__allowed_ingress. It is then cleared again in
>> br_vlan.c/br_handle_vlan if the egress port (swp3 in our example) is set
>> to egress the VID untagged.
>> 
>> The last step only clears skb->vlan_present though, the actual VID
>> information still resides in skb->vlan_tci. I tried just removing 5/9
>> from this series, and then sourced the VID from skb->vlan_tci for
>> untagged packets. It works like a charm! I think this is the way
>> forward.
>> 
>> The question is if we need another bit of information to signal that
>> skb->vlan_tci contains valid information, but the packet should still be
>> considered untagged? This could also be used on Rx to carry priority
>> (PCP) information to the bridge.
>
> Either we add another bit of information, or we don't clear the VLAN
> in this bit of code, if the port supports fwd offload:
>
> br_handle_vlan:
>
> 	if (v->flags & BRIDGE_VLAN_INFO_UNTAGGED)
> 		__vlan_hwaccel_clear_tag(skb);
>
> The expectation that the hardware handles VLAN popping on the egress of
> individual ports (as part of the replication procedure) should be valid,
> I guess. In the case of DSA, all packets sent between the DSA master and
> the CPU port using fwd offload should be VLAN-tagged.

Yeah I agree that for this offload, it would be fine to always send
packets tagged. There are some things that might be helped by that extra
bit of info though:

- VLAN PCP. The switchdev and bridge could communicate the priority bits
  also for untagged packets, both on ingress and egress. This would
  maintain the priority up to a VLAN upper on top of the bridge, where
  you can use the standard {ingress,egress}-qos-map feature to map PCP
  to socket priority.

- TC. Right now, matching on VLANs is messy because there is no way to
  express "match VLAN1" in a filter that can be reused across a group of
  ports ("block" in TC parlance) where some may be untagged members and
  others are tagged. In hardware, the VLAN parser typically resides much
  earlier in the pipeline (way before reaching the bridge engine) so
  TCAMs can easily do these things.

But this is perhaps a separate job. Nothing stops us from going the
always-tagged-route now and adding "untagged awareness" to the stack
later on.

[RFC,net-next,5/9] net: dsa: Track port PVIDs

Checks

Commit Message

Comments

Patch