diff mbox series

[net,1/2] net: dsa: Accept software VLANs for stacked interfaces

Message ID 20210308150405.3694678-2-tobias@waldekranz.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net: dsa: Accept software VLANs for stacked interfaces | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 27 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Tobias Waldekranz March 8, 2021, 3:04 p.m. UTC
The dsa_slave_vlan_rx_{add,kill}_vid ndos are required for hardware
that can not control VLAN filtering per port, rather it is a device
global setting, in order to support VLAN uppers on non-bridged ports.

For hardware that can control VLAN filtering per port, it is perfectly
fine to fallback to software VLANs in this scenario. So, make sure
that this "error" does not leave the DSA layer as vlan_add_vid does
not know the meaning of it.

The blamed commit removed this exemption by not advertising the
feature if the driver did not implement VLAN offloading. But as we
know see, the assumption that if a driver supports VLAN offloading, it
will always use it, does not hold in certain edge cases.

Fixes: 9b236d2a69da ("net: dsa: Advertise the VLAN offload netdev ability only if switch supports it")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
---
 net/dsa/slave.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Vladimir Oltean March 8, 2021, 3:44 p.m. UTC | #1
On Mon, Mar 08, 2021 at 04:04:04PM +0100, Tobias Waldekranz wrote:
> The dsa_slave_vlan_rx_{add,kill}_vid ndos are required for hardware
> that can not control VLAN filtering per port, rather it is a device
> global setting, in order to support VLAN uppers on non-bridged ports.
> 
> For hardware that can control VLAN filtering per port, it is perfectly
> fine to fallback to software VLANs in this scenario. So, make sure
> that this "error" does not leave the DSA layer as vlan_add_vid does
> not know the meaning of it.
> 
> The blamed commit removed this exemption by not advertising the
> feature if the driver did not implement VLAN offloading. But as we
> know see, the assumption that if a driver supports VLAN offloading, it
> will always use it, does not hold in certain edge cases.
> 
> Fixes: 9b236d2a69da ("net: dsa: Advertise the VLAN offload netdev ability only if switch supports it")
> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> ---

So these NDOs exist for drivers that need the 'rx-vlan-filter: on'
feature in ethtool -k, which can be due to any of the following reasons:
1. vlan_filtering_is_global = true, some ports are under a VLAN-aware
   bridge while others are standalone (this is what you described)
2. Hellcreek. This driver needs it because in standalone mode, it uses
   unique VLANs per port to ensure separation. For separation of untagged
   traffic, it uses different PVIDs for each port, and for separation of
   VLAN-tagged traffic, it never accepts 8021q uppers with the same vid
   on two ports.
3. the ports that are under a VLAN-aware bridge should also set this
   feature, for 8021q uppers having a VID not claimed by the bridge.
   In this case, the driver will essentially not even know that the VID
   is coming from the 8021q layer and not the bridge.

If a driver does not fall under any of the above 3 categories, there is
no reason why it should advertise the 'rx-vlan-filter' feature, therefore
no reason why it should implement these NDOs, and return -EOPNOTSUPP.

We are essentially saying the same thing, except what I propose is to
better manage the 'rx-vlan-filter' feature of the DSA net devices. After
your patches, the network stack still thinks that mv88e6xxx ports in
standalone mode have VLAN filtering enabled, which they don't. That
might be confusing. Not only that, but any other driver that is
VLAN-unaware in standalone mode will similarly have to ignore VLANs
coming from the 8021q layer, which may add uselessly add to their
complexity. Let me prepare an alternative patch series and let's see how
they compare against each other.

As far as I see, mv88e6xxx needs to treat the VLAN NDOs in case 3 only,
and DSA will do that without any sort of driver-level awareness. It's
all the other cases (standalone ports mode) that are bothering you.
Vladimir Oltean March 8, 2021, 5 p.m. UTC | #2
On Mon, Mar 08, 2021 at 05:44:46PM +0200, Vladimir Oltean wrote:
> On Mon, Mar 08, 2021 at 04:04:04PM +0100, Tobias Waldekranz wrote:
> > The dsa_slave_vlan_rx_{add,kill}_vid ndos are required for hardware
> > that can not control VLAN filtering per port, rather it is a device
> > global setting, in order to support VLAN uppers on non-bridged ports.
> > 
> > For hardware that can control VLAN filtering per port, it is perfectly
> > fine to fallback to software VLANs in this scenario. So, make sure
> > that this "error" does not leave the DSA layer as vlan_add_vid does
> > not know the meaning of it.
> > 
> > The blamed commit removed this exemption by not advertising the
> > feature if the driver did not implement VLAN offloading. But as we
> > know see, the assumption that if a driver supports VLAN offloading, it
> > will always use it, does not hold in certain edge cases.
> > 
> > Fixes: 9b236d2a69da ("net: dsa: Advertise the VLAN offload netdev ability only if switch supports it")
> > Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
> > ---
> 
> So these NDOs exist for drivers that need the 'rx-vlan-filter: on'
> feature in ethtool -k, which can be due to any of the following reasons:
> 1. vlan_filtering_is_global = true, some ports are under a VLAN-aware
>    bridge while others are standalone (this is what you described)
> 2. Hellcreek. This driver needs it because in standalone mode, it uses
>    unique VLANs per port to ensure separation. For separation of untagged
>    traffic, it uses different PVIDs for each port, and for separation of
>    VLAN-tagged traffic, it never accepts 8021q uppers with the same vid
>    on two ports.
> 3. the ports that are under a VLAN-aware bridge should also set this
>    feature, for 8021q uppers having a VID not claimed by the bridge.
>    In this case, the driver will essentially not even know that the VID
>    is coming from the 8021q layer and not the bridge.
> 
> If a driver does not fall under any of the above 3 categories, there is
> no reason why it should advertise the 'rx-vlan-filter' feature, therefore
> no reason why it should implement these NDOs, and return -EOPNOTSUPP.
> 
> We are essentially saying the same thing, except what I propose is to
> better manage the 'rx-vlan-filter' feature of the DSA net devices. After
> your patches, the network stack still thinks that mv88e6xxx ports in
> standalone mode have VLAN filtering enabled, which they don't. That
> might be confusing. Not only that, but any other driver that is
> VLAN-unaware in standalone mode will similarly have to ignore VLANs
> coming from the 8021q layer, which may add uselessly add to their
> complexity. Let me prepare an alternative patch series and let's see how
> they compare against each other.
> 
> As far as I see, mv88e6xxx needs to treat the VLAN NDOs in case 3 only,
> and DSA will do that without any sort of driver-level awareness. It's
> all the other cases (standalone ports mode) that are bothering you.

So I stopped from sending an alternative solution, because neither mine
nor yours will fix this situation:

ip link add link lan0 name lan0.100 type vlan id 100
ip addr add 192.168.100.1/24 dev lan0.100
ping 192.168.100.2 # should work
ip link add br0 type bridge vlan_filtering 0
ip link set lan0 master br0
ping 192.168.100.2 # should still work
ip link set br0 type bridge vlan_filtering 1
ping 192.168.100.2 # should still work

Basically my point is that you disregard the vlan_vid_add from the
lan0.100 upper now because you think you don't need it, but one day will
come when you will. We've had that problem for a very long while now
with bridge VLANs, and it wasn't even completely solved yet (that's why
ds->configure_vlan_while_not_filtering is still a thing). It's
fundamentally the same with VLANs added by the 8021q layer. I think you
should see what you can do to make mv88e6xxx stop complaining and accept
the VLANs from the 8021q uppers even if they aren't needed right away.
It's a lot easier that way, otherwise you will end up having to replay
them somehow.
Florian Fainelli March 8, 2021, 5:38 p.m. UTC | #3
On 3/8/21 9:00 AM, Vladimir Oltean wrote:
> On Mon, Mar 08, 2021 at 05:44:46PM +0200, Vladimir Oltean wrote:
>> On Mon, Mar 08, 2021 at 04:04:04PM +0100, Tobias Waldekranz wrote:
>>> The dsa_slave_vlan_rx_{add,kill}_vid ndos are required for hardware
>>> that can not control VLAN filtering per port, rather it is a device
>>> global setting, in order to support VLAN uppers on non-bridged ports.
>>>
>>> For hardware that can control VLAN filtering per port, it is perfectly
>>> fine to fallback to software VLANs in this scenario. So, make sure
>>> that this "error" does not leave the DSA layer as vlan_add_vid does
>>> not know the meaning of it.
>>>
>>> The blamed commit removed this exemption by not advertising the
>>> feature if the driver did not implement VLAN offloading. But as we
>>> know see, the assumption that if a driver supports VLAN offloading, it
>>> will always use it, does not hold in certain edge cases.
>>>
>>> Fixes: 9b236d2a69da ("net: dsa: Advertise the VLAN offload netdev ability only if switch supports it")
>>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>>> ---
>>
>> So these NDOs exist for drivers that need the 'rx-vlan-filter: on'
>> feature in ethtool -k, which can be due to any of the following reasons:
>> 1. vlan_filtering_is_global = true, some ports are under a VLAN-aware
>>    bridge while others are standalone (this is what you described)
>> 2. Hellcreek. This driver needs it because in standalone mode, it uses
>>    unique VLANs per port to ensure separation. For separation of untagged
>>    traffic, it uses different PVIDs for each port, and for separation of
>>    VLAN-tagged traffic, it never accepts 8021q uppers with the same vid
>>    on two ports.
>> 3. the ports that are under a VLAN-aware bridge should also set this
>>    feature, for 8021q uppers having a VID not claimed by the bridge.
>>    In this case, the driver will essentially not even know that the VID
>>    is coming from the 8021q layer and not the bridge.
>>
>> If a driver does not fall under any of the above 3 categories, there is
>> no reason why it should advertise the 'rx-vlan-filter' feature, therefore
>> no reason why it should implement these NDOs, and return -EOPNOTSUPP.
>>
>> We are essentially saying the same thing, except what I propose is to
>> better manage the 'rx-vlan-filter' feature of the DSA net devices. After
>> your patches, the network stack still thinks that mv88e6xxx ports in
>> standalone mode have VLAN filtering enabled, which they don't. That
>> might be confusing. Not only that, but any other driver that is
>> VLAN-unaware in standalone mode will similarly have to ignore VLANs
>> coming from the 8021q layer, which may add uselessly add to their
>> complexity. Let me prepare an alternative patch series and let's see how
>> they compare against each other.
>>
>> As far as I see, mv88e6xxx needs to treat the VLAN NDOs in case 3 only,
>> and DSA will do that without any sort of driver-level awareness. It's
>> all the other cases (standalone ports mode) that are bothering you.
> 
> So I stopped from sending an alternative solution, because neither mine
> nor yours will fix this situation:
> 
> ip link add link lan0 name lan0.100 type vlan id 100
> ip addr add 192.168.100.1/24 dev lan0.100
> ping 192.168.100.2 # should work
> ip link add br0 type bridge vlan_filtering 0
> ip link set lan0 master br0
> ping 192.168.100.2 # should still work
> ip link set br0 type bridge vlan_filtering 1
> ping 192.168.100.2 # should still work
> 
> Basically my point is that you disregard the vlan_vid_add from the
> lan0.100 upper now because you think you don't need it, but one day will
> come when you will. We've had that problem for a very long while now
> with bridge VLANs, and it wasn't even completely solved yet (that's why
> ds->configure_vlan_while_not_filtering is still a thing). It's
> fundamentally the same with VLANs added by the 8021q layer. I think you
> should see what you can do to make mv88e6xxx stop complaining and accept
> the VLANs from the 8021q uppers even if they aren't needed right away.
> It's a lot easier that way, otherwise you will end up having to replay
> them somehow.

Agreed.
--
Florian
Tobias Waldekranz March 8, 2021, 8 p.m. UTC | #4
On Mon, Mar 08, 2021 at 19:00, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Mon, Mar 08, 2021 at 05:44:46PM +0200, Vladimir Oltean wrote:
>> On Mon, Mar 08, 2021 at 04:04:04PM +0100, Tobias Waldekranz wrote:
>> > The dsa_slave_vlan_rx_{add,kill}_vid ndos are required for hardware
>> > that can not control VLAN filtering per port, rather it is a device
>> > global setting, in order to support VLAN uppers on non-bridged ports.
>> > 
>> > For hardware that can control VLAN filtering per port, it is perfectly
>> > fine to fallback to software VLANs in this scenario. So, make sure
>> > that this "error" does not leave the DSA layer as vlan_add_vid does
>> > not know the meaning of it.
>> > 
>> > The blamed commit removed this exemption by not advertising the
>> > feature if the driver did not implement VLAN offloading. But as we
>> > know see, the assumption that if a driver supports VLAN offloading, it
>> > will always use it, does not hold in certain edge cases.
>> > 
>> > Fixes: 9b236d2a69da ("net: dsa: Advertise the VLAN offload netdev ability only if switch supports it")
>> > Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> > ---
>> 
>> So these NDOs exist for drivers that need the 'rx-vlan-filter: on'
>> feature in ethtool -k, which can be due to any of the following reasons:
>> 1. vlan_filtering_is_global = true, some ports are under a VLAN-aware
>>    bridge while others are standalone (this is what you described)
>> 2. Hellcreek. This driver needs it because in standalone mode, it uses
>>    unique VLANs per port to ensure separation. For separation of untagged
>>    traffic, it uses different PVIDs for each port, and for separation of
>>    VLAN-tagged traffic, it never accepts 8021q uppers with the same vid
>>    on two ports.
>> 3. the ports that are under a VLAN-aware bridge should also set this
>>    feature, for 8021q uppers having a VID not claimed by the bridge.
>>    In this case, the driver will essentially not even know that the VID
>>    is coming from the 8021q layer and not the bridge.
>> 
>> If a driver does not fall under any of the above 3 categories, there is
>> no reason why it should advertise the 'rx-vlan-filter' feature, therefore
>> no reason why it should implement these NDOs, and return -EOPNOTSUPP.
>> 
>> We are essentially saying the same thing, except what I propose is to
>> better manage the 'rx-vlan-filter' feature of the DSA net devices. After
>> your patches, the network stack still thinks that mv88e6xxx ports in
>> standalone mode have VLAN filtering enabled, which they don't. That
>> might be confusing. Not only that, but any other driver that is

Alright, we do not want to lie to the stack, got it...

>> VLAN-unaware in standalone mode will similarly have to ignore VLANs
>> coming from the 8021q layer, which may add uselessly add to their
>> complexity. Let me prepare an alternative patch series and let's see how
>> they compare against each other.
>> 
>> As far as I see, mv88e6xxx needs to treat the VLAN NDOs in case 3 only,
>> and DSA will do that without any sort of driver-level awareness. It's
>> all the other cases (standalone ports mode) that are bothering you.
>
> So I stopped from sending an alternative solution, because neither mine
> nor yours will fix this situation:
>
> ip link add link lan0 name lan0.100 type vlan id 100
> ip addr add 192.168.100.1/24 dev lan0.100
> ping 192.168.100.2 # should work
> ip link add br0 type bridge vlan_filtering 0
> ip link set lan0 master br0
> ping 192.168.100.2 # should still work
> ip link set br0 type bridge vlan_filtering 1
> ping 192.168.100.2 # should still work
>
> Basically my point is that you disregard the vlan_vid_add from the
> lan0.100 upper now because you think you don't need it, but one day will
> come when you will. We've had that problem for a very long while now
> with bridge VLANs, and it wasn't even completely solved yet (that's why
> ds->configure_vlan_while_not_filtering is still a thing). It's
> fundamentally the same with VLANs added by the 8021q layer. I think you
> should see what you can do to make mv88e6xxx stop complaining and accept
> the VLANs from the 8021q uppers even if they aren't needed right away.

...hang on, are we OK with lying or not? Yes, I guess?

> It's a lot easier that way, otherwise you will end up having to replay
> them somehow.

I think vlan_for_each should be enough to perform the replay when
toggling VLAN filtering on a port?

More importantly, there are other sequences that we do not guard against
today:

- Adding VID to a bridge port that is used on an 1Q upper of another
  bridged port.

    .100  br0
       \  / \
       lan0 lan1

    $ ip link add dev br0 type bridge vlan_filtering 1
    $ ip link add dev lan0.100 link lan0 type vlan id 100
    $ ip link set dev lan0 master br0
    $ ip link set dev lan1 master br0
    $ bridge vlan add dev lan1 vid 100 # This should fail

    After this sequence, the switch will forward VID 100 tagged frames
    between lan0 and lan1.

- Briding two ports that both have 1Q uppers using the same VID.

    .100  br0  .100
       \  / \  /
       lan0 lan1

    $ ip link add dev br0 type bridge vlan_filtering 1
    $ ip link add dev lan0.100 link lan0 type vlan id 100
    $ ip link add dev lan1.100 link lan1 type vlan id 100
    $ ip link set dev lan0 master br0
    $ ip link set dev lan1 master br0 # This should fail

    This is also allowed by DSA today, and produces the same switch
    config as the previous sequence.

So in summary:

- Try to design some generic VLAN validation that can be used when:
  - Adding VLANs to standalone ports.
  - Adding VLANs to bridged ports.
  - Toggling VLAN filtering on ports.
- Remove 1/2.
- Rework 2/2 to:
  - `return 0` when adding a VLAN to a non-bridged port, not -EOPNOTSUPP.
  - Lazy load/unload VIDs from VLAN uppers when toggling filtering on a
    port using vlan_for_each or similar.

Does that sound reasonable?

Are we still in net territory or is this more suited for net-next?
Vladimir Oltean March 8, 2021, 8:50 p.m. UTC | #5
On Mon, Mar 08, 2021 at 09:00:49PM +0100, Tobias Waldekranz wrote:
> Alright, we do not want to lie to the stack, got it...

[...]

> ...hang on, are we OK with lying or not? Yes, I guess?

I'm not too happy about it. The problem in my mind, really, is that if
we disable 'rx-vlan-filter' and we gain an 8021q upper in the meantime,
we'll lose the .ndo_vlan_rx_add_vid call for it. This is worse in my
opinion than saying you're going to drop unknown VLANs but not actually
doing it.

> > It's a lot easier that way, otherwise you will end up having to replay
> > them somehow.
> 
> I think vlan_for_each should be enough to perform the replay when
> toggling VLAN filtering on a port?

Yes, good point about vlan_for_each, I didn't notice that, since almost
nobody uses it, and absolutely nobody uses it for replaying VLANs in the
RX filter, but it looks like it might be able to do the trick.

> More importantly, there are other sequences that we do not guard against
> today:
> 
> - Adding VID to a bridge port that is used on an 1Q upper of another
>   bridged port.
> 
>     .100  br0
>        \  / \
>        lan0 lan1
> 
>     $ ip link add dev br0 type bridge vlan_filtering 1
>     $ ip link add dev lan0.100 link lan0 type vlan id 100
>     $ ip link set dev lan0 master br0
>     $ ip link set dev lan1 master br0
>     $ bridge vlan add dev lan1 vid 100 # This should fail
> 
>     After this sequence, the switch will forward VID 100 tagged frames
>     between lan0 and lan1.

Yes, this is not caught today. Should be trivially fixed by iterating
over all dp->bridge_dev lowers in dsa_slave_vlan_add, when calling
dsa_slave_vlan_check_for_8021q_uppers, not just for the specified port.

> - Briding two ports that both have 1Q uppers using the same VID.
> 
>     .100  br0  .100
>        \  / \  /
>        lan0 lan1
> 
>     $ ip link add dev br0 type bridge vlan_filtering 1
>     $ ip link add dev lan0.100 link lan0 type vlan id 100
>     $ ip link add dev lan1.100 link lan1 type vlan id 100
>     $ ip link set dev lan0 master br0
>     $ ip link set dev lan1 master br0 # This should fail
> 
>     This is also allowed by DSA today, and produces the same switch
>     config as the previous sequence.

Correct, this is also not caught.
In this case it looks like there isn't even an attempt to validate the
VLAN configuration of the ports already in the bridge. We would probably
have to hook into dsa_port_bridge_join, iterate through all the VLAN
uppers of the new port, then for each VLAN upper we should construct a
fake struct switchdev_obj_port_vlan and call dsa_slave_vlan_check_for_8021q_uppers
again for all lowers of the bridge which we're about to join that are
DSA ports. Patches welcome!

> So in summary:
> 
> - Try to design some generic VLAN validation that can be used when:
>   - Adding VLANs to standalone ports.
>   - Adding VLANs to bridged ports.
>   - Toggling VLAN filtering on ports.

What do you mean 'generic'?

> - Remove 1/2.
> - Rework 2/2 to:
>   - `return 0` when adding a VLAN to a non-bridged port, not -EOPNOTSUPP.

Still in mv88e6xxx you mean? Well, if mv88e6xxx is still not going to
install the VLAN to hardware, why would it lie to DSA and return 0?

>   - Lazy load/unload VIDs from VLAN uppers when toggling filtering on a
>     port using vlan_for_each or similar.

How do you plan to do it exactly? Hook into dsa_port_vlan_filtering and:
if vlan_filtering == false, then do vlan_for_each(..., dsa_slave_vlan_rx_kill_vid)
if vlan_filtering == true, then do vlan_for_each(..., dsa_slave_vlan_rx_add_vid)?
Basically when VLAN filtering is disabled, the VTU will only contain the
bridge VLANs and none of the 8021q VLANs?

If we make this happen, then my patches for runtime toggling
'rx-vlan-filter' should also be needed.

> Does that sound reasonable?
> 
> Are we still in net territory or is this more suited for net-next?

It'll be a lot of patches, but the base logic is there already, so I
think we could still target 'net'.
Tobias Waldekranz March 8, 2021, 10:32 p.m. UTC | #6
On Mon, Mar 08, 2021 at 22:50, Vladimir Oltean <olteanv@gmail.com> wrote:
> On Mon, Mar 08, 2021 at 09:00:49PM +0100, Tobias Waldekranz wrote:
>> Alright, we do not want to lie to the stack, got it...
>
> [...]
>
>> ...hang on, are we OK with lying or not? Yes, I guess?
>
> I'm not too happy about it. The problem in my mind, really, is that if
> we disable 'rx-vlan-filter' and we gain an 8021q upper in the meantime,
> we'll lose the .ndo_vlan_rx_add_vid call for it. This is worse in my
> opinion than saying you're going to drop unknown VLANs but not actually
> doing it.
>
>> > It's a lot easier that way, otherwise you will end up having to replay
>> > them somehow.
>> 
>> I think vlan_for_each should be enough to perform the replay when
>> toggling VLAN filtering on a port?
>
> Yes, good point about vlan_for_each, I didn't notice that, since almost
> nobody uses it, and absolutely nobody uses it for replaying VLANs in the
> RX filter, but it looks like it might be able to do the trick.
>
>> More importantly, there are other sequences that we do not guard against
>> today:
>> 
>> - Adding VID to a bridge port that is used on an 1Q upper of another
>>   bridged port.
>> 
>>     .100  br0
>>        \  / \
>>        lan0 lan1
>> 
>>     $ ip link add dev br0 type bridge vlan_filtering 1
>>     $ ip link add dev lan0.100 link lan0 type vlan id 100
>>     $ ip link set dev lan0 master br0
>>     $ ip link set dev lan1 master br0
>>     $ bridge vlan add dev lan1 vid 100 # This should fail
>> 
>>     After this sequence, the switch will forward VID 100 tagged frames
>>     between lan0 and lan1.
>
> Yes, this is not caught today. Should be trivially fixed by iterating
> over all dp->bridge_dev lowers in dsa_slave_vlan_add, when calling
> dsa_slave_vlan_check_for_8021q_uppers, not just for the specified port.
>
>> - Briding two ports that both have 1Q uppers using the same VID.
>> 
>>     .100  br0  .100
>>        \  / \  /
>>        lan0 lan1
>> 
>>     $ ip link add dev br0 type bridge vlan_filtering 1
>>     $ ip link add dev lan0.100 link lan0 type vlan id 100
>>     $ ip link add dev lan1.100 link lan1 type vlan id 100
>>     $ ip link set dev lan0 master br0
>>     $ ip link set dev lan1 master br0 # This should fail
>> 
>>     This is also allowed by DSA today, and produces the same switch
>>     config as the previous sequence.
>
> Correct, this is also not caught.
> In this case it looks like there isn't even an attempt to validate the
> VLAN configuration of the ports already in the bridge. We would probably
> have to hook into dsa_port_bridge_join, iterate through all the VLAN
> uppers of the new port, then for each VLAN upper we should construct a
> fake struct switchdev_obj_port_vlan and call dsa_slave_vlan_check_for_8021q_uppers
> again for all lowers of the bridge which we're about to join that are
> DSA ports. Patches welcome!
>
>> So in summary:
>> 
>> - Try to design some generic VLAN validation that can be used when:
>>   - Adding VLANs to standalone ports.
>>   - Adding VLANs to bridged ports.
>>   - Toggling VLAN filtering on ports.
>
> What do you mean 'generic'?

I get the sense that one reason that the mentioned cases are not caught
by the existing validation logic, is that checks are scattered in
multiple places (primarily dsa_slave_check_8021q_upper and
dsa_port_can_apply_vlan_filtering).

Ideally we should have a single function that answers the question
"given the current VLAN config, is it OK to make this one modification?"

This is all still very hand-waivy though, it might not be possible.

>> - Remove 1/2.
>> - Rework 2/2 to:
>>   - `return 0` when adding a VLAN to a non-bridged port, not -EOPNOTSUPP.
>
> Still in mv88e6xxx you mean? Well, if mv88e6xxx is still not going to
> install the VLAN to hardware, why would it lie to DSA and return 0?

Because we want the core to add the `struct vlan_vid_info` to our port's
vlan list so that we can lazy load it if/when needed. But maybe your
dynamic rx-vlan-filter patch will render that unnecessary?

>>   - Lazy load/unload VIDs from VLAN uppers when toggling filtering on a
>>     port using vlan_for_each or similar.
>
> How do you plan to do it exactly? Hook into dsa_port_vlan_filtering and:
> if vlan_filtering == false, then do vlan_for_each(..., dsa_slave_vlan_rx_kill_vid)
> if vlan_filtering == true, then do vlan_for_each(..., dsa_slave_vlan_rx_add_vid)?

I was just going to hook in to mv88e6xxx_port_vlan_filtering, call
vlan_for_each and generate an mv88e6xxx-internal call to add the
VIDs to the port and the CPU port. This does rely on rx-vlan-filter
always being enabled as the VLAN will be setup on all DSA ports at that
point, just not on any user ports.

> Basically when VLAN filtering is disabled, the VTU will only contain the
> bridge VLANs and none of the 8021q VLANs?

Yes, the switch won't be able to use the 1Q VLANs for anything useful
anyway.

> If we make this happen, then my patches for runtime toggling
> 'rx-vlan-filter' should also be needed.

I am fine with that, but I think that means that we need to solve the
replay at the DSA layer in order to setup DSA ports correctly.

>> Does that sound reasonable?
>> 
>> Are we still in net territory or is this more suited for net-next?
>
> It'll be a lot of patches, but the base logic is there already, so I
> think we could still target 'net'.
Vladimir Oltean March 8, 2021, 10:52 p.m. UTC | #7
On Mon, Mar 08, 2021 at 11:32:57PM +0100, Tobias Waldekranz wrote:
> I get the sense that one reason that the mentioned cases are not caught
> by the existing validation logic, is that checks are scattered in
> multiple places (primarily dsa_slave_check_8021q_upper and
> dsa_port_can_apply_vlan_filtering).
>
> Ideally we should have a single function that answers the question
> "given the current VLAN config, is it OK to make this one modification?"
>
> This is all still very hand-waivy though, it might not be possible.

Nope, they are not caught because they are odd corner cases that we
haven't encountered in real life usage.

> >> - Remove 1/2.
> >> - Rework 2/2 to:
> >>   - `return 0` when adding a VLAN to a non-bridged port, not -EOPNOTSUPP.
> >
> > Still in mv88e6xxx you mean? Well, if mv88e6xxx is still not going to
> > install the VLAN to hardware, why would it lie to DSA and return 0?
>
> Because we want the core to add the `struct vlan_vid_info` to our port's
> vlan list so that we can lazy load it if/when needed. But maybe your
> dynamic rx-vlan-filter patch will render that unnecessary?

Does struct mv88e6xxx_port have a VLAN list that I'm not aware of (a la
struct sja1105_private :: bridge_vlans)? If it doesn't, I was going to
send some patches to the DSA core anyway which manage the VLAN RX
filtering of the user interfaces dynamically.

> >>   - Lazy load/unload VIDs from VLAN uppers when toggling filtering on a
> >>     port using vlan_for_each or similar.
> >
> > How do you plan to do it exactly? Hook into dsa_port_vlan_filtering and:
> > if vlan_filtering == false, then do vlan_for_each(..., dsa_slave_vlan_rx_kill_vid)
> > if vlan_filtering == true, then do vlan_for_each(..., dsa_slave_vlan_rx_add_vid)?
>
> I was just going to hook in to mv88e6xxx_port_vlan_filtering, call
> vlan_for_each and generate an mv88e6xxx-internal call to add the
> VIDs to the port and the CPU port. This does rely on rx-vlan-filter
> always being enabled as the VLAN will be setup on all DSA ports at that
> point, just not on any user ports.

:D
Could you wait for a few hours and see what I'm cooking up first? There
is nothing there that is specific to mv88e6xxx, it is a problem we
should solve generally. The only issue with mv88e6xxx is that it has a
pet peeve to not offload VLANs in standalone mode, while others don't
care so much.

> > Basically when VLAN filtering is disabled, the VTU will only contain the
> > bridge VLANs and none of the 8021q VLANs?
>
> Yes, the switch won't be able to use the 1Q VLANs for anything useful
> anyway.
>
> > If we make this happen, then my patches for runtime toggling
> > 'rx-vlan-filter' should also be needed.
>
> I am fine with that, but I think that means that we need to solve the
> replay at the DSA layer in order to setup DSA ports correctly.

Yup, testing them right now.
diff mbox series

Patch

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 992fcab4b552..64d330f138f5 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1346,6 +1346,12 @@  static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
 	/* User port... */
 	ret = dsa_port_vlan_add(dp, &vlan, &extack);
 	if (ret) {
+		if (ret == -EOPNOTSUPP)
+			/* Driver allows the configuration, but is not
+			 * offloading it, which is fine by us.
+			 */
+			goto add_to_master;
+
 		if (extack._msg)
 			netdev_err(dev, "%s\n", extack._msg);
 		return ret;
@@ -1360,6 +1366,7 @@  static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
 		return ret;
 	}
 
+add_to_master:
 	return vlan_vid_add(master, proto, vid);
 }
 
@@ -1379,7 +1386,7 @@  static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
 	 * ports which can be members of this VLAN as well.
 	 */
 	err = dsa_port_vlan_del(dp, &vlan);
-	if (err)
+	if (err && err != -EOPNOTSUPP)
 		return err;
 
 	vlan_vid_del(master, proto, vid);