mbox series

[v2,net-next,00/10] net: bridge: Multiple Spanning Trees

Message ID 20220301100321.951175-1-tobias@waldekranz.com (mailing list archive)
Headers show
Series net: bridge: Multiple Spanning Trees | expand

Message

Tobias Waldekranz March 1, 2022, 10:03 a.m. UTC
The bridge has had per-VLAN STP support for a while now, since:

https://lore.kernel.org/netdev/20200124114022.10883-1-nikolay@cumulusnetworks.com/

The current implementation has some problems:

- The mapping from VLAN to STP state is fixed as 1:1, i.e. each VLAN
  is managed independently. This is awkward from an MSTP (802.1Q-2018,
  Clause 13.5) point of view, where the model is that multiple VLANs
  are grouped into MST instances.

  Because of the way that the standard is written, presumably, this is
  also reflected in hardware implementations. It is not uncommon for a
  switch to support the full 4k range of VIDs, but that the pool of
  MST instances is much smaller. Some examples:

  Marvell LinkStreet (mv88e6xxx): 4k VLANs, but only 64 MSTIs
  Marvell Prestera: 4k VLANs, but only 128 MSTIs
  Microchip SparX-5i: 4k VLANs, but only 128 MSTIs

- By default, the feature is enabled, and there is no way to disable
  it. This makes it hard to add offloading in a backwards compatible
  way, since any underlying switchdevs have no way to refuse the
  function if the hardware does not support it

- The port-global STP state has precedence over per-VLAN states. In
  MSTP, as far as I understand it, all VLANs will use the common
  spanning tree (CST) by default - through traffic engineering you can
  then optimize your network to group subsets of VLANs to use
  different trees (MSTI). To my understanding, the way this is
  typically managed in silicon is roughly:

  Incoming packet:
  .----.----.--------------.----.-------------
  | DA | SA | 802.1Q VID=X | ET | Payload ...
  '----'----'--------------'----'-------------
                        |
                        '->|\     .----------------------------.
                           | +--> | VID | Members | ... | MSTI |
                   PVID -->|/     |-----|---------|-----|------|
                                  |   1 | 0001001 | ... |    0 |
                                  |   2 | 0001010 | ... |   10 |
                                  |   3 | 0001100 | ... |   10 |
                                  '----------------------------'
                                                             |
                               .-----------------------------'
                               |  .------------------------.
                               '->| MSTI | Fwding | Lrning |
                                  |------|--------|--------|
                                  |    0 | 111110 | 111110 |
                                  |   10 | 110111 | 110111 |
                                  '------------------------'

  What this is trying to show is that the STP state (whether MSTP is
  used, or ye olde STP) is always accessed via the VLAN table. If STP
  is running, all MSTI pointers in that table will reference the same
  index in the STP stable - if MSTP is running, some VLANs may point
  to other trees (like in this example).

  The fact that in the Linux bridge, the global state (think: index 0
  in most hardware implementations) is supposed to override the
  per-VLAN state, is very awkward to offload. In effect, this means
  that when the global state changes to blocking, drivers will have to
  iterate over all MSTIs in use, and alter them all to match. This
  also means that you have to cache whether the hardware state is
  currently tracking the global state or the per-VLAN state. In the
  first case, you also have to cache the per-VLAN state so that you
  can restore it if the global state transitions back to forwarding.

This series adds a new mst_enable bridge setting (as suggested by Nik)
that can only be changed when no VLANs are configured on the
bridge. Enabling this mode has the following effect:

- The port-global STP state is used to represent the CST (Common
  Spanning Tree) (1/10)

- Ingress STP filtering is deferred until the frame's VLAN has been
  resolved (1/10)

- The preexisting per-VLAN states can no longer be controlled directly
  (1/10). They are instead placed under the MST module's control,
  which is managed using a new netlink interface (described in 3/10)

- VLANs can br mapped to MSTIs in an arbitrary M:N fashion, using a
  new global VLAN option (2/10)

4-5/10 adds switchdev notifications so that a driver can track VID to
MSTI mappings and MST port states.

An offloading implementation is this provided for mv88e6xxx.

A proposal for the corresponding iproute2 interface is available here:

https://github.com/wkz/iproute2/tree/mst

Tobias Waldekranz (10):
  net: bridge: mst: Multiple Spanning Tree (MST) mode
  net: bridge: mst: Allow changing a VLAN's MSTI
  net: bridge: mst: Support setting and reporting MST port states
  net: bridge: mst: Notify switchdev drivers of VLAN MSTI migrations
  net: bridge: mst: Notify switchdev drivers of MST state changes
  net: dsa: Pass VLAN MSTI migration notifications to driver
  net: dsa: Pass MST state changes to driver
  net: dsa: mv88e6xxx: Disentangle STU from VTU
  net: dsa: mv88e6xxx: Export STU as devlink region
  net: dsa: mv88e6xxx: MST Offloading

 drivers/net/dsa/mv88e6xxx/chip.c        | 232 ++++++++++++++
 drivers/net/dsa/mv88e6xxx/chip.h        |  38 +++
 drivers/net/dsa/mv88e6xxx/devlink.c     |  94 ++++++
 drivers/net/dsa/mv88e6xxx/global1.h     |  10 +
 drivers/net/dsa/mv88e6xxx/global1_vtu.c | 311 ++++++++++--------
 include/net/dsa.h                       |   5 +
 include/net/switchdev.h                 |  17 +
 include/uapi/linux/if_bridge.h          |  17 +
 include/uapi/linux/if_link.h            |   1 +
 include/uapi/linux/rtnetlink.h          |   5 +
 net/bridge/Makefile                     |   2 +-
 net/bridge/br_input.c                   |  17 +-
 net/bridge/br_mst.c                     | 402 ++++++++++++++++++++++++
 net/bridge/br_netlink.c                 |  17 +-
 net/bridge/br_private.h                 |  31 ++
 net/bridge/br_stp.c                     |   3 +
 net/bridge/br_switchdev.c               |  57 ++++
 net/bridge/br_vlan.c                    |  20 +-
 net/bridge/br_vlan_options.c            |  24 +-
 net/dsa/dsa_priv.h                      |   3 +
 net/dsa/port.c                          |  40 +++
 net/dsa/slave.c                         |  12 +
 22 files changed, 1216 insertions(+), 142 deletions(-)
 create mode 100644 net/bridge/br_mst.c

Comments

Vladimir Oltean March 1, 2022, 4:21 p.m. UTC | #1
Hi Tobias,

On Tue, Mar 01, 2022 at 11:03:11AM +0100, Tobias Waldekranz wrote:
> A proposal for the corresponding iproute2 interface is available here:
> 
> https://github.com/wkz/iproute2/tree/mst

Please pardon my ignorance. Is there a user-mode STP protocol application
that supports MSTP, and that you've tested these patches with?
I'd like to give it a try.
Stephen Hemminger March 1, 2022, 5:19 p.m. UTC | #2
On Tue, 1 Mar 2022 18:21:42 +0200
Vladimir Oltean <olteanv@gmail.com> wrote:

> Hi Tobias,
> 
> On Tue, Mar 01, 2022 at 11:03:11AM +0100, Tobias Waldekranz wrote:
> > A proposal for the corresponding iproute2 interface is available here:
> > 
> > https://github.com/wkz/iproute2/tree/mst  
> 
> Please pardon my ignorance. Is there a user-mode STP protocol application
> that supports MSTP, and that you've tested these patches with?
> I'd like to give it a try.

https://github.com/mstpd/mstpd
Tobias Waldekranz March 1, 2022, 9:20 p.m. UTC | #3
On Tue, Mar 01, 2022 at 18:21, Vladimir Oltean <olteanv@gmail.com> wrote:
> Hi Tobias,
>
> On Tue, Mar 01, 2022 at 11:03:11AM +0100, Tobias Waldekranz wrote:
>> A proposal for the corresponding iproute2 interface is available here:
>> 
>> https://github.com/wkz/iproute2/tree/mst
>
> Please pardon my ignorance. Is there a user-mode STP protocol application
> that supports MSTP, and that you've tested these patches with?
> I'd like to give it a try.

I see that Stephen has already pointed you to mstpd in a sibling
message.

It is important to note though, that AFAIK mstpd does not actually
support MSTP on a vanilla Linux system. The protocol implementation is
in place, and they have a plugin architecture that makes it easy for people
to hook it up to various userspace SDKs and whatnot, but you can't use
it with a regular bridge.

A colleague of mine has been successfully running a modified version of
mstpd which was tailored for v1 of this series (RFC). But I do not
believe he has had the time to rework it for v2. That should mostly be a
matter of removing code though, as v2 allows you to manage the MSTIs
directly, rather than having to translate it to an associated VLAN.
Pavel Šimerda March 1, 2022, 10:30 p.m. UTC | #4
On 01/03/2022 22:20, Tobias Waldekranz wrote:
> On Tue, Mar 01, 2022 at 18:21, Vladimir Oltean <olteanv@gmail.com> wrote:
>> Hi Tobias,
>>
>> On Tue, Mar 01, 2022 at 11:03:11AM +0100, Tobias Waldekranz wrote:
>>> A proposal for the corresponding iproute2 interface is available here:
>>>
>>> https://github.com/wkz/iproute2/tree/mst
>>
>> Please pardon my ignorance. Is there a user-mode STP protocol application
>> that supports MSTP, and that you've tested these patches with?
>> I'd like to give it a try.
> 
> I see that Stephen has already pointed you to mstpd in a sibling
> message.
> 
> It is important to note though, that AFAIK mstpd does not actually
> support MSTP on a vanilla Linux system. The protocol implementation is
> in place, and they have a plugin architecture that makes it easy for people
> to hook it up to various userspace SDKs and whatnot, but you can't use
> it with a regular bridge.
> 
> A colleague of mine has been successfully running a modified version of
> mstpd which was tailored for v1 of this series (RFC). But I do not
> believe he has had the time to rework it for v2. That should mostly be a
> matter of removing code though, as v2 allows you to manage the MSTIs
> directly, rather than having to translate it to an associated VLAN.

Hello,

we experimented with mstpd with pretty reasonable kernel modifications. Vanilla kernel wasn't capable of transferring the correct mapping from mstpd to the hardware due to lack of vlan2msti mapping and per-msti port state (rather than just per-vlan port state).

https://github.com/mstpd/mstpd/pull/112

I didn't pursue this for a while, though.

Regards,
Pavel