mbox series

[v5,net-next,00/10] tag_8021q for Ocelot switches

Message ID 20210121023616.1696021-1-olteanv@gmail.com (mailing list archive)
Headers show
Series tag_8021q for Ocelot switches | expand

Message

Vladimir Oltean Jan. 21, 2021, 2:36 a.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

Changes in v5:
- Split patch series in half, removing PTP bits.
- Split previous monolithic patch "net: dsa: felix: add new VLAN-based
  tagger" into 3 smaller patches.
- Updated the sysfs documentation
- Made the tagger_lock per DSA switch tree instead of per DSA switch
- Using dsa_tree_notify instead of dsa_broadcast.

Changes in v4:
- Support simultaneous compilation of tag_ocelot.c and
  tag_ocelot_8021q.c.
- Support runtime switchover between the two taggers, by using
  echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
- We are now actually performing cleanup instead of just probe-time
  setup, which is required for supporting tagger switchover.
- Now draining the CPU queues by continuously reading QS_XTR_READ, same
  as Ocelot, instead of one-time asserting QS_XTR_FLUSH, which actually
  needed a sleep to be effective.

Changes in v3:
Use a per-port bool is_dsa_8021q_cpu instead of a single dsa_8021q_cpu
variable, to be compatible with future work where there may be
potentially multiple tag_8021q CPU ports in a LAG.

Changes in v2:
Posted the entire rework necessary for PTP support using tag_8021q.c.
Added a larger audience to the series.



The Felix switch inside LS1028A has an issue. It has a 2.5G CPU port,
and the external ports, in the majority of use cases, run at 1G. This
means that, when the CPU injects traffic into the switch, it is very
easy to run into congestion. This is not to say that it is impossible to
enter congestion even with all ports running at the same speed, just
that the default configuration is already very prone to that by design.

Normally, the way to deal with that is using Ethernet flow control
(PAUSE frames).

However, this functionality is not working today with the ENETC - Felix
switch pair. The hardware issue is undergoing documentation right now as
an erratum within NXP, but several customers have been requesting a
reasonable workaround for it.

In truth, the LS1028A has 2 internal port pairs. The lack of flow control
is an issue only when NPI mode (Node Processor Interface, aka the mode
where the "CPU port module", which carries DSA-style tagged packets, is
connected to a regular Ethernet port) is used, and NPI mode is supported
by Felix on a single port.

In past BSPs, we have had setups where both internal port pairs were
enabled. We were advertising the following setup:

"data port"     "control port"
  (2.5G)            (1G)

   eno2             eno3
    ^                ^
    |                |
    | regular        | DSA-tagged
    | frames         | frames
    |                |
    v                v
   swp4             swp5

This works but is highly unpractical, due to NXP shifting the task of
designing a functional system (choosing which port to use, depending on
type of traffic required) up to the end user. The swpN interfaces would
have to be bridged with swp4, in order for the eno2 "data port" to have
access to the outside network. And the swpN interfaces would still be
capable of IP networking. So running a DHCP client would give us two IP
interfaces from the same subnet, one assigned to eno2, and the other to
swpN (0, 1, 2, 3).

Also, the dual port design doesn't scale. When attaching another DSA
switch to a Felix port, the end result is that the "data port" cannot
carry any meaningful data to the external world, since it lacks the DSA
tags required to traverse the sja1105 switches below. All that traffic
needs to go through the "control port".

So in newer BSPs there was a desire to simplify that setup, and only
have one internal port pair:

   eno2            eno3
    ^
    |
    | DSA-tagged    x disabled
    | frames
    |
    v
   swp4            swp5

However, this setup only exacerbates the issue of not having flow
control on the NPI port, since that is the only port now. Also, there
are use cases that still require the "data port", such as IEEE 802.1CB
(TSN stream identification doesn't work over an NPI port), source
MAC address learning over NPI, etc.

Again, there is a desire to keep the simplicity of the single internal
port setup, while regaining the benefits of having a dedicated data port
as well. And this series attempts to deliver just that.

So the NPI functionality is disabled conditionally. Its purpose was:
- To ensure individually addressable ports on TX. This can be replaced
  by using some designated VLAN tags which are pushed by the DSA tagger
  code, then removed by the switch (so they are invisible to the outside
  world and to the user).
- To ensure source port identification on RX. Again, this can be
  replaced by using some designated VLAN tags to encapsulate all RX
  traffic (each VLAN uniquely identifies a source port). The DSA tagger
  determines which port it was based on the VLAN number, then removes
  that header.
- To deliver PTP timestamps. This cannot be obtained through VLAN
  headers, so we need to take a step back and see how else we can do
  that. The Microchip Ocelot-1 (VSC7514 MIPS) driver performs manual
  injection/extraction from the CPU port module using register-based
  MMIO, and not over Ethernet. We will need to do the same from DSA,
  which makes this tagger a sort of hybrid between DSA and pure
  switchdev.

Vladimir Oltean (10):
  net: dsa: tag_8021q: add helpers to deduce whether a VLAN ID is RX or
    TX VLAN
  net: mscc: ocelot: export VCAP structures to include/soc/mscc
  net: mscc: ocelot: store a namespaced VCAP filter ID
  net: mscc: ocelot: reapply bridge forwarding mask on bonding
    join/leave
  net: mscc: ocelot: don't use NPI tag prefix for the CPU port module
  net: dsa: document the existing switch tree notifiers and add a new
    one
  net: dsa: allow changing the tag protocol via the "tagging" device
    attribute
  net: dsa: felix: convert to the new .{set,del}_tag_protocol DSA API
  net: dsa: add a second tagger for Ocelot switches based on tag_8021q
  net: dsa: felix: perform switch setup for tag_8021q

 Documentation/ABI/testing/sysfs-class-net-dsa |  11 +-
 MAINTAINERS                                   |   1 +
 drivers/net/dsa/ocelot/Kconfig                |   2 +
 drivers/net/dsa/ocelot/felix.c                | 486 ++++++++++++++++--
 drivers/net/dsa/ocelot/felix.h                |   2 +
 drivers/net/dsa/ocelot/felix_vsc9959.c        |   1 +
 drivers/net/dsa/ocelot/seville_vsc9953.c      |   1 +
 drivers/net/ethernet/mscc/ocelot.c            |  93 ++--
 drivers/net/ethernet/mscc/ocelot_flower.c     |   7 +-
 drivers/net/ethernet/mscc/ocelot_net.c        |   1 +
 drivers/net/ethernet/mscc/ocelot_vcap.c       |  19 +-
 drivers/net/ethernet/mscc/ocelot_vcap.h       | 295 +----------
 drivers/net/ethernet/mscc/ocelot_vsc7514.c    |   2 -
 include/linux/dsa/8021q.h                     |  14 +
 include/net/dsa.h                             |  23 +
 include/soc/mscc/ocelot.h                     |   6 +-
 include/soc/mscc/ocelot_vcap.h                | 297 +++++++++++
 net/dsa/Kconfig                               |  21 +-
 net/dsa/Makefile                              |   1 +
 net/dsa/dsa.c                                 |  20 +
 net/dsa/dsa2.c                                | 129 ++++-
 net/dsa/dsa_priv.h                            |  19 +
 net/dsa/master.c                              |  26 +-
 net/dsa/port.c                                |  44 +-
 net/dsa/slave.c                               |  35 +-
 net/dsa/switch.c                              |  88 ++++
 net/dsa/tag_8021q.c                           |  15 +-
 net/dsa/tag_ocelot_8021q.c                    |  68 +++
 28 files changed, 1306 insertions(+), 421 deletions(-)
 create mode 100644 net/dsa/tag_ocelot_8021q.c

Comments

Florian Fainelli Jan. 21, 2021, 3:41 a.m. UTC | #1
On 1/20/2021 6:36 PM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> Context: Ocelot switches put the injection/extraction frame header in
> front of the Ethernet header. When used in NPI mode, a DSA master would
> see junk instead of the destination MAC address, and it would most
> likely drop the packets. So the Ocelot frame header can have an optional
> prefix, which is just "ff:ff:ff:ff:ff:fe > ff:ff:ff:ff:ff:ff" padding
> put before the actual tag (still before the real Ethernet header) such
> that the DSA master thinks it's looking at a broadcast frame with a
> strange EtherType.
> 
> Unfortunately, a lesson learned in commit 69df578c5f4b ("net: mscc:
> ocelot: eliminate confusion between CPU and NPI port") seems to have
> been forgotten in the meanwhile.
> 
> The CPU port module and the NPI port have independent settings for the
> length of the tag prefix. However, the driver is using the same variable
> to program both of them.
> 
> There is no reason really to use any tag prefix with the CPU port
> module, since that is not connected to any Ethernet port. So this patch
> makes the inj_prefix and xtr_prefix variables apply only to the NPI
> port (which the switchdev ocelot_vsc7514 driver does not use).
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Florian Fainelli Jan. 21, 2021, 3:43 a.m. UTC | #2
On 1/20/2021 6:36 PM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> The existence of dsa_broadcast has generated some confusion in the past:
> https://www.mail-archive.com/netdev@vger.kernel.org/msg365042.html
> 
> So let's document the existing dsa_port_notify and dsa_broadcast
> functions and explain when each of them should be used.
> 
> Also, in fact, the in-between function has always been there but was
> lacking a name, and is the main reason for this patch: dsa_tree_notify.
> Refactor dsa_broadcast to use it.
> 
> This patch also moves dsa_broadcast (a top-level function) to dsa2.c,
> where it really belonged in the first place, but had no companion so it
> stood with dsa_port_notify.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>