diff mbox series

[net,v1] net: dsa: lantiq_gswip: Fix FDB add/remove on the CPU port

Message ID 20220630212703.3280485-1-martin.blumenstingl@googlemail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net,v1] net: dsa: lantiq_gswip: Fix FDB add/remove on the CPU port | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 22 this patch: 22
netdev/cc_maintainers fail 1 blamed authors not CCed: f.fainelli@gmail.com; 1 maintainers not CCed: f.fainelli@gmail.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 19 this patch: 19
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 37 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Martin Blumenstingl June 30, 2022, 9:27 p.m. UTC
There's no bridge available when adding or removing a static FDB entry
for (towards) the CPU port. This is intentional because the CPU port is
always considered standalone, even if technically for the GSWIP driver
it's part of every bridge.

Handling FDB add/remove on the CPU port fixes the following message
during boot in OpenWrt:
  port 4 failed to add <LAN MAC address> vid 1 to fdb: -22
as well as the following message during system shutdown:
  port 4 failed to delete <LAN MAC address> vid 1 from fdb: -22

Use FID 0 (which is also the "default" FID) when adding/removing an FDB
entry for the CPU port. Testing with a BT Home Hub 5A shows that this
"default" FID works as expected:
- traffic from/to LAN (ports in a bridge) is not seen on the WAN port
  (standalone port)
- traffic from/to the WAN port (standalone port) is not seen on the LAN
  (ports in a bridge) ports
- traffic from/to LAN is not seen on another LAN port with a different
  VLAN
- traffic from/to one LAN port to another is still seen

Fixes: 58c59ef9e930c4 ("net: dsa: lantiq: Add Forwarding Database access")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
---
This patch is "minimalistic" on purpose: the goal is to have it
backported to Linux 5.15. Linux 5.15 doesn't have
dsa_fdb_present_in_other_db() or struct dsa_db yet. Once this patch has
been accepted I will work on implementing FDB isolation for the Lantiq
GSWIP driver.

Hauke, I hope I considered all test-cases which you find relevant. If not
then please let me know.


 drivers/net/dsa/lantiq_gswip.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

Comments

Vladimir Oltean July 1, 2022, 1:01 p.m. UTC | #1
Hi Martin,

On Thu, Jun 30, 2022 at 11:27:03PM +0200, Martin Blumenstingl wrote:
> There's no bridge available when adding or removing a static FDB entry
> for (towards) the CPU port. This is intentional because the CPU port is
> always considered standalone, even if technically for the GSWIP driver
> it's part of every bridge.
> 
> Handling FDB add/remove on the CPU port fixes the following message
> during boot in OpenWrt:
>   port 4 failed to add <LAN MAC address> vid 1 to fdb: -22
> as well as the following message during system shutdown:
>   port 4 failed to delete <LAN MAC address> vid 1 from fdb: -22
> 
> Use FID 0 (which is also the "default" FID) when adding/removing an FDB
> entry for the CPU port.

What does "default" FID even mean, and why is the default FID relevant?

> Testing with a BT Home Hub 5A shows that this "default" FID works as
> expected:
> - traffic from/to LAN (ports in a bridge) is not seen on the WAN port
>   (standalone port)

Why is this fact relevant to the change in any way? By saying that the
FID of FDB entries installed towards the CPU doesn't affect the
forwarding isolation between a bridged and a standalone port, you're
effectively only saying that you're silencing a warning and you're not
doing any harm. But you aren't explaining how the commit is doing any
good, either (hint: it isn't).

> - traffic from/to the WAN port (standalone port) is not seen on the LAN
>   (ports in a bridge) ports

Same

> - traffic from/to LAN is not seen on another LAN port with a different
>   VLAN

Same

> - traffic from/to one LAN port to another is still seen

Same

> 
> Fixes: 58c59ef9e930c4 ("net: dsa: lantiq: Add Forwarding Database access")

I guess you don't understand the problem. That commit can't be wrong,
since it dates from v5.2, but DSA only started calling port_fdb_add() on
a CPU port at all since commit
d5f19486cee7 ("net: dsa: listen for SWITCHDEV_{FDB,DEL}_ADD_TO_DEVICE on foreign bridge neighbors") (v5.12)
- and technically, that was opt-in, and the technique only started to become more widespread with commits
81a619f78759 ("net: dsa: include fdb entries pointing to bridge in the host fdb list"),
10fae4ac89ce ("net: dsa: include bridge addresses which are local in the host fdb list") and
3068d466a67e ("net: dsa: sync static FDB entries on foreign interfaces to hardware")
(all appeared in v5.14).
Also, the most recent application of the "port_fdb_add() on CPU ports" technique was introduced in commit
5e8a1e03aa4d ("net: dsa: install secondary unicast and multicast addresses as host FDB/MDB")
(v5.18). But that is also more or less opt-in, since the driver needs to
declare support for FDB isolation to make use of it.

> Cc: stable@vger.kernel.org

We don't CC stable for patches that go through the "net" tree, the
networking maintainers send weekly pull requests and the patches get
automatically backported from there to the relevant and still-not-EOL
stable branches, based on the Fixes: tag. That's why it's important that
you fill that in correctly.

> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> ---
> This patch is "minimalistic" on purpose: the goal is to have it
> backported to Linux 5.15. Linux 5.15 doesn't have dsa_fdb_present_in_other_db()
> or struct dsa_db yet. Once this patch has been accepted I will work on
> implementing FDB isolation for the Lantiq GSWIP driver.

Don't you want to go the other way around, first understand what is the
real problem, its impact and the correct solution, then figure out what
and how can be backported, and _where_?

I'm willing to help.

First it should be understood why DSA bothers to install FDB entries on
the CPU port in the first place. It does so because there is a largeish
class of switches where the MAC source addresses of traffic originating
from Linux are not learned by the hardware. As such, packets being targeted
_towards_ Linux interfaces will not find an entry in the FDB, and will
be flooded. This can be seen if you have a system with swp0 and swp1
both under br0, and the station attached to swp0 pings br0. The ICMP
requests will also be visible by the station attached to swp1.

It's hard to say whether this is the case or not for gswip, but this has
been going on for years and years. Not really a functional (connectivity)
problem, but nonetheless undesirable.

Recently (5.12-5.14), DSA started to address the problem by accepting
the fact that MAC SA learning won't necessarily take place for packets
xmitted by Linux, and just populating the FDB by itself on the CPU port.

At first, FDB isolation and the concept of FIDs was not very well understood,
so DSA would simply pass on the bridge local FDB entries to port_fdb_add()
with little concern as to which bridge actually offloaded those addresses.
The particular driver implementation of mv88e6xxx may have shaped that
decision in large part, because
(a) if the FDB entry had a non-zero VID, that VID could be uniquely
    traced back to a FID, since that driver does not allow the same VID
    to be added to more than 1 VLAN-aware bridge
(b) if the FDB entry had a VID of 0 (aka the entry is supposed to match
    only on MAC DA, for when the port is VLAN-unaware), the driver did
    not have FDB isolation (different FIDs) between one VLAN-unaware
    bridge and another VLAN-unaware bridge. Just one FID for standalone
    and VLAN-unaware bridge ports, and one FID per bridge VLAN.

Yet, the above was incorrect because it ironically did not consider
drivers such as the gswip which have FDB isolation implemented, but not
exposed to the DSA core, because the DSA core doesn't understand FDB
isolation.

From my reading of the gswip driver, it allocates a "single port bridge"
for each standalone port (effectively a VLAN which contains only port X
and the CPU port, with VID=0 and FID=X+1). It also allocates one FID for
each VLAN-unaware bridge, and one FID for each bridge VLAN. In other
words, what more could you want.

My understanding of the "if (!bridge) return -EINVAL" code that you're
deleting is this: the gswip driver needs to map the FDB entry to a FID
(simply put, it's asking: "On which packets should the FDB entry match?
Having which VLAN, and coming from which ports?"), yet the DSA core
simply isn't providing enough information.

You're deleting that, and saying: FID 0.

FID 0, what?

I'm not at all clear on how (if at all) is FID 0 used by this driver.
The "single port bridges" use a FID in range 1 to max_ports (port + 1),
whereas gswip_vlan_active_create() allocates a fid in range max_ports to
ARRAY_SIZE(priv->vlans). Neither of those is 0.

So while you may program FDB entries in FID 0, they will probably not
match any packet. So packets would still be flooded as before.
So what you're doing
(a) is useless
(b) consumes FDB space for nothing

So I consider that introducing host FDB entries without some gating
condition was a mistake. We thought we could wing it and mass-migrate a
bunch of drivers to include new functionality, and now we can't probably
fix that up, since some would probably perceive it as being a
regression.

Yet, in a strange way, it appears that it isn't the development of new
core features that draws people's attention, but the harmless kernel log
error messages. So in a way, I don't feel so bad that now I have your
attention?

In any case, I recommend you to first set up a test bench where you
actually see a difference between packets being flooded to the CPU vs
matching an FDB entry targeting it. Then read up a bit what the provided
dsa_db argument wants from port_fdb_add(). This conversation with Alvin
should explain a few things.
https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870

Then have a patch (set) lifting the "return -EINVAL" from gswip *properly*.
And only then do we get to ask the questions "how bad are things for
linux-5.18.y? how bad are they for linux-5.15.y? what do we need to do?".

> Hauke, I hope I considered all test-cases which you find relevant. If not
> then please let me know.
> 
> 
>  drivers/net/dsa/lantiq_gswip.c | 29 ++++++++++++++++++-----------
>  1 file changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c
> index e531b93f3cb2..9dab28903af0 100644
> --- a/drivers/net/dsa/lantiq_gswip.c
> +++ b/drivers/net/dsa/lantiq_gswip.c
> @@ -1365,19 +1365,26 @@ static int gswip_port_fdb(struct dsa_switch *ds, int port,
>  	int i;
>  	int err;
>  
> -	if (!bridge)
> -		return -EINVAL;
> -
> -	for (i = max_ports; i < ARRAY_SIZE(priv->vlans); i++) {
> -		if (priv->vlans[i].bridge == bridge) {
> -			fid = priv->vlans[i].fid;
> -			break;
> +	if (bridge) {
> +		for (i = max_ports; i < ARRAY_SIZE(priv->vlans); i++) {
> +			if (priv->vlans[i].bridge == bridge) {
> +				fid = priv->vlans[i].fid;
> +				break;
> +			}
>  		}
> -	}
>  
> -	if (fid == -1) {
> -		dev_err(priv->dev, "Port not part of a bridge\n");
> -		return -EINVAL;
> +		if (fid == -1) {
> +			dev_err(priv->dev, "Port not part of a bridge\n");
> +			return -EINVAL;
> +		}
> +	} else if (dsa_is_cpu_port(ds, port)) {
> +		/* Use FID 0 which is the "default" and used as fallback. This
> +		 * is not used by any standalone port or a bridge, so we can
> +		 * safely use it for the CPU port.
> +		 */
> +		fid = 0;
> +	} else {
> +		return -EOPNOTSUPP;
>  	}
>  
>  	mac_bridge.table = GSWIP_TABLE_MAC_BRIDGE;
> -- 
> 2.37.0
>
Martin Blumenstingl July 2, 2022, 5:43 p.m. UTC | #2
Hi Vladimir,

On Fri, Jul 1, 2022 at 3:02 PM Vladimir Oltean <olteanv@gmail.com> wrote:
[...]
> > Use FID 0 (which is also the "default" FID) when adding/removing an FDB
> > entry for the CPU port.
>
> What does "default" FID even mean, and why is the default FID relevant?
The GSW140 datasheet [0] (which is for a newer IP than the one we are
targeting currently with the GSWIP driver - but I am not aware of any
older datasheets) page 78 mentions: "By default the FID is zero and
all entries belong to shared VLAN learning."
Further down you mention that I probably don't understand the problem,
which is probably true - so I'll cut things short here.

[...]
> > Fixes: 58c59ef9e930c4 ("net: dsa: lantiq: Add Forwarding Database access")
>
> I guess you don't understand the problem. That commit can't be wrong,
> since it dates from v5.2, but DSA only started calling port_fdb_add() on
> a CPU port at all since commit
> d5f19486cee7 ("net: dsa: listen for SWITCHDEV_{FDB,DEL}_ADD_TO_DEVICE on foreign bridge neighbors") (v5.12)
> - and technically, that was opt-in, and the technique only started to become more widespread with commits
> 81a619f78759 ("net: dsa: include fdb entries pointing to bridge in the host fdb list"),
> 10fae4ac89ce ("net: dsa: include bridge addresses which are local in the host fdb list") and
> 3068d466a67e ("net: dsa: sync static FDB entries on foreign interfaces to hardware")
> (all appeared in v5.14).
OK, this makes sense as we're not seeing these warnings in 5.10
Initially I thought that "just" the printk level was changed from
DEBUG to something higher - but it seems that this observation is
incorrect.

> Also, the most recent application of the "port_fdb_add() on CPU ports" technique was introduced in commit
> 5e8a1e03aa4d ("net: dsa: install secondary unicast and multicast addresses as host FDB/MDB")
> (v5.18). But that is also more or less opt-in, since the driver needs to
> declare support for FDB isolation to make use of it.
I did find the FDB isolation changes in 5.18 but I am not sure yet how
to integrate it into the GSWIP driver.

> > Cc: stable@vger.kernel.org
>
> We don't CC stable for patches that go through the "net" tree, the
> networking maintainers send weekly pull requests and the patches get
> automatically backported from there to the relevant and still-not-EOL
> stable branches, based on the Fixes: tag. That's why it's important that
> you fill that in correctly.
sorry, I'll clean it up in v2

> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> > ---
> > This patch is "minimalistic" on purpose: the goal is to have it
> > backported to Linux 5.15. Linux 5.15 doesn't have dsa_fdb_present_in_other_db()
> > or struct dsa_db yet. Once this patch has been accepted I will work on
> > implementing FDB isolation for the Lantiq GSWIP driver.
>
> Don't you want to go the other way around, first understand what is the
> real problem, its impact and the correct solution, then figure out what
> and how can be backported, and _where_?
>
> I'm willing to help.
I thought that I understood the problem but this is clearly not the
case. Thanks for offering your help!

> First it should be understood why DSA bothers to install FDB entries on
> the CPU port in the first place. It does so because there is a largeish
> class of switches where the MAC source addresses of traffic originating
> from Linux are not learned by the hardware. As such, packets being targeted
> _towards_ Linux interfaces will not find an entry in the FDB, and will
> be flooded. This can be seen if you have a system with swp0 and swp1
> both under br0, and the station attached to swp0 pings br0. The ICMP
> requests will also be visible by the station attached to swp1.
>
> It's hard to say whether this is the case or not for gswip, but this has
> been going on for years and years. Not really a functional (connectivity)
> problem, but nonetheless undesirable.
I think for GSWIP it's not flooding packets to all ports.
For testing I had one device connected to LAN1 (let's call this swp0,
in OpenWrt we actually name the port "lan1") and another one connected
to LAN4 (I'll again go with your example and call this swp1).
A ping from the device on swp0 to the IPv4 of br0 (called br-lan in
OpenWrt) was not visible in wireshark on the other device on swp1.
This is with and without this patch.

If needed I can re-test this with Linux 5.10 to make sure that none of
the DSA changes had any impact on this behavior.

[...]
> Yet, in a strange way, it appears that it isn't the development of new
> core features that draws people's attention, but the harmless kernel log
> error messages. So in a way, I don't feel so bad that now I have your
> attention?
To give you some insight how I found this warning:
I updated the Lantiq target in OpenWrt to Linux 5.15 - until then I
didn't think we had to make any GSWIP changes because "everything
worked fine".
The first feedback I got for this was basically "does the Ethernet
switch still work with the new warnings?".

Also there's no need to feel bad for this message. I'm having trouble
understanding how just one switch IP (GSWIP) works. Maintaining a
subsystem which fits many switch IPs must be a different beast.

> In any case, I recommend you to first set up a test bench where you
> actually see a difference between packets being flooded to the CPU vs
> matching an FDB entry targeting it. Then read up a bit what the provided
> dsa_db argument wants from port_fdb_add(). This conversation with Alvin
> should explain a few things.
> https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870
I previously asked Hauke whether the RX tag (net/dsa/tag_gswip.c) has
some bit to indicate whether traffic is flooded - but to his knowledge
the switch doesn't provide this information.
So I am not sure what I can do in this case - do you have any pointers for me?

Also apologies if all of this is very obvious. So far I have only been
working on the xMII part of Ethernet drivers, meaning: I am totally
new to the FDB part.

> Then have a patch (set) lifting the "return -EINVAL" from gswip *properly*.
> And only then do we get to ask the questions "how bad are things for
> linux-5.18.y? how bad are they for linux-5.15.y? what do we need to do?".
agreed


Thanks again for your time and all these valuable hints Vladimir!
Martin


[0] https://assets.maxlinear.com/web/documents/617930_gsw140_ds_rev1.11.pdf
Vladimir Oltean July 2, 2022, 6:56 p.m. UTC | #3
On Sat, Jul 02, 2022 at 07:43:11PM +0200, Martin Blumenstingl wrote:
> Hi Vladimir,
> 
> On Fri, Jul 1, 2022 at 3:02 PM Vladimir Oltean <olteanv@gmail.com> wrote:
> [...]
> > > Use FID 0 (which is also the "default" FID) when adding/removing an FDB
> > > entry for the CPU port.
> >
> > What does "default" FID even mean, and why is the default FID relevant?
> The GSW140 datasheet [0] (which is for a newer IP than the one we are
> targeting currently with the GSWIP driver - but I am not aware of any
> older datasheets)

Thanks for the document! Really useful.

> page 78 mentions: "By default the FID is zero and all entries belong
> to shared VLAN learning."

Not talking about the hardware defaults when it's obvious the driver
changes those, in an attempt to comply to Linux networking expectations...

> > In any case, I recommend you to first set up a test bench where you
> > actually see a difference between packets being flooded to the CPU vs
> > matching an FDB entry targeting it. Then read up a bit what the provided
> > dsa_db argument wants from port_fdb_add(). This conversation with Alvin
> > should explain a few things.
> > https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870
> I previously asked Hauke whether the RX tag (net/dsa/tag_gswip.c) has
> some bit to indicate whether traffic is flooded - but to his knowledge
> the switch doesn't provide this information.

Yeah, you generally won't find quite that level of detail even in more
advanced switches. Not that you need it...

> So I am not sure what I can do in this case - do you have any pointers for me?

Yes, I do.

gswip_setup has:

	/* Default unknown Broadcast/Multicast/Unicast port maps */
	gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP1);
	gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP2);
	gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP3); <- replace BIT(cpu_port) with 0

If you can no longer ping, it means that flooding was how packets
reached the system.

It appears that what goes on is interesting.
The switch is configured to flood traffic that's unknown to the FDB only
to the CPU (notably not to other bridged ports).
In software, the packet reaches tag_gswip.c, where unlike the majority
of other DSA tagging protocols, we do not call dsa_default_offload_fwd_mark(skb).
Then, the packet reaches the software bridge, and the switch has
informed the bridge (via skb->offload_fwd_mark == 0) that the packet
hasn't been already flooded in hardware, so the software bridge needs to
do it (only if necessary, of course).

The software bridge floods the packet according to its own FDB. In your
case, the software bridge recognizes the MAC DA of the packet as being
equal to the MAC address of br0 itself, and so, it doesn't flood it,
just terminates it locally. This is true whether or not the switch
learned that address in its FDB on the CPU port.

> Also apologies if all of this is very obvious. So far I have only been
> working on the xMII part of Ethernet drivers, meaning: I am totally
> new to the FDB part.
> 
> > Then have a patch (set) lifting the "return -EINVAL" from gswip *properly*.
> > And only then do we get to ask the questions "how bad are things for
> > linux-5.18.y? how bad are they for linux-5.15.y? what do we need to do?".
> agreed
> 
> 
> Thanks again for your time and all these valuable hints Vladimir!
> Martin
> 
> 
> [0] https://assets.maxlinear.com/web/documents/617930_gsw140_ds_rev1.11.pdf

So if I'm right, the state of facts is quite "not broken" (quite the
other way around, I'm really impressed), although there are still
improvements to be made. Flooding could be offloaded to hardware, then
flooding to CPU could be turned off and controlled via port promiscuity.
This would save quite a few CPU cycles.
Martin Blumenstingl July 2, 2022, 10:53 p.m. UTC | #4
Hi Vladimir,

On Sat, Jul 2, 2022 at 8:56 PM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Sat, Jul 02, 2022 at 07:43:11PM +0200, Martin Blumenstingl wrote:
> > Hi Vladimir,
> >
> > On Fri, Jul 1, 2022 at 3:02 PM Vladimir Oltean <olteanv@gmail.com> wrote:
> > [...]
> > > > Use FID 0 (which is also the "default" FID) when adding/removing an FDB
> > > > entry for the CPU port.
> > >
> > > What does "default" FID even mean, and why is the default FID relevant?
> > The GSW140 datasheet [0] (which is for a newer IP than the one we are
> > targeting currently with the GSWIP driver - but I am not aware of any
> > older datasheets)
>
> Thanks for the document! Really useful.
Great that this helps. Whenever you hear me contradicting statements
from that datasheet then please let me know. There have been subtle
changes between the different versions of the IP, so I may have to
double check with the vendor driver to see if things still apply to
older versions.

> > page 78 mentions: "By default the FID is zero and all entries belong
> > to shared VLAN learning."
>
> Not talking about the hardware defaults when it's obvious the driver
> changes those, in an attempt to comply to Linux networking expectations...
>
> > > In any case, I recommend you to first set up a test bench where you
> > > actually see a difference between packets being flooded to the CPU vs
> > > matching an FDB entry targeting it. Then read up a bit what the provided
> > > dsa_db argument wants from port_fdb_add(). This conversation with Alvin
> > > should explain a few things.
> > > https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870
> > I previously asked Hauke whether the RX tag (net/dsa/tag_gswip.c) has
> > some bit to indicate whether traffic is flooded - but to his knowledge
> > the switch doesn't provide this information.
>
> Yeah, you generally won't find quite that level of detail even in more
> advanced switches. Not that you need it...
>
> > So I am not sure what I can do in this case - do you have any pointers for me?
>
> Yes, I do.
>
> gswip_setup has:
>
>         /* Default unknown Broadcast/Multicast/Unicast port maps */
>         gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP1);
>         gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP2);
>         gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP3); <- replace BIT(cpu_port) with 0
>
> If you can no longer ping, it means that flooding was how packets
> reached the system.
I tried this but I can still ping OpenWrt's br-lan IP.

I zero'ed the GSWIP_PCE_PMAP2 register (which according to the
documentation is used for L2 multicast/broadcast flooding) as well,
which changes the behavior:
- once the br-lan (IP address: 192.168.1.14) interface is brought up
it cannot be ping'ed from a device connected to one of the switch
ports ("Destination Host Unreachable")
- I can ping a device connected to the switch from within OpenWrt
(meaning: ping from the CPU port to a device with IP 192.168.1.100 on
one of the switch port works)
- once I start the ping from within OpenWrt I immediately get replies
from OpenWrt to the other device

ping log:
    [similar messages omitted, only the icmp_seq is different]
    From 192.168.1.100 icmp_seq=87 Destination Host Unreachable
    [this is when I start "ping 192.168.1.100" from within OpenWrt)
    64 bytes from 192.168.1.14: icmp_seq=88 ttl=64 time=3016 ms
    64 bytes from 192.168.1.14: icmp_seq=89 ttl=64 time=2002 ms
    64 bytes from 192.168.1.14: icmp_seq=90 ttl=64 time=989 ms
    64 bytes from 192.168.1.14: icmp_seq=91 ttl=64 time=0.379 ms

I made sure that the changes from my patch are not applied:
# dmesg | grep " to fdb: -22" | wc -l
9

Also in case it's relevant: I added some printk's to
gswip_port_fdb_dump() (because I don't know how to differentiate
"hardware FDB" from "software FDB" entries in "bridge fdb show brport
lan1"):
The switch seems to learn the CPU port's MAC address automatically -
even before I issue "ping 192.168.1.100" (most likely due to something
in OpenWrt accessing the network).
The "static" flag is not set though (which is expected I think).

As a side-note: I think the comment is partially incorrect. At least
for the GSWIP IP revision which the driver is targeting,
GSWIP_PCE_PMAP1 is for the "monitoring" port. My understanding is that
this "monitoring port" is used with port mirroring (which the hardware
supports but we don't implement in the driver yet).

> It appears that what goes on is interesting.
> The switch is configured to flood traffic that's unknown to the FDB only
> to the CPU (notably not to other bridged ports).
> In software, the packet reaches tag_gswip.c, where unlike the majority
> of other DSA tagging protocols, we do not call dsa_default_offload_fwd_mark(skb).
> Then, the packet reaches the software bridge, and the switch has
> informed the bridge (via skb->offload_fwd_mark == 0) that the packet
> hasn't been already flooded in hardware, so the software bridge needs to
> do it (only if necessary, of course).
>
> The software bridge floods the packet according to its own FDB. In your
> case, the software bridge recognizes the MAC DA of the packet as being
> equal to the MAC address of br0 itself, and so, it doesn't flood it,
> just terminates it locally. This is true whether or not the switch
> learned that address in its FDB on the CPU port.
>
> > Also apologies if all of this is very obvious. So far I have only been
> > working on the xMII part of Ethernet drivers, meaning: I am totally
> > new to the FDB part.
> >
> > > Then have a patch (set) lifting the "return -EINVAL" from gswip *properly*.
> > > And only then do we get to ask the questions "how bad are things for
> > > linux-5.18.y? how bad are they for linux-5.15.y? what do we need to do?".
> > agreed
> >
> >
> > Thanks again for your time and all these valuable hints Vladimir!
> > Martin
> >
> >
> > [0] https://assets.maxlinear.com/web/documents/617930_gsw140_ds_rev1.11.pdf
>
> So if I'm right, the state of facts is quite "not broken" (quite the
> other way around, I'm really impressed), although there are still
> improvements to be made. Flooding could be offloaded to hardware, then
> flooding to CPU could be turned off and controlled via port promiscuity.
> This would save quite a few CPU cycles.
Hearing that things are not horribly broken is great!
Also saving a few CPU cycles would be awesome since this SoCs has a
500MHz MIPS 34Kc core with two VPEs (meaning: one core which supports
SMT - or "HT" as known in the Intel world). So any CPU cycle that can
be saved helps


Best regards,
Martin
Vladimir Oltean July 6, 2022, 9:06 p.m. UTC | #5
On Sun, Jul 03, 2022 at 12:53:45AM +0200, Martin Blumenstingl wrote:
> Hi Vladimir,
> 
> On Sat, Jul 2, 2022 at 8:56 PM Vladimir Oltean <olteanv@gmail.com> wrote:
> >
> > On Sat, Jul 02, 2022 at 07:43:11PM +0200, Martin Blumenstingl wrote:
> > > Hi Vladimir,
> > >
> > > On Fri, Jul 1, 2022 at 3:02 PM Vladimir Oltean <olteanv@gmail.com> wrote:
> > > [...]
> > > > > Use FID 0 (which is also the "default" FID) when adding/removing an FDB
> > > > > entry for the CPU port.
> > > >
> > > > What does "default" FID even mean, and why is the default FID relevant?
> > > The GSW140 datasheet [0] (which is for a newer IP than the one we are
> > > targeting currently with the GSWIP driver - but I am not aware of any
> > > older datasheets)
> >
> > Thanks for the document! Really useful.
> Great that this helps. Whenever you hear me contradicting statements
> from that datasheet then please let me know. There have been subtle
> changes between the different versions of the IP, so I may have to
> double check with the vendor driver to see if things still apply to
> older versions.
> 
> > > page 78 mentions: "By default the FID is zero and all entries belong
> > > to shared VLAN learning."
> >
> > Not talking about the hardware defaults when it's obvious the driver
> > changes those, in an attempt to comply to Linux networking expectations...
> >
> > > > In any case, I recommend you to first set up a test bench where you
> > > > actually see a difference between packets being flooded to the CPU vs
> > > > matching an FDB entry targeting it. Then read up a bit what the provided
> > > > dsa_db argument wants from port_fdb_add(). This conversation with Alvin
> > > > should explain a few things.
> > > > https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870
> > > I previously asked Hauke whether the RX tag (net/dsa/tag_gswip.c) has
> > > some bit to indicate whether traffic is flooded - but to his knowledge
> > > the switch doesn't provide this information.
> >
> > Yeah, you generally won't find quite that level of detail even in more
> > advanced switches. Not that you need it...
> >
> > > So I am not sure what I can do in this case - do you have any pointers for me?
> >
> > Yes, I do.
> >
> > gswip_setup has:
> >
> >         /* Default unknown Broadcast/Multicast/Unicast port maps */
> >         gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP1);
> >         gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP2);
> >         gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP3); <- replace BIT(cpu_port) with 0
> >
> > If you can no longer ping, it means that flooding was how packets
> > reached the system.
> I tried this but I can still ping OpenWrt's br-lan IP.

Yes, so I looked at the GSW140 documentation again and found:

Table 38 Special Tag Ingress Format
Bit 4 Force no learning (1 B = address is not learned, 0 B = ignore)

This header format is quite different compared to what is supported by
gswip_tag_xmit() - 8 bytes vs 4 - but if they're at all similar, the
GSWIP may have a similar "learn disable" bit per packet, which the
tag proto driver isn't setting => address learning takes place.

Please note that there is a reason why a "learn disable" bit exists.
To work properly from all angles, only traffic injected into the switch
from br-lan should be learned by the hardware. Traffic injected from
standalone ports shouldn't.

> 
> I zero'ed the GSWIP_PCE_PMAP2 register (which according to the
> documentation is used for L2 multicast/broadcast flooding) as well,
> which changes the behavior:
> - once the br-lan (IP address: 192.168.1.14) interface is brought up
> it cannot be ping'ed from a device connected to one of the switch
> ports ("Destination Host Unreachable")
> - I can ping a device connected to the switch from within OpenWrt
> (meaning: ping from the CPU port to a device with IP 192.168.1.100 on
> one of the switch port works)
> - once I start the ping from within OpenWrt I immediately get replies
> from OpenWrt to the other device
> 
> ping log:
>     [similar messages omitted, only the icmp_seq is different]
>     From 192.168.1.100 icmp_seq=87 Destination Host Unreachable
>     [this is when I start "ping 192.168.1.100" from within OpenWrt)
>     64 bytes from 192.168.1.14: icmp_seq=88 ttl=64 time=3016 ms
>     64 bytes from 192.168.1.14: icmp_seq=89 ttl=64 time=2002 ms
>     64 bytes from 192.168.1.14: icmp_seq=90 ttl=64 time=989 ms
>     64 bytes from 192.168.1.14: icmp_seq=91 ttl=64 time=0.379 ms
> 
> I made sure that the changes from my patch are not applied:
> # dmesg | grep " to fdb: -22" | wc -l
> 9
> 
> Also in case it's relevant: I added some printk's to
> gswip_port_fdb_dump() (because I don't know how to differentiate
> "hardware FDB" from "software FDB" entries in "bridge fdb show brport
> lan1"):
> The switch seems to learn the CPU port's MAC address automatically -
> even before I issue "ping 192.168.1.100" (most likely due to something
> in OpenWrt accessing the network).
> The "static" flag is not set though (which is expected I think).

Ok, so this wasn't too complicated it seems. Thanks for doing the tests.

There isn't a better way than to printk the FDB entries on the CPU port,
given that there isn't a netdev for that port through which we could
report .ndo_fdb_dump.

What exists is the concept of "devlink regions" and "devlink port regions".
These are named binary bits exposed by certain DSA drivers that are
dumped by user space over netlink, and interpreted by a vendor specific
tool.

Andrew Lunn wrote mv88e6xxx_dump for... mv88e6xxx
https://github.com/lunn/mv88e6xxx_dump

and that can be used to dump information for CPU ports. It includes
entire Address Translation Unit (ATU) entries, so the format is a quite
a bit more detailed than bridge FDB entries since it is hardware specific.

There are various forks of that tool for other pieces of hardware, like
sja1105_dump:
https://github.com/vladimiroltean/mv88e6xxx_dump

To my knowledge there hasn't been any successful attempt in unifying all
forks into a larger dsa_dump, although I've wanted to do that.

Mentioning this just in case you have some spare time to work on a small
little debugging tool. If you're fine with printk that's cool too.

> As a side-note: I think the comment is partially incorrect. At least
> for the GSWIP IP revision which the driver is targeting,
> GSWIP_PCE_PMAP1 is for the "monitoring" port. My understanding is that
> this "monitoring port" is used with port mirroring (which the hardware
> supports but we don't implement in the driver yet).
> 
> > It appears that what goes on is interesting.
> > The switch is configured to flood traffic that's unknown to the FDB only
> > to the CPU (notably not to other bridged ports).
> > In software, the packet reaches tag_gswip.c, where unlike the majority
> > of other DSA tagging protocols, we do not call dsa_default_offload_fwd_mark(skb).
> > Then, the packet reaches the software bridge, and the switch has
> > informed the bridge (via skb->offload_fwd_mark == 0) that the packet
> > hasn't been already flooded in hardware, so the software bridge needs to
> > do it (only if necessary, of course).
> >
> > The software bridge floods the packet according to its own FDB. In your
> > case, the software bridge recognizes the MAC DA of the packet as being
> > equal to the MAC address of br0 itself, and so, it doesn't flood it,
> > just terminates it locally. This is true whether or not the switch
> > learned that address in its FDB on the CPU port.
> >
> > > Also apologies if all of this is very obvious. So far I have only been
> > > working on the xMII part of Ethernet drivers, meaning: I am totally
> > > new to the FDB part.
> > >
> > > > Then have a patch (set) lifting the "return -EINVAL" from gswip *properly*.
> > > > And only then do we get to ask the questions "how bad are things for
> > > > linux-5.18.y? how bad are they for linux-5.15.y? what do we need to do?".
> > > agreed
> > >
> > >
> > > Thanks again for your time and all these valuable hints Vladimir!
> > > Martin
> > >
> > >
> > > [0] https://assets.maxlinear.com/web/documents/617930_gsw140_ds_rev1.11.pdf
> >
> > So if I'm right, the state of facts is quite "not broken" (quite the
> > other way around, I'm really impressed), although there are still
> > improvements to be made. Flooding could be offloaded to hardware, then
> > flooding to CPU could be turned off and controlled via port promiscuity.
> > This would save quite a few CPU cycles.
> 
> Hearing that things are not horribly broken is great!
> Also saving a few CPU cycles would be awesome since this SoCs has a
> 500MHz MIPS 34Kc core with two VPEs (meaning: one core which supports
> SMT - or "HT" as known in the Intel world). So any CPU cycle that can
> be saved helps

The first thing will be to get the driver to pass the existing
selftests. If you look at tools/testing/selftests/drivers/net/dsa/local_termination.sh,
that is what should ultimately pass, since it checks that the packets
that should be received are received, and the ones that shouldn't aren't.

But to get there we'll need to make smaller steps, like disable address
learning on standalone ports, isolate FDBs, maybe offload the bridge TX
forwarding process (in order to populate the "Force no learning" bit in
tag_gswip.c properly), and only then will the local_termination test
also pass. There's also some more work to do in the bridge driver, but
we're getting there slowly.

I'll give some more details about these things in the thread with the
selftest for the configure_vlan_while_not_filtering feature, there are
quite a few things to be said.
diff mbox series

Patch

diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c
index e531b93f3cb2..9dab28903af0 100644
--- a/drivers/net/dsa/lantiq_gswip.c
+++ b/drivers/net/dsa/lantiq_gswip.c
@@ -1365,19 +1365,26 @@  static int gswip_port_fdb(struct dsa_switch *ds, int port,
 	int i;
 	int err;
 
-	if (!bridge)
-		return -EINVAL;
-
-	for (i = max_ports; i < ARRAY_SIZE(priv->vlans); i++) {
-		if (priv->vlans[i].bridge == bridge) {
-			fid = priv->vlans[i].fid;
-			break;
+	if (bridge) {
+		for (i = max_ports; i < ARRAY_SIZE(priv->vlans); i++) {
+			if (priv->vlans[i].bridge == bridge) {
+				fid = priv->vlans[i].fid;
+				break;
+			}
 		}
-	}
 
-	if (fid == -1) {
-		dev_err(priv->dev, "Port not part of a bridge\n");
-		return -EINVAL;
+		if (fid == -1) {
+			dev_err(priv->dev, "Port not part of a bridge\n");
+			return -EINVAL;
+		}
+	} else if (dsa_is_cpu_port(ds, port)) {
+		/* Use FID 0 which is the "default" and used as fallback. This
+		 * is not used by any standalone port or a bridge, so we can
+		 * safely use it for the CPU port.
+		 */
+		fid = 0;
+	} else {
+		return -EOPNOTSUPP;
 	}
 
 	mac_bridge.table = GSWIP_TABLE_MAC_BRIDGE;