diff mbox series

[net,3/3] net: dsa: Include tagger overhead when setting MTU for DSA and CPU ports

Message ID 20210524213313.1437891-4-andrew@lunn.ch (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series MTU fixes for mv88e6xxx | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers fail 1 blamed authors not CCed: chris.packham@alliedtelesis.co.nz; 3 maintainers not CCed: olteanv@gmail.com chris.packham@alliedtelesis.co.nz vivien.didelot@gmail.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: suspect code indent for conditional statements (24, 40)
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Andrew Lunn May 24, 2021, 9:33 p.m. UTC
Same members of the Marvell Ethernet switches impose MTU restrictions
on ports used for connecting to the CPU or DSA. If the MTU is set too
low, tagged frames will be discarded. Ensure the tagger overhead is
included in setting the MTU for DSA and CPU ports.

Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
Reported by: 曹煜 <cao88yu@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
 net/dsa/switch.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Vladimir Oltean May 24, 2021, 10:04 p.m. UTC | #1
On Mon, May 24, 2021 at 11:33:13PM +0200, Andrew Lunn wrote:
> Same members of the Marvell Ethernet switches impose MTU restrictions
> on ports used for connecting to the CPU or DSA. If the MTU is set too
> low, tagged frames will be discarded. Ensure the tagger overhead is
> included in setting the MTU for DSA and CPU ports.
> 
> Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
> Reported by: 曹煜 <cao88yu@gmail.com>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> ---

Some switches account for the DSA tag automatically in hardware. So far
it has been the convention that if a switch doesn't do that, the driver
should, not DSA.

>  net/dsa/switch.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index 9bf8e20ecdf3..48c737b0b802 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -67,14 +67,26 @@ static bool dsa_switch_mtu_match(struct dsa_switch *ds, int port,
>  static int dsa_switch_mtu(struct dsa_switch *ds,
>  			  struct dsa_notifier_mtu_info *info)
>  {
> -	int port, ret;
> +	struct dsa_port *cpu_dp;
> +	int port, ret, overhead;
>  
>  	if (!ds->ops->port_change_mtu)
>  		return -EOPNOTSUPP;
>  
>  	for (port = 0; port < ds->num_ports; port++) {
>  		if (dsa_switch_mtu_match(ds, port, info)) {
> -			ret = ds->ops->port_change_mtu(ds, port, info->mtu);
> +			overhead = 0;
> +			if (dsa_is_cpu_port(ds, port)) {
> +				cpu_dp = dsa_to_port(ds, port);
> +				overhead = cpu_dp->tag_ops->overhead;
> +			}
> +			if (dsa_is_dsa_port(ds, port)) {
> +					cpu_dp = dsa_to_port(ds, port)->cpu_dp;
> +					overhead = cpu_dp->tag_ops->overhead;

Too Much Indentation.

> +			}
> +
> +			ret = ds->ops->port_change_mtu(ds, port,
> +						       info->mtu + overhead);
>  			if (ret)
>  				return ret;
>  		}
> -- 
> 2.31.1
>
Andrew Lunn May 25, 2021, 2:53 a.m. UTC | #2
On Mon, May 24, 2021 at 10:04:01PM +0000, Vladimir Oltean wrote:
> On Mon, May 24, 2021 at 11:33:13PM +0200, Andrew Lunn wrote:
> > Same members of the Marvell Ethernet switches impose MTU restrictions
> > on ports used for connecting to the CPU or DSA. If the MTU is set too
> > low, tagged frames will be discarded. Ensure the tagger overhead is
> > included in setting the MTU for DSA and CPU ports.
> > 
> > Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
> > Reported by: 曹煜 <cao88yu@gmail.com>
> > Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> > ---
> 
> Some switches account for the DSA tag automatically in hardware. So far
> it has been the convention that if a switch doesn't do that, the driver
> should, not DSA.

O.K.

This is going to be a little bit interesting with Tobias's support for
changing the tag protocol. I need to look at the ordering.

	 Andrew
Vladimir Oltean May 25, 2021, 9:10 a.m. UTC | #3
On Tue, May 25, 2021 at 04:53:39AM +0200, Andrew Lunn wrote:
> On Mon, May 24, 2021 at 10:04:01PM +0000, Vladimir Oltean wrote:
> > On Mon, May 24, 2021 at 11:33:13PM +0200, Andrew Lunn wrote:
> > > Same members of the Marvell Ethernet switches impose MTU restrictions
> > > on ports used for connecting to the CPU or DSA. If the MTU is set too
> > > low, tagged frames will be discarded. Ensure the tagger overhead is
> > > included in setting the MTU for DSA and CPU ports.
> > > 
> > > Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
> > > Reported by: 曹煜 <cao88yu@gmail.com>
> > > Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> > > ---
> > 
> > Some switches account for the DSA tag automatically in hardware. So far
> > it has been the convention that if a switch doesn't do that, the driver
> > should, not DSA.
> 
> O.K.
> 
> This is going to be a little bit interesting with Tobias's support for
> changing the tag protocol. I need to look at the ordering.

The dsa_switch_change_tag_proto() notifier handler already iterates
through user ports and calls dsa_slave_change_mtu(), which triggers the
whole shebang (calculates the largest_mtu of the ds, changes the master
MTU to that value plus the tagger overhead, emits a notifier for the CPU
port MTU change and another one for the user port MTU change, then
triggers the MTU bridge normalization logic if the MTU is in fact a
MRU).

So when the tagging protocol changes, you get re-notified to change the
MTU on the CPU port to the largest_mtu. That part should work correctly.

What could be interesting, and this is something I had to check, is to
see if the proper MTU values are propagated correctly to the DSA links.
dsa_slave_change_mtu() calls dsa_port_mtu_change() with
propagate_upstream == true for the CPU port (which is programmed with
the largest_mtu of the switch) and that triggers this matching logic:

static bool dsa_switch_mtu_match(struct dsa_switch *ds, int port,
				 struct dsa_notifier_mtu_info *info)
{
	if (ds->index == info->sw_index)
		return (port == info->port) || dsa_is_dsa_port(ds, port);

	if (!info->propagate_upstream)
		return false;

	if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port)) <- returns true for all DSA links here
		return true;

	return false;
}

the "propagate_upstream" is a bit of a misnomer, since it is "propagate
to other switches" - what I really needed at the time, with
"propagate_upstream == false", was a way to send a targeted MTU change
(for the user port itself) that bypasses the cross-chip notifiers.

My updated cross-chip notifier simulator
(https://patchwork.kernel.org/project/netdevbpf/patch/20210222120248.1415075-1-olteanv@gmail.com/)
shows this "heat map" for a notifier emitted on the CPU port (port 0 of
switch 0) with propagate_upstream == true:

Heat map for test notifier emitted on sw0p0:

   sw0p0     sw0p1     sw0p2     sw0p3     sw0p4
[  cpu  ] [  user ] [  user ] [  dsa  ] [  user ]
[   x   ] [       ] [       ] [   x   ] [       ]
                                  |
                                  +---------+
                                            |
   sw1p0     sw1p1     sw1p2     sw1p3     sw1p4
[  user ] [  user ] [  user ] [  dsa  ] [  dsa  ]
[       ] [       ] [       ] [   x   ] [   x   ]
                                  |
                                  +---------+
                                            |
   sw2p0     sw2p1     sw2p2     sw2p3     sw2p4
[  user ] [  user ] [  user ] [  user ] [  dsa  ]
[       ] [       ] [       ] [       ] [   x   ]

So the largest_mtu in this case ends up being programmed in all DSA
links.

There are 2 potential problems I see:
(a) the largest_mtu is calculated as the maximum over all user ports of
    @ds. But since it is propagated in the entire tree, maybe it should
    be the maximum across the entire @dst, to avoid this situation:

ip link set sw0p1 mtu 9000
ip link set sw1p1 mtu 1500 # oops, this changes the largest_mtu of the CPU port, breaking termination for sw0p1

(b) I don't remember why I didn't make the targeted notifier
    (propagate_upstream == false) even more targeted towards only the
    port on which it was emitted. Instead DSA links of that switch are
    targeted too, and that is probably a mistake:

	if (ds->index == info->sw_index)
		return (port == info->port) || dsa_is_dsa_port(ds, port);

because if the DSA links of the entire dst were programmed in a previous
round to the largest_mtu via a "propagate_upstream == true" notification,
then the dsa_port_mtu_change(propagate_upstream == false) call that is
immediately upcoming will break the MTU on the one DSA link which is
chip-wise local to the dp whose MTU is changing right now.

Example:

ip link set sw0p1 mtu 9000
ip link set sw2p1 mtu 9000 # at this stage, sw0p1 and sw2p1 can talk to one another using jumbo frames
ip link set sw0p2 mtu 1500 # this programs the sw0p3 DSA link first to
                           # the largest_mtu of 9000, then reprograms it to 1500 with the
                           # "propagate_upstream == false" notifier, breaking communication between
                           # sw0p1 and sw2p1
diff mbox series

Patch

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 9bf8e20ecdf3..48c737b0b802 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -67,14 +67,26 @@  static bool dsa_switch_mtu_match(struct dsa_switch *ds, int port,
 static int dsa_switch_mtu(struct dsa_switch *ds,
 			  struct dsa_notifier_mtu_info *info)
 {
-	int port, ret;
+	struct dsa_port *cpu_dp;
+	int port, ret, overhead;
 
 	if (!ds->ops->port_change_mtu)
 		return -EOPNOTSUPP;
 
 	for (port = 0; port < ds->num_ports; port++) {
 		if (dsa_switch_mtu_match(ds, port, info)) {
-			ret = ds->ops->port_change_mtu(ds, port, info->mtu);
+			overhead = 0;
+			if (dsa_is_cpu_port(ds, port)) {
+				cpu_dp = dsa_to_port(ds, port);
+				overhead = cpu_dp->tag_ops->overhead;
+			}
+			if (dsa_is_dsa_port(ds, port)) {
+					cpu_dp = dsa_to_port(ds, port)->cpu_dp;
+					overhead = cpu_dp->tag_ops->overhead;
+			}
+
+			ret = ds->ops->port_change_mtu(ds, port,
+						       info->mtu + overhead);
 			if (ret)
 				return ret;
 		}